Planet LUV

December 04, 2020

etbeKDE Icons Disappearing in Debian/Unstable

One of my workstations is running Debian/Unstable with KDE and SDDM on an AMD Radeon R7 260X video card. Recently it stopped displaying things correctly after a reboot, all the icons failed to display as well as many of the Qt controls. When I ran a KDE application from the command line I got the error “QSGTextureAtlas: texture atlas allocation failed, code=501“. Googling that error gave a blog post about a very similar issue in 2017 [1]. From that blog post I learned that I could stop the problem by setting MESA_EXTENSION_OVERRIDE=”-GL_EXT_bgra -GL_EXT_texture_format_BGRA8888″ in the environment. In a quick test I found that the environment variable setting worked, making the KDE apps display correctly and not report an error about a texture atlas.

I created a file ~/.config/plasma-workspace/env/bgra.sh with the following contents:

export MESA_EXTENSION_OVERRIDE="-GL_EXT_bgra -GL_EXT_texture_format_BGRA8888"

Then after the next login things worked as desired!

Now the issue is, where is the bug? GL, X, and the internals of KDE are things I don’t track much. I welcome suggestions from readers of my blog as to what the culprit might be and where to file a Debian bug – or a URL to a Debian bug report if someone has already filed one.

Update

When I run the game warzone2100 with this setting it crashes with the below output. So this Mesa extension override isn’t always a good thing, just solves one corner case of a bug.

$ warzone2100 
/usr/bin/gdb: warning: Couldn't determine a path for the index cache directory.
27      ../sysdeps/unix/sysv/linux/wait4.c: No such file or directory.
No frame at level 0x7ffc3392ab50.
Saved dump file to '/home/etbe/.local/share/warzone2100-3.3.0//logs/warzone2100.gdmp-VuGo2s'
If you create a bugreport regarding this crash, please include this file.
Segmentation fault (core dumped)

Update 2

Carsten provided the REAL solution to this, run “apt remove libqt5quick5-gles” which will automatically install “libqt5quick5” which makes things work. Another workstation I run that tracks Testing had libqt5quick5 installed which was why it didn’t have the problem.

The system in question had most of KDE removed due to package dependency issues when tracking Unstable and when I reinstalled it I guess the wrong one was installed.

November 25, 2020

Stewart SmithWhy you should use `nproc` and not grep /proc/cpuinfo

There’s something really quite subtle about how the nproc utility from GNU coreutils works. If you look at the man page, it’s even the very first sentence:

Print the number of processing units available to the current process, which may be less than the number of online processors.

So, what does that actually mean? Well, just because the computer some code is running on has a certain number of CPUs (and here I mean “number of hardware threads”) doesn’t necessarily mean that you can spawn a process that uses that many. What’s a simple example? Containers! Did you know that when you invoke docker to run a container, you can easily limit how much CPU the container can use? In this case, we’re looking at the --cpuset-cpus parameter, as the --cpus one works differently.

$ nproc
8

$ docker run --cpuset-cpus=0-1 --rm=true -it  amazonlinux:2
bash-4.2# nproc
2
bash-4.2# exit

$ docker run --cpuset-cpus=0-2 --rm=true -it  amazonlinux:2
bash-4.2# nproc
3

As you can see, nproc here gets the right bit of information, so if you’re wanting to do a calculation such as “Please use up to the maximum available CPUs” as a parameter to the configuration of a piece of software (such as how many threads to run), you get the right number.

But what if you use some of the other common methods?

$ /usr/bin/lscpu -p | grep -c "^[0-9]"
8
$ grep -c 'processor' /proc/cpuinfo 
8

$ docker run --cpuset-cpus=0-1 --rm=true -it  amazonlinux:2
bash-4.2# yum install -y /usr/bin/lscpu
......
bash-4.2# /usr/bin/lscpu -p | grep -c "^[0-9]"
8
bash-4.2# grep -c 'processor' /proc/cpuinfo 
8
bash-4.2# nproc
2

In this case, if you base your number of threads off grepping lscpu you take another dependency (on the util-linux package), which isn’t needed. You also get the wrong answer, as you do by grepping /proc/cpuinfo. So, what this will end up doing is just increase the number of context switches, possibly also adding a performance degradation. It’s not just in docker containers where this could be an issue of course, you can use the same mechanism that docker uses anywhere you want to control resources of a process.

Another subtle thing to watch out for is differences in /proc/cpuinfo content depending on CPU architecture. You may not think it’s an issue today, but who wants to needlessly debug something?

tl;dr: for determining “how many processes to run”: use nproc, don’t grep lscpu or /proc/cpuinfo

November 16, 2020

Stewart SmithPhotos from Tasmania (2017)

On the random old photos train, there’s some from spending time in Tasmania post linux.conf.au 2017 in Hobart.

All of these are Kodak E100VS film, which was no doubt a bit out of date by the time I shot it (and when they stopped making Ektachrome for a while). It was a nice surprise to be reminded of a truly wonderful Tassie trip, taken with friends, and after the excellent linux.conf.au.

November 15, 2020

Stewart SmithPhotos from Melbourne

I recently got around to scanning some film that took an awful long time to make its way back to me after being developed. There’s some pictures from home.

The rest of this roll of 35mm Fuji Velvia 50 is from Tasmania, which would place this all around December 2016.

Stewart SmithPhotos from long ago….

It’s strange to get unexpected photos from a while ago. It’s also joyous.

These photos above are from a park down the street from where we used to live. I believe it was originally a quarry, and a number of years ago the community got together and turned it into a park. It’s a quite decent size (Parkrun is held there), and there’s plenty of birds (and ducks!) to see.

Moorabbin Station

It’s a very strange feeling seeing photos from both the before time, and from where I used to live. I’m sure that if the world wasn’t the way it was now, and there wasn’t a pandemic, it would feel different.

All of the above were shot on a Nikon F80 with 35mm Fuji Velvia 50 film.

November 08, 2020

etbeLinks November 2020

KDE has a long term problem of excessive CPU time used by the screen locker [1]. Part of it is due to software GL emulation, and part of it is due to the screen locker doing things like flashing the cursor when nothing else is happening. One of my systems has an NVidia card and enabling GL would cause it to crash. So now I have kscreenlocker using 30% of a CPU core even when the screen is powered down.

Informative NYT article about the latest security features for iPhones [2]. Android needs new features like this!

Russ Allbery wrote an interesting review of the book Hand to Mouth by Linda Tirado [3], it’s about poverty in the US and related things. Linda first became Internet famous for her essay “Why I Make Terrible Decisions or Poverty Thoughts” which is very insightful and well written, this is the latest iteration of that essay [4].

This YouTube video by Ruby Payne gives great insights to class based attitudes towards time and money [5].

News Week has an interesting article about chicken sashimi, apparently you can safely eat raw chicken if it’s prepared well [6].

Vanity Fair has an informative article about how Qanon and Trumpism have infected the Catholic Church [7]. Some of Mel Gibson’s mental illness is affecting a significant portion of the Catholic Church in the US and some parts in the rest of the world.

Noema has an interesting article on toxic Internet culture, Japan’s 2chan, 4chan, 8chan/8kun, and the conspiracy theories they spawned [8].

Benjamin Corey is an ex-Fundie who wrote an amusing analysis of the Biblical statements about the anti-Christ [9].

NYMag has an interesting article The Final Gasp of Donald Trump’s Presidency [10].

Mother Jones has an informative article about the fact that Jim Watkins (the main person behind QAnon) has a history of hosting child porn on sites he runs [11], but we all knew QAnon was never about protecting kids.

Eand has an insightful article America’s Problem is That White People Want It to Be a Failed State [12].

October 19, 2020

etbeVideo Decoding

I’ve had a saga of getting 4K monitors to work well. My latest issue has been video playing, the dreaded mplayer error about the system being too slow. My previous post about 4K was about using DisplayPort to get more than 30Hz scan rate at 4K [1]. I now have a nice 60Hz scan rate which makes WW2 documentaries display nicely among other things.

But when running a 4K monitor on a 3.3GHz i5-2500 quad-core CPU I can’t get a FullHD video to display properly. Part of the process of decoding the video and scaling it to 4K resolution is too slow, so action scenes in movies lag. When running a 2560*1440 monitor on a 2.4GHz E5-2440 hex-core CPU with the mplayer option “-lavdopts threads=3” everything is great (but it fails if mplayer is run with no parameters). In doing tests with apparent performance it seemed that the E5-2440 CPU gains more from the threaded mplayer code than the i5-2500, maybe the E5-2440 is more designed for server use (it’s in a Dell PowerEdge T320 while the i5-2500 is in a random white-box system) or maybe it’s just because it’s newer. I haven’t tested whether the i5-2500 system could perform adequately at 2560*1440 resolution.

The E5-2440 system has an ATI HD 6570 video card which is old, slow, and only does PCIe 2.1 which gives 5GT/s or 8GB/s. The i5-2500 system has a newer ATI video card that is capable of PCIe 3.0, but “lspci -vv” as root says “LnkCap: Port #0, Speed 8GT/s, Width x16” and “LnkSta: Speed 5GT/s (downgraded), Width x16 (ok)”. So for reasons unknown to me the system with a faster PCIe 3.0 video card is being downgraded to PCIe 2.1 speed. A quick check of the web site for my local computer store shows that all ATI video cards costing less than $300 have PCI3 3.0 interfaces and the sole ATI card with PCIe 4.0 (which gives double the PCIe speed if the motherboard supports it) costs almost $500. I’m not inclined to spend $500 on a new video card and then a greater amount of money on a motherboard supporting PCIe 4.0 and CPU and RAM to go in it.

According to my calculations 3840*2160 resolution at 24bpp (probably 32bpp data transfers) at 30 frames/sec means 3840*2160*4*30/1024/1024=950MB/s. PCIe 2.1 can do 8GB/s so that probably isn’t a significant problem.

I’d been planning on buying a new video card for the E5-2440 system, but due to some combination of having a better CPU and lower screen resolution it is working well for video playing so I can save my money.

As an aside the T320 is a server class system that had been running for years in a corporate DC. When I replaced the high speed SAS disks with SSDs SATA disks it became quiet enough for a home workstation. It works very well at that task but the BIOS is quite determined to keep the motherboard video running due to the remote console support. So swapping monitors around was more pain than I felt like going through, I just got it working and left it. I ordered a special GPU power cable but found that the older video card that doesn’t need an extra power cable performs adequately before the cable arrived.

Here is a table comparing the systems.

2560*1440 works well 3840*2160 goes slow
System Dell PowerEdge T320 White Box PC from rubbish
CPU 2.4GHz E5-2440 3.3GHz i5-2500
Video Card ATI Radeon HD 6570 ATI Radeon R7 260X
PCIe Speed PCIe 2.1 – 8GB/s PCIe 3.0 downgraded to PCIe 2.1 – 8GB/s

Conclusion

The ATI Radeon HD 6570 video card is one that I had previously tested and found inadequate for 4K support, I can’t remember if it didn’t work at that resolution or didn’t support more than 30Hz scan rate. If the 2560*1440 monitor dies then it wouldn’t make sense to buy anything less than a 4K monitor to replace it which means that I’d need to get a new video card to match. But for the moment 2560*1440 is working well enough so I won’t upgrade it any time soon. I’ve already got the special power cable (specified as being for a Dell PowerEdge R610 for anyone else who wants to buy one) so it will be easy to install a powerful video card in a hurry.

October 12, 2020

etbeFirst Attempt at Gnocchi-Statsd

I’ve been investigating the options for tracking system statistics to diagnose performance problems. The idea is to track all sorts of data about the system (network use, disk IO, CPU, etc) and look for correlations at times of performance problems. DataDog is pretty good for this but expensive, it’s apparently based on or inspired by the Etsy Statsd. It’s claimed that the gnocchi-statsd is the best implementation of the protoco used by the Etsy Statsd, so I decided to install that.

I use Debian/Buster for this as that’s what I’m using for the hardware that runs KVM VMs. Here is what I did:

# it depends on a local MySQL database
apt -y install mariadb-server mariadb-client
# install the basic packages for gnocchi
apt -y install gnocchi-common python3-gnocchiclient gnocchi-statsd uuid

In the Debconf prompts I told it to “setup a database” and not to manage keystone_authtoken with debconf (because I’m not doing a full OpenStack installation).

This gave a non-working configuration as it didn’t configure the MySQL database for the [indexer] section and the sqlite database that was configured didn’t work for unknown reasons. I filed Debian bug #971996 about this [1]. To get this working you need to edit /etc/gnocchi/gnocchi.conf and change the url line in the [indexer] section to something like the following (where the password is taken from the [database] section).

url = mysql+pymysql://gnocchi-common:PASS@localhost:3306/gnocchidb

To get the statsd interface going you have to install the gnocchi-statsd package and edit /etc/gnocchi/gnocchi.conf to put a UUID in the resource_id field (the Debian package uuid is good for this). I filed Debian bug #972092 requesting that the UUID be set by default on install [2].

Here’s an official page about how to operate Gnocchi [3]. The main thing I got from this was that the following commands need to be run from the command-line (I ran them as root in a VM for test purposes but would do so with minimum privs for a real deployment).

gnocchi-api
gnocchi-metricd

To communicate with Gnocchi you need the gnocchi-api program running, which uses the uwsgi program to provide the web interface by default. It seems that this was written for a version of uwsgi different than the one in Buster. I filed Debian bug #972087 with a patch to make it work with uwsgi [4]. Note that I didn’t get to the stage of an end to end test, I just got it to basically run without error.

After getting “gnocchi-api” running (in a terminal not as a daemon as Debian doesn’t seem to have a service file for it), I ran the client program “gnocchi” and then gave it the “status” command which failed (presumably due to the metrics daemon not running), but at least indicated that the client and the API could communicate.

Then I ran the “gnocchi-metricd” and got the following error:

2020-10-12 14:59:30,491 [9037] ERROR    gnocchi.cli.metricd: Unexpected error during processing job
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/gnocchi/cli/metricd.py", line 87, in run
    self._run_job()
  File "/usr/lib/python3/dist-packages/gnocchi/cli/metricd.py", line 248, in _run_job
    self.coord.update_capabilities(self.GROUP_ID, self.store.statistics)
  File "/usr/lib/python3/dist-packages/tooz/coordination.py", line 592, in update_capabilities
    raise tooz.NotImplemented
tooz.NotImplemented

At this stage I’ve had enough of gnocchi. I’ll give the Etsy Statsd a go next.

Update

Thomas has responded to this post [5]. At this stage I’m not really interested in giving Gnocchi another go. There’s still the issue of the indexer database which should be different from the main database somehow and sqlite (the config file default) doesn’t work.

I expect that if I was to persist with Gnocchi I would encounter more poorly described error messages from the code which either don’t have Google hits when I search for them or have Google hits to unanswered questions from 5+ years ago.

The Gnocchi systemd config files are in different packages to the programs, this confused me and I thought that there weren’t any systemd service files. I had expected that installing a package with a daemon binary would also get the systemd unit file to match.

The cluster features of Gnocchi are probably really good if you need that sort of thing. But if you have a small instance (EG a single VM server) then it’s not needed. Also one of the original design ideas of the Etsy Statsd was that UDP was used because data could just be dropped if there was a problem. I think for many situations the same concept could apply to the entire stats service.

If the other statsd programs don’t do what I need then I may give Gnocchi another go.

July 17, 2020

Dave HallIf You’re not Using YAML for CloudFormation Templates, You’re Doing it Wrong

In my last blog post, I promised a rant about using YAML for CloudFormation templates. Here it is. If you persevere to the end I’ll also show you have to convert your existing JSON based templates to YAML.

Many of the points I raise below don’t just apply to CloudFormation. They are general comments about why you should use YAML over JSON for configuration when you have a choice.

One criticism of YAML is its reliance on indentation. A lot of the code I write these days is Python, so indentation being significant is normal. Use a decent editor or IDE and this isn’t a problem. It doesn’t matter if you’re using JSON or YAML, you will want to validate and lint your files anyway. How else will you find that trailing comma in your JSON object?

Now we’ve got that out of the way, let me try to convince you to use YAML.

As developers we are regularly told that we need to document our code. CloudFormation is Infrastructure as Code. If it is code, then we need to document it. That starts with the Description property at the top of the file. If you JSON for your templates, that’s it, you have no other opportunity to document your templates. On the other hand, if you use YAML you can add inline comments. Anywhere you need a comment, drop in a hash # and your comment. Your team mates will thank you.

JSON templates don’t support multiline strings. These days many developers have 4K or ultra wide monitors, we don’t want a string that spans the full width of our 34” screen. Text becomes harder to read once you exceed that “90ish” character limit. With JSON your multiline string becomes "[90ish-characters]\n[another-90ish-characters]\n[and-so-on"]. If you opt for YAML, you can use the greater than symbol (>) and then start your multiline comment like so:

Description: >
  This is the first line of my Description
  and it continues on my second line
  and I'll finish it on my third line.

As you can see it much easier to work with multiline string in YAML than JSON.

“Folded blocks” like the one above are created using the > replace new lines with spaces. This allows you to format your text in a more readable format, but allow a machine to use it as intended. If you want to preserve the new line, use the pipe (|) to create a “literal block”. This is great for an inline Lambda functions where the code remains readable and maintainable.

  APIFunction:
    Type: AWS::Lambda::Function
    Properties:
      Code:
        ZipFile: |
          import json
          import random


          def lambda_handler(event, context):
              return {"statusCode": 200, "body": json.dumps({"value": random.random()})}
      FunctionName: "GetRandom"
      Handler: "index.lambda_handler"
      MemorySize: 128
      Role: !GetAtt LambdaServiceRole.Arn
      Runtime: "python3.7"
		Timeout: 5

Both JSON and YAML require you to escape multibyte characters. That’s less of an issue with CloudFormation templates as generally you’re only using the ASCII character set.

In a YAML file generally you don’t need to quote your strings, but in JSON double quotes are used every where, keys, string values and so on. If your string contains a quote you need to escape it. The same goes for tabs, new lines, backslashes and and so on. JSON based CloudFormation templates can be hard to read because of all the escaping. It also makes it harder to handcraft your JSON when your code is a long escaped string on a single line.

Some configuration in CloudFormation can only be expressed as JSON. Step Functions and some of the AppSync objects in CloudFormation only allow inline JSON configuration. You can still use a YAML template and it is easier if you do when working with these objects.

The JSON only configuration needs to be inlined in your template. If you’re using JSON you have to supply this as an escaped string, rather than nested objects. If you’re using YAML you can inline it as a literal block. Both YAML and JSON templates support functions such as Sub being applied to these strings, it is so much more readable with YAML. See this Step Function example lifted from the AWS documentation:

MyStateMachine:
  Type: "AWS::StepFunctions::StateMachine"
  Properties:
    DefinitionString:
      !Sub |
        {
          "Comment": "A simple AWS Step Functions state machine that automates a call center support session.",
          "StartAt": "Open Case",
          "States": {
            "Open Case": {
              "Type": "Task",
              "Resource": "arn:aws:lambda:${AWS::Region}:${AWS::AccountId}:function:open_case",
              "Next": "Assign Case"
            }, 
            "Assign Case": {
              "Type": "Task",
              "Resource": "arn:aws:lambda:${AWS::Region}:${AWS::AccountId}:function:assign_case",
              "Next": "Work on Case"
            },
            "Work on Case": {
              "Type": "Task",
              "Resource": "arn:aws:lambda:${AWS::Region}:${AWS::AccountId}:function:work_on_case",
              "Next": "Is Case Resolved"
            },
            "Is Case Resolved": {
                "Type" : "Choice",
                "Choices": [ 
                  {
                    "Variable": "$.Status",
                    "NumericEquals": 1,
                    "Next": "Close Case"
                  },
                  {
                    "Variable": "$.Status",
                    "NumericEquals": 0,
                    "Next": "Escalate Case"
                  }
              ]
            },
             "Close Case": {
              "Type": "Task",
              "Resource": "arn:aws:lambda:${AWS::Region}:${AWS::AccountId}:function:close_case",
              "End": true
            },
            "Escalate Case": {
              "Type": "Task",
              "Resource": "arn:aws:lambda:${AWS::Region}:${AWS::AccountId}:function:escalate_case",
              "Next": "Fail"
            },
            "Fail": {
              "Type": "Fail",
              "Cause": "Engage Tier 2 Support."    }   
          }
        }

If you’re feeling lazy you can use inline JSON for IAM policies that you’ve copied from elsewhere. It’s quicker than converting them to YAML.

YAML templates are smaller and more compact than the same configuration stored in a JSON based template. Smaller yet more readable is winning all round in my book.

If you’re still not convinced that you should use YAML for your CloudFormation templates, go read Amazon’s blog post from 2017 advocating the use of YAML based templates.

Amazon makes it easy to convert your existing templates from JSON to YAML. cfn-flip is aPython based AWS Labs tool for converting CloudFormation templates between JSON and YAML. I will assume you’ve already installed cfn-flip. Once you’ve done that, converting your templates with some automated cleanups is just a command away:

cfn-flip --clean template.json template.yaml

git rm the old json file, git add the new one and git commit and git push your changes. Now you’re all set for your new life using YAML based CloudFormation templates.

If you want to learn more about YAML files in general, I recommend you check our Learn X in Y Minutes’ Guide to YAML. If you want to learn more about YAML based CloudFormation templates, check Amazon’s Guide to CloudFormation Templates.

July 09, 2020

Dave HallLogging Step Functions to CloudWatch

Many AWS Services log to CloudWatch. Some do it out of the box, others need to be configured to log properly. When Amazon released Step Functions, they didn’t include support for logging to CloudWatch. In February 2020, Amazon announced StepFunctions could now log to CloudWatch. Step Functions still support CloudTrail logs, but CloudWatch logging is more useful for many teams.

Users need to configure Step Functions to log to CloudWatch. This is done on a per State Machine basis. Of course you could click around he console to enable it, but that doesn’t scale. If you use CloudFormation to manage your Step Functions, it is only a few extra lines of configuration to add the logging support.

In my example I will assume you are using YAML for your CloudFormation templates. I’ll save my “if you’re using JSON for CloudFormation you’re doing it wrong” rant for another day. This is a cut down example from one of my services:

---
AWSTemplateFormatVersion: '2010-09-09'
Description: StepFunction with Logging Example.
Parameters:
Resources:
  StepFunctionExecRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
        - Effect: Allow
          Principal:
            Service: !Sub "states.${AWS::Region}.amazonaws.com"
          Action:
          - sts:AssumeRole
      Path: "/"
      Policies:
      - PolicyName: StepFunctionExecRole
        PolicyDocument:
          Version: '2012-10-17'
          Statement:
          - Effect: Allow
            Action:
            - lambda:InvokeFunction
            - lambda:ListFunctions
            Resource: !Sub "arn:aws:lambda:${AWS::Region}:${AWS::AccountId}:function:my-lambdas-namespace-*"
          - Effect: Allow
            Action:
            - logs:CreateLogDelivery
            - logs:GetLogDelivery
            - logs:UpdateLogDelivery
            - logs:DeleteLogDelivery
            - logs:ListLogDeliveries
            - logs:PutResourcePolicy
            - logs:DescribeResourcePolicies
            - logs:DescribeLogGroups
            Resource: "*"
  MyStateMachineLogGroup:
    Type: AWS::Logs::LogGroup
    Properties:
      LogGroupName: /aws/stepfunction/my-step-function
      RetentionInDays: 14
  DashboardImportStateMachine:
    Type: AWS::StepFunctions::StateMachine
    Properties:
      StateMachineName: my-step-function
      StateMachineType: STANDARD
      LoggingConfiguration:
        Destinations:
          - CloudWatchLogsLogGroup:
             LogGroupArn: !GetAtt MyStateMachineLogGroup.Arn
        IncludeExecutionData: True
        Level: ALL
      DefinitionString:
        !Sub |
        {
          ... JSON Step Function definition goes here
        }
      RoleArn: !GetAtt StepFunctionExecRole.Arn

The key pieces in this example are the second statement in the IAM Role with all the logging permissions, the LogGroup defined by MyStateMachineLogGroup and the LoggingConfiguration section of the Step Function definition.

The IAM role permissions are copied from the example policy in the AWS documentation for using CloudWatch Logging with Step Functions. The CloudWatch IAM permissions model is pretty weak, so we need to grant these broad permissions.

The LogGroup definition creates the log group in CloudWatch. You can use what ever value you want for the LogGroupName. I followed the Amazon convention of prefixing everything with /aws/[service-name]/ and then appended the Step Function name. I recommend using the RetentionInDays configuration. It stops old logs sticking around for ever. In my case I send all my logs to ELK, so I don’t need to retain them in CloudWatch long term.

Finally we use the LoggingConfiguration to tell AWS where we want to send out logs. You can only specify a single Destinations. The IncludeExecutionData determines if the inputs and outputs of each function call is logged. You should not enable this if you are passing sensitive information between your steps. The verbosity of logging is controlled by Level. Amazon has a page on Step Function log levels. For dev you probably want to use ALL to help with debugging but in production you probably only need ERROR level logging.

I removed the Parameters and Output from the template. Use them as you need to.

April 01, 2020

Dave HallZoom's Make or Break Moment

Zoom is experiencing massive growth as large sections of the workforce transition to working from home. At the same time many problems with Zoom are coming to light. This is their make or break moment. If they fix the problems they end up with a killer video conferencing app. The alternative is that they join Cisco's Webex in the dumpster fire of awful enterprise software.

In the interest of transparency I am a paying Zoom customer and I use it for hours every day. I also use Webex (under protest) as it is a client's video conferencing platform of choice.

In the middle of last year Jonathan Leitschuh disclosed two bugs in zoom with security and privacy implications . There was a string of failures that lead to these bugs. To Zoom’s credit they published a long blog post about why these “features” were there in the first place.

Over the last couple of weeks other issues with Zoom have surfaced. “Zoom bombing” or using random 9 digit numbers to find meetings has become a thing. This is caused by zoom’s meeting rooms having a 9 digit code to join. That’s really handy when you have to dial in and enter the number on your telephone keypad. The down side is that you have a 1 in 999 999 999 chance of joining a meeting when using a random number. Zoom does offer the option of requiring a password or PIN for each call. Unfortunately it isn’t the default. Publishing a blog post on how to secure your meetings isn’t enough, the app needs to be more secure by default. The app should default to enabling a 6 digit PIN when creating a meeting.

The Intercept is reporting Zoom’s marketing department got a little carried away when describing the encryption used in the product. This is an area where words matter. Encryption in transit is a base line requirement in communication tools these days. Zoom has this, but their claims about end to end encryption appear to be false. End to end encryption is very important for some use cases. I await the blog post explaining this one.

I don’t know why Proton Mail’s privacy issues blog post got so much attention. This appears to be based on someone skimming the documentation rather than any real testing. Regardless the post got a lot of traction. Some of the same issues were flagged by the EFF.

Until recently zoom’s FAQ read “Does Zoom sell Personal Data? […] Depends what you mean by ‘sell’”. I’m sure that sounded great in a meeting but it is worrying when you read it as a customer. Once called out on social media it was quickly updated and a blog post published. In the post, Zoom assures users it isn’t selling their data.

Joseph Cox reported late last week that Zoom was sending data to Facebook every time someone used their iOS app. It is unclear if Joe gave Zoom an opportunity to fix the issue before publishing the article. The company pushed out a fix after the story broke.

The most recent issue broke yesterday about the Zoom macOS installer behaving like malware. This seems pretty shady behaviour, like their automatic reinstaller that was fixed last year. To his credit, Zoom Founder and CEO, Eric Yuan engaged with the issue on twitter. This will be one to watch over the coming days.

Over the last year I have seen a consistent pattern when Zoom is called out on security and valid privacy issues with their platform. They respond publicly with “oops my bad” blog posts . Many of the issues appear to be a result of them trying to deliver a great user experience. Unfortunately they some times lean too far toward the UX and ignore the security and privacy implications of their choices. I hope that over the coming months we see Zoom correct this balance as problems are called out. If they do they will end up with an amazing platform in terms of UX while keeping their users safe.

Update Since publishing this post additional issues with Zoom were reported. Zoom's CEO announced the company was committed to fixing their product.

November 16, 2019

Dave HallDrupalSouth Diversity Scholarship Winner Announced

A few weeks ago we announced our diversity scholarship for DrupalSouth. Before announcing the winner I want to talk a bit about our experience doing this for the first time.

DrupalSouth is the largest Drupal event held in Oceania every year. It provides a great marketing opportunity for businesses wanting to promote their products and services to the Drupal community. Dave Hall Consulting planned to sponsor DrupalSouth to promote our new training business - Getting It Live training. By the time we got organised all of the (affordable) sponsorship opportunities had gone. After considering various opportunities around the event we felt the best way of investing a similar amount of money and giving something back to the community was through a diversity scholarship

The community provided positive feedback about the initiative. However despite the enthusiasm and working our networks to get a range of applicants, we only ended up with 7 applicants. They were all guys. One applicant was from Australia, the rest were from overseas. About half the applicants dropped out when contacted to confirm that they could cover their own travel and visa expenses.

We are likely to offer other scholarships in the future. We will start earlier and explore other channels for promoting the program.

The scholarship has been awarded to Yogesh Ingale, from Mumbai, India. Over the last 3 years Yogesh has been employed by Tata Consultancy Services’ digital operations team as a DevOps Engineer. During this time he has worked with Drupal, Cloud Computing, Python and Web Technologies. Yogesh is interested in automating processes. When he’s not working, Yogesh likes to travel, automate things and write blog posts. Disclaimer: I know Yogesh through my work with one of my clients. Some times the Drupal community feels pretty small.

Congratulations Yogesh! I am looking forward to seeing you in Hobart.

If you want to meet Yogesh before DrupalSouth, we still have some seats available for our 73780151419">2 day git training course that’s running on 25-26 November. If you won’t be in Hobart, contact us to discuss your training needs.

November 10, 2019

Julien GoodwinSome thoughts on Storytelling as an engineering teaching tool

Every week at work on Wednesday afternoons we have the SRE ops review, a relaxed two hour affair where SREs (& friends of, not all of whom are engineers) share interesting tidbits that have happened over the last week or so, this might be a great success, an outage, a weird case, or even a thorny unsolved problem. Usually these relate to a service the speaker is oncall for, or perhaps a dependency or customer service, but we also discuss major incidents both internal & external. Sometimes a recent issue will remind one of the old-guard (of which I am very much now a part) of a grand old story and we share those too.

Often the discussion continues well into the evening as we decant to one of the local pubs for dinner & beer, sometimes chatting away until closing time (probably quite regularly actually, but I'm normally long gone).

It was at one of these nights at the pub two months ago (sorry!), that we ended up chatting about storytelling as a teaching tool, and a colleague asked an excellent question, that at the time I didn't have a ready answer for, but I've been slowly pondering, and decided to focus on over an upcoming trip.

As I start to write the first draft of this post I've just settled in for cruise on my first international trip in over six months[1], popping over to Singapore for the Melbourne Cup weekend, and whilst I'd intended this to be a holiday, I'm so terrible at actually having a holiday[2] that I've ended up booking two sessions of storytelling time, where I present the history of Google's production networks (for those of you reading this who are current of former engineering Googlers, similar to Traffic 101). It's with this perspective of planning, and having run those sessions that I'm going to try and answer the question that I was asked.

Or at least, I'm going to split up the question I was asked and answer each part.

"What makes storytelling good"

On its own this is hard to answer, there are aspects that can help, such as good presentation skills (ideally keeping to spoken word, but simple graphs, diagrams & possibly photos can help), but a good story can be told in a dry technical monotone and still be a good story. That said, as with the rest of these items charisma helps.

"What makes storytelling interesting"

In short, a hook or connection to the audience, for a lot of my infrastructure related outage stories I have enough context with the audience to be able to tie the impact back in a way that resonates with a person. For larger disparate groups shared languages & context help ensure that I'm not just explaining to one person.

In these recent sessions one was with a group of people who work in our Singapore data centre, in that session I focused primarily on the history & evolution of our data centre fabrics, giving them context to understand why some of the (at face level) stranger design decisions have been made that way.

The second session was primarily people involved in the deployment side of our backbone networks, and so I focused more on the backbones, again linking with knowledge the group already had.

"What makes storytelling entertaining"

Entertaining storytelling is a matter of style, skills and charisma, and while many people can prepare (possibly with help) an entertaining talk, the ability to tell an entertaining story off the cuff is more of a skill, luckily for me, one I seem to do ok with. Two things that can work well are dropping in surprises, and where relevant some level of self-deprecation, however both need to be done very carefully.

Surprises can work very well when telling a story chronologically "I assumed X because Y, <five minutes of waffling>, so it turned out I hadn't proved Y like I thought, so it wasn't X, it was Z", they can help the audience to understand why a problem wasn't solved so easily, and explaining "traps for young players" as Dave Jones (of the EEVblog) likes to say can themselves be really helpful learning elements. Dropping surprises that weren't surprises to the story's protagonist generally only works if it's as a punchline of a joke, and even then it often doesn't.

Self-deprecation is an element that I've often used in the past, however more recently I've called others out on using it, and have been trying to reduce it myself, depending on the audience you might appear as a bumbling success or stupid, when the reality may be that nobody understood the situation properly, even if someone should have. In the ops review style of storytelling, it can also lead to a less experienced audience feeling much less confident in general than they should, which itself can harm productivity and careers.

If the audience already had relevant experience (presenting a classic SRE issue to other SREs for example, a network issue to network engineers, etc.) then audience interaction can work very well for engagement. "So the latency graph for database queries was going up and to the right, what would you look at?" This is also similar to one of the ways to run a "wheel of misfortune" outage simulation.

"What makes storytelling useful & informative at the same time"

In the same way as interest, to make storytelling useful & informative for the audience involves consideration for the audience, as a presenter if you know the audience, at least in broad strokes this helps. As I mentioned above, when I presented my talk to a group of datacenter-focused people I focused on the DC elements, connecting history to the current incarnations; when I presented to a group of more general networking folk a few days later, I focused more on the backbones and other elements they'd encountered.

Don't assume that a story will stick wholesale, just leaving a few keywords, or even just a vague memory with a few key words they can go digging for can make all the difference in the world. Repetition works too, sharing many interesting stories that share the same moral (for an example, one of the ops review classics is demonstrations about how lack of exponential backoff can make recovery from outages hard), hearing this over dozens of different stories over weeks (or months, or years...) it eventually seeps in as something to not even question having been demonstrated as such an obvious foundation of good systems.

When I'm speaking to an internal audience I'm happy if they simply remember that I (or my team) exist and might be worth reaching out to in future if they have questions.

Lastly, storytelling is a skill you need to practice, whether a keynote presentation in front of a few thousand people, or just telling tall takes to some mates at the pub practice helps, and eventually many of the elements I've mentioned above become almost automatic. As can probably be seen from this post I could do with some more practice on the written side.

1: As I write these words I'm aboard a Qantas A380 (QF1) flying towards Singapore, the book I'm currently reading, of all things about mechanical precision ("Exactly: How Precision Engineers Created the Modern World" or as it has been retitled for paperback "The Perfectionists"), has a chapter themed around QF32, the Qantas A380 that notoriously had to return to Singapore after an uncontained engine failure. Both the ATSB report on the incident and the captain Richard de Crespigny's book QF32 are worth reading. I remember I burned though QF32 one (very early) morning when I was stuck in GlobalSwitch Sydney waiting for approval to repatch a fibre, one of the few times I've actually dealt with the physical side of Google's production networks, and to date the only time the fact I live just a block from that facility has been used at all sensibly.

2: To date, I don't think I've ever actually had a holiday that wasn't organised by family, or attached to some conference, event or work travel I'm attending. This trip is probably the closest I've ever managed (roughly equal to my burnout trip to Hawaii in 2014), and even then I've ruined it by turning two of the three weekdays into work. I'm much better at taking breaks that simply involve not leaving home or popping back to stay with family in Melbourne.

April 27, 2019

Julien GoodwinBuilding new pods for the Spectracom 8140 using modern components

I've mentioned a bunch of times on the time-nuts list that I'm quite fond of the Spectracom 8140 system for frequency distribution. For those not familiar with it, it's simply running a 10MHz signal against a 12v DC power feed so that line-powered pods can tap off the reference frequency and use it as an input to either a buffer (10MHz output pods), decimation logic (1MHz, 100kHz etc.), or a full synthesizer (Versa-pods).

It was only in October last year that I got a house frequency standard going using an old Efratom FRK-LN which now provides the reference; I'd use a GPSDO, but I live in a ground floor apartment without a usable sky view, this of course makes it hard to test some of the GPS projects I'm doing. Despite living in a tiny apartment I have test equipment in two main places, so the 8140 is a great solution to allow me to lock all of them to the house standard.


(The rubidium is in the chunky aluminium chassis underneath the 8140)

Another benefit of the 8140 is that many modern pieces of equipment (such as my [HP/Agilent/]Keysight oscilloscope) have a single connector for reference frequency in/out, and should the external frequency ever go away it will switch back to its internal reference, but also send that back out the connector, which could lead to other devices sharing the same signal switching to it. The easy way to avoid that is to use a dedicated port from a distribution amplifier for each device like this, which works well enough until you have this situation in multiple locations.

As previously mentioned the 8140 system uses pods to add outputs, while these pods are still available quite cheaply used on eBay (as of this writing, for as low as US$8, but ~US$25/pod has been common for a while), recently the cost of shipping to Australia has gone up to the point I started to plan making my own.

By making my own pods I also get to add features that the original pods didn't have[1], I started with a quad-output pod with optional internal line termination. This allows me to have feeds for multiple devices with the annoying behaviour I mentioned earlier. The enclosure is a Pomona model 4656, with the board designed to slot in, and offer pads for the BNC pins to solder to for easy assembly.



This pod uses a Linear Technologies (now Analog Devices) LTC6957 buffer for the input stage replacing a discrete transistor & logic gate combined input stage in the original devices. The most notable change is that this stage works reliably down to -30dBm input (possibly further, couldn't test beyond that), whereas the original pods stop working right around -20dBm.

As it turns out, although it can handle lower input signal levels, in other ways including power usage it seems very similar. One notable downside is the chip tops out at 4v absolute maximum input, so a separate regulator is used just to feed this chip. The main regulator has also been changed from a 7805 to an LD1117 variant.

On this version the output stage is the same TI 74S140 dual 4-input NAND gate as was used on the original pods, just in SOIC form factor.

As with the next board there is one error on the board, the wire loop that forms the ground connection was intended to fit a U-type pin header, however the footprint I used on the boards was just too tight to allow the pins through, so I've used some thin bus wire instead.



The second major variant I designed was a combo version, allowing sine & square outputs by just switching a jumper, or isolated[2] or line-regenerator (8040TA from Spectracom) versions with a simple sub-board containing just an inductor (TA) or 1:1 transformer (isolated).



This is the second revision of that board, where the 74S140 has been replaced by a modern TI 74LVC1G17 buffer. This version of the pod, set for sine output, uses almost exactly 30mA of current (since both the old & new pods use linear supplies that's the most sensible unit), whereas the original pods are right around 33mA. The empty pods at the bottom-left are simply placeholders for 2 100 ohm resistors to add 50 ohm line termination if desired.

The board fits into the Pomona 2390 "Size A" enclosures, or for the isolated version the Pomona 3239 "Size B". This is the reason the BNC connectors have to be extended to reach the board, on the isolated boxes the BNC pins reach much deeper into the enclosure.

If the jumpers were removed, plus the smaller buffer it should be easy to fit a pod into the Pomona "Miniature" boxes too.



I was also due to create some new personal businesscards, so I arranged the circuit down to a single layer (the only jumper is the requirement to connect both ground pins on the connectors) and merged it with some text converted to KiCad footprints to make a nice card on some 0.6mm PCBs. The paper on that photo is covering the link to the build instructions, which weren't written at the time (they're *mostly* done now, I may update this post with the link later).

Finally, while I was out travelling at the start of April my new (to me) HP 4395A arrived so I've finally got some spectrum output. The output is very similar between the original and my version, with the major notable difference being that my version is 10dB worse at the third harmonic. I lack the equipment (and understanding) to properly measure phase noise, but if anyone in AU/NZ wants to volunteer their time & equipment for an afternoon I'd love an excuse for a field trip.



Spectrum with input sourced from my house rubidium (natively a 5MHz unit) via my 8140 line. Note that despite saying "ExtRef" the analyzer is synced to its internal 10811 (which is an optional unit, and uses an external jumper, hence the display note.



Spectrum with input sourced from the analyzer's own 10811, and power from the DC bias generator also from the analyzer.


1: Or at least I didn't think they had, I've since found out that there was a multi output pod, and one is currently in the post heading to me.
2: An option on the standard Spectracom pods, albeit a rare one.

January 13, 2019

Julien GoodwinTransport security for BGP, AKA BGP-STARTTLS, a proposal

Several days ago, inspired in part by an old work mail thread being resurrected I sent this image as a tweet captioned "The state of BGP transport security.":



The context of the image for those not familiar with it is this image about noSQL databases.

This triggered a bunch of discussion, with multiple people saying "so how would *you* do it", and we'll get to that (or for the TL;DR skip to the bottom), but first some background.

The tweet is a reference to the BGP protocol the Internet uses to exchange routing data between (and within) networks. This protocol (unless inside a separate container) is never encrypted, and can only be authenticated (in practice) by a TCP option known as TCP-MD5 (standardised in RFC2385). The BGP protocol itself has no native encryption or authentication support. Since routing data can often be inferred by the packets going across a link anyway, this has lead to this not being a priority to fix.

Transport authentication & encryption is a distinct issue from validation of the routing data transported by BGP, an area already being worked on by the various RPKI projects, eventually transport authentication may be able to benefit from some of the work done by those groups.

TCP-MD5 is quite limited, and while generally supported by all major BGP implementations it has one major limitation that makes it particularly annoying, in that it takes a single key, making key migration difficult (and in many otherwise sensible topologies, impossible without impact). Being a TCP option is also a pain, increasing fragility.

At the time of its introduction TCP-MD5 gave two main benefits the first was to have some basic authentication beyond the basic protocol (for which the closest element in the protocol is the validation of peer-as in the OPEN message, and a mismatch will helpfully tell you who the far side was looking for), plus making it harder to interfere with the TCP session, which on many of the TCP implementations of the day was easier than it should have been. Time, however has marched on, and protection against session interference from non-MITM is no longer needed, the major silent MITM case of Internet Exchanges using hubs is long obsolete, plus, in part due to the pain associated in changing keys many networks have a "default" key they will use when turning up a peering session, these keys are often so well known for major networks that they've often been shared on public mailing lists, eliminating what little security benefit TCP-MD5 still brings.

This has been known to be a problem for many years, and the proposed replacement TCP-AO (The TCP Authentication Option) was standardised in 2010 as RFC5925, however, to the best of my knowledge eight years later no mainstream BGP implementation supports it, and as it too is a TCP option, not only does it still has many of the downsides of MD5, but major OS kernels are much less likely to implement new ones (indeed, an OS TCP maintainer commenting such on the thread I mentioned above is what kicked off my thinking).

TCP, the wire format, is in practice unchangeable. This is one of the major reasons for QUIC, the TCP replacement protocol soon to be standardised as HTTP/3, so for universal deployment any proposal that requires changes to TCP is a non-starter.

Any solution must be implementable & deployable.
  • Implementable - BGP implementations must be able to implement it, and do so correctly, ideally with a minimum of effort.
  • Deployable - Networks need to be able to deploy it, when authentication issues occur error messages should be no worse than with standard BGP (this is an area most TCP-MD5 implementations fail at, of those I've used JunOS is a notable exception, Linux required kernel changes for it to even be *possible* to debug)


Ideally any security-critical code should already exist in a standardised form, with multiple widely-used implementations.

Fortunately for us, that exists in the form of TLS. IPSEC, while it exists, fails the deployable tests, as almost anyone who's ever had the misfortune of trying to get IPSEC working between different organisations using different implementations can attest, sure it can usually be made to work, but nowhere near as easily as TLS.

Discussions about the use of TLS for this purpose have happened before, but always quickly run into the problem of how certificates for this should be managed, and that is still an open problem, potentially the work on RPKI may eventually provide a solution here, but until that time we have a workable alternative in the form of TLS-PSK (first standardised in RFC4279), a set of TLS modes that allow the use of pre-shared keys instead of certificates (for those wondering, not only does this still exist in TLS1.3 it's in a better form). For a variety of reasons, not the least the lack of real-time clocks in many routers that may not be able to reach an NTP server until BGP is established, PSK modes are still more deployable than certificate verification today. One key benefit for TLS-PSK is it supports multiple keys to allow migration to a new key in a significantly less impactful manner.

The most obvious way to support BGP-in-TLS would simply be to listen on a new port (as is done for HTTPS for example), however there's a few reasons why I don't think such a method is deployable for BGP, primarily due to the need to update control-plane ACLs, a task that in large networks is often distant from the folk working with BGP, and in small networks may not be understood by any current staff (a situation not dissimilar to the state of TCP). Another option would simply be to use protocol multiplexing and do a TLS negotiation if a TLS hello is received, or unencrypted BGP for a BGP OPEN, this would violate the general "principal of least astonishment", and would be harder for operators to debug.

Instead I propose a design similar to that used by SMTP (where it is known as STARTTLS), during early protocol negotiation support for TLS is signalled using a zero-length capability in the BGP OPEN, the endpoints do a TLS negotiation, and then the base protocol continues inside the new TLS tunnel. Since this negotiation happens during the BGP OPEN, it does mean that other data included in the OPEN leaks. Primarily this is the ASN, but also the various other capabilities supported by the implementation (which could identify the implementation), I suggest that if TLS is required information in the initial OPEN not be validated, and standard reserved ASN be sent instead, and any other capabilities not strictly required not sent, with a fresh OPEN containing all normal information sent inside the TLS session.

Migration from TCP-MD5 is key point, however not one I can find any good answer for. Some implementations already allow TCP-MD5 to be optional, and that would allow an easy upgrade, however such support is rare, and unlikely to be more widely supported.

On that topic, allowing TLS to be optional in a consistent manner is particularly helpful, and something that I believe SHOULD be supported to allow cases like currently unauthenticated public peering sessions to be migrated to TLS with minimal impact. Allowing this does open the possibility of a downgrade attack, and make more plausible attacks causing state machine confusions (implementation believes it's talking on a TLS-secured session when it isn't).

What do we lose from TCP-MD5? Some performance, whilst this is not likely to be an issue for most situations, it is likely not an option for anyone still trying to run IPv4 full tables on a Cisco Catalyst 6500 with Sup720. We do also lose the TCP manipulation prevention aspects, however these days those are (IMO) of minimal value in practice. There's also the costs of implementations needing to include a TLS implementations, and whilst nearly every system will have one (at the very least for SSH) it may not already be linked to the routing protocol implementation.

Lastly, my apologies to anyone who has proposed this before, but my neither I nor my early reviewers were aware of such a proposal. Should such a proposal already exist, meeting the goals of implementable & deployable it may be sensible to pick that up instead.

The IETF is often said to work on "rough consensus and running code", for this proposal here's what I believe a minimal *actual* demonstration of consensus with code would be:
  • Two BGP implementations, not derived from the same source.
  • Using two TLS implementations, not derived from the same source.
  • Running on two kernels (at the very least, Linux & FreeBSD)


The TL;DR version:
  • Using a zero-length BGP capability in the BGP OPEN message implementations advertise their support for TLS
    • TLS version MUST be at least 1.3
    • If TLS is required, the AS field in the OPEN MAY be set to a well-known value to prevent information leakage, and other capabilities MAY be removed, however implementations MUST NOT require the TLS capability be the first, last or only capability in the OPEN
    • If TLS is optional, which MUST NOT be default behaviour), the OPEN MUST be (other than the capability) be the same as a session configured for no encryption
  • After the TCP client receives a conformation of TLS support from the TCP server's OPEN message, a TLS handshake begins
    • To make this deployable TLS-PSK MUST be supported, although exact configuration is TBD.
    • Authentication-only variants of TLS (ex RFC4785) REALLY SHOULD NOT be supported.
    • Standard certificate-based verification MAY be supported, and if supported MUST validate use client certificates, validating both. However, how roots of trust would work for this has not been investigated.
  • Once the TCP handshake completes the BGP state starts over with the client sending a new OPEN
    • Signalling the TLS capability in this OPEN is invalid and MUST be rejected
  • (From here, everything is unchanged from normal BGP)


Magic numbers for development:
  • Capability: (to be referred to as EXPERIMENTAL-STARTTLS) 219
  • ASN (for avoiding data leaks in OPEN messages): 123456
    • Yes this means also sending 4-byte capability. Every implementation that might possibly implement this already supports 4-byte ASNs.


The key words "MUST (BUT WE KNOW YOU WON'T)", "SHOULD CONSIDER", "REALLY SHOULD NOT", "OUGHT TO", "WOULD PROBABLY", "MAY WISH TO", "COULD", "POSSIBLE", and "MIGHT" in this document are to be interpreted as described in RFC 6919.

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC2119.

August 23, 2018

Julien GoodwinCustom output pods for the Standard Research CG635 Clock Generator

As part of my previously mentioned side project the ability to replace crystal oscillators in a circuit with a higher quality frequency reference is really handy, to let me eliminate a bunch of uncertainty from some test setups.

A simple function generator is the classic way to handle this, although if you need square wave output it quickly gets hard to find options, with arbitrary waveform generators (essentially just DACs) the common option. If you can get away with just sine wave output an RF synthesizer is the other main option.

While researching these I discovered the CG635 Clock Generator from Stanford Research, and some time later picked one of these up used.

As well as being a nice square wave generator at arbitrary voltages these also have another set of outputs on the rear of the unit on an 8p8c (RJ45) connector, in both RS422 (for lower frequencies) and LVDS (full range) formats, as well as some power rails to allow a variety of less common output formats.

All I needed was 1.8v LVCMOS output, and could get that from the front panel output, but I'd then need a coax tail on my boards, as well as potentially running into voltage rail issues so I wanted to use the pod output instead. Unfortunately none of the pods available from Stanford Research do LVCMOS output, so I'd have to make my own, which I did.

The key chip in my custom pod is the TI SN65LVDS4, a 1.8v capable single channel LVDS reciever that operates at the frequencies I need. The only downside is this chip is only available in a single form factor, a 1.5mm x 2mm 10 pin UQFN, which is far too small to hand solder with an iron. The rest of the circuit is just some LED indicators to signal status.


Here's a rendering of the board from KiCad.

Normally "not hand solderable" for me has meant getting the board assembled, however my normal assembly house doesn't offer custom PCB finishes, and I wanted these to have white solder mask with black silkscreen as a nice UX when I go to use them, so instead I decided to try my hand at skillet reflow as it's a nice option given the space I've got in my tiny apartment (the classic tutorial on this from SparkFun is a good read if you're interested). Instead of just a simple plate used for cooking you can now buy hot plates with what are essentially just soldering iron temperature controllers, sold as pre-heaters making it easier to come close to a normal soldering profile.

Sadly, actually acquiring the hot plate turned into a bit of a mess, the first one I ordered in May never turned up, and it wasn't until mid-July that one arrived from a different supplier.

Because of the aforementioned lack of space instead of using stencils I simply hand-applied (leaded) paste, without even an assist tool (which I probably will acquire for next time), then hand-mounted the components, and dropped them on the plate to reflow. I had one resistor turn 90 degrees, and a few bridges from excessive paste, but for a first attempt I was really happy.


Here's a photo of the first two just after being taken off the hot plate.

Once the reflow was complete it was time to start testing, and this was where I ran into my biggest problems.

The two big problems were with the power supply I was using, and with my oscilloscope.

The power supply (A Keithley 228 Voltage/Current source) is from the 80's (Keithley's "BROWN" era), and while it has nice specs, doesn't have the most obvious UI. Somehow I'd set it to limit at 0ma output current, and if you're not looking at the segment lights it's easy to miss. At work I have an EEZ H24005 which also resets the current limit to zero on clear, however it's much more obvious when limiting, and a power supply with that level of UX is now on my "to buy" list.

The issues with my scope were much simpler. Currently I only have an old Rigol DS1052E scope, and while it works fine it is a bit of a pain to use, but ultimately I made a very simple mistake while testing. I was feeding in a trigger signal direct from the CG635's front outputs, and couldn't figure out why the generator was putting out such a high voltage (implausibly so). To cut the story short, I'd simply forgotten that the scope was set for use with 10x probes, and once I realised that everything made rather more sense. An oscilloscope with auto-detection for 10x probes, as well as a bunch of other features I want in a scope (much bigger screen for one), has now been ordered, but won't arrive for a while yet.

Ultimately the boards work fine, but until the new scope arrives I can't determine signal quality of them, but at least they're ready for when I'll need them, which is great for flow.

August 01, 2013

Tim ConnorsNo trains for the corporatocracy

Sigh, look, I know we don't actually live in a democracy (but a corporatocracy instead), and I should never expect the relevant ministers to care about my meek little protestations otherwise, but I keep writing these letters to ministers for transport anyway, under the vague hope that it might remind them that they're ministers for transport, and not just roads.


Dear Transport Minister, Terry Mulder,

I encourage you and your fellow ministers to read this article
("Tracking the cost", The Age, June 13 2009) from 2009, back when the
Liberals claimed to have a very different attitude, and when
circumstances seemed to mirror the current time:
http://www.theage.com.au/national/tracking-the-cost-20090612-c67m.html

The eventual costs of building the first extensions to the Melbourne
public transport system in 80 years eventually blew out from $8M to
$500M over the short life of the South Morang project; despite being a
much smaller project than the entire rail lines built cheaper by
cities such as Perth in recent years.

The increased cost is explained away as a safety requirement - it
being so important to now start building grade separated lines rather
than level crossings regardless of circumstances. Perceived safety
trumps real safety (I'd much rather be in a train than suffer from one
of the 300 Victorian deaths on the roads each year), but more sinister
is that because of this inflated expense, we'll probably never see
another rail line like this built at all in Melbourne (although we'll
build at public expense a wonderful road tunnel that no-one but
Lindsay Fox will use at more than 10 times the cost, though).

I suspect the real reason for grade separation is not safety, but to
cause less inconvenience to car drivers stuck for 30 seconds at these
minor crossings. Since the delays at level crossings are a roads
problem, and collisions of errant motorists with trains at level
crossings is a roads problem, and the South Morang railway reservation
existed far before any of the roads were put in place, I'm wondering
whether you can answer why the blowout in costs of construction of
train lines comes out of the public transport budget, and not at the
expense of what causes these problems in the first place - the roads?
These train lines become harder to build because of an artificial cost
inflation caused by something that will be less of a problem if only
we could built more rail lines and actually improve the Melbourne
public transport system and make it attractive to use, for once (we've
been waiting for 80 years).


Yours sincerely,


And a little while later, the reply!

July 01, 2013

Tim ConnorsYarra trail pontoon closures

I do have to admit, I had some fun writing this one:

Dear Transport Minister, Terry Mulder (Denis Napthine, Local MP Ted Baillieu, Ryan Smith MP responsible for Parks Victoria, Parks Victoria itself, and Bicycle Victoria CCed),

I am writing about the sudden closure of the Main Yarra bicycle trail around Punt Road. The floating sections of the trail have been closed for the foreseeable future because of some over-zealous lawyer at Parks Victoria who has decided that careless riders might injure themselves on the rare occasion when the pontoon is both icy, and resting on the bottom of the Yarra at very low tides, sloping sideways at a minor angle. The trail has been closed before Parks Victoria have even planned for how they're going to rectify the problem with the pontoons. Instead, the lawyers have forced riders to take to parallel streets such as Swan St (which I took tonight in the rain, negotiating the thin strip between parked cars far enough from their doors being flung out illegally by careless drivers, and the wet tram tracks beside them). Obviously, causing riders to take these detours will be very much less safe than just keeping the trail open until a plan is developed, but I can see why Parks Victoria would want to shift the legal burden away from them.

I have no faith that the pontoon will be fixed in the foreseeable future without your intervention, because of past history -- that trail has been partially closed for about 18 months out of the past 3 years due to the very important works on the freeway above (keeping the economy going, as they say, by digging ditches and filling them immediately back up again).

Since we're already wasting $15B on an east-west freeway tunnel that will do absolutely nothing to alleviate traffic congestion because the outbound (Easterly direction) freeway is already at capacity in the afternoon without the extra induced traffic this project will add, I was wondering if you could spare a few million to duplicate the only easterly bicycle trail we have, so that these sorts of incidents don't reoccur and have so much impact on riders in the future.

I do hope that this trail will be fixed in a timely fashion before myself and all other 3000-4000 cyclists who currently use the trail every day resorting to riding through any of your freeway tunnels.

Yours sincerely,

Me

April 14, 2013

Tim Connors

Oh well, if The Age aren't going to publish my Thatcher rant, I will:

Jan White (
Letters, 11 Apr) is heavily misguided if she believes that Thatcher was one of Britain's greatest leaders. For whom? By any metric 70% of Brits cared about, she was one of the worst. Any harmony, strength of character and respect Brits may be missing now would be due to her having nearly destroyed everything about British society with her Thatchernomics. Her funeral should be privatised and definitely not funded by the state as it is going to be. Instead, it could be funded by the long queue of people who want to dance on her grave.

March 21, 2013

Tim ConnorsRagin' on the road

Since The Age didn't publish my letter, my 3 readers ought to see it anyway:


Reynah Tang of the Law Institute of Victoria says that road rage offences shouldn't necessarily lead to loss of licence ("Offenders risk losing their licence", The Age, Mar 21) . He misses the point -- a vehicle is a weapon. Road ragers demonstrably do not have enough self control to drive. They have already lost their temper when in control of such a weapon, so they must never be given a licence to use that weapon again (the weapon should also be forfeited). The same is presumably true of gun murderers after their initial jail time (which road ragers rarely are given). RACV's Brian Negus also doesn't appear to realise that a driving license is a privilege, not an automatic right. You can still have all your necessary mobility without your car - it's not a human rights issue.


It was less than 200 words even dammit! But because the editor didn't check the basic arithmetic in a previous day's letter, they had to publish someone's correction.

November 18, 2012

Ben McGinnesFixed it

I've fixed the horrible errors that were sending my tweets here, it only took a few hours.

To do that I've had to disable cross-posting and it looks like it won't even work manually, so my updates will likely only occur on my own domain.

Details of the changes are here. They include better response times for my domain and no more Twitter posts on the main page, which should please those of you who hate that. Apparently that's a lot of people, but since I hate being inundated with FarceBook's crap I guess it evens out.

The syndicated feed for my site around here somewhere will get everything, but there's only one subscriber to that (last time I checked) and she's smart enough to decide how she wants to deal with that.

Ben McGinnesTweet Sometimes I amaze even myself; I remembered the pa…

Sometimes I amaze even myself; I remembered the passphrases to old PGP keys I thought had been lost to time. #crypto

Originally published at Organised Adversary. Please leave any comments there.

Ben McGinnesTweet These are the same keys I referred to in the PPAU…

These are the same keys I referred to in the PPAU #NatSecInquiry submission as being able to be used against me. #crypto

Originally published at Organised Adversary. Please leave any comments there.

Ben McGinnesTweet Now to give them their last hurrah: sign my curren…

Now to give them their last hurrah: sign my current key with them and then revoke them! #crypto

Originally published at Organised Adversary. Please leave any comments there.

October 26, 2011

Donna Benjaminheritage and hysterics

Originally published at KatteKrab. Please leave any comments there.

This gorgeous photo of The Queen in Melbourne on the Royal Tram made me smile this morning.

I've long been a proponent of an Australian Republic - but the populist hysteria of politicians, this photo, and the Kingdom of the Netherlands is actually making me rethink that position.

At least for today.  Long may she reign over us.

"Queen Elizabeth II smiles as she rides on the royal tram down St Kilda Road"
Photo from Getty Images published on theage.com.au

October 02, 2011

Donna BenjaminSticks and Stones and Speech

Originally published at KatteKrab. Please leave any comments there.

THE law does treat race differently: it is not unlawful to publish an article that insults, offends, humiliates or intimidates old people, for instance, or women, or disabled people. Professor Joseph, director of the Castan Centre for Human Rights Law at Monash University, said in principle ''humiliate and intimidate'' could be extended to other anti-discrimination laws. But historically, racial and religious discrimination is treated more seriously because of the perceived potential for greater public order problems and violence.

Peter Munro The Age  2 Oct 2011

Ahaaa. Now I get it! We've been doing it wrong. 

Racial villification is against the law because it might be more likely to lead to violence than villifying women, the elderly or the disabled.

Interesting debates and articles about free speech and discrimination are bobbing up and down in the flotsam and jetsam of the Bolt decision. Much of it seems to hinge on some kind of legal see-saw around notions of a bad law about bad words.

I've always been a proponent of the sticks and stones philosophy.  For those not familiar, it's the principle behind a children's nursery rhyme.

Sticks and Stones may break my bones
But  words will never hurt me

But I'm increasingly disturbed by the hateful culture of online comment.  I am a very strong proponent of the human right to free expression, and abhor censorship, but I'm seriously sick of "My right to free speech" being used as the ultimate excuse for people using words to denigrate, humiliate, intimidate, belittle and attack others, particularly women.

We should defend a right to free speech, but condemn hate speech when ever and where ever we see it.  Maybe we actually need to get violent to make this stop? Surely not.

September 20, 2011

Donna BenjaminQantas Pilots

Originally published at KatteKrab. Please leave any comments there.

The Qantas Pilot Safety culture is something worth fighting to protect. I read Malcolm Gladwell's Outliers whilst on board a Qantas flight recently. While Qantas itself isn't mentioned in the book, a footnote listed Australia as having the 2nd lowest Pilot Power-Distance Index (PDI) in the world. New Zealand had the lowest. The entire chapter "The Ethnic Theory of Plane Crashes" is the strongest argument I've seen which explains the Qantas safety record. The experience of pilots and relationships amongst the entire air crew is a crucial differentiating factor. Other airlines work hard to develop this culture, often needing to work against their own cultural patterns to achieve it. At Qantas, and likely at other Australian airlines too, this culture is the norm.

I want Australian Qantas Pilots flying Qantas planes. I'd like an Australian in charge too.

If you too support Qantas Pilots - go to their website, sign the petition.

Do your own reading.

G.R. Braithwaite, R.E. Caves, J.P.E. Faulkner, Australian aviation safety — observations from the ‘lucky’ countryJournal of Air Transport Management, Volume 4, Issue 1, January 1998: 55-62.

Anthony Dennis, What it takes to become a Qantas pilot news.com.au, 8 September 2011.

Ashleigh Merritt, Culture in the Cockpit: Do Hofstede’s Dimensions Replicate?  Journal of Cross-Cultural Psychology, May 2000 31: 283-30.

Matt Phillips, Malcolm Gladwell on Culture, Cockpit Communication and Plane Crashes, WSJ Blogs, 4 December 2008.

 

September 18, 2011

Donna BenjaminRegistering for LCA2012

Originally published at KatteKrab. Please leave any comments there.

linux.conf.au ballarat 2012

I am right now, at this very minute, registering for linux.conf.au in Ballarat in January. Creating my planet feed. Yep. Uhuh.

I reckon the "book a bus" feature of rego is pretty damn cool.  I won't be using it, because I'll be driving up from Melbourne. Serious kudos to the Ballarat team. Also nice to see they'll add busses from Avalon airport as well as from Tullamarine airport if there's demand.

Too cool.