Planet Bozo

August 20, 2019

Worse Than FailureCodeSOD: I'm Sooooooo Random, LOL

There are some blocks of code that require a preamble, and an explanation of the code and its flow. Often you need to provide some broader context.

Sometimes, you get some code like Wolf found, which needs no explanation:

export function generateRandomId(): string { counter++; return 'id' + counter; }

I mean, I guess that's a slightly better than this solution. Wolf found this because some code downstream was expecting random, unique IDs, and wasn't getting them.

[Advertisement] ProGet can centralize your organization's software applications and components to provide uniform access to developers and servers. Check it out!

August 19, 2019

Worse Than FailureLowest Bidder Squared

Stack of coins 0214

Initech was in dire straits. The website was dog slow, and the budget had been exceeded by a factor of five already trying to fix it. Korbin, today's submitter, was brought in to help in exchange for decent pay and an office in their facility.

He showed up only to find a boxed-up computer and a brand new flat-packed desk, also still in the box. The majority of the space was a video-recording studio that saw maybe 4-6 hours of use a week. After setting up his office, Korbin spent the next day and a half finding his way around the completely undocumented C# code. The third day, there was a carpenter in the studio area. Inexplicably, said carpenter decided he needed to contact-glue carpet to a set of huge risers ... indoors. At least a gallon of contact cement was involved. In minutes, Korbin got a raging headache, and he was essentially gassed out of the building for the rest of the day. Things were not off to a good start.

Upon asking around, Korbin quickly determined that the contractors originally responsible for coding the website had underbid the project by half, then subcontracted the whole thing out to a team in India to do the work on the cheap. The India team had then done the very same thing, subcontracting it out to the most cut-rate individuals they could find. Everything had been written in triplicate for some reason, making it impossible to determine what was actually powering the website and what was dead code. Furthermore, while this was a database-oriented site, there were no stored procedures, and none of the (sub)subcontractors seemed to understand how to use a JOIN command.

In an effort to tease apart what code was actually needed, Korbin turned on profiling. Only ... it was already on in the test version of the site. With a sudden ominous hunch, he checked the live site—and sure enough, profiling was running in production as well. He shut it off, and instantly, the whole site became more responsive.

The next fix was also pretty simple. The site had a bad habit of asking for information it already had, over and over, without any JOINs. Reducing the frequency of database hits improved performance again, bringing it to within an order of magnitude of what one might expect from a website.

While all this was going on, the leaderboard page had begun timing out. Sure enough, it was an N-squared solution: open database, fetch record, close database, repeat, then compare the two records, putting them in order and beginning again. With 500 members, it was doing 250,000 passes each time someone hit the page. Korbin scrapped the whole thing in favor of the site's first stored procedure, then cached it to call only once a day.

The weeks went on, and the site began to take shape, finally getting something like back on track. Thanks to the botched rollout, however, many of the company's endorsements had vanished, and backers were pulling out. The president got on the phone with some VIP about Facebook—because as we all know, the solution to any company's problem is the solution to every company's problems.

"Facebook was written in PHP. He told me it was the best thing out there. So we're going to completely redo the website in PHP," the president confidently announced at the next all-hands meeting. "I want to hear how long everyone thinks this will take to get done."

The only developers left at that point were Korbin and a junior kid just out of college, with one contractor with some experience on the project.

"Two weeks. Maybe three," the kid replied.

They went around the table, and all the non-programmers chimed in with the 2-3 week assessment. Next to last came the experienced contractor. Korbin's jaw nearly dropped when he weighed in at 3-4 weeks.

"None of that is realistic!" Korbin proclaimed. "Even with the existing code as a road map, it's going to take 4-6 months to rewrite. And with the inevitable feature-creep and fixes for things found in testing, it is likely to take even longer."

Korbin was told the next day he could pick up his final check. Seven months later, he ran into the junior kid again, and asked how the rewrite went.

"It's still ongoing," he admitted.

[Advertisement] Ensure your software is built only once and then deployed consistently across environments, by packaging your applications and components. Learn how today!

XKCDConference Question

August 16, 2019

Worse Than FailureError'd: What About the Fish?

"On the one hand, I don't want to know what the fish has to do with Boris Johnson's love life...but on the other hand I have to know!" Mark R. writes.

 

"Not sure if that's a new GDPR rule or the Slack Mailbot's weekend was just that much better then mine," Adam G. writes.

 

Connor W. wrote, "You know what, I think I'll just stay inside."

 

"It's great to see that an attempt at personalization was made, but whatever happened to 'trust but verify'?" writes Rob H.

 

"For a while, I thought that, maybe, I didn't actually know how to use my iPhone's alarm. Instead, I found that it just wasn't working right. So, I contacted Apple Support, and while they were initially skeptical that it was an iOS issue, this morning, I actually have proof!" Markus G. wrote.

 

Tim G. writes, "I guess that's better than an angry error message."

 

[Advertisement] Continuously monitor your servers for configuration changes, and report when there's configuration drift. Get started with Otter today!

XKCDSerena Versus the Drones

August 15, 2019

Worse Than FailureCodeSOD: A Devil With a Date

Jim was adding a feature to the backend. This feature updated a few fields on an object, and then handed the object off as JSON to the front-end.

Adding the feature seemed pretty simple, but when Jim went to check out its behavior in the front-end, he got validation errors. Something in the data getting passed back by his web service was fighting with the front end.

On its surface, that seemed like a reasonable problem, but when looking into it, Jim discovered that it was the record_update_date field which was causing validation issues. The front-end displayed this as a read only field, so there was no reason to do any client-side validation in the first place, and that field was never sent to the backend, so there was even less than no reason to do validation.

Worse, the field had, at least to the eye, a valid date: 2019-07-29T00:00:00.000Z. Even weirder, if Jim changed the backend to just return 2019-07-29, everything worked. He dug into the validation code to see what might be wrong about it:

/**
 * Custom validation
 *
 * This is a callback function for ajv custom keywords
 *
 * @param  {object} wsFormat aiFormat property content
 * @param  {object} data Data (of element type) from document where validation is required
 * @param  {object} itemSchema Schema part from wsValidation keyword
 * @param  {string} dataPath Path to document element
 * @param  {object} parentData Data of parent object
 * @param  {string} key Property name
 * @param  {object} rootData Document data
 */
function wsFormatFunction(wsFormat, data, itemSchema, dataPath, parentData, key, rootData) {

    let valid;
    switch (aiFormat) {
        case 'date': {
            let regex = /^\d\d\d\d-[0-1]\d-[0-3](T00:00:00.000Z)?\d$/;
            valid = regex.test(data);
            break;
        }
        case 'date-time': {
            let regex = /^\d\d\d\d-[0-1]\d-[0-3]\d[t\s](?:[0-2]\d:[0-5]\d:[0-5]\d|23:59:60)(?:\.\d+)?(?:z|[+-]\d\d:\d\d)$/i;
            valid = regex.test(data);
            break;
        }
        case 'time': {
            let regex = /^(0[0-9]|1[0-9]|2[0-3]):[0-5][0-9]:[0-5][0-9]$/;
            valid = regex.test(data);
            break;
        }
        default: throw 'Unknown wsFormat: ' + wsFormat;
    }

    if (!valid) {
        wsFormatFunction['errors'] = wsFormatFunction['errors'] || [];

        wsFormatFunction['errors'].push({
            keyword: 'wsFormat',
            dataPath: dataPath,
            message: 'should match format "' + wsFormat + '"',
            schema: itemSchema,
            data: data
        });
    }

    return valid;
}

When it starts with “Custom validation” and it involves dates, you know you’re in for a bad time. Worse, it’s custom validation, dates, and regular expressions written by someone who clearly didn’t understand regular expressions.

Let’s take a peek at the branch which was causing Jim’s error, and examine the regex:

/^\d\d\d\d-[0-1]\d-[0-3](T00:00:00.000Z)?\d$/

It should start with four digits, followed by a dash, followed by a value between 0 and 1. Then another digit, then a dash, then a number between 0 and 3, then the time (optionally), then a final digit.

It’s obvious why Jim’s perfectly reasonable date wasn’t working: it needed to be 2019-07-2T00:00:00.000Z9. Or, if Jim just didn’t include the timestamp, not only would 2019-07-29 be a valid date, but so would 2019-19-39, which just so happens to be my birthday. Mark your calendars for the 39th of Undevigintiber.

[Advertisement] BuildMaster allows you to create a self-service release management platform that allows different teams to manage their applications. Explore how!

August 14, 2019

XKCDOld Game Worlds

August 12, 2019

XKCDE Scooters

June 29, 2019

etbeLong-term Device Use

It seems to me that Android phones have recently passed the stage where hardware advances are well ahead of software bloat. This is the point that desktop PCs passed about 15 years ago and laptops passed about 8 years ago. For just over 15 years I’ve been avoiding buying desktop PCs, the hardware that organisations I work for throw out is good enough that I don’t need to. For the last 8 years I’ve been avoiding buying new laptops, instead buying refurbished or second hand ones which are more than adequate for my needs. Now it seems that Android phones have reached the same stage of development.

3 years ago I purchased my last phone, a Nexus 6P [1]. Then 18 months ago I got a Huawei Mate 9 as a warranty replacement [2] (I had swapped phones with my wife so the phone I was using which broke was less than a year old). The Nexus 6P had been working quite well for me until it stopped booting, but I was happy to have something a little newer and faster to replace it at no extra cost.

Prior to the Nexus 6P I had a Samsung Galaxy Note 3 for 1 year 9 months which was a personal record for owning a phone and not wanting to replace it. I was quite happy with the Note 3 until the day I fell on top of it and cracked the screen (it would have been ok if I had just dropped it). While the Note 3 still has my personal record for continuous phone use, the Nexus 6P/Huawei Mate 9 have the record for going without paying for a new phone.

A few days ago when browsing the Kogan web site I saw a refurbished Mate 10 Pro on sale for about $380. That’s not much money (I usually have spent $500+ on each phone) and while the Mate 9 is still going strong the Mate 10 is a little faster and has more RAM. The extra RAM is important to me as I have problems with Android killing apps when I don’t want it to. Also the IP67 protection will be a handy feature. So that phone should be delivered to me soon.

Some phones are getting ridiculously expensive nowadays (who wants to walk around with a $1000+ Pixel?) but it seems that the slightly lower end models are more than adequate and the older versions are still good.

Cost Summary

If I can buy a refurbished or old model phone every 2 years for under $400 that will make using a phone cost about $0.50 per day. The Nexus 6P cost me $704 in June 2016 which means that for the past 3 years my phone cost was about $0.62 per day.

It seems that laptops tend to last me about 4 years [3], and I don’t need high-end models (I even used one from a rubbish pile for a while). The last laptops I bought cost me $289 for a Thinkpad X1 Carbon [4] and $306 for the Thinkpad T420 [5]. That makes laptops about $0.20 per day.

In May 2014 I bought a Samsung Galaxy Note 10.1 2014 edition tablet for $579. That is still working very well for me today, apart from only having 32G of internal storage space and an OS update preventing Android apps from writing to the micro SD card (so I have to use USB to copy TV shows on to it) there’s nothing more than I need from a tablet. Strangely I even get good battery life out of it, I can use it for a couple of hours without the battery running out. Battery life isn’t nearly as good as when it was new, but it’s still OK for my needs. As Samsung stopped providing security updates I can’t use the tablet as a SSH client, but now that my primary laptop is a small and light model that’s less of an issue. Currently that tablet has cost me just over $0.30 per day and it’s still working well.

Currently it seems that my hardware expense for the forseeable future is likely to be about $1 per day. 20 cents for laptop, 30 cents for tablet, and 50 cents for phone. The overall expense is about $1.66 per month as I’m on a $20 per month pre-paid plan with Aldi Mobile.

Saving Money

A laptop is very important to me, the amounts of money that I’m spending don’t reflect that. But it seems that I don’t have any option for spending more on a laptop (the Thinkpad X1 Carbon I have now is just great and there’s no real option for getting more utility by spending more). I also don’t have any option to spend less on a tablet, 5 years is a great lifetime for a device that is practically impossible to repair (repair will cost a significant portion of the replacement cost).

I hope that the Mate 10 can last at least 2 years which will make it a new record for low cost of ownership of a phone for me. If app vendors can refrain from making their bloated software take 50% more RAM in the next 2 years that should be achievable.

The surprising thing I learned while writing this post is that my mobile phone expense is the largest of all my expenses related to mobile computing. Given that I want to get good reception in remote areas (needs to be Telstra or another company that uses their network) and that I need at least 3GB of data transfer per month it doesn’t seem that I have any options for reducing that cost.

January 13, 2019

etbeAre Men the Victims?

A very famous blog post is Straight White Male: The Lowest Difficulty Setting There Is by John Scalzi [1]. In that post he clearly describes that life isn’t great for straight white men, but that there are many more opportunities for them.

Causes of Death

When this post is mentioned there are often objections, one common objection is that men have a lower life expectancy. The CIA World factbook (which I consider a very reliable source about such matters) says that the US life expectancy is 77.8 for males and 82.3 for females [2]. The country with the highest life expectancy is Monaco with 85.5 for males and 93.4 years for females [3]. The CDC in the US has a page with links to many summaries about causes of death [4]. The causes where men have higher rates in 2015 are heart disease (by 2.1%), cancer (by 1.7%), unintentional injuries (by 2.8%), and diabetes (by 0.4%). The difference in the death toll for heart disease, cancer, unintentional injuries, and diabetes accounts for 7% of total male deaths. The male top 10 lists of causes of death also includes suicide (2.5%) and chronic liver disease (1.9%) which aren’t even in the top 10 list for females (which means that they would each comprise less than 1.6% of the female death toll).

So the difference in life expectancy would be partly due to heart problems (which are related to stress and choices about healthy eating etc), unintentional injuries (risk seeking behaviour and work safety), cancer (the CDC reports that smoking is more popular among men than women [5] by 17.5% vs 13.5%), diabetes (linked to unhealthy food), chronic liver disease (alcohol), and suicide. Largely the difference seems to be due to psychological and sociological issues.

The American Psychological Association has for the first time published guidelines for treating men and boys [6]. It’s noteworthy that the APA states that in the past “psychology focused on men (particularly white men), to the exclusion of all others” and goes on to describe how men dominate the powerful and well paid jobs. But then states that “men commit 90 percent of homicides in the United States and represent 77 percent of homicide victims”. They then go on to say “thirteen years in the making, they draw on more than 40 years of research showing that traditional masculinity is psychologically harmful and that socializing boys to suppress their emotions causes damage that echoes both inwardly and outwardly”. The article then goes on to mention use of alcohol, tobacco, and unhealthy eating as correlated with “traditional” ideas about masculinity. One significant statement is “mental health professionals must also understand how power, privilege and sexism work both by conferring benefits to men and by trapping them in narrow roles”.

The news about the new APA guidelines focuses on the conservative reaction, the NYT has an article about this [7].

I think that there is clear evidence that more flexible ideas about gender etc are good for men’s health and directly connect to some of the major factors that affect male life expectancy. Such ideas are opposed by conservatives.

Risky Jobs

Another point that is raised is the higher rate of work accidents for men than women. In Australia it was illegal for women to work in underground mines (one of the more dangerous work environments) until the late 80’s (here’s an article about this and other issues related to women in the mining industry [8]).

I believe that people should be allowed to work at any job they are qualified for. I also believe that we need more occupational health and safety legislation to reduce the injuries and deaths at work. I don’t think that the fact that a group of (mostly male) politicians created laws to exclude women from jobs that are dangerous and well-paid while also not creating laws to mitigate the danger is my fault. I’ll vote against such politicians at every opportunity.

Military Service

Another point that is often raised is that men die in wars.

In WW1 women were only allowed to serve in the battlefield as nurses. Many women died doing that. Deaths in war has never been an exclusively male thing. Women in many countries are campaigning to be allowed to serve equally in the military (including in combat roles).

As far as I am aware the last war where developed countries had conscription was the Vietnam war. Since then military technology has developed to increasingly complex and powerful weapons systems with an increasing number of civilians and non-combat military personnel supporting each soldier who is directly involved in combat. So it doesn’t seem likely that conscription will be required for any developed country in the near future.

But not being directly involved in combat doesn’t make people safe. NPR has an interesting article about the psychological problems (potentially leading up to suicide) that drone operators and intelligence staff experience [9]. As an aside the article reference two women doing that work.

Who Is Ignoring These Things?

I’ve been accused of ignoring these problems, it’s a general pattern on the right to accuse people of ignoring these straight white male problems whenever there’s a discussion of problems that are related to not being a straight white man. I don’t think that I’m ignoring anything by failing to mention death rates due to unsafe workplaces in a discussion about the treatment of trans people. I try to stay on topic.

The New York Times article I cited shows that conservatives are the ones trying to ignore these problems. When the American Psychological Association gives guidelines on how to help men who suffer psychological problems (which presumably would reduce the suicide rate and bring male life expectancy closer to female life expectancy) they are attacked by Fox etc.

My electronic communication (blog posts, mailing list messages, etc) is mostly connected to the free software community, which is mostly male. The majority of people who read what I write are male. But it seems that the majority of positive feedback when I write about such issues is from women. I don’t think there is a problem of women or left wing commentators failing men. I think there is a problem of men and conservatives failing men.

What Can We Do?

I’m sure that there are many straight white men who see these things as problems but just don’t say anything about it. If you don’t want to go to the effort of writing a blog post then please consider signing your name to someone else’s. If you are known for your work (EG by being a well known programmer in the Linux community) then you could just comment “I agree” on a post like this and that makes a difference while also being really easy to do.

Another thing that would be good is if we could change the hard drinking culture that seems connected to computer conferences etc. Kara has an insightful article on Model View Culture about drinking and the IT industry [10]. I decided that drinking at Linux conferences had got out of hand when about 1/3 of the guys at my table at a conference dinner vomited.

Linux Conf Au (the most prestigious Linux conference) often has a Depression BoF which is really good. I hope they have one this year. As an aside I have problems with depression, anyone who needs someone to talk to about such things and would rather speak to me than attend a BoF is welcome to contact me by email (please take a failure to reply immediately as a sign that I’m behind on checking my email not anything else) or social media.

If you have any other ideas on how to improve things please make a comment here, or even better write a blog post and link to it in a comment.

January 06, 2019

etbePhotograph Your Work

One thing I should have learned before (but didn’t) and hope I’ve learned now is to photograph sysadmin work.

If you work as a sysadmin you probably have a good phone, if you are going to run ssh from a phone or use a phone to read docs while in a server room with connectivity problems you need a phone with a good screen. You will also want a phone that has current security support. Such a phone will have a reasonable amount of storage space, I doubt that you can get a phone with less than 32G of storage that has a decent screen and Android security support. Admittedly Apple has longer security support for iPhones than Google does for Nexus/Pixel phones so it might be possible to get an older iPhone with a decent screen and hardly any space (but that’s not the point here).

If you have 32G of storage on your phone then there’s no real possibility of using up your storage space by photographing a day’s work. You could probably take an unreasonable number of photos of a week’s work as well as a few videos and not use up much of that.

The first time I needed photos recently was about 9 months ago when I was replacing some network gear (new DSL modem and switch for a client). The network sockets in the rack weren’t labelled and I found it unreasonably difficult to discover where everything was (the tangle of cables made tracking them impossible). What I should have done is to photograph the cables before I started and then I would have known where to connect everything. A 12MP camera allows zooming in on photos to get details, so a couple of quick shots of that rack would have saved me a lot of time – and in the case where everything goes as planned taking a couple of photos isn’t going to delay things.

Last night there was a power failure in a server room that hosts a couple of my machines. When power came back on the air-conditioner didn’t start up and the end result was a server with one of it’s disks totally dead (maybe due to heat, maybe power failures, maybe it just wore out). For unknown reasons BTRFS wouldn’t allow me to replace the disk in the RAID-1 array so I needed to copy the data to a new disk and create a new mirror (taking a lot of my time and also giving downtime). While I was working on this the filesystem would only mount read-only so no records of the kernel errors were stored. If I had taken photos of the screen I would have records of this which might allow me to reproduce the problem and file a bug report. Now I have no records, I can’t reproduce it, and I have a risk that next time a disk dies in a BTRFS RAID-1 I’ll have the same problem. Also presumably random people all over the world will suffer needless pain because of this while lacking the skills to file a good bug report because I didn’t make good enough records to reproduce it.

Hopefully next time I’m in a situation like this I’ll think to take some photos instead of just rebooting and wiping the evidence.

As an aside I’ve been finding my phone camera useful for zooming in on serial numbers that I can’t read otherwise. I’ve got new glasses on order that will hopefully address this, but in the mean time it’s the only way I can read the fine print. Another good use of a phone camera is recording error messages that scroll past too quickly to read and aren’t logged. Some phones support slow motion video capture (up to 120fps or more) and even for phones that don’t you can use slow play (my favourite Android video player MX Player works well at 5% normal speed) to capture most messages that are too quick to read.

September 20, 2018

etbeWords Have Meanings

As a follow-up to my post with Suggestions for Trump Supporters [1] I notice that many people seem to have private definitions of words that they like to use.

There are some situations where the use of a word is contentious and different groups of people have different meanings. One example that is known to most people involved with computers is “hacker”. That means “criminal” according to mainstream media and often “someone who experiments with computers” to those of us who like experimenting with computers. There is ongoing discussion about whether we should try and reclaim the word for it’s original use or whether we should just accept that’s a lost cause. But generally based on context it’s clear which meaning is intended. There is also some overlap between the definitions, some people who like to experiment with computers conduct experiments with computers they aren’t permitted to use. Some people who are career computer criminals started out experimenting with computers for fun.

But some times words are misused in ways that fail to convey any useful ideas and just obscure the real issues. One example is the people who claim to be left-wing Libertarians. Murray Rothbard (AKA “Mr Libertarian”) boasted about “stealing” the word Libertarian from the left [2]. Murray won that battle, they should get over it and move on. When anyone talks about “Libertarianism” nowadays they are talking about the extreme right. Claiming to be a left-wing Libertarian doesn’t add any value to any discussion apart from demonstrating the fact that the person who makes such a claim is one who gives hipsters a bad name. The first time penny-farthings were fashionable the word “libertarian” was associated with left-wing politics. Trying to have a sensible discussion about politics while using a word in the opposite way to almost everyone else is about as productive as trying to actually travel somewhere by penny-farthing.

Another example is the word “communist” which according to many Americans seems to mean “any person or country I don’t like”. It’s often invoked as a magical incantation that’s supposed to automatically win an argument. One recent example I saw was someone claiming that “Russia has always been communist” and rejecting any evidence to the contrary. If someone was to say “Russia has always been a shit country” then there’s plenty of evidence to support that claim (Tsarist, communist, and fascist Russia have all been shit in various ways). But no definition of “communism” seems to have any correlation with modern Russia. I never discovered what that person meant by claiming that Russia is communist, they refused to make any comment about Russian politics and just kept repeating that it’s communist. If they said “Russia has always been shit” then it would be a clear statement, people can agree or disagree with that but everyone knows what is meant.

The standard response to pointing out that someone is using a definition of a word that is either significantly different to most of the world (or simply inexplicable) is to say “that’s just semantics”. If someone’s “contribution” to a political discussion is restricted to criticising people who confuse “their” and “there” then it might be reasonable to say “that’s just semantics”. But pointing out that someone’s writing has no meaning because they choose not to use words in the way others will understand them is not just semantics. When someone claims that Russia is communist and Americans should reject the Republican party because of their Russian connection it’s not even wrong. The same applies when someone claims that Nazis are “leftist”.

Generally the aim of a political debate is to convince people that your cause is better than other causes. To achieve that aim you have to state your cause in language that can be understood by everyone in the discussion. Would the person who called Russia “communist” be more or less happy if Russia had common ownership of the means of production and an absence of social classes? I guess I’ll never know, and that’s their failure at debating politics.

August 25, 2018

Dave HallAWS Parameter Store

Anyone with a moderate level of AWS experience will have learned that Amazon offers more than one way of doing something. Storing secrets is no exception. 

It is possible to spin up Hashicorp Vault on AWS using an official Amazon quick start guide. The down side of this approach is that you have to maintain it.

If you want an "AWS native" approach, you have 2 services to choose from. As the name suggests, Secrets Manager provides some secrets management tools on top of the store. This includes automagic rotation of AWS RDS credentials on a regular schedule. For the first 30 days the service is free, then you start paying per secret per month, plus API calls.

There is a free option, Amazon's Systems Manager Parameter Store. This is what I'll be covering today.

Structure

It is easy when you first start out to store all your secrets at the top level. After a while you will regret this decision. 

Parameter Store supports hierarchies. I recommend using them from day one. Today I generally use /[appname]-[env]/[KEY]. After some time with this scheme I am finding that /[appname]/[env]/[KEY] feels like it will be easier to manage. IAM permissions support paths and wildcards, so either scheme will work.

If you need to migrate your secrets, use Parameter Store namespace migration script

Access Controls

Like most Amazon services IAM controls access to Parameter Store. 

Parameter Store allows you to store your values as plain text or encrypted using a key using KMS. For encrypted values the user must have have grants on the parameter store value and KMS key. For consistency I recommend encrypting all your parameters.

If you have a monolith a key per application per envionment is likely to work well. If you have a collection of microservices having a key per service per environment becomes difficult to manage. In this case share a key between several services in the same environment.

Here is an IAM policy for an Lambda function to access a hierarchy of values in parameter store:

To allow your developers to manage the parameters in dev you will need a policy that looks like this:

Amazon has great documentation on controlling access to Parameter Store and KMS.

Adding Parameters

Amazon allows you to store almost any string up to 4Kbs in length in the Parameter store. This gives you a lot of flexibility.

Parameter Store supports deep hierarchies. You will find this becomes annoying to manage. Use hierarchies to group your values by application and environment. Within the heirarchy use a flat structure. I recommend using lower case letters with dashes between words for your paths. For the parameter keys use upper case letters with underscores. This makes it easy to differentiate the two when searching for parameters. 

Parameter store encodes everything as strings. There may be cases where you want to store an integer as an integer or a more complex data structure. You could use a naming convention to differentiate your different types. I found it easiest to encode every thing as json. When pulling values from the store I json decode it. The down side is strings must be wrapped in double quotes. This is offset by the flexibility of being able to encode objects and use numbers.

It is possible to add parameters to the store using 3 different methods. I generally find the AWS web console easiest when adding a small number of entries. Rather than walking you through this, Amazon have good documentation on adding values. Remember to always use "secure string" to encrypt your values.

Adding parameters via boto3 is straight forward. Once again it is well documented by Amazon.

Finally you can maintain parameters in with a little bit of code. In this example I do it with Python.

Using Parameters

I have used Parameter Store from Python and the command line. It is easier to use it from Python.

My example assumes that it a Lambda function running with the policy from earlier. The function is called my-app-dev. This is what my code looks like:

If you want to avoid loading your config each time your Lambda function is called you can store the results in a global variable. This leverages Amazon's feature that doesn't clear global variables between function invocations. The catch is that your function won't pick up parameter changes without a code deployment. Another option is to put in place logic for periodic purging of the cache.

On the command line things are little harder to manage if you have more than 10 parameters. To export a small number of entries as environment variables, you can use this one liner:

Make sure you have jq installed and the AWS cli installed and configured.

Conclusion

Amazon's System Manager Parameter Store provides a secure way of storing and managing secrets for your AWS based apps. Unlike Hashicorp Vault, Amazon manages everything for you. If you don't need the more advanced features of Secrets Manager you don't have to pay for them. For most users Parameter Store will be adequate.

July 05, 2018

Dave HallMigrating AWS System Manager Parameter Store Secrets to a new Namespace

When starting with a new tool it is common to jump in start doing things. Over time you learn how to do things better. Amazon's AWS System Manager (SSM) Parameter Store was like that for me. I started off polluting the global namespace with all my secrets. Over time I learned to use paths to create namespaces. This helps a lot when it comes to managing access.

Recently I've been using Parameter Store a lot. During this time I have been reminded that naming things is hard. This lead to me needing to change some paths in SSM Parameter Store. Unfortunately AWS doesn't allow you to rename param store keys, you have to create new ones.

There was no way I was going to manually copy and paste all those secrets. Python (3.6) to the rescue! I wrote a script to copy the values to the new namespace. While I was at it I migrated them to use a new KMS key for encryption.

Grab the code from my gist, make it executable, pip install boto3 if you need to, then run it like so:

copy-ssm-ps-path.py source-tree-name target-tree-name new-kms-uuid

The script assumes all parameters are encrypted. The same key is used for all parameters. boto3 expects AWS credentials need to be in ~/.aws or environment variables.

Once everything is verified, you can use a modified version of the script that calls ssm.delete_parameter() or do it via the console.

I hope this saves someone some time.

September 24, 2017

Dave HallDrupal Puppies

Over the years Drupal distributions, or distros as they're more affectionately known, have evolved a lot. We started off passing around database dumps. Eventually we moved onto using installations profiles and features to share par-baked sites.

There are some signs that distros aren't working for people using them. Agencies often hack a distro to meet client requirements. This happens because it is often difficult to cleanly extend a distro. A content type might need extra fields or the logic in an alter hook may not be desired. This makes it difficult to maintain sites built on distros. Other times maintainers abandon their distributions. This leaves site owners with an unexpected maintenance burden.

We should recognise how people are using distros and try to cater to them better. My observations suggest there are 2 types of Drupal distributions; starter kits and targeted products.

Targeted products are easier to deal with. Increasingly monetising targeted distro products is done through a SaaS offering. The revenue can funds the ongoing development of the product. This can help ensure the project remains sustainable. There are signs that this is a viable way of building Drupal 8 based products. We should be encouraging companies to embrace a strategy built around open SaaS. Open Social is a great example of this approach. Releasing the distros demonstrates a commitment to the business model. Often the secret sauce isn't in the code, it is the team and services built around the product.

Many Drupal 7 based distros struggled to articulate their use case. It was difficult to know if they were a product, a demo or a community project that you extend. Open Atrium and Commerce Kickstart are examples of distros with an identity crisis. We need to reconceptualise most distros as "starter kits" or as I like to call them "puppies".

Why puppies? Once you take a puppy home it becomes your responsibility. Starter kits should be the same. You should never assume that a starter kit will offer an upgrade path from one release to the next. When you install a starter kit you are responsible for updating the modules yourself. You need to keep track of security releases. If your puppy leaves a mess on the carpet, no one else will clean it up.

Sites build on top of a starter kit should diverge from the original version. This shouldn't only be an expectation, it should be encouraged. Installing a starter kit is the starting point of building a unique fork.

Project pages should clearly state that users are buying a puppy. Prospective puppy owners should know if they're about to take home a little lap dog or one that will grow to the size of a pony that needs daily exercise. Puppy breeders (developers) should not feel compelled to do anything once releasing the puppy. That said, most users would like some documentation.

I know of several agencies and large organisations that are making use of starter kits. Let's support people who are adopting this approach. As a community we should acknowledge that distros aren't working. We should start working out how best to manage the transition to puppies.

September 16, 2017

Dave HallTrying Drupal

While preparing for my DrupalCamp Belgium keynote presentation I looked at how easy it is to get started with various CMS platforms. For my talk I used Contentful, a hosted content as a service CMS platform and contrasted that to the "Try Drupal" experience. Below is the walk through of both.

Let's start with Contentful. I start off by visiting their website.

Contentful homepage

In the top right corner is a blue button encouraging me to "try for free". I hit the link and I'm presented with a sign up form. I can even use Google or GitHub for authentication if I want.

Contentful signup form

While my example site is being installed I am presented with an overview of what I can do once it is finished. It takes around 30 seconds for the site to be installed.

Contentful installer wait

My site is installed and I'm given some guidance about what to do next. There is even an onboarding tour in the bottom right corner that is waving at me.

Contentful dashboard

Overall this took around a minute and required very little thought. I never once found myself thinking come on hurry up.

Now let's see what it is like to try Drupal. I land on d.o. I see a big prominent "Try Drupal" button, so I click that.

Drupal homepage

I am presented with 3 options. I am not sure why I'm being presented options to "Build on Drupal 8 for Free" or to "Get Started Risk-Free", I just want to try Drupal, so I go with Pantheon.

Try Drupal providers

Like with Contentful I'm asked to create an account. Again I have the option of using Google for the sign up or completing a form. This form has more fields than contentful.

Pantheon signup page

I've created my account and I am expecting to be dropped into a demo Drupal site. Instead I am presented with a dashboard. The most prominent call to action is importing a site. I decide to create a new site.

Pantheon dashboard

I have to now think of a name for my site. This is already feeling like a lot of work just to try Drupal. If I was a busy manager I would have probably given up by this point.

Pantheon create site form

When I submit the form I must surely be going to see a Drupal site. No, sorry. I am given the choice of installing WordPress, yes WordPress, Drupal 8 or Drupal 7. Despite being very confused I go with Drupal 8.

Pantheon choose application page

Now my site is deploying. While this happens there is a bunch of items that update above the progress bar. They're all a bit nerdy, but at least I know something is happening. Why is my only option to visit my dashboard again? I want to try Drupal.

Pantheon site installer page

I land on the dashboard. Now I'm really confused. This all looks pretty geeky. I want to try Drupal not deal with code, connection modes and the like. If I stick around I might eventually click "Visit Development site", which doesn't really feel like trying Drupal.

Pantheon site dashboard

Now I'm asked to select a language. OK so Drupal supports multiple languages, that nice. Let's select English so I can finally get to try Drupal.

Drupal installer, language selection

Next I need to chose an installation profile. What is an installation profile? Which one is best for me?

Drupal installer, choose installation profile

Now I need to create an account. About 10 minutes I already created an account. Why do I need to create another one? I also named my site earlier in the process.

Drupal installer, configuration form part 1
Drupal installer, configuration form part 2

Finally I am dropped into a Drupal 8 site. There is nothing to guide me on what to do next.

Drupal site homepage

I am left with a sense that setting up Contentful is super easy and Drupal is a lot of work. For most people wanting to try Drupal they would have abandoned someway through the process. I would love to see the conversion stats for the try Drupal service. It must miniscule.

It is worth noting that Pantheon has the best user experience of the 3 companies. The process with 1&1 just dumps me at a hosting sign up page. How does that let me try Drupal?

Acquia drops onto a page where you select your role, then you're presented with some marketing stuff and a form to request a demo. That is unless you're running an ad blocker, then when you select your role you get an Ajax error.

The Try Drupal program generates revenue for the Drupal Association. This money helps fund development of the project. I'm well aware that the DA needs money. At the same time I wonder if it is worth it. For many people this is the first experience they have using Drupal.

The previous attempt to have simplytest.me added to the try Drupal page ultimately failed due to the financial implications. While this is disappointing I don't think simplytest.me is necessarily the answer either.

There needs to be some minimum standards for the Try Drupal page. One of the key item is the number of clicks to get from d.o to a working demo site. Without this the "Try Drupal" page will drive people away from the project, which isn't the intention.

If you're at DrupalCon Vienna and want to discuss this and other ways to improve the marketing of Drupal, please attend the marketing sprints.

AttachmentSize
try-contentful-1.png342.82 KB
try-contentful-2.png214.5 KB
try-contentful-3.png583.02 KB
try-contentful-5.png826.13 KB
try-drupal-1.png1.19 MB
try-drupal-2.png455.11 KB
try-drupal-3.png330.45 KB
try-drupal-4.png239.5 KB
try-drupal-5.png203.46 KB
try-drupal-6.png332.93 KB
try-drupal-7.png196.75 KB
try-drupal-8.png333.46 KB
try-drupal-9.png1.74 MB
try-drupal-10.png1.77 MB
try-drupal-11.png1.12 MB
try-drupal-12.png1.1 MB
try-drupal-13.png216.49 KB