Google Cloud arrives Down Under

On June 20th 2017 Google finally announced the availability of its Cloud Platform in Sydney, Australia with 3 availability zones. This is huge exciting news as Australian organisations finally get to choose between the 3 leading cloud providers with local presence: AWS, Azure and Google Cloud.

Having used AWS almost exclusively since 2008 and gained my Solutions Architect certification along the way, I have been waiting for something different which would help to bridge some of the gaps faced with other cloud providers.

Google Cloud have come a long way, they were one of the first to offer a powerful app engine PaaS, however they suffered early on from a poorly designed console UI, lack of services plus limited worldwide presence. This enabled AWS to gain a huge share of the market. Fast forward to 2017, GCP has many cool services and a completely redesigned console.

This blog post is not a personal statement telling everyone to make the switch to Google Cloud, but rather a non-exhaustive list of little things I like, from a DevOps perspective.
I hope it will benefit solutions architects such as myself, to better evangelise which provider suits best. Google Cloud does not actually have all the equivalent services of AWS but it’s inevitable to compare the two.

Console

The Google Cloud console is extremely simple, fast and non cluttered like the AWS console. Perhaps the best thing I appreciate is that I do not have to select a region first in each service. I’m sure many of you have already experienced this tummy scare moment when you load the EC2 console in AWS and see no running instances until you realise you’re in the wrong region…
Not having to select a region firsthand makes it much easier to have an overall view of all your resources.

Cloud Shell

Any DevOps engineer knows that despite all the automation tools available these days, there is always occasional manual troubleshooting required. This typically involves launching a new instance, waiting till it’s up and running, then installing tools such as Docker, AWS cli etc…
Cloud Shell is an interactive terminal directly available in the Google Console (Azure has one too), it takes a couple of seconds to be ready, has 5GB of persistence storage, and more importantly all the common tools are installed: Cloud SDK, Docker (no authentication needed with the Google Container Registry), plus the usual commands ping, telnet, curl etc…
Cloud Shell saves a lot of time for quickly managing and trouble-shooting resources.

Compute Engine

The Compute Engine does not possess all the equivalent features of EC2 but it has a couple of things I really like. Apart from their blazing speed with fast launch times for machines, it has the ability to choose custom machine types which is a big plus. You can choose your own amount of CPU and RAM if none of the machine types suit your needs.

Container Engine

The main service which actually got me to try out Google Cloud, is the Container Engine (GKE). It’s a fully managed Kubernetes cluster service. It is well known that setting up a Kubernetes cluster with H.A. is fairly complex and upgrades aren’t always smooth. Even when using tools such as Kops or Kargo, in my opinion, it relies too much on code to set up and manage a cluster. I strongly believe that not everything is code and evangelise to take more advantage of platforms.
I’ve had cases when I needed a Kubernetes cluster up and running quickly so that I could test my containers. With GKE after a couple of clicks you have your cluster ready.
AWS does not even have a proper fully managed Kubernetes scheduler, in my opinion their ECS service lacks many features required for managing and orchestrating a Docker cluster.
GKE also enables you to create additional “pools” which can have different machine types for specific container resources needs.
Finally GKE is, optionally at a cost, fully integrated with StackDriver to provide monitoring and centralised logging without needing to add any extra configuration to your kubernetes manifests at all.

Many organisations are adopting Docker micro-services with Kubernetes and I see GKE becoming an integral part of Google Cloud. Using GKE also means that your Kubernetes manifests stay agnostic with no cloud vendor lock-in.

SQL

RDS is one of AWS’s best services for a fully managed database. Google Cloud’s SQL service is very limited in features compared to RDS but it has some little things which RDS console does not offer. In SQL you can directly create users and databases via the console.
The biggest handicap of SQL is that it’s a public service, it cannot be launched inside a virtual private network. Instead you’d need to setup a Cloud SQL Proxy which provides a secure tunnel between your SQL instance and GKE or Compute Engine.

Storage

We’ve seen during S3’s recent outage in the US east region that it’s critical to have multi-region replication. When you create a bucket in AWS S3 you can choose to replicate to another region.
Google’s Storage has taken a slightly different and better approach to cross-region replication. When you create a Storage bucket, you can select to replicate within a whole geographic region (multi-regional) US, Europe or Asia. This is a much simpler and attractive approach for storing mission critical data.

Pricing

Naturally Google Cloud’s pricing is aggressive and lower than the costs of running on AWS. But the best thing about their pricing model is the degressive pricing.
On AWS you will either pay on demand or you can purchase Reserved Instances (where you need to pay upfront for 1 or 3 years). Despite the small changes you can make to the Reserved Instances, it still requires a long term commitment to a vendor and instance type which isn’t always ideal.
With Google Cloud, there is no Reserved Instances approach, for example, if you leave your machine running for a full month you automatically get 30% off. This pricing model is perfectly suited to cloud resources.

Final words

I’ve compared a couple of Google Cloud’s services with AWS and listed their advantages. There is no doubt we will be seeing all cloud providers offering similar features. At the end of the day whether you use Google or AWS or Azure or X it depends on many parameters and the problems you’re trying to solve, to which there are no immediate answers.

I look forward to seeing how Google Cloud will disrupt the Australian market.

If you need any help with cloud practice, do not hesitate to contact me.

Docker Build Tips

dockerWith Docker’s rising popularity, many people are building and publishing their own images. It’s easy to get started and build. It feels like going back to the days before configuration management arrived, with lots of messy bash scripting. Unsurprisingly Docker Hub has it’s fair share of poorly written Dockerfiles.

This article is an attempt to structure Dockerfiles better with some tips and keeping in mind how to make them the smallest size possible.

Layers

It is important to understand that Docker images are based on layers, every command in the Dockerfile will produce one. The rule of thumb is to create the least number of layers possible and separate the ones that rarely change from those that change frequently.

Structuring the Dockerfile

If your Docker image typically installs a package, adds a couple of files then runs the app, then it’s good to adhere to what I call the “FMERAEEC” method; that’s using the: FROM, MAINTAINER, ENV, RUN, ADD, ENTRYPOINT, EXPOSE & CMD commands. Of course not everything will fit into this.

FROM

Use your preferred base image, which one you use is entirely up to you. It is preferable not to use the latest tag as you need to know when you base image changes, to verify that the app still runs ok.
debian:jessie tends to be a popular base image on the Docker Hub. We’ll discuss slim images later on.

MAINTAINER

Use the RFC compliant format e.g.

MAINTAINER Tom Murphy <tom@bluemalkin.net>

ENV

If you are installing packages, it is a good idea to specify which version of the main package is being installed. e.g.
ENV NGINX_VERSION 1.9.12

RUN

This layer will frequently change, so the golden rule is to chain up where possible, all the bash commands into a single RUN.
Typically you will get the list of packages, install the package(s) (using the version specified in ENV) then cleanup the list of packages.
Then you may run some configuration change commands such as sed or create symbolic links for the log files to the standard output/error etc… e.g.

Building from source

If you are frequently building your container which requires using something built from source, it is preferable not to build from source. Those builds will take longer to accomplish, plus with the dependencies required, you may end up with a  large sized layer. Instead you may want to have a separate process which builds, packages and stores them somewhere. Then your main app build can fetch and install the packages.

ADD

Add your configuration files, artifacts built by your CI/CD etc… Plus your entrypoint.sh script
Use ADD rather than COPY as ADD allows for additional sources, plus COPY will be deprecated.

A common misconception is the destination path in the container automatically creates any missing parent directories, therefore there’s no need to RUN mkdir commands.

ENTRYPOINT

When you run your docker container, the entrypoint script will run first. This is a good place to make some configuration changes to the container, based on any environment variables you pass in at run time, then issue:
exec "$@"  which executes the CMD command.

A classic one many forget, is to add the execute bit to the script. Ensure it’s set in your source control, rather than running an additional RUN command to chmod the file.

EXPOSE

Expose the container port(s)  regardless of whether you will use Docker links or bind to the host.

CMD

Finally CMD instructs the command (and options such as run in foreground) to run for the container.

It is preferred to put the command and options in the form of:
CMD ["executable","param1","param2"]

Other commands notes

LABEL: many issue a LABEL command for the container description and another LABEL command for the version of the image produced. Unless you have a good reason to use them and that your platform(s) will query the metadata, avoid them. Especially the description label which mostly remains static. The version of the image is what you tag the image. Remember each LABEL command produces another layer.

Logging

Many mount a volume to the container so that the host can access the logs and send them somewhere. Whilst this approach works, it is a lot simpler for the container to send logs to the standard and error outputs, then use a logging driver in Docker.

Running the container CMD in foreground should produce at least startup log to the standard output. To get the full logs, redirect them to the outputs, by creating symbolic links in a RUN command.
For example with nginx again:

Ensure that the user has permission to write to the outputs.

Keeping images the smallest size possible

As mentioned earlier, try and keep the number of layers as low as possible. Some other tips are:

  • Remove package lists at the end of the RUN command  rm -rf /var/lib/apt/lists/*
    And optionally any other temporary files under /var/tmp/*  and /tmp/* 
  • If you are downloading archives, remove them after extracting
  • If you are building from source, remove the required packages for building and it’s dependencies
  • Consider building using a slim base image

Do not flatten the image with a docker export unless you have a good reason to. An export does not preserve all the layers, thus can no longer take advantage of cached layers.

Slim Images using Alpine Linux

Whenever possible it’s good to use a slim image to speed up builds and deployments.
The busybox image has been around for a while but recently there has been an increasing trend in adopting the Alpine Linux image.
It’s is a security-oriented, lightweight Linux distribution based on musl libc and busybox It’s only 4.8MB !
Several official images are published to Docker Hub with an alpine slim variant, keep an eye for tags suffixed with -alpine.

Alpine Linux currently has a limited number of packages in it’s package repository. It still has some popular ones: nginx, squid, redis, openjdk7 etc… For comparison openjdk based on Alpine is 100MB which is over 5 times smaller than the Debian (560MB) based image.

Some may have security reservations with slim images versus a full Docker O.S. In my opinion it’s reasonably secure as long as: 1. the upstream image is frequently updated 2. you ensure you always pull the latest base image on all builds and 3. more importantly, ensure that your host O.S. is regularly patched.

Final words

There are many many topics to learn and cover in Docker. Ensure you are well familiar with the Docker build documentation at https://docs.docker.com/engine/reference/builder/
Read on Dockerfile best practices https://docs.docker.com/engine/userguide/eng-image/dockerfile_best-practices/
Check how others build their images on Docker Hub, Git Hub etc… learn from them and enhance yours. Try keep your build structure and naming simple and consistent.

The next Docker topic will cover the Docker platform: rancher.com. Watch this space and happy Dockering !

How to wing the AWS certification exam

Solutions-Architect-AssociateA couple of months ago I undertook the AWS Solutions Architect – Associate exam, which I happily passed with a score of 85%.

Whilst I went into the exam with almost no preparation at all, I’ve put together some tips to best prepare yourself for the exam.

Please note that when you undertake the exam, you are required to sign a NDA, which forbids from sharing the contents of the exam questions.

The certification

AWS certifications are valid for 2 years and are useful to test your knowledge, boost your credentials plus you get access to the Amazon Partner Network.

The next level after the associate exam is the professional exam. The main differences between the two are:

Associate level

  • Technical
  • Troubleshooting
  • Common scenarios

Professional level

  • Much more in depth
  • Complex scenarios

The associate exam duration is 80 minutes whilst the professional exam duration is 170 minutes. You are taken into the exam room (you cannot bring anything at all with you), the questions are multiple choice answers. At the end of the exam you are immediately presented with the results on the screen.

If you fail the exam, you’ll have to wait 30 days before you can try again.

Preparation

The exam questions are well written, it’s not an exam you can just study for and hope you’ll pass, you need to have plenty of hands-on experience. And the best experience you can get is in your profession.

Some tips to best prepare yourself:

  • Practice, practice, practice !
  • Read the AWS whitepapers aws.amazon.com/whitepapers
  • Sign up to Cloud Academy cloudacademy.com
  • Sign up to Linux Academy linuxacademy.com
  • Read the AWS sample questions and discuss them with your colleagues
  • Undertake the AWS practice exam (US$20: 20 questions / 30 mins)

The Cloud and Linux Academy have online courses and lots of quizzes. The official AWS practice exam is useful to undertake last, as you get to practice against the timer which can be distracting.

AWS Solutions Architect – Associate exam

The scoring breakdown for the exam I undertook is:

  • Designing highly available, cost efficient, fault tolerant, scalable systems
  • Implementation / Deployment
  • Security
  • Troubleshooting

My impressions are:

  • It’s not easy, the questions are well composed for architects with plenty of experience
  • Some of the questions are long
  • AWS states one year minimum experience, it depends on how many services you got exposed to. Despite having used AWS extensively since 2008, I found some of the questions challenging
  • The questions are high level but also hands-on
  • The exam covers most main AWS services

For the AWS services covered, whilst each exam is different, they cover roughly:

  • 75% EC2 (ELB, EBS, AMI…), VPC & IAM roles
  • 25% other services (Storage Gateway, Route53, Cloudfront, SQS, RDS, SES, DynamoDB etc…)

Exam gotchas

For the Solutions Architect exam:

  • There are architecture for totally different scenarios
  • Be mindful of cost effective vs best design vs default architecture
  • Security is very important to know (e.g. Security groups, /ACL statefull/stateless etc…)
  • Good practice with troubleshooting is essential
  • Some questions can be easily answered by elimination

Exam tips

Some tips for when you sit the exam:

  • Prepare yourself
  • Take your time
  • Don’t pay too much attention to the timer
  • Read the questions carefully
  • Mark questions for review later
  • Leave at least 10/15 mins to review

The AWS certification exam can be stressful but also fun, good luck if you intend to undertake it !