Skip to main content

Creating a Customized Linux Amazon Beanstalk AMI

Amazon Web Services (AWS) have launched some cool stuff in the last year.  One of the newest and most exciting is Amazon Elastic Beanstalk.  The purpose of Beanstalk is to let developers create web applications and deploy them into an auto-scalable cloud hosted environment.  I'm going to share a little about this, and then talk about how to make your own customized Linux machine image (and why you would want one anyway...).
Prior to Beanstalk, if you wanted to create an 'elastic' web application that scaled automatically depending on its resource needs, you would need to manually provision an elastic IP, a load balancer (maybe), and build some logic into your web application to know when to spin up new machine images to handle the load. (If I've lost you so far, you might as well move on to some lighter reading).


Beanstalk is profound in that it creates this automatically managed environment for you, and lets you simply upload your web application to the "cloud."  Beanstalk monitors the health and load of your application, and along with some features you can tweak, will know when to spin up new machine images to handle the load.

If this sounds cool to you, its because it is!  Read more here: http://aws.amazon.com/elasticbeanstalk/

As of the writing of this blog, Elastic Beanstalk only supports Java web applications, and they deploy on Amazon's own flavor of Linux in a Tomcat 6.x container.  Good news!  This is exactly the environment I prefer.  Those of you who like to deploy on Windows are out of luck for now, but I hear Amazon is working on Windows server images for Beanstalk.  You might even be able to roll your own after looking at one of their Linux images.

Ok, so if this is so cool, why would I need to make my own customized machine image (AMI)?  Well unless a vanilla Linux server environment has everything your web application needs, you may need some extras.  For example, you may want to have some custom packages installed on the server, like Image Magick or server side log or security monitoring.  These things are easy to configure on your own dedicated servers, but in Beanstalk land, every machine image that boots is vanilla, unless you customize it.  This article exists because it took me a while to figure all this stuff out, so now my hard work helps you out.

For my latest project, I needed some custom software and services to run on my Linux machine images, and you simply can't install software packages, compile code, start/stop services from a Java application (not easily anyway).  So that leaves me with one option: roll my own image and use it to run my Beanstalk app.

The first thing you might consider is to just launch your Beanstalk app, SSH into the server and customize it on the fly, then bundle the disk image.  Nice concept, but no go.  If you customize your disk image while it is running in a Beanstalk environment, bundling it simply doesn't work and you get a non-bootable machine image.

No, in fact the way to do this is to spin up an EC2 instance manually, using one of Amazon's beanstalk AMI's.  I used a 64-bit image ami-100fff79.  Launching in EC2 as opposed to Beanstalk means that just the core services get launched, and it makes it easy for us to customize, and most importantly, bundling it to your own custom AMI actually works.

Go through the normal steps of launching an EC2 instance, create a keypair (newbies read http://chris-richardson.blog-city.com/amazon_ec2_keypairs_and_other_stumbling_blocks_1.htm), and SSH into your instance.

Side Track:

Now one of the things I needed on my custom machine image is a common location to mount an Amazon S3 bucket.  Why?  Because my web application needs temporary storage, and there may be multiple machine images running at the same time, so they all need access to the same temporary storage.  For that is a nifty Linux tool called s3fs which lets you mount your Amazon S3 bucket as part of your file system.

The problem I ran into was that Amazon's Linux machine image doesn't have much of the stuff I need to compile custom code, so I was forced to install a metric crap-ton of packages just to get basic compilation done, and even then I had issues with dependencies of certain versions of packages that Amazon doesn't make available and you have to compile yourself.  Lots of work.  Well, lots of work UNLESS you find wonderful people who work hard and post their instructions and benefits (kind of why I am doing this, ya know?).  One such honorable person is Matthew Stump.  He figured out all the packages needed to compile s3fs and packaged them up into RPMs that install nicely on Amazon's Linux AMIs.  Check them out here: http://eclecticengineer.blogspot.com/2011/03/amazon-ami-linux-rpms-for-s3fs.html.

Quick note:  When you install your own packages, you may need to turn off yum's gpg checking.  To do that, edit /etc/yum.conf, and set gpgcheck to 0.

Ok back to the main topic:

Now that you have installed and configured all your custom packages, and you are happy with your environment, you need to save it to your very own AMI to use with Beanstalk.  I found out at this point that this is not so easy.  Yes, yes, the nifty tools that are available for Eclipse are easy for bundling windows images, but they simply don't work in this case for the Linux image I want.  You have to do it the old fashioned way, with the command line.

In order to save your own AMI, you need an Amazon S3 bucket.  Be smart, create a separate bucket for the disk image so you don't accidentally overwrite or delete part of it later.  Also, make sure your bucket name is all letters and numbers, no underscores.  For some reason underscores caused me grief when using Amazon's API tools and caused "v2 compatibility" bucket warnings.  Safer just to use no underscores.

First of all, you need to install the api tools.  Grab a copy here: http://s3.amazonaws.com/ec2-downloads/ec2-api-tools.zip.  wget works great. You can follow these directions to configure the tools (requires environment variables, etc.) http://linuxsysadminblog.com/2009/06/howto-get-started-with-amazon-ec2-api-tools/

I also grabbed an older copy of these tools which helped me out.  The older file is here: http://aws.amazon.com/developertools/368?_encoding=UTF8&jiveRedirect=1


Note the difference:  One is Amazon aMi tools, and the other is Amazon aPi tools.  I ended up with both to get the job done.  I wish it was less confusing.  Seriously, I do.

Update on 3/19/2011 - I found out that the newest Amazon AMI's already have the API tools installed, so this saves you some time.  They are all located in /opt/aws/bin/.

You will also need to have your X.509 cert.  When you sign up for Amazon Web Services, they give you an access key id and a secret id, but then you can create an X.509 cert.  Do it -- you will need the private key (pk) and certificate (cert) to create your own AMIs.  Once you have those files, you will need them on your running EC2 instance.  I installed ftp (yum install ftp) so I could connect out to another account and grab my certs.  Once I had them, I put them in the /mnt directory on my running EC2 instance.  Get them on the server any way you prefer.

Now comes the magic, here is the consolidation of all my effort:  the command line entries that actually do the work.

1. To build the image:

cd /mnt

nohup sudo -E /home/ec2-user/ec2-ami-tools/bin/ec2-bundle-vol -r x86_64 -d /mnt -k /mnt/pk-XXXXXXXXXXXXXXXXXXXXXXXX.pem -c /mnt/cert-XXXXXXXXXXXXXXXXXXXXXXXXXX.pem -u ############ &

Notes: The reason I use "nohup" is so that this runs in the background, so even if I log out, or the ssh connection gets dropped (happened to me many times), it will keep running until it is done.  It generates a text file in your user directory called "nohup.out" that has all the output of the process.  The pk-XXXX.pem and cert-XXXX.pem are the X.509 certs you got from the AWS security credentials page and that you put on the server.  The ######## at the end is your Amazon account ID.  If you log into your AWS account, it is displayed with dashes, like this: ####-####-#### (where #'s are numbers of course).  The number you use in this command is the same number, just without the dashes. I know it's confusing.

This can take quite a while to complete in my experience if you are using one of Amazon's small EC2 instances with limited CPU.  Much faster on a normal or larger instance.

You MUST wait until this process finishes until you move on. Check the nohup.out file for progress.  Errors will be obvious.  Successful completion will be obvious, and will result in many image.xxx files in the /mnt directory.

2. Upload the image files to your S3 bucket:

nohup sudo -E /home/ec2-user/ec2-ami-tools/bin/ec2-upload-bundle -b YOURBUCKETNAME -m /mnt/image.manifest.xml -a ACCESS_KEY_ID -s SECRET_ACCESS_KEY &

Same deal with this command.  nohup makes sure it doesn't time out, even though your SSH session will.  

Replace "YOURBUCKETNAME" with the name of your Amazon S3 bucket, and remember, no underscores or spaces in the name of your bucket, otherwise you will get unusual "v2 compatibility" errors.

Replace "ACCESS_KEY_ID" with your Amazon Access Key ID, and replace "SECRET_ACCESS_KEY" with your Secret Access Key. 
Check "nohup.out" for progress.  This usually doesn't take too long.  Maybe 5 minutes.

You can also log into the Amazon S3 console and verify that you have image-xxx files (lots of them).

3.  Register your custom image with Amazon EC2

This is the process that tells Amazon where your disk image is, and associates it with your account so you can create instances using it.  


sudo -E /home/ec2-user/ec2-api-tools/bin/ec2-register YOURBUCKETNAME/image.manifest.xml -K /mnt/pk-XXXXXXXXXXXXXXXXXX.pem -C /mnt/cert-XXXXXXXXXXXXXXXXXXXXX.pem

Note that we don't use "nohup" here.  The reason is because this command usually runs and completes quickly (within a minute).  You certainly can use nohup (and the & at the end), no harm in doing so.

Just like previous commands, substitute your bucket name, and your pk and cert file names. 

When this completes, it will display the name of your custom image like so:

IMAGE   ami-XXXXXXXXX

If you want to write it down, you can, however it will show up in your Amazon console after successful registration. 

Wrapping up:
Now that your image is created and registered, you can launch a new EC2 instance with it.  More importantly, you can use it for Elastic Beanstalk.  The way you do it is a little strange, but works.  
  1. Launch your Beanstalk application (via Eclipse, or via Amazon console).  Wait until it is in "ready" state.
  2. Expand "Environment Details" and click on "Edit Configuration."  The configuration window will appear.
  3. In the Custom AMI ID input box, put in the ami-XXXXXXX identifier of the custom image you created in step 3.  Click apply and you are good to go!
In summary, we've gone over some basics about Elastic Beanstalk, and how to boot, customize and create a custom AMI from Amazon's Beanstalk machine image.
Have fun!  If you enjoyed this article, Tweet about it, post it on Facebook, and let people know about it!

Popular posts from this blog

Making Macbook Air with 128GB SSD usable with Bootcamp

I recently got a new Macbook Air 11" (the 2012 version) and loaded it with goodies like 8GB ram and 2GHz Core i7.  What I DIDN'T upgrade was the internal SSD.  My config came with 128GB SSD and I refused to pay $300+ to upgrade it to 256GB.  Yeah I know, some call me cheap, but SSds cost $75-$150 for 240GB, so adding another 128GB for $300 seemed way too steep for me.  I figured "ok, I'm going to make 128G work!"

Here is the story of how that went...

Installing python 3.4.x on OSX El Capitan

I love "brew" package manager, but sometimes being too progressive breaks things.  I have several python apps that I maintain that get deployed to AWS using Elastic Beanstalk.  AWS eb can deploy with python 2.7 or 3.4.  Any recent 'brew install python3" will get 3.5.1. #annoying

Dell XPS M1330 + Snow Leopard Hackintosh

I have been working with a Dell XPS M1330 laptop for a few years now.  It doesn't quite match up to the newest notebooks in terms of performance, but it certainly still has some life in it.  I had previously installed OSX 10.5.x on it as an experiment, and had moderate success.  I decided to revisit this idea again to install Snow Leopard (OSX 10.6) on the Dell M1330, and keep some notes for those of you brave enough to Hackintosh your own machine...