Amazon Web Services, or more simply AWS, provides a wide range of web services for building technical infrastructure. It's not a replacement for having an ISP (or a data center, if your firm is that large), but it's a great way to avoid spending too much money on infrastructure before you even have a proven business model. It's also a brilliant way to scale up resources on demand, even if – and sometimes, especially when – you already have a successful business model. In this article, let's introduce the web services available from Amazon, discuss a bit about use cases, then talk about business strategy for leveraging AWS.
Have you ever had a great idea for a new tech business, but weren't sure what to do next? Write a business plan? Set up a meeting with a venture capital firm? Launch a web site? Getting from A to B can become enormously expensive, when you start to think about buying servers, hiring sysadmins, leasing facilities, etc., before you can even demonstrate your idea running on the Internet.
Amazon considered that problem. For a few years at the start of this decade – post Bubble-bursting – they analyzed what it cost for new businesses to launch on the Internet, where the risks seemed to hit worst. They took their internal technology infrastructure and re-examined what parts could be rolled out as public services. Associates (ECS) was launched in 2002, then much of what people recognize now as AWS really began to catch on in 2006. More services roll out just about every quarter, it seems.
So about your great idea for a new tech business... Do it. If you have a web app, then go prototype it using Ruby on Rails, or Django, or any number of PHP frameworks. Build it on your laptop, upload code to EC2, save an AMI, replicate as needed. (Don't worry, we'll catch up on these acronyms in a bit here.) Or, if your business idea involves a lot of data crunching, maybe something like Hadoop, again you can build on your personal computer, then upload and run it at scale on EC2. The bottom line is: pay only for what you use, and scale up to enterprise size on demand.
One minor thing you want need to know beforehand: Amazon.com staff abbreviate terms and even acronyms using numbers. Have you seen how the term internationalization gets abbreviated as "I18N" sometimes? They do that. A lot. Apparently, it comes from the top; it's a geek thing, and at Amazon.com even the business types are dyed-in-the-wool tech geeks, all the way up to the CEO. So the AWS web service names follow similar spelling conventions: instead of saying Simple Storage Solution, they use the term S3. Having a good decoder ring might help make reading their documentation a wee bit simpler. Ahem.
Let's take a tour through their current set of web services...
Associates
Amazon Associates, formerly called ECS, was the first web service offering from Amazon. Use it in conjunction with Amazon Associates Program to create web storefronts. In other words, your site acts as a reseller that refers products at Amazon and its third-party vendors. Earn up to 8.5% in referral fees – which can be quite good money, overall.
EC2
Elastic Computing Cloud provides resizable compute capacity. Servers on demand, rented per hour. Grid computing, clouds, or – as they say on Sand Hill Road – "Google scale". You can configure a Linux server, then store it as an image (Amazon Machine Image, or AMI) and launch as many as you need. It takes only a few minutes to launch a new server, and you can stop it anytime you like. You pay by the hour, paying only for capacity that you actually use.
Small Instance (default)
- $0.10/machine/hour, plus data transfer
- 1.7 Gb memory, 1 32-bit virtual core with 1 compute unit each, 160 Gb disk
- $0.40/machine/hour, plus data transfer
- 7.5 Gb memory, 2 64-bit virtual cores with 2 compute units each, 850 Gb disk
- $0.80/machine/hour, plus data transfer
- 15 Gb memory, 4 64-bit virtual cores with 2 compute units each, 1690 Gb disk
Are there any downsides? Sure, EC2 is not a direct replacement for running your own data center – if you need that. For starters, there are no dedicated IP addresses (they're working on it). You'll need arrange for dynamic DNS. That happens to be a very good way to combine the strengths of a local ISP and EC2, by resolving your DNS from the local ISP and pointing into the cloud.
Another downside is that while you do get root on your virtual server, you never really know what the actual hardware lives, and you face some restrictions when it comes time to make kernel mods. Amazon documentation keeps repeating the refrain about a hypothetical "circa 2007 1.7 GHz Xeon processor", but that may be the most you'll ever learn about their data centers. (I hear they're stacked in shipping containers near hydroelectric plants, but that's just rumor – albeit a rather "cleantech" rumor, at that). My dev team ran into problems with some special uses of MySQL that required kernel changes. Hadoop won't run on the standard AMI, but there are special Hadoop AMI which work well. Generally there are work-arounds, but be forewarned that you're not going to walk into any data center and push a reboot button on an EC2 instance. Ever.
Still, I find that EC2 runs better than other Linux virtualization systems that I've tried. Tough to beat that price-point, too.
S3
Simple Storage Service allows you to write, read, and delete objects in "buckets". Each object can range from 1 byte to 5 Gb in size, with an unlimited number of objects per bucket. Think of S3 as a very large hashtable – think of key/value pairs – persisted to disk and replicated across several different data centers. Use REST or SOAP operations to read and write objects. You can also use BitTorrent for streaming media out of S3.
One application is to store LOBs in S3 instead of in your database. Or serialize large objects directly out of your middleware. A good approach is to use some transport language like XML or JSON to encode data objects, so they can be used directly in an AJAX call by a web client. An even better idea is to put a Distributed Hash Table (DHS) as a kind of middle-tier cache for objects persisted out to S3. Read more about that in Amazon's paper about their Dynamo project.
How much you ask? $0.15/Gb/month, plus data transfer. That's almost getting cheaper than buying a disk upgrade for your laptop, byte by byte.
One thing to consider: how do you manage your business requirements for off-site backups? I know that our business insurance certainly requires that kind of practice in place. Frankly, with S3 you must manage it yourself, and that implies costs for network transfer. One suggestion to Amazon would be to provide alternate means to bring S3 data out at a reduced speed, trickled out for backups.
SQS
Simple Queue Service provides a way to manage message queues – up to 8 Kb of text data per message – which can scale arbitrarily large. It's inexpensive and highly reliable. This is my favorite part of AWS, and potentially the most valuable to Internet entrepreneurs.
If you're familiar with MQSeries from IBM, you understand what this provides. For example, when I was working in banking software, MQSeries could harden gigabytes of transaction data reliably, while we waited to have new mainframes booted on the other side of the queue. That may be only 25 words of description, but in practice it was a nightmare made simple through an amazing IBM technology.
SQS follows that pattern. Integrating SQS into a web app is a little bit of a cognitive stretch for many developers; you really must embrace a different mindset. Once you do, you probably won't be thinking in terms of MVC design patterns much longer.
Costs? $0.000001/message, plus data transfer. In other words, $1 per million messages processed.
SimpleDB
SimpleDB might be described as a cross between a relational database and a spreadsheet, except that each cell may have different attributes. The whole shebang gets indexed automagically, then you can perform queries, joins, intersections, etc. – in very large quantities. Think of it as a good place to store pointers and metadata for those large objects stored in S3. You can hit the SimpleDB part of an object first, run its metadata through your business logic, then determine whether or not you want to stream out gigabytes from, say, some video you stored in an S3 bucket.
How well does that compare with Oracle licenses? At a mere $0.14/machine/hour, plus $1.50/Gb/month, plus data transfer, and with less headache about scaling and performance issues, it does look rather compelling.
FPS
Having worked around e-commerce for 15 years, in my opinion Flexible Payment Services has got to be one of more ingenious parts of AWS. It's a web service that simplifies the process of taking money reliably (for you, the seller) and conveniently (for your customer). It handles payment the same as how people do checkout on Amazon purchases – which is pretty much guaranteed to be familiar to consumers. You can program rules about billing, create recurring fees, add a Pay Now widget to any web site, or set up to handle micro-payments which would be prohibitively expensive through most financial processors.
You may be able to negotiate a better rate with your bank – once you have sales established – if you're not concerned about little nuances like chargebacks, fraud, etc.
MTurk
Mechanical Turk is described as "a marketplace for work that requires human intelligence" or an "elastic workforce". See this Wikipedia entry for more background about the name. You describe a kind of task, called a "HIT", then people sign up to perform your HITs. There is a $0.005 per HIT minimum commission, and Amazon takes 10% of commissions.
This is the kind of service which has been used to coordinate massively collaborative search for downed planes and lost ships, such as the search for Steve Fossett.
Alexa
There are also four Alexa web services, primarily for web analytics: Site Thumbnail, Top Sites, Web Information Service, and Web Search. I've used them a little commercially, and they are quite different from the other AWS offerings. Probably best to reserve that discussion for a follow-up article.
Strategy...
Let's repeat this point again, because it's important: the strategy for leveraging AWS is to build out your engineering so that operations can scale from Day One. Get running quickly, only pay for what you use. Get early feedback from testers and customers about your business ideas. Most importantly: save your seed equity for expenditures which are more important than technology infrastructure.
What could be more important in a tech start-up than technology costs? The following items top my list:
- Legal fees for NDA, HR, contracts, financing, etc.
- Health insurance and family benefits for employees
- Patent and trademark filings, which grow increasingly complex internationally
- SG&A to establish initial sales
AWS has been crafted by some of the most successful people working in e-commerce to give you exactly that kind of quick-start advantage. And why not? Ask yourself, would you rather earn a few hundred dollars each month from multitudes of new tech start-ups, early in their growth curve, assisting them to grow larger (and spend more)... or would you prefer to wait until a start-up has proven itself in the marketplace, then try to sell e-commerce and fulfillment services? Amazon has been active in both areas, but I have a hunch that the former strategy (AWS) earns more revenue over time.
We can discuss more later about system architecture, design patterns, and how to build infrastructure that scales up and scales down on-demand. Meanwhile, keep in mind that AWS can provide you with enabling services so that your new business idea has a much better path toward becoming a success.
2 comments:
Paco,
Good stuff. I'm sure there are a lot of folks out there that have run across the mention of Amazon Web Services, maybe even EC2, S3, etc. and shrugged it off figuring it was a way to build an on-line book store or some experimental thing...which I suppose it was. :-)
One bit I'd add is that there are real companies using this, it's not just a "oh that's sort of neat". While using EC2, S3, etc. directly is not for the faint of heart, there are companies that have built services on top already that are more accessible.
Red Hat's Cloud Computing offering is only one of the most notable recent examples utilizing AWS:
http://www.redhat.com/solutions/cloud/
I've seen AWS used by both start-ups and mature companies, for everything from prototyping, starter production use, off-site data backups, disaster recovery, overflow traffic handling (i.e. slashdot effect) and full scale production use.
Some might be interested, to get ideas and wrap their heads around the possible applications in their own organizations, in perusing the AWS success stories:
http://www.amazon.com/Success-Stories-AWS-home-page/b?ie=UTF8&node=182241011
Sh**... Amazon helped destroy Fringeware.
Post a Comment