Tuesday, April 8, 2008

Amazon Web Services has another outage

Amazon's cloud computing service was down on Monday morning for more than an hour, following an outage on its hosted storage service two months ago.
While Amazon appears to have learned some lessons since the previous outage, the incidents underscore the immaturity of the services, an analyst said.

"In terms of Amazon, what you need to know is that this is very new," said Phil Shih, an analyst with Tier 1 Research, a division of The 451 Group. "It's not something they've perfected. Because of this, we don't advise anybody to use this for anything mission-critical."

Amazon's Elastic Compute Cloud is a Web service that offers hosted computing. Users can quickly scale up or down the amount of processing power that they need, based on their changing requirements.

On Monday at around 2 a.m. Pacific Time, the first EC2 customer reported problems accessing the service on Amazon's Web services forum. Others quickly chimed in.

Within 15 minutes, an Amazon employee acknowledged reading about the problems and said the company was investigating them. That note, and subsequent messages at regular intervals, seemed to placate some customers. "Not all doom and gloom," one person wrote on the forum. "It should be noted that [Amazon Web Services] are keeping us up to date... 10 out of 10 for communication. Bravo!"

That's a very different type of response than customers had after the S3 outage in mid-February, when some users were quite angry at a lack of acknowledgement and information from Amazon about the outage, which lasted for as long as three hours.

At 3:21 a.m. Pacific Time on Monday, the first customer posted a note saying that the EC2 service was back up. Others followed. On the forum, Amazon said it would post more details about what caused the problem, but hadn't by Monday afternoon. An Amazon spokesman said he was working to get answers to questions about the outage.

Still, improvements in communication don't change the reliability of the services. Shih recommends that companies only consider using Amazon's Web services for small internal development products, where a company can absorb the risks and potential downtime.

But that recommendation could change in the future. "Do I expect them to raise their game and get better over time? Absolutely," Shih said. "They're pouring resources into this, and they're serious about it."

While these types of outages are a black eye for Amazon, they likely don't cost the company in terms of service level agreement payouts, Shih said. Late last year, Amazon created an SLA that lets companies apply for credits in the event of an outage. "Most people won't bother to get their money back," Shih said. "It's such a small amount, and it requires more paperwork to get the credit." But an SLA is something Amazon has to offer in order for companies to consider it a true enterprise-class service, he said.

No comments: