DynamoDB Integrates With Amazon Elastic MapReduce

By Darryl K. Taft  |  Posted 2012-01-18 Print this article Print


Selipsky said Amazon DynamoDB also integrates with Amazon Elastic MapReduce (Amazon EMR). Amazon EMR allows businesses to perform complex analytics of their large datasets using a hosted pay-as-you-go Hadoop framework on AWS. With the launch of Amazon DynamoDB, it is easy for customers to use Amazon EMR to analyze datasets stored in DynamoDB, archive the results in Amazon Simple Storage Service (Amazon S3), while keeping the original dataset in DynamoDB intact. Businesses can also use Amazon EMR to access data in multiple stores (i.e., Amazon DynamoDB, Amazon RDS and Amazon S3), do complex analysis over this combined dataset and store the results of this work in Amazon S3.

"A lot of what we've been doing at AWS for years has been trying to help developers spend less time with the complex management of infrastructure that is not necessarily differentiating to their businesses," Selipsky said. "Nowhere is that need more pressing than in the area of databases. Databases traditionally involve a lot of complexity and difficulty in scaling workloads, and incurring a lot of costs or involving downtime for applications. So DynamoDB is aimed squarely at removing all of that muck and providing very predictable performance and high scalability, all without requiring any intervention or management from customers. And the customers we've been working with are excited about that."

"Elsevier is a $3 billion enterprise that provides science and health information to more than 30 million scientists, students and medical professionals worldwide," said Darren Person, chief architect of Elsevier, in a statement. "Each year we publish thousands of books, nearly 2,000 journals and more than 250,000 articles, which means our datasets are constantly and rapidly changing. We are always evaluating new technologies that will enable us to handle our large, varying workloads. Operating a distributed data store on our own is orders of magnitude more complicated and expensive to manage than traditional databases. DynamoDB delivers a high-performance service that can be easily scaled up or down to meet our needs, helping us eliminate complexity and lower costs."

"DynamoDB is a truly revolutionary product which allows SmugMug to finally realize its goal of being 100% cloud-based," added Don MacAskill, CEO of SmugMug, in a statement. "I love how DynamoDB enables us to provision our desired throughput, and achieve low latency and seamless scale, even with our constantly growing workloads. Even though we have years of experience with large, complex architectures, we are happy to be finally out of the business of managing it ourselves, and to be using DynamoDB to get even higher performance and stability than we can achieve on our own. Most importantly, DynamoDB allows SmugMug to spend even more time and energy on what really matters-our product and customer experience."

"DynamoDB solves our problem of distributing and storing high-volume writes in a straightforward and cost-effective way," said Rob Storrs, head of engineering at Formspring, in a statement. "Our rapid growth meant that we were spending significant resources managing our own large-scale database systems.  DynamoDB gives us low latency and easy scalability, which allows us to keep our costs low and our engineers focused on building what our customers want.  It's another example of AWS listening to their customers and building services that solve real problems."

"Prior to Amazon DynamoDB, many of our customers were forced to spend weeks forecasting, planning, and preparing their database deployments to perform well at peak loads," said Raju Gulabani, vice president of Database Services at AWS, in a statement. "DynamoDB makes those processes obsolete. Now businesses can quickly add capacity with a few clicks in the management console. During our private beta, we saw customers successfully scale up from 100s of writes per second to over 100,000 writes per second without having to change a single line of code. This level of elasticity, coupled with consistent performance, reduces the cost and the risk of building a fast-growing application."

As mentioned earlier, Vogels said DynamoDB is the result of 15 years of learning. More specifically, it is related to an internal technology known as Dynamo that the company began writing about seven or eight years ago, Vogels said. DynamoDB is a follow-on to that research with input from some others areas, he said.



Darryl K. Taft covers the development tools and developer-related issues beat from his office in Baltimore. He has more than 10 years of experience in the business and is always looking for the next scoop. Taft is a member of the Association for Computing Machinery (ACM) and was named 'one of the most active middleware reporters in the world' by The Middleware Co. He also has his own card in the 'Who's Who in Enterprise Java' deck.

Submit a Comment

Loading Comments...
Manage your Newsletters: Login   Register My Newsletters

Rocket Fuel