With a theme of “Giant Ideas,” MongoDB is breaking new ground as a hot NoSQL database. This week at its MongoDB World conference, the company declared that its database enables developers to launch their giant ideas and create big things. As part of that, Mongo DB announced Atlas, its new database-as-a-service offering, as well as a new MongoDB Connector for Apache Spark. That connector enables developers and data scientists to gain insights from live, operational and streaming data. However, the company continues to gain momentum in the database world and is growing and picking up new customers—many of which have come from traditional SQL databases like Oracle and Microsoft’s SQL Server. Many of its user organizations are .NET shops. “The whole raison d’etre of MongoDB is to unleash developer productivity,” Dev Ittycheria, MongoDB’s president and CEO, said during his keynote at the show.
In an interview from MongoDB World, Kelly Stirman, vice president of strategy at MongoDB, talks to eWEEK about what “giant ideas” got the company to this point and where it’s going from here.
What’s the hot new thing coming from MongoDB for this event?
We announced a bunch of things, but I think the thing that’s most interesting is Atlas, this new database as a service. An analogy I’ve been using is when you think about transportation, you can own your car and you have to deal with insurance and keeping fuel in the car and a whole lot of other issues. But you could also have something like Uber where you don’t need to own a car and you can just click a button and a car shows up and they drive you where you need to go. You don’t have to worry about navigating or anything; you just sit back and focus on what matters to you.
So we’ve had products that matter to people that have owned their own car for years now—that make it easier and safer and all that good stuff. But this is a new product that is more like Uber where you just let us take care of things for you and you just pay by the hour for exactly how much you use. Some people prefer this–think of developers trying to get their idea off the ground and they just want somebody to take care of the infrastructure for them. That’s this service.
Think of a really big company that is all about owning their own cars, but they might decide that this is good for the next application or maybe for their development and test environments. And they’re going to use both. They’re going to own their own car and they’re going to use Uber for some things as well. That’s kind of how I think about it. We’re really excited about it. It’s the first time we’re launching a service like this. It starts on AWS, but we’ll have it available on Azure and Google within the next year or so.
Are there plans to support any other cloud providers?
Not right now. Those are the three that really seem to matter to the people we talk to.
Why did it take you so long to get into the DBaaS arena? There have been other companies that have made a good showing off of the MongoDB-as-a -service space.
I have two answers. One is this is the culmination of something we started five years ago, where we started with a monitoring service that got to about 60,000 users. Then we added a fully managed backup service where you pay by the gigabyte for backups. Then we created an automation service to do installations, upgrades and config changes that don’t take your database offline. So what this is—is taking all of that work that we’ve been developing for the past four or five years and then taking all the experience we have helping people manage their infrastructure through our support, and basically building software that manages not just MongoDB but the underlying infrastructure as well using what we think are the best configurations and security for MongoDB and bundling that into this nice service. So it’s something we’ve been working on for a long time.
Where Is MongoDB Taking Its Giant Ideas?
But you’re right; other people did it. Why couldn’t we have done it sooner? I think that really just boils down to being a company that’s relatively small and focusing on first building a database and building great commercial products. And now we’re going after what for us is a new segment of the market, which includes people who are not going to spend thousands of dollars per server per year, but if you bundle the whole thing up with the underlying infrastructure, then that starts to make a whole lot more sense to them.
There seemed to be a big reception for this service. Was this move demand-driven? What caused you to do this now?
If you look at the market, just look at the companies you alluded to. Others have been doing this. I think there are four companies to think about. One is IBM, which has a product called Compose. Rackspace has a product called ObjectRocket. There’s an independent company called mLab. And then Parse, which was acquired by Facebook a few years ago, who announced several months ago that they are winding down that service, so people have to find a new home. Parse itself has half a million apps in production. mLab has 300,000. I don’t know how many Compose and ObjectRocket have. But it’s somewhere between 800,000 to a million apps running on comparable services for MongoDB. So there is clearly demand. We invited about 2,000 customers to try out Atlas on a private beta program and there was overwhelming enthusiasm—like yes, finally, what took you guys so long. So I think there is plenty of demand.
Where we have a limitation today is that we’re not on Azure and we’re not on Google. And we’re not on all Amazon regions, but we’ll get there. We’ll get to additional regions on Amazon in the next few months and the other cloud platforms later. But we know that just with the four regions we’re launching on, there’s lots and lots of opportunity for us.
What was the impetus for the new Spark Connector?
There’s a lot of interest in people using Spark with MongoDB and what that’s about is if you think about the way people use MongoDB with Hadoop, they’ve got different operational systems and their data moves through ETL [Extract, Transform and Load] or some other process into Hadoop. Then people start to run analytics on it. And maybe Spark is faster than MapReduce, but it’s still all this time to move out of the operational system into Hadoop. But people are saying that with the kind of machine learning and analytics that they’re doing on data today, they want to move some of that to run on the operational data as it’s being created. And that’s the demand of using Spark with MongoDB. So last year, we took our connector for Hadoop and enhanced it so that it would be compatible with Spark. We learned a lot and decided there’s enough interest there to make an engineering investment to make a dedicated connector for Spark.
I wouldn’t be surprised if it is as popular as the Hadoop connector, if not more popular than that several months from now. Clearly in the Hadoop community there is a lot of focus on Spark these days. I think it will be quite popular, but give us a couple of months to see what the data looks like.
What’s big in MongoDB 3.4 that we will see later this year?
We previewed a couple of things in 3.4. One of them is graph technology, which I think will be really interesting to some folks. What’s going to be nice about graphs in MongoDB is you’re going to be able to take advantage of all the other capabilities such as availability and scalability and security and so forth, where it seems like the graph databases out there are less far along in those areas than MongoDB. So we’ll have some of the core graph analytical capabilities in the database, but we won’t have everything a dedicated graph database has. So graph is one thing.
Where Is MongoDB Taking Its Giant Ideas?
Another new thing is faceted navigation. If you’ve ever shopped on Amazon or any e-commerce site and you could narrow your search results by size or color or price, that’s faceted navigation. Lots of people who build apps on MongoDB want faceted navigation and today they do that by adding Elasticsearch or some other search technology like Solr. And that means more servers, more things to manage and more complexity. And there’s always some delay between the search index and the actual database, so we think that faceted navigation should just be in the database. You should be able to do great searches in the database directly without standing up a whole other cluster of machines.
Then the most interesting thing to me was what we call zone sharding. Sharding is a capability that all NoSQL databases have, which is the idea that you can scale out horizontally. But the way that pretty much everyone does it is they randomly assign the data to different servers. The truth is there are a lot of cases where you want to control exactly which servers the data is on for a few different reasons. One example is data sovereignty. If you’re running a global deployment, there are some countries that require that the data not leave the data centers of that country.
Another example is what some companies call “multi-temperature storage.” So you have really high performance storage for your “hot” data that you’re accessing frequently and then lower cost storage for different access patterns. With zone sharding you can assign data to different physical data centers, to different types of storage and a third use case is if you want to ensure a great experience with low latency writes and reads for users depending on where they live. So your New York users can read and write their data really fast and your California users can read and write their data really fast. You understand where those users are and if one goes on vacation, you can assign them to the other group.
So this concept of zone sharding lets people take MongoDB for global deployments, for multi-temperature storage, for data sovereignty requirements and address those out of the database, which is really exciting and which is really charting new territory in terms of how people are running global databases. No other NoSQL product is going to give you that kind of functionality.
What can you say about future directions for MongoDB?
Well, I think Atlas is very exciting and clearly we’re going to keep building out where you can take advantage of Atlas. I’m not announcing anything, but to me intuitively, if you look at running an app there’s a lot of moving parts. What we’ve tried to do with Atlas is take a bunch of those moving parts and simplify them into a service that you just pay for by the hour. There are still layers of the stack that we could build into that service. And those exist from other technologies; they exist from each of the cloud providers that have their own pieces of the stack that we could potentially integrate with. So I think there’s more opportunity for us to expand where you can take advantage of Atlas, but to also integrate with other pieces of the stack more efficiently.
I think another thing that we can do in the bigger picture is automatically fix things before they go off the rails. We have an enormous amount of data about how people use MongoDB based on five years of monitoring history and thousands and thousands of deployments and mining that to come up with predictive models to suggest what are the indicators that something is about to go wrong. We can tell a user that they might want to fix this problem or click a button and we’ll fix it for you, or we’ll just automatically fix it for them. I think about that as a prescriptive management capability that would also be charting new territory for databases that no one is really doing right now. They may give you better tools to fix things, but going that next step to pre-emptively notify you that something could go wrong? I think is really interesting new territory.