




















Summary
MongoDB Days 2014 is a seven-date, four-country tour to highlight existing capabilities, upcoming features, and early adopter case studies for this supplier's database technology. ARC Advisory Group recently attended the Boston event. MongoDB is an example of an emerging class of database broadly known as NoSQL ("Not-only SQL") databases. NoSQL databases use different storage structures than the row-column schema used by relational databases and are typically designed to be distributed, horizontally scalable, and highly available.
MongoDB, Inc. has raised over $230 million in venture funding and has a number of strategic partnerships. One of these is with Bosch Software Innovations, which sees MongoDB as central to its Industrial Internet of Things (IIoT) strategy. Roughly 300 people attended the event in Boston, mostly systems architects and developers. As a result, many of the sessions throughout the day were quite technical and engineering focused. However, a number of customers also presented, including a number of mature, established companies.
Key takeaways from the event include:
What Is MongoDB, and Where Did It Come From?
Relational database management systems (RDBMS) have grown to dominate the storage market for enterprise applications over the last 30 years. But, relational databases were designed in the era of the enterprise server – a self-contained computer system with CPU, RAM and disk in a single box. As such, they were really only able to scale performance vertically, by adding more or better hardware within that single box. The emergence of internet-based business models rapidly exposed the limitations of this approach. In addition, internet pioneers also discovered that data structures consisting of tables with rows and columns were a poor fit for many emerging applications. For example, data representing website traffic is semi-structured and so does not fit neatly into rows and columns. In addition, the rigidity of the relational schema, once it had been defined, hampered agility in the application development cycle.
At that time, there was a distinct lack of cost-effective, commercially available solutions for use by companies like Yahoo, Amazon, and Google, which built their own scalable data infrastructure technologies. Some, such as Hadoop and Cassandra, later became publically available. In a similar way, MongoDB started life as an internal project at 10gen to meet the need for a database to underpin a commercial cloud infrastructure service. However, 10gen's founders quickly realized the potential of the database. A strategic pivot followed, along with renaming the company to MongoDB, Inc. in August 2013. Now, MongoDB is open source, available either as a free download or a more capable commercially licensed product. The database is horizontally scalable via "sharding" (similar to partitioning in relational databases), and provides high-availability through replication. Equally important, MongoDB is a type of NoSQL database that is known as a document store, and does not have tables organized by rows and columns as a more traditional relational database does. (The term document store is unfortunate. It doesn't mean that MongoDB stores traditional office documents, such as presentations and spreadsheets. It simply means that data is stored in a more flexible way that can accommodate a variety of data structures.)
Who Uses MongoDB, and Why?
At this point, the company claims that over 1,000 organizations use the commercial MongoDB products, including 30 of the Fortune 100. MongoDB Days 2014 in Boston featured case studies (and speakers) from the following organizations:
The Broad Institute
The Broad Institute is a biomedical and genomic research center located in Cambridge, Massachusetts. Corey Flynn, a bioinformatics scientist, presented the organization's story (the squeamish might want to skip slide 2). In part, the research is aimed at moving towards personalized medicine by matching specific drugs and medications more closely to the genetics of an individual. MongoDB helps them do that by storing the results of 1.4 million experiments and over 12,000 compounds. The flexibility of MongoDB has been critical, as their database is frequently re-factored as the application is enhanced. Overall, the solution has improved the ability of researchers to predict drug function, to find novel drug targets, and to repurpose drugs that were unsuccessful in other treatment scenarios.
CARFAX
CARFAX is a web-based provider of vehicle history information used by millions of consumers and car dealers every year. Prior to adopting MongoDB, CARFAX used an in-house-developed, key-value store first written in 1984. The company currently has 13.6 billion records on vehicles and receives information from more than 34,000 different data sources. Notably, the presenter (Jai Hirsch – slides here) described this as a medium-size data problem, but a complex data problem. Significantly, MongoDB's ability to manage this complex data led to its selection during the technology evaluation phase of the project. That is, the legacy document structure mapped well to the MongoDB document structure. In addition, the product was suitable for working with sparse data – records that may have hundreds of fields but with only a few populated in each record. And MongoDB's built-in high availability was critical for a "bet-the-business" application. The production environment consists of 108 servers, a database of 10.6TB and services queries in 200ms.
Jai Hirsch offered a number of suggestions which he felt were critical to the success of the project, including:
MetLife
MetLife, a global provider of insurance, annuities and employee benefits, has approximately 100 million customers and 65,000 employees. MongoDB now lies at the heart of a mission-critical application known as "The Wall." This application provides a 360-degree view of the customer for customer service representatives spanning 45 million agreements with 140 million transactions.
Greg Novikov, a database specialist at MetLife presented an operational perspective of day-to-day life with MongoDB supporting this key application. (Greg's slides are here.) The company's clearly defined standards for applications are important for operating the business. For example, the loss of a single data center for an indeterminate time should not compromise the application. In addition, all data centers had to be on premise for compliance reasons. MetLife also anticipated growing data volumes. These requirements drove a number of systems architecture decisions:
While The Wall is live and performing well, there are some areas in which MongoDB is relatively less mature than others that have been around for many years. For example, MetLife found the solution somewhat lacking in security features, with weak password protection and third-party products used for data encryption and some auditing functions. Similarly, Greg felt that automating administrative functions and also workload management could be improved.
What's Up-and-Coming in MongoDB?
Eliot Horowitz, CTO at MongoDB, provided some highlights from the MongoDB roadmap. To start, Eliot talked about three design principles that direct everything the research and development team does:
The upcoming release (2.8) adds capabilities in three main areas over the current release (2.6):
C
onclusion
Relational database systems aren't going away any time soon. They underpin many of the business world's mission-critical applications reliably and invisibly. However, the emergence of the internet-based business model ushered in a new class of application – applications that needed to ingest large volumes from new data sources, 24x7x365.
As is always the case, changing business needs present opportunities for new and emerging technologies. Today, enterprises must be agile and always "open for business." Consequently, low-cost, easy to administer high availability is becoming more important. Likewise, the flexibility to deal with different data types and morph to support different data structures is growing in importance too.
At this point, MongoDB is a bit of a rough diamond. The product enables cost-effective high availability and data flexibility. Organizations that need those characteristics might want to evaluate MongoDB (free download). Since some desirable enterprise-class capabilities, such as security and easy administration for large-scale deployments are less mature, any evaluation must consider compliance with corporate policies and standards.
All signed-in ARC Advisory Group clients can view this report in pdf format at this Link
If you would like to buy this report or obtain information about how to become a client, please Request ARC Info
Keywords: MongoDB, Database, Big Data, NoSQL, IIoT.