Close
  • Latest News
  • Artificial Intelligence
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
Read Down
Sign in
Close
Welcome!Log into your account
Forgot your password?
Read Down
Password recovery
Recover your password
Close
Search
Logo
Logo
  • Latest News
  • Artificial Intelligence
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
More
    Home Database
    • Database
    • IT Management
    • Networking
    • Servers

    Does NoSQL Matter for Your Organization?

    By
    Jason Brooks
    -
    September 8, 2011
    Share
    Facebook
    Twitter
    Linkedin

      Over the past few years, NoSQL databases have received a great deal of attention, with countless attestations of their virtues in enabling consumer-facing, Web-based businesses to manage fast-growing user demand and make use of the huge quantities of data that their users create.

      It’s clear that NoSQL adoption has paid dividends to the Twitters and Netflixes of the world. But it’s been less apparent just how much attention mainstream organizations ought to pay to the trend, since relational databases are familiar and well-entrenched, and since many well-established solutions exist for scaling relational databases

      Despite the heavy focus on the virtues of NoSQL for “Internet-scale” businesses, the products and services covered under the NoSQL umbrella are well worth consideration by organizations of all sizes, not as across-the-board replacements for relational databases but as additional tools for meeting business goals.

      NoSQL refers to a broad class of database products which tend not to expose SQL interfaces. What separates these products from traditional databases has less to do with SQL and more to do with a departure from relational models. In particular, these databases do away with fixed schema, which can be beneficial when developing applications with changing requirements. For this reason, non-relational is a better, if less broadly referenced, handle for this group of products.

      One of the canonical documents describing the design concepts and rationales for non-relational databases is Amazon.com’s 2007 paper on its “Dynamo” data store, which the company developed to meet its internal service-level requirements.

      The paper describes how traditional relational database management systems, with their focus on prioritizing data consistency above write-operation availability, proved ill-suited to the Web retailer’s needs in the context of Amazon’s infrastructure, which is comprised of large numbers of commodity servers of varying capacities. For Amazon, blocking customers from adding new items to their carts while waiting for separate application nodes to get in sync was too high a price to pay, so Dynamo was designed to boost availability by de-prioritizing consistency.

      While the scale of Amazon’s infrastructure and user base is relatively unique (as is Amazon’s capacity for rolling its own data store solution), the need to prioritize certain application characteristics above others is common to every organization. Today’s crop of non-relational database products provide businesses with more options without requiring that they create solutions from scratch.

      There are several different types of non-relational databases that fall under the NoSQL umbrella, including key-value stores, document-oriented databases, columnar databases and graph databases, each with their own data models, scaling strategies and use cases.

      Pinning down particular NoSQL databases into a specific category can get confusing, as some of the categories tend to blend into each other. For understanding the broad categories of NoSQL data stores, I found this paper by Rick Cattell helpful, in which the former Sun Microsystems database architect breaks down the options into key-value stores, document stores and extensible-record stores.

      In a key-value store, individual records amount to some arbitrary lump of information, indexed by a key. These systems typically do not interpret the data themselves, leaving that function to the application. Riak, which is supported by Basho Technologies, and Oracle’s Berkeley DB are examples of popular key-value stores.

      In a document store, records are comprised of documents that consist of a variable number of named attributes of various types, such as integers, strings and nested objects. Document-oriented databases tend to recognize the structure of the data they store and have more querying functionality than key-value stores. MongoDB, from 10gen, and Apache CouchDB, which is supported by Couchbase, are examples of popular document stores.

      Extensible-record stores, which are also known as wide-column stores, provide a data model similar to relational databases, but with a focus on organizing data into columns (rather than rows) and column families (rather than tables). Apache Cassandra, which is supported by DataStax, and Apache HBase, supported by Cloudera, are examples of popular extensible-record stores.

      More important than worrying about which bucket a given non-relational database fits into is focusing on the particular set of features it offers-in particular, which controls it offers for balancing availability, consistency and fault-tolerance, how it handles scaling and which interfaces it provides for accessing data.

      For example, Apache Cassandra enables administrators to set their desired trade-offs between availability and consistency on a per-query basis. To maximize consistency, administrators can configure a Cassandra cluster to hold off on reporting a write complete or responding to a read until all nodes in a cluster have responded. To maximize availability, the system can complete an operation if any one node completes a write or responds to a read. Administrators can also opt for several gradations in between to reach a balance and to provide for resiliency in case nodes fail.

      MongoDB provides for scaling out across nodes in a cluster through auto-partitioning. If a data set grows too large for a single machine, MongoDB can chunk up the collection and distribute it across the nodes assigned to it, with distributed replica sets to recover from a node failure.

      Among the primary challenges for administrators working to wrap their minds around NoSQL databases are the differences in accessing data stored in these systems. Due to the major differences between these products, there isn’t a straight equivalent to SQL in the relational world. Rather, most non-relational databases provide bindings for accessing data using multiple programming languages.

      There are a number of SQL-like querying languages that have sprouted up to offer higher-level data access, such as Google’s GQL for its AppEngine platform as a service (PaaS), MongoDB’s Mongo Query Language, Cassandra Query Language and the nascent UnQL (Unstructured Query Language). For Apache Hadoop-based systems, Apache Pig and Apache Hive offer two separate routes for working with data from a higher level.

      In my own efforts to better understand the differences in accessing data on relational and non-relational data stores, I’ve found helpful the open-source, Django-nonrelational project. Django is a Python framework for building Web-based applications that sports an object-relational mapping layer for abstracting the differences between separate relational databases. Django nonrelational supports Google’s AppEngine datastore, and offers in-development backend support for Cassandra and MongoDB.

      For administrators and developers familiar with Django, experimenting with the various backends provides a hands-on reference for the differences between relational and non-relational stores, and between some of the different NoSQL systems.

      Jason Brooks
      As Editor in Chief of eWEEK Labs, Jason Brooks manages the Labs team and is responsible for eWEEK's print edition. Brooks joined eWEEK in 1999, and has covered wireless networking, office productivity suites, mobile devices, Windows, virtualization, and desktops and notebooks. Jason's coverage is currently focused on Linux and Unix operating systems, open-source software and licensing, cloud computing and Software as a Service. Follow Jason on Twitter at jasonbrooks, or reach him by email at jbrooks@eweek.com.
      Get the Free Newsletter!
      Subscribe to Daily Tech Insider for top news, trends & analysis
      This email address is invalid.
      Get the Free Newsletter!
      Subscribe to Daily Tech Insider for top news, trends & analysis
      This email address is invalid.

      MOST POPULAR ARTICLES

      Latest News

      Zeus Kerravala on Networking: Multicloud, 5G, and...

      James Maguire - December 16, 2022 0
      I spoke with Zeus Kerravala, industry analyst at ZK Research, about the rapid changes in enterprise networking, as tech advances and digital transformation prompt...
      Read more
      Applications

      Datadog President Amit Agarwal on Trends in...

      James Maguire - November 11, 2022 0
      I spoke with Amit Agarwal, President of Datadog, about infrastructure observability, from current trends to key challenges to the future of this rapidly growing...
      Read more
      IT Management

      Intuit’s Nhung Ho on AI for the...

      James Maguire - May 13, 2022 0
      I spoke with Nhung Ho, Vice President of AI at Intuit, about adoption of AI in the small and medium-sized business market, and how...
      Read more
      Cloud

      IGEL CEO Jed Ayres on Edge and...

      James Maguire - June 14, 2022 0
      I spoke with Jed Ayres, CEO of IGEL, about the endpoint sector, and an open source OS for the cloud; we also spoke about...
      Read more
      Applications

      Kyndryl’s Nicolas Sekkaki on Handling AI and...

      James Maguire - November 9, 2022 0
      I spoke with Nicolas Sekkaki, Group Practice Leader for Applications, Data and AI at Kyndryl, about how companies can boost both their AI and...
      Read more
      Logo

      eWeek has the latest technology news and analysis, buying guides, and product reviews for IT professionals and technology buyers. The site’s focus is on innovative solutions and covering in-depth technical content. eWeek stays on the cutting edge of technology news and IT trends through interviews and expert analysis. Gain insight from top innovators and thought leaders in the fields of IT, business, enterprise software, startups, and more.

      Facebook
      Linkedin
      RSS
      Twitter
      Youtube

      Advertisers

      Advertise with TechnologyAdvice on eWeek and our other IT-focused platforms.

      Advertise with Us

      Menu

      • About eWeek
      • Subscribe to our Newsletter
      • Latest News

      Our Brands

      • Privacy Policy
      • Terms
      • About
      • Contact
      • Advertise
      • Sitemap
      • California – Do Not Sell My Information

      Property of TechnologyAdvice.
      © 2022 TechnologyAdvice. All Rights Reserved

      Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.

      ×