Understanding the Scale Limitations of Graph Databases | eWEEK | eWeek

Understanding the Scale Limitations of Graph Databases

enterprise IT
Written By
eWEEK EDITORS
eWEEK EDITORS
May 26, 2022
4 minute read
eWeek content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More

Graph databases and models have been around for well over a decade, and are among the most impactful technologies to emerge from the NoSQL movement.

Graph data models are natively designed to focus on the relationships within and between data, representing data as nodes connected by edges. As such, the graph model is strikingly similar to the way humans often think and talk.

The node-edge-node pattern in a graph corresponds directly to the subject-predicate-object pattern common to languages like English. So, if you’ve ever used mind-mapping technology or diagrammed ideas on a whiteboard, you’ve created a graph.

Graph data models have become part of the standard toolkit for data scientists applying artificial intelligence (AI) to everything from fraud detection and manufacturing control systems to recommendation engines and customer 360s.

Given this broad applicability, it’s no surprise Gartner believes that graph database technologies will be used in more than 80% of data and analytics innovations, including real-time event streaming, by 2025. But as adoption accelerates, limitations and challenges are emerging. And one of the most significant limitations graph databases face is their inability to scale.

Also see: Real Time Data Management Trends

Volume and Velocity of Modern Data Generation

Much has changed since the emergence of the most recent generation of graph databases from a decade ago. Enterprises are dealing with previously unimaginable volumes of data to potentially query. That data enters and streams through the enterprise in a variety of channels, and enterprises want action on that information in real time.

Original graph designs couldn’t have imagined today’s sheer volume of data or the computation power needed to put that data to work. And it’s not just the volume of data dragging graph databases down. It’s the velocity of that data.

While graph databases can excel at computation on moderately-sized sets of data at rest, they get especially siloed and suffer significant tradeoffs when real-time actions on streaming data are desired. Streaming is actively moving data; it constantly arrives from diverse sources.

And enterprises want to act upon it immediately in event-processing pipelines because when certain events are not caught quickly, as they happen, the opportunity to act disappears. For example, security incidents, transaction processing (such as fraud or credit validations), and automated machine-to-machine actions.

Anomalies and patterns need to be recognized with AI and ML algorithms that can automate (or at least escalate) an action. And that recognition needs to occur before an automated action can proceed.

Graph databases were simply never built for this scenario. They are typically restricted to hundreds or thousands of events per second. But today’s enterprises need to be able to process a velocity of millions of events per second and, in some advanced use cases, tens of millions.

There’s a hard limit both on how quickly graph systems can process data and on how much complexity (like how many hops in the query) they can handle. Because of those limits, graph systems often don’t get used. Since graph systems don’t get used, data engineering teams have no option other than to recreate the graph database-like functionality spread throughout their microservices architecture.

Also see: Best Data Analytics Tools 

The Rise of Custom Data Pipeline Development

These workarounds to query the event streams in real time require significant effort. Developers typically turn to event stream processing systems like Flink and ksqlDB, which make it possible, but not easy, to use familiar SQL query syntax to query the event streams.

It’s not uncommon for enterprises to have teams of data engineers developing extensive and complex microservice architectures for months or years to get up to the scale and speed needs of streaming data. However, these systems tend to lack the expressive query structures needed to find complex patterns in streams efficiently.

As noted, to operate at the volume and velocity that enterprises require, these systems have had to make tough tradeoffs that lead to significant limitations.

For example, time windows can restrict a system’s ability to connect events that do not arrive within a narrow time interval (often measured in seconds or minutes). This means that rather than providing some critical insight or business value, an event is instead simply ignored if it arrives even seconds too late.

Even with costly limitations like time windows, event stream processing systems have been successful. Many can even scale to process millions of events per second—but with significant effort and limitations that fail to deliver the full power of graph data models.

Also see: Why Cloud Means Cloud Native

Advertisement

Innovation Will Rise to Meet Demand

The demand for insights from instant event data streams and the value they deliver has never been higher. As adoption accelerates, businesses should expect to see new data infrastructure emerge to eliminate many of the scale struggles that can hold back the power of graph database models.

About the Author: 

Rob Malnati is the COO of thatDot

eWeek Logo

eWeek has the latest technology news and analysis, buying guides, and product reviews for IT professionals and technology buyers. The site's focus is on innovative solutions and covering in-depth technical content. eWeek stays on the cutting edge of technology news and IT trends through interviews and expert analysis. Gain insight from top innovators and thought leaders in the fields of IT, business, enterprise software, startups, and more.

Property of TechnologyAdvice. © 2026 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.