How Startup Cribl Enables Companies to Get More Value Out of Data

eWEEK PRODUCT ANALYSIS: Intelligent handling of machine data enables IT to move at digital speeds.


Earlier this month, I authored this post on the importance of quality machine data to improve analytics and security tools. High-quality data isn’t created out of the air, however, and I noted that it requires a new approach to machine data management. While there are vendors that currently claim to do this, the legacy vendors are either too slow or only address part of the problem. This makes sense, because these companies were founded when the volume of machine data--defined as metrics, logs, and traces--was fairly low compared to today. 

In my post, I highlighted a number of attributes that a modernized solution would need, and this included: 

  • A unified set of consolidated data that is a single source of truth.
  • Pre-processing of machine data so analytic tools only process the data they require instead of everything. This would include de-duplication, removal of null events and dynamic sampling of the stream.
  • Normalizing the data so it’s consistent and in a format that’s usable by all the tools.
  • Optimizing data flows for performance and cost.
  • Directing only the data required to specific tools. There’s no point in having a tool process data only to drop it.

Cribl gets a round of funding, validating there is a problem with data management 

Earlier this month, start-up Cribl announced a series B funding round of $35 million, led by Sequoia, to address this problem. CRV also participated in this round, bringing the investment in Cribl to $46 million. This may seem like a hefty amount for a data management company in a mature market, but as I had pointed out, there isn’t currently a vendor that can meet the demands of digital organizations. Given that modern security and analytic tools are based on machine learning, a higher volume of quality data will have a significant impact on the output from said tools. Based on the criteria I had outlined, Cribl promises to be a complete solution provider. 

The company positions its flagship product, named LogStream, as an observability pipeline. Observability here is defined as the ability to interrogate the environment without knowing in advance the questions that will be asked. Essentially, the observability pipeline provides a single universal receiver and universal router to stream data that can be used by all of an organization’s IT tools, including log analysis, SIEMs, UEBA and data lakes. 

LogStream built to work at petabyte scale 

The company purpose-built LogStream from the ground up to analyze machine data while it’s in motion and can process data at Petabyte Scale. The product also has an admin-friendly management console, allowing for visibility into up to 1,000 nodes. The benefit of this approach is it enables customers to continue to analyze data as the business changes. Typically what happens is that the organizations build a data set to be analyzed in a certain way. When the business changes, that data set is outdated and needs to be rebuilt for the new requirements. This has the obvious disadvantage of being slow and time consuming, causing organizations to fall behind. 

Customers who deploy LogStream will realize a number of benefits, including: 

  • intelligent routing where data is pushed to multiple systems in the most cost-effective way; 
  • data security through sub-field encryption and data hashing to preserve the uniqueness of information; 
  • cost optimization at scale with data ingest being cut by as much as 50%. Also, the reduction of data cuts down on the spend on tools, which is often priced by volume of data. 

The volume of machine data will continue to surge. In its press release, Cribl cited logs, metrics and traces data has grown by 25% year-over-year for the past few years. Given the trends in the industry, I can see the growth rate accelerating by 30% or even more. 

If companies don’t get a handle on this now, they’re facing spending more on tools and getting less-accurate output. If the goal is to do more with less, the machine data deluge can create the unenviable “do less with more,” which has been the trend. Cribl’s LogStream can help reverse this trend. 

Zeus Kerravala is an eWEEK regular contributor and the founder and principal analyst with ZK Research. He spent 10 years at Yankee Group and prior to that held a number of corporate IT positions.