Reducted Data Footprint

 
 
By Rick Abbott and Bob Zurek  |  Posted 2010-04-28 Email Print this article Print
 
 
 
 
 
 
 


1. Reduced data footprint

In recent years, column-oriented databases have been noted by many as the preferred architecture for high-volume analytics. A column-oriented database stores data column by column instead of row by row. There are many advantages to this. Most analytic queries only involve a subset of the columns in a table, so a column-oriented database focuses on retrieving only the data that is required. This speeds queries and reduces disk I/O and computer resources.

Furthermore, these databases enable efficient data compression because each column stores a single data type, as opposed to rows that typically contain several data types. Compression can be optimized for each particular data type, reducing the amount of storage needed for the database. Column orientation also greatly accelerates query processing, which significantly increases the concurrent queries a server can process.

There are a variety of column-oriented solutions on the market. Some duplicate data and require as large a hardware footprint as traditional row-based systems. Others have combined the column basis with other technologies, which eliminates the need for data duplication. This means that users don't need as many servers or as much storage to analyze the same volume of data.

For example, some column-oriented databases can achieve compression results ranging from 10:1 (a 10TB database becomes a 1TB database) to more than 40:1, depending on the data. With this level of compression, a distributed server environment can be reduced by a factor of 20 to 50 times and be brought down to a single box-slashing heat, power consumption and carbon emissions.

Virtual data marts are also coming on the scene, leveraging Enterprise Information Integration (EII) technologies to create specialized views of data sets without the need for physical storage. The downside to this approach is that complex queries can be sluggish, which can be a problem when analytic needs call for close to real-time insight.

Open-source software takes efficient resource utilization a step further as it typically does not require proprietary hardware or specialized appliances.




 
 
 
 
RICK ABBOTT BIO: Rick Abbott is President of 360DegreeView, LLC. Rick has over 19 years of information management and technology experience, including private and public sector work. On the commercial side, Rick has significant experience in both the telecommunications and financial services industries. Rick has over eight years of "Big 5" experience, including an associate partnership position with Deloitte Consulting. Rick's primary focus over the past 13 years has been on large-scale business intelligence initiatives. He has direct experience in all aspects of business intelligence and data warehouse projects including business case development, strategic planning and business alignment, business requirements, and technical architecture and design. He possesses over 10 years of large, IT-related project management experience. Rick also has significant experience in assisting clients in negotiating large technology product, service, and outsourcing contracts. Read Rick's blog at www.360degreeinsight.com. He can also be reached at rick@360degreeview.com.================================================================================BOB ZUREK BIO: Bob Zurek is Chief Technology Officer and Vice President of Product Management at Infobright. Bob is also responsible for client services including sales engineering and implementation services. Bob has over 25 years of proven success in software development, technology research, and product management. He also possesses deep expertise in database management systems, business intelligence, and open-source technologies. Prior to joining Infobright, Bob was vice president of products and CTO at EnterpriseDB. While at EnterpriseDB, Bob led the company's technology and product management operations for their open-source product line. Prior to EnterpriseDB, Bob held management positions at IBM, Ascential Software and other technology companies where he consistently demonstrated the ability to define and deliver market-leading products with strong competitive differentiation. Read Bob's blog at http://www.infobright.org/Open-source/Blog/bob_zureks_blog. He can also be reached at bob.zurek@infobright.com.
 
 
 
 
 
 
 

Submit a Comment

Loading Comments...
 
Manage your Newsletters: Login   Register My Newsletters























 
 
 
 
 
 
 
 
 
 
 
Thanks for your registration, follow us on our social networks to keep up-to-date
Rocket Fuel