Greenplum Technology Speeds Data Loading for Enterprise Data Warehouses | eWeek

Greenplum Technology Speeds Data Loading for Enterprise Data Warehouses

Écrit par
Brian Prince
Brian Prince
Mar 16, 2009
2 minute read
eWeek Le contenu et les recommandations de produits sont indépendants de la rédaction. Nous pouvons gagner de l'argent lorsque vous cliquez sur des liens vers nos partenaires. En savoir plus

Greenplum is banking on new technology to speed the data loading process for companies dealing with large data warehouses.

Greenplum’s massively parallel processing (MPP) Scatter/Gather Streaming (SG Streaming) technology is designed to eliminate the bottlenecks associated with other approaches to data loading. At its core, its approach utilizes a parallel-everywhere approach to loading in which data flows from one or more source systems to every node of the database.

The technology is part of the company’s bid to challenge players such as Teradata, Oracle and Netezza. Customers are running into cost and performance constraints with competing solutions, and are looking for scalable software solutions to meet their needs, opined Paul Salazar, vice president of marketing.

According to Greenplum, this is different from traditional bulk loading technologies used by most mainstream database and MPP appliance vendors that push data from a single source, often over a single or small number of parallel channels. The aforementioned situation can result in bottlenecks and higher load times.

“With our approach we hit fully linear parallelism because we take all the source systems and we essentially do what we call scatter the data,” explained Ben Wether, director of product management at Greenplum. “We break it up into chunks that are sprayed across hundreds or thousands of parallel streams into the database and received…by all the nodes of the database in parallel. The essence of it is we eliminate all the bottlenecks.”

Performance scales with the number of Greenplum Database nodes, and the technology supports both large batch and continuous near-real-time loading patterns, company officials said. Data can be transformed and processed in-flight, leveraging all nodes of the database in parallel.

Final gathering and storage of data to disk takes place on all nodes simultaneously, with data automatically partitioned across nodes and optionally compressed, Greenplum officials explained.

“Our objective as we go through the product evolution…is to build out a range of capabilities that are just again appealing to the customers who we have today who want in many cases ever-increasing rates of speed and loading, speed of query response, flexibility of doing embedded analytics and really to most easily access very vast volumes of data without having to do a lot of manipulation or a lot of moving of data,” Salazar said.

eWeek Logo

eWeek has the latest technology news and analysis, buying guides, and product reviews for IT professionals and technology buyers. The site's focus is on innovative solutions and covering in-depth technical content. eWeek stays on the cutting edge of technology news and IT trends through interviews and expert analysis. Gain insight from top innovators and thought leaders in the fields of IT, business, enterprise software, startups, and more.

Propriété de TechnologyAdvice. © 2026 TechnologyAdvice. Tous droits réservés

Divulgation publicitaire : Certains des produits qui apparaissent sur ce site proviennent d'entreprises dont TechnologyAdvice reçoit une compensation. Cette compensation peut influencer la façon dont les produits apparaissent sur ce site, notamment l'ordre dans lequel ils apparaissent. TechnologyAdvice n'inclut pas toutes les entreprises ou tous les types de produits disponibles sur le marché.