Vertica has a number of performance enhancements on the menu for the upcoming version of its column database. When the company releases Vertica 3.5 later this year, users can expect new capabilties around data storage and processing to be included in the mix, as well as support for MapReduce.
Vertica Systems has enhanced its column database with a new data
storing and processing architecture designed to improve
performance.
The company has dubbed the new architecture FlexStore. With it,
customers can organize different parts of the database in different
ways to achieve maximum performance and compression, Dave Menninger,
vice president of marketing and product management at Vertica, told
eWEEK.
"The whole objective of this is to reduce the amount of I/O that's
necessary to satisfy a query," he said. "Reducing I/O [input/output]
gives you better performance."
Vertica 3.5 automatically applies a variety of physical design,
database storage and query execution techniques that keep the database
optimized for the analytic workload it's supporting at the time. For
example, users can group multiple columns into a single disk file to
minimize file input/output for workloads that read a large percentage
of the columns in a table, do single row look-ups, query against many
small columns or that frequently update data in those columns.
The new database also allows users to automate the creation of tiered storage to improve information lifecycle management.
"The reason you would care about this is if you have different parts of
your architecture that have different I/O characteristics," Menninger
said. "Once you recognize that you can have different performance
characteristics in different parts of your system, you want to be able
to take advantage of that and put the data that is accessed most
frequently on the portions of the disk that perform the best, or on the
disks that perform the best."
FlexStore is one of two main enhancements the company is touting;
the second is support for Apache Hadoop, the open-source version of
MapReduce. The move
follows similar moves by companies such as Greenplum and
Aster Data Systems. Vertica officials argue, however, that they have taken a slightly different approach to MapReduce than other vendors.
"Vertica's
introduction of support for MapReduce differs from the approach of
other vendors in that it supports it as a parallel capability rather
than as something integrated with SQL," noted Philip Howard, an analyst
with Bloor Research, in a statement. "This makes sense because most
people using MapReduce are not SQL programmers, and vice versa."
A MapReduce job
is typically a big one, Menninger said, and since the Vertica cluster
is usually fully loaded it does not make sense to throw another big job
on the cluster.
"We believe, and
our customers have told us, that they deliberately want those two
environments, the Vertica and MapReduce environment, to be separate but
equal," he said.
The new version
of the database will be generally available in October and can run on
Linux servers, VMware vSphere or VMware Server-supported hardware or
on-demand in enterprise and public clouds such as the Amazon Elastic
Compute Cloud (Amazon EC2).