The latest Index Engines appliance features auto-restarting for data collection and extraction jobs along with higher-speed data deduplication.
Index Engines announced performance enhancements that include deduplication
to its large-scale data discovery platform on Oct. 5.
Generally available immediately, the latest release allows users more
control over the indexing and
data
deduplication processes, the company said in a statement. IT managers can
search and extract data more efficiently during data discovery in case of
litigation or regulatory compliance. The new hardware platform also boasts 16
core processors and 72GB devoted to index storage.
"As enterprises become more litigation ready, they are proactively
processing larger volumes of ESI [electronic
stored information]," said Jim McGann, vice president of information discovery
at Index Engines.
The
Index Engines platform now
supports auto-network discovery, making the discovery process more complete. IT
managers can automatically find all network locations and endpoints, including
servers and desktops, and not rely on their memory to create the list. Once the
location has been discovered, the Index Engines appliance crawls the content to
create the index.
Using NFS/CIFS file systems,
Index
Engines processes crawl content at speeds of 1TB per hour per node, the
company claimed.
Index Engines 3.3 includes automated restart features for LAN
indexing and
backup
tape extraction, ensuring uninterrupted data collection, the company said.
This is particularly useful as faulty tape libraries or corrupt tapes can
interrupt data processing, and data can be lost.
If the extraction or indexing processes are interrupted, the appliance
auto-restarts and resumes the jobs, finishing the index without leaving any
gaps, according to the company.
Typical large-scale deployments consist of multiple Index Engines appliances
to process large network data environments and offline tape, according to the
company. This means there is a possibility of the same piece of data being
crawled by multiple appliances and indexed, which can be confusing when
extracting results during data recovery.
To prevent such an occurrence, the release features distributed
deduplication functionality so that the system doesn't save multiple copies of
the same data. With this functionality, multiple Index Engines agents analyze
the data and coordinate content extraction so that only unique files and
e-mails are saved. This streamlines the collection process and saves storage
space.
Other new features include auto-tagging electronic data based on stored
queries, extracting e-mail to MSG format, and enhanced PDF support to discover
and identify suspicious documents.
Companies need to retain all e-mails, documents and files for specified
period of time, but the information can get unwieldy and hard to manage. Not
being able to find a specific file or a series of e-mail communications when
needed can be disastrous during a lawsuit or regulatory hearing. A data
discovery platform like Index Engines provides an easy-to-search index that
makes data extraction and recovery painless, regardless of whether the
data
is online or stored on proprietary backup and transfer formats, according
to the company.
The appliances support index and search capabilities up to 1 billion data
objects in a single box, the company claimed.
The base price for the unit is $85,000, and orders will be shipped within
four weeks after a customer places an order.