Atigeo Launches Big Data Semantic Search Tool Using NIH PubMed

Atigeo has introduced PubMed Explorer, a semantic tool that combs the National Institute of Health's PubMed database to produce more relevant search results for medical data based on context.

Atigeo, a big data analytics company, has launched PubMed Explorer, an application that allows medical researchers to search the federal database to present results of medical studies based on context in a graphical display.

PubMed is a National Institutes of Health database that provides access to more than 400,000 medical research documents.

PubMed Explorer uses Atigeo's xPatterns big data semantic search platform in the cloud to fine-tune search results so the database can learn the user's search patterns. PubMed's linear search capabilities slow medical research, according to Atigeo.

xPatterns can apply analytics to PubMed's unstructured data set to deliver relevant results by generating domain concepts, Atigeo reported.

Semantic search is a form of technology that can deliver results based on the context of a search term.

"We created our xPatterns unique semantic platform based on our own internal algorithms to help derive relevance from classifying data sets in addition to the types of searches we're familiar with, like Boolean searches," Christopher Burgess, chief operating officer and chief security officer for Atigeo, told eWEEK.

Atigeo announced PubMed Explorer July 24.

"Our goal is to provide medical researchers with the appropriate tools to shorten research cycles, enable breakthroughs and ultimately improve our health," Michael Sandoval, chairman and CEO of Atigeo, said in a statement.

PubMed Explorer acts as a domain expert in which an algorithm extracts relevant terms from research studies or clinical EHRs and generates a graph of connections between the documents and discovered data, Burgess explained.

The tool then ranks search results using visualization tools like an "orange carrot" rather than just listing search results, said Burgess. The company also calls the graphical interface "bubbles and sticks."

Over time, the tool learns the context of searches, said Burgess.

"As your queries change, the manner in which the query relates to the data may also change," he said.

The xPatterns platform is able to find patterns in PubMed's unstructured data. A query of "cancer" and "colon," for example, would bring up graphical results showing articles on colon cancer.

Common Boolean search combines operators such as "and," "not" and "or." However, xPatterns goes beyond the Boolean searching to find the relevance between search terms, said Burgess.

"We believe it will show a revelation of data across large data sets that previously wasn't possible," said Burgess.

"The beauty is it works in a cloud-based environment," said Burgess. "It's a light lift. It runs over any number of infrastructure stacks within the cloud environment to include IBM InfoSphere BigInsights and Cloudera Hadoop."

In addition to medical research in PubMed, Atigeo uses its xPatterns platform to automate coding of insurance claims based on the Systematized Nomenclature of Medicine (SNOMED) classification system, said Burgess.

"We've identified 80 data sets we believe we can marry together to help in this," he said. "Our goal is to marry up as much of the world's research as we can."

Atigeo also offers xPatterns' Computer-Assisted Coding (CAC), which uses semantic search to help with the health care industry's transition to the International Classification of Diseases (ICD)-10 diagnosis code. The Department of Health and Human Services has extended the implementation deadline for these codes to 2014. With a dual view of ICD-9 and ICD-10 data, CAC can reduce errors in codes, according to Atigeo.