There are more than a few reasons why graph search has matured into a hot enterprise IT item as we move into the new year 2017: It just works, and it’s fast and efficient to boot.
Graph search, an open-source database project built on all the networking people around the world do online every day, is the most far-reaching search IT to go mainstream since Google started storing up and ranking Websites 18 years ago. Basically, a graph search database anonymously uses all the contacts in all the networks in which you work to help you find information.
Anything you touch, any service you use and anything people in your networks touch eventually can help speed information back to you. It avoids anything non-relevant that would slow down the search.
A graph database uses graph structures for semantic queries with nodes, edges and properties to represent and store data. Graph databases are used for storing, managing and querying complex and highly connected data. Moreover, the graph database architecture is particularly well-suited for exploring data to find commonalities and anomalies among large data volumes and unlocking the value contained in the data’s relationships.
Hugely popular cloud services such as Google, Yahoo, Bing, Twitter, Facebook, Pinterest, LinkedIn, Google+ and Web-based email all use graph search. Thus, they have not only improved the way people interconnect, socialize and do business, they also help improve our search for information on the Web, because they are massive holders of these connections.
How the Busiest Web Sites Enable Graph Search
Because computers and networks remember everything we do when we punch a keyboard, click on a Website or touch a virtual keyboard, those massive logs of data saved by networks are now leading us into a new world of search that is gradually moving more and more to graph search.
With the coming of the internet of things, you can bet that more and more enterprise IT managers are going to be looking for fast, inexpensive search mechanisms in order to keep track of all the new devices that will be coming onto the internet.
Here’s a typical use case: You might be looking for a particular type of restaurant in New York City–say Vietnamese. Using a graph search engine–let’s deploy the one now powering Facebook’s internal search–you enter a query (i.e., “Vietnamese restaurant in midtown Manhattan”); the graph search database then connects all the dots in your friends’ accounts and identifies whether they’ve talked about, photographed or reviewed Vietnamese restaurants in midtown Manhattan.
It then looks at previous queries about Vietnamese restaurants you have made yourself in the past, it looks at the local geographic area where you currently are, and then it delivers those search results to you in microseconds.
The result is that you get a much more relevant type of search via people with whom you are connected–not a list of Web pages based on keywords that may or may not satisfy your query.
Well-Known Use Case for Neo4J: Panama Papers
Here’s another use case, this one a bit more well known: The Panama Papers, an exposé of offshore tax haven activity of many members of the global elite–political and otherwise.
The ICIJ used Neo4j and Linkurious, a graph visualization library for Neo4j, to unearth the details of the Panama Papers, the reporting of which took several years. The final report, published in book form, was released in April 2016.
“Neo4j is a revolutionary discovery tool that’s transformed our investigative journalism process because relationships are all important in telling you where the criminality lies, who works with whom, and so on,” ICIJ Data and Research Unit editor Mar Cabra said. “Understanding relationships at huge scale is where graph techniques excel.”
The Papers consisted of 2.6TB of data that was obtained by German newspaper Süddeutsche Zeitung and shared with the ICIJ. There were more than 11.5 million documents in all; that’s certainly big data. The international project, which included more than 300 journalists, investigated the rogue financial services businesses that assisted public political figures in several countries who were hiding their (and often the public’s) money offshore.
The ICIJ needed an intuitive, easy-to-use solution that did not require the intervention of any data scientist or developers, so that journalists could work with the data regardless of their technical abilities. Thus, Neo4J became the perfect data management and storage tool, because it worked across clouds, languages and security requirements.
Graph DBs Continue to Gain Market Share
Graph databases are rapidly growing in usage, although most people still are not aware of them. From Websites adding social-network features to telecoms providing personalized customer services to bioinformatics research, organizations are adopting graph databases as an efficient way to model and query already-connected data. Most of the growth of the genre thus far has been the result of word-of-mouth among database admins and CTOs.
Facebook put graph search on the mainstream IT map in early 2013 when CEO Mark Zuckerberg announced it was going into early adopters’ accounts. Later that year it went into general availability on the site. Other companies have been rolling out their own versions of the database based on the open-source model. The world’s largest social network also launched Graph Search for its iOS application and for its Messenger app.
Although Facebook has made graph search widely known and freely available within its own environment, the social network did not invent the technology; it started as an open-source project in the late 1990s.
Neo Technology researchers in Sweden have pioneered graph databases–they’ve been working on them since 2000–and have been instrumental in bringing them to a growing number of enterprises worldwide. Among those are Global 2000 companies such as Cisco Systems, Accenture, Deutsche Telekom and Telenor.