Weaving the Semantic Web

In 1991 Tim Berners-Lee created the World Wide Web and forever changed business, education, and the way people interact. A few years after that, he began speaking about his next vision for the Web, one which would do for data what the original Web had done for unstructured content. Berners-Lee called this new vision the Semantic Web. Put simply, the Semantic Web would make it possible to treat the entire Web as if it were a database. In the same way that a developer can query data in a standard database and build applications that use that data, people would be able to query data from across the entire web and build as-needed applications that pulled related but diverse data from multiple sources. On the Semantic Web, it wouldn't be necessary to infer what something was about through the use of text searches and guesswork since the information would be specifically tagged and marked up to clearly say what it was. More importantly, the Semantic Web would make it possible to easily link to and find similar and related data. However, it has taken many years to get all of the pieces of the Semantic Web into place. Several core pieces, including the querying language, only recently came close to being standardized by the World Wide Web Consortium, of which Berners-Lee is the Chairman. To many it seemed as if the Semantic Web was an emerging technology that was taking an awfully long time to emerge. The Semantic Web is finally taking shape. Businesses, sites and Web applications are beginning to define, link to, and create data models that take advantage of Semantic Web technologies to provide new types of functionality. Now is the time for businesses, developers and web users to get ready, the Semantic Web is finally here. eWEEK Labs interviewed the man himself, Tim Berners-Lee, to get his take on the status and outlook of the Semantic Web. We also spoke to Eric Miller, the longtime head of the Semantic Web initiative at the World Wide Web Consortium and the President of Zepheira, a company that helps businesses deploy and leverage Semantic Web technologies. In order to shed light on the real world implications of the Semantic Web, we've also evaluated several examples of publicly accessible real-world implementations of Semantic Web technologies. Finally, we've studied the challenges facing the Semantic Web, from security risks to hype curves to proprietary data islands and spoke to Stephen Downes, a researcher at the National Research Council's Institute for Information Technology in Canada, who believes that the Semantic Web will ultimately fail because of proprietary data protections. With this information, we hope you'll gain a better understanding of what the Semantic Web is, where it stands now, it's outlook for the future, and, most importantly, how it will impact your business.

Tim Berners-Lee

eWEEK's Emerging Technology Looks at the Semantic Web

Semantic Web Technology Gains Steam - eWEEK Labs tests out tools for building the Semantic Web

Using the Semantic Web in the Real World - A visual look at several real-world deployments of Semantic Web technology and how they compare to current methods

Podcast: The Challenges of the Semantic Web - In this podcast I speak to Tim Berners-Lee about the current status of the Semantic Web, the challenges it faces and its future. I also speak to Eric Miller of Zepheira and to Stephen Downes, a researcher at the National Research Council's Institute for Information Technology in Canada.

And more to come. Check back later this week as I'll be posting my full interview with Tim Berners-Lee.

In 1991 Tim Berners-Lee created the World Wide Web and forever changed business, education, and the way people interact. A few years after that, he began speaking about his next vision for the Web, one which would do for data what the original Web had done for unstructured content.

Berners-Lee called this new vision the Semantic Web. Put simply, the Semantic Web would make it possible to treat the entire Web as if it were a database. In the same way that a developer can query data in a standard database and build applications that use that data, people would be able to query data from across the entire web and build as-needed applications that pulled related but diverse data from multiple sources.

On the Semantic Web, it wouldn't be necessary to infer what something was about through the use of text searches and guesswork since the information would be specifically tagged and marked up to clearly say what it was. More importantly, the Semantic Web would make it possible to easily link to and find similar and related data.

However, it has taken many years to get all of the pieces of the Semantic Web into place. Several core pieces, including the querying language, only recently came close to being standardized by the World Wide Web Consortium, of which Berners-Lee is the Chairman. To many it seemed as if the Semantic Web was an emerging technology that was taking an awfully long time to emerge.

The Semantic Web is finally taking shape. Businesses, sites and Web applications are beginning to define, link to, and create data models that take advantage of Semantic Web technologies to provide new types of functionality. Now is the time for businesses, developers and web users to get ready, the Semantic Web is finally here.

eWEEK Labs interviewed the man himself, Tim Berners-Lee, to get his take on the status and outlook of the Semantic Web. We also spoke to Eric Miller, the longtime head of the Semantic Web initiative at the World Wide Web Consortium and the President of Zepheira, a company that helps businesses deploy and leverage Semantic Web technologies.

In order to shed light on the real world implications of the Semantic Web, we've also evaluated several examples of publicly accessible real-world implementations of Semantic Web technologies.

Finally, we've studied the challenges facing the Semantic Web, from security risks to hype curves to proprietary data islands and spoke to Stephen Downes, a researcher at the National Research Council's Institute for Information Technology in Canada, who believes that the Semantic Web will ultimately fail because of proprietary data protections.

With this information, we hope you'll gain a better understanding of what the Semantic Web is, where it stands now, it's outlook for the future, and, most importantly, how it will impact your business.

Under Construction

To a large degree, the Semantic Web has been a work in progress for as long as the term has exisited. Berners-Lee said that, "The last ten years we've been building the foundation of the Semantic Web in the sense of building the data formats and building the ontology language and all the things related to them."

The Semantic Web relies on several key technologies to make content data aware. The first is something that was part of the original Web, namely URIs (Uniform Resource Identifiers). Anytime you use the Web you are using lots of URIs as they are the core addressing method of the Web (every standard URL web address is a type of URI). URIs are important for the Semantic Web as you must be able to address and identify data sources in order to access them, just like a website.

Even more core to the Semantic Web is RDF (Resource Description Framework) which was pretty much the first Semantic Web standard that was defined. RDF makes it possible to describe Web-based content so that it is understandable to machines. A good example of an RDF file are FOAF (Friend of a Friend) files, which are essentially Semantic Web files about people.

For instance the FOAF file for "Jim Rapoza" makes it possible for a program to understand that there is a person whose name is Jim Rapoza, who has specific websites, business and educational affiliations and friends. Most importantly, those friends have their own FOAF and RDF files, and the machine can follow those links, which is in itself a core aspect of the Semantic Web, in that data leads to other related and relevant data.

For a while RDF was pretty much the only Semantic Web standard, and while this led to some interesting RDF implementations the Semantic Web stayed stalled. Then the W3C released the Web Ontology Language or OWL, which was especially core for business use as the ability to define ontologies is key for categorizing and classifying groups of related data.

Still the Semantic Web had a key weakness in that it had no querying language. As Berners-Lee said, "Imagine trying to develop relational databases without SQL." However, this was addressed with SPARQL, which brings SQL-like querying capabilities to RDF and the Semantic Web.

So there's your alphabet soup of standards and technologies--but how are people using Semantic Web technologies and how to they differ from traditional Web resources? Really the best way to understand the Semantic Web is through examples.

Click here to see more on how Semantic Web technology is used in the real world Semantic Web in Action

There are lots of classic examples of Semantic Web technologies that can help with really thorny problems, such as life science applications that help researchers search, access and understand medicines and diseases that can be identified and referred to using multiple names. But there are also examples that apply to everyday Web usage.

DBpedia.org is a project that takes Semantic Web technology and applies it to the vast amount of data inside the popular Wikipedia.org Internet encyclopedia. Using DBpedia, it is possible to use SPARQL to query Wikipedia in a much more powerful way than is possible using standard search tools. For example, using the Wikipedia search engine to look for television sitcoms set in New York produces a pretty much useless set of results where only one sitcom even appears on the initial results page. However, using the Semantic Web powered DBpedia, a fully accurate list of popular TV shows set in New York is returned, almost as if you had queried a SQL database rather than a website that had been made semantically aware.

Another example is the much hyped Joost online television service, which uses Semantic technology on the backend to help users better understand the relationships between particular pieces of content, which in, turn, helps users find the content that they most want.

Helping businesses overcome the hurdles to understanding and deploying Semantic Web technologies was one of the reasons why the Semantic Web Initiative's Eric Miller started the company Zepheira. Miller said, "There are lots of good standards and technologies out there but the gap between the standards and technologies was still quite large."

One key aspect that Miller has identified after deploying Semantic Web technologies in a wide variety of businesses is that most companies already have a great deal of rich semantic data in many existing systems, from mail applications to calendaring tools to databases to company LDAP directories. He said, "Enterprises are realizing that they have huge intellectual capital that they are not harnessing effectively."

Miller said that much of the work being done now in businesses revolves around freeing data from proprietary systems so that this data can be used in Semantic Web applications. He also said that more and more Semantic Web technologies are being used for traditional business integration. This is in-line with comments from Berners-Lee, who told us that "The number one role of Semantic Web technologies is data integration across applications."

Adoption Roadblocks

While the potential of the Semantic Web is very high, there are definitely plenty of issues and potential gotchas that face this emerging technology. Since it is a web-based technology, the Semantic Web will be vulnerable to scammers and bad guys who will try to use the technology to their advantage. For example, just as there are phishing sites that try to look like other legitimate sites, it is possible that similar techniques will be used to trick users with false data that appears to come from a legitimate source.

Also, access control is an important issue for Semantic Web applications, especially in business implementations, where it will be important to make sure data doesn't go to people who don't have the right to see it. Berners-Lee said that this is an area of focus for the Semantic Web community and pointed to the Policy Aware Web project, which is working towards creating access control rules for emerging web technologies.

Another challenge facing the Semantic Web is hype. The Semantic Web has recently become in vogue for many vendors hoping to gain attention for their products, with some marketers already using the term Web 3.0 to describe Semantic Web products and technologies.

Typically what happens when a technology gets hyped is that lots of products start to claim that they are part of this new in-crowd, even if they really aren't. We've already received pitches of products claiming to be Semantic Web technologies that clearly have nothing to do with the Semantic Web. Often these types of hype cycles can actually slow down the progress of an emerging technology as they confuse potential customers and distract developers.

Berners-Lee said that there's one simple way to determine if a product is actually a Semantic Web technology: Look for the standards support. If the product doesn't support core standards like RDF, OWL or SPARQL, then it isn't a Semantic Web product.

However, to some observers the biggest challenge facing the Semantic Web isn't security or hype or standards support, it's greed. One argument is that businesses, software vendors and large commercial websites won't want to expose their data, that they'll develop their own proprietary formats in order to keep people on their products and sites.

This is the argument made by researcher Stephen Downes, who wrote a blog essay entitled "Why the Semantic Web Will Fail". In our interview with him, Downes said, "Companies first and foremost attempt to secure a monopoly over a particular format or a particular standard."

Downes, who has worked with Semantic Web and similar technologies in his work in online learning, pointed out that while technologies like RDF have been around for years, many large companies have avoided using them in projects where they should have made sense. And it isn't hard to see his point, from public sites like Flickr and even Google to corporate products like IBM's Lotus Connections, which has lots of semantic capabilities but doesn't use RDF or other Semantic Web technologies.

However, both Berners-Lee and Miller pointed out to us the many ways that proprietary data can be easily converted into Semantic Web data, for example pointing out how sites like Flickr are already machine readable. Also, Berners-Lee said that in order for sites and products to remain competitive they will have to make their proprietary data Semantic Web aware. He said that people won't give sites and companies their data (which is what makes most sites valuable) unless they can re-use it, "All of these sites, no matter how fancy they are, they are going to have to realize that the users will want their data back."

The Semantic Future

So what is the outlook for the Semantic Web? Will we continue to see lots of islands of proprietary semantic data that don't integrate well together? Will we soon see the giant all-the-web as a database scenario where the Semantic Web makes possible all kinds of new and exciting types of applications?

Our take is that the Semantic Web will eventually succeed, as it holds too many benefits for too many people to fall by the wayside. But it's also likely that the Semantic Web won't come about in the exact same way that many people envision. The lesson of the Web 2.0 technologies is that users are often surprising in the way they utilize new technologies. Miller told us that he had already seen businesses using Semantic Web technologies in interesting and unexpected ways.

But one thing is certain. The way that information is found, data is analyzed and web applications are built is going to change radically because of these new technologies. Businesses should start investigating these technologies and figuring out how to best leverage them in their infrastructure.

Or to use the words of Tim Berners-Lee, "It's time to get Semantic Web wise."