Content Repository API for Java, a specification released in Version 1.0 in June 2005, gives applications access to content collections with minimal concern for storage details. Also known by its Java Specification Request number, JSR-170, or its nickname of JCR (Java Content Repository), it defines an object model and interfaces for standardized and disciplined access to many storage types and arrangements.
Enterprises continue to find that key information about their customers, their suppliers and competitive environments is in different places and varied forms, making applications both overly complex and insufficiently flexible as new data sources emerge.
JCR lead company Day Software, in Irvine, Calif., has released an open-source JCR implementation and has augmented it to deliver a commercial JCR product to address this critical enterprise concern. eWEEK Technology Editor Peter Coffee spoke with Days chief scientist, Roy Fielding, about key JCR concepts.
As I understand it, one of the most distinctive features of Days Communiqué content management system is the fact that its one of the few full-scale implementations of JCR technology. Is that correct?
Yes, were one of the developers of the JCR technology. We based it on our earlier content bus ideas.
How would you characterize the distinctive opportunities that the existence of JCR as a standard creates for people over and above the ways that were available to do content management on the Java platform—or, for that matter, on any platform—before JCR took form?
Well, you lose some of the vendor lock-in, and, as a result of that, you have more opportunity for the global network effect of having many more application developers working on a common platform.
What weve done is try to increase adoption of the technology by making it available through Apache—through Jackrabbit—and trying to get more people involved with building applications on top of it as a platform.
And companies can develop on the JCR platform—they can build products that compete essentially with our existing Communiqué product. Of course, there are many improvements that are made in the process of standardizing on JCR, but you create a much larger base of developers working on applications, and as those applications mature, it becomes an infrastructure issue of which content repository you want to have underneath that. Right now, we also have a CRX content repository that is our version of the JCR implementation.
Youve used the phrase “content bus.” Ive heard the concept of “bus” applied to services and a number of other things, but its the first time Ive heard of a “content bus.” Could you elaborate on that?
The term came from the earlier versions of our Communiqué product. It was the name that we gave to what is now called the Content Repository API for Java technology, which is the JCR interface. Its an architecture for creating a content-centric integration interface with data that youre involved with in an enterprise so that you operate on that data regardless of where that data is stored.
Its a way of providing an abstraction on all of the methods of storage you might have in an enterprise, without having to have the application developer write to a specific interface for each one of those storage methods.
It sounds to me as if its important not to confuse what youve just described with other abstraction schemes, such as CORBA, that try to make you indifferent to the physical device on which something is stored. Youre talking about going to another level of abstraction—not even knowing what specific object youre asking for but, rather, being able to ask a pool of objects which of them has various content characteristics. Do I have that about right?
Its close, but not quite. Its actually a very similar concept of application integration. In a CORBA or a .Net style of interface, its a controlled interaction, controlled-space interface. What I mean by that is if you think of an application like a word processor, such as Microsoft Word, you go up to all of the menus and youll see a File menu, which has Open, Close, Save As and so on. Those are all control interfaces for the Word application.
The JCR interface takes a different perspective. It uses the same style of Web interface that I developed as part of the work on the World Wide Web project and applies it to the Java environment to create a data-centric interface.
So the data-centric interface is going to treat the data thats manipulated by the word processor as being more significant to the controls in the application itself. So, for example, it will focus on what a paragraph is or what style sheets are, things like that. You can have a more general concept—data can be less specific to particular applications than tools are, and this allows various advantages in terms of being able to apply more tool sets to that data.
I guess thats the real key point to make here—that many attempts have been made to deliver the independence of content from form. Youre talking about once again delivering that notion that data can be put into whatever application or whatever vehicle meets your needs without bringing with it a bunch of baggage from how it was originally expected to be used.
Right. There are a number of differences—big differences—in the way that JCR operates over the Web. The Web focuses on standardized data formats, while JCR really focuses on—because its an internal server API—objects that can be manipulated as generic data. So it doesnt have the same parsing overhead as, for example, doing things with XML directly.
And thats a critical point, considering that the bandwidth explosion associated with that XML overhead and the parsing overhead youve just been describing are impediments to its use in some situations.
Thats correct.