When an IT vendor wants to get my attention, the most effective technique is to rub my nose in one of my own recent columns and say, "You wanted someone to do this? Look at us."
I invited this kind of response when I wrote at the end of September about the crying need for much-improved analysis of software failures. I observed that airliners and now even automobiles have standards for "black box" capture and replay of the circumstances of failure: I soon heard about Israels Identify Software, formerly known as Mutek Solutions, whose AppSight Black Box monitors and records an applications environment and behavior. In an on-line discussion late last year, one industry user called it "an awesome piece of software" and added, "I cant stop thinking of reasons to use it."
Identify will today release version 5.5 of the AppSight Black Box product, able to follow a problem across the boundaries between J2EE and .Net modules in hybrid applications—which seem to be more the rule than the exception among todays new development projects.
The product launch comes with a garnish of commissioned research that I find quite persuasive on one point in particular: Im talking about the finding, from a survey of 479 companies, that developers "spend 75 percent of their time reproducing and communicating the problem to various team members and just 25 percent actually fixing it." To me, this has the ring of truth, because I know how much of my time while writing a product review goes merely into setting up a scenario for a communicative screen shot—let alone documenting precisely the circumstances of any observed misbehavior in a product.
For a dispersed development team, or for situations in which an early-adopter customer wants to communicate a problem back to a vendor whos still in the throes of completing the 1.0 release of a product, Identify will earn its keep by capturing a reproducible representation of a software failure that can easily be shared with others.
Another, less obvious dividend of AppSight Black Box is in effectively outsourcing the nuisance of writing logging code as part of every application. When I spoke with the Identify team about their imminent launch, they estimated that 10 to 15 per cent of a developers time may go into devising and crafting internal logging utilities for an enterprise application that needs to be monitored in the field. Using AppSight Black Box gives that developer an off-the-shelf logging system that captures everything, not just the behaviors that the developer suspected as possible problems. Identify has also provided welcome management tools, such as the ability to use one failure condition to trigger a more complete level of application recording for some period of time. This is a useful way to get the level of detail thats needed to characterize a failure, when it occurs, without filling unreasonable amounts of storage with verbose logs of perfectly normal operation.
Another comment on my September "black box" column suggested that it offered "another reason why the EPR approach can be preferred in terms of system predictability." If EPR is a label thats new to you, as it was to me, then youll find it worth your time to drop by the EPRforum site to look over its materials on the Electronic PRocess effort to define and deploy reliable service-based solutions.
Finally, a reader of that September column suggested that other readers might want to look into the work thats being done on Notification of Failure (NOF) and Time to Perform (TTP) in the context of ebXML: the ebxml-bp archive contains some recent comments on efforts in this area.
Notify me of my failures at firstname.lastname@example.org