Zeroing In on Site and Security Flaws
To think about software problems in terms of statistics is tempting, and better than doing no analysis at all, but stopping there is terribly hazardous to system health. Its simply not enough to get most of the code working most of the time--not when the performance of that code determines the first impressions of an enterprise that are formed by new customers, or when the security of that code maintains the integrity of relationships with established supply-chain partners.
Statistical thinking is especially flawed in the realm of security, where failure is actively sought by opponents rather than merely occurring by chance. This is, as Ive said before, an exercise in game theory rather than a species of traditional debugging.
In most cases, moreover, defects or vulnerabilities represent weak links in a chain: Break any one, and the system fails. As obvious as this might seem, its not the impression thats created by many diagrams of IT system architecture, which often seem to suggest that there are layers of redundant strength.
We readily fall into that trap because weve learned to expect redundancy in physical systems: We even see it in the computer systems that operate critical real-world mechanisms like the space shuttle, but weve also been known to fool ourselves into thinking that we had redundancy when we were actually vulnerable to a single common failure mode.
Enterprise systems such as e-business Web sites often seem even more dangerously brittle and more optimistically designed. Too often, we rely on the systems own opinion of whether or not its working, instead of enabling independent measurement of whats actually going on: We put a lot of money and analytic effort into tools that dont always measure the right things. "Even with network monitoring tools in place, a staggering 72.6% first learn about performance problems from end-user calls to the help desk, and another 82.3% said employee complaints usually are the first they hear of slowdowns on their networks," summarized a report compiled this past July from a survey of enterprise network managers.
It was an interesting coincidence, then, that saw me meeting last week with toolmakers that raise the bar for relevant measurement of both application performance and network security policy. I made my first acquaintance with TeaLeaf Technology Inc., whose IntegriTea products capture Web site session histories and enable rapid analysis of unsatisfactory behavior; I also got an update on whats happening at Preventsys, whose automated security audit technology Ive previously discussed with you, and whose latest version 1.5 products accelerate developers efforts to comply with expanding regulatory requirements.
One independent study last year using TeaLeafs tools found application errors in five-eighths of a sample of major e-business sites. During my meeting with TeaLeaf last week, we went through an extended scenario based on a financial services site: I was impressed by the power of their logging technology and analytic tools to bring out patterns of Web site behavior that quickly zeroed in on the kind of flaws that lead to infuriating waste of customer time and often to the abandonment of transactions.
And whether were talking about simple failure to complete a transaction correctly, or about complex vulnerabilities that might result in disclosure of sensitive data or in financial losses due to fraud or to system disruption, were talking about enterprise systems that represent promises made to other parties. When software doesnt keep those promises, we can expect to face rising expectations in courts of law that by now, we should be getting these things right.
Lets set our sights on building systems that work; lets use the tools that are right for that job.
Discuss this in the eWEEK forum.