JMP Sifts Insight From Mountains of Data

 
 
By Peter Coffee  |  Posted 2002-10-07 Email Print this article Print
 
 
 
 
 
 
 

Statistical app gains new analytic capabilities; key upgrade features are missing from Mac edition.

When you think you know whats happening, conventional statistical software can help you decide if youre just imagining things. In enterprise settings, its more common to have a pile of data and to wonder if useful insights are buried within—a situation that demands a different approach to software design, as attempted by SAS Institute Inc.s JMP 5.0 "statistical discovery software."

Shipping now for Windows and Mac OS, the $995 package ($395 for an upgrade from Version 4.0) is still distinguished by the ease of navigation, from raw data through related analyses and graphics, thats been this products hallmark since our review hailed Version 1.0 as one of the breakthrough software designs of the early 1990s.

To our surprise, the release candidate code that we reviewed froze up several times on the Windows 98 workstation that we used for our tests; buyers considering a volume purchase will do well to conduct a trial of the current code on their own intended platform.

The multitabbed JMP Starter dialog offered useful capsule descriptions of each available family of statistical tests. Each test presented a construction panel for choosing the variables that would be examined. Resulting plots were not just static diagrams but remained tied to the data.

When we selected the data points in a particular cluster on an X-Y scatter plot, for example, we saw the corresponding entries immediately highlighted in the nearby cluster hierarchy view and the tabular data set window.

Novel analytic methods, such as clustering and neural-network modeling, came readily to hand during our tests of JMP 5.0—encouraging exploration by users who probably would not acquire a dedicated tool for these unfamiliar purposes. The products manual and sample data sets will quickly make clear the power that these tests provide.

We also applaud the more general treatment and presentation of data errors in JMP 5.0, ranging from optional error bars on bar plots to the availability of partial least-squares fits (which avoid the common problem of overfitting data so the sample is well-described, but future observations are poorly predicted). For situations where a lead user devises procedures for others, JMP 5.0 expands the products scripting language with new commands for matrix operations and for managing interactive sessions.

In general, JMP 5.0 has been scaled up from previous versions to handle the larger problems that are likely to be found in todays data-rich enterprise. The cluster algorithm is still quite memory-intensive (product documents describe a 4,700-row test as consuming 125MB), but SAS claims that speed is much improved. We did not have an earlier version available to verify this.

Random number generation has also been re-implemented to reduce the chance of repeating a sequence: At 10 billion random numbers per second, hypothetically, the new algorithm should not repeat in less than 105984 years—as in, 1 followed by about 6,000 zeros. We think we can live with that.

Analytic power isnt much use, though, without transparent access to data, and JMP 5.0s Internet access option is marred by lack of generality, unable to work with non-HTTP URLs such as those for local files. The Internet access option is marred by lack of generality, unable to work with non-HTTP URLs such as those for local files.

An option to extract an HTML-formatted table directly into a JMP data set was handicapped, in the code we tested, by lack of freedom to specify whether the first row should be regarded as header or data. The manual (which we saw in final form) called this "an experimental command," and we hope that it will quickly be fleshed out.

While theyre at it, we hope SAS developers will incorporate accessed URLs into JMPs automatically maintained list of recently opened files.

We were dismayed to find even this limited Internet access offered in only the Windows version of the product. Nor is this the only example of platform bias. Version 5.0s new facilities for opening SAS data files and for one-step import of formatted text as data are likewise Windows-only features. This seems to us a breach of faith with academic users, who, in many cases, still favor the Macintosh.

We hope that SAS will not fail to take full advantage of Mac OS Xs graphical power, as well as offering Internet access to Mac users, soon.

Technology Editor Peter Coffee can be reached at peter_coffee@ziffdavis.com.



 
 
 
 
Peter Coffee is Director of Platform Research at salesforce.com, where he serves as a liaison with the developer community to define the opportunity and clarify developers' technical requirements on the company's evolving Apex Platform. Peter previously spent 18 years with eWEEK (formerly PC Week), the national news magazine of enterprise technology practice, where he reviewed software development tools and methods and wrote regular columns on emerging technologies and professional community issues.Before he began writing full-time in 1989, Peter spent eleven years in technical and management positions at Exxon and The Aerospace Corporation, including management of the latter company's first desktop computing planning team and applied research in applications of artificial intelligence techniques. He holds an engineering degree from MIT and an MBA from Pepperdine University, he has held teaching appointments in computer science, business analytics and information systems management at Pepperdine, UCLA, and Chapman College.
 
 
 
 
 
 
 

Submit a Comment

Loading Comments...
 
Manage your Newsletters: Login   Register My Newsletters























 
 
 
 
 
 
 
 
 
 
 
Thanks for your registration, follow us on our social networks to keep up-to-date
Rocket Fuel