JMP Sifts Insight From Mountains of Data

 
 
By Peter Coffee  |  Posted 2002-10-07
 
 
 

JMP Sifts Insight From Mountains of Data


When you think you know whats happening, conventional statistical software can help you decide if youre just imagining things. In enterprise settings, its more common to have a pile of data and to wonder if useful insights are buried within—a situation that demands a different approach to software design, as attempted by SAS Institute Inc.s JMP 5.0 "statistical discovery software."

Shipping now for Windows and Mac OS, the $995 package ($395 for an upgrade from Version 4.0) is still distinguished by the ease of navigation, from raw data through related analyses and graphics, thats been this products hallmark since our review hailed Version 1.0 as one of the breakthrough software designs of the early 1990s.

To our surprise, the release candidate code that we reviewed froze up several times on the Windows 98 workstation that we used for our tests; buyers considering a volume purchase will do well to conduct a trial of the current code on their own intended platform.

The multitabbed JMP Starter dialog offered useful capsule descriptions of each available family of statistical tests. Each test presented a construction panel for choosing the variables that would be examined. Resulting plots were not just static diagrams but remained tied to the data.

When we selected the data points in a particular cluster on an X-Y scatter plot, for example, we saw the corresponding entries immediately highlighted in the nearby cluster hierarchy view and the tabular data set window.

Novel analytic methods, such as clustering and neural-network modeling, came readily to hand during our tests of JMP 5.0—encouraging exploration by users who probably would not acquire a dedicated tool for these unfamiliar purposes. The products manual and sample data sets will quickly make clear the power that these tests provide.

We also applaud the more general treatment and presentation of data errors in JMP 5.0, ranging from optional error bars on bar plots to the availability of partial least-squares fits (which avoid the common problem of overfitting data so the sample is well-described, but future observations are poorly predicted). For situations where a lead user devises procedures for others, JMP 5.0 expands the products scripting language with new commands for matrix operations and for managing interactive sessions.

In general, JMP 5.0 has been scaled up from previous versions to handle the larger problems that are likely to be found in todays data-rich enterprise. The cluster algorithm is still quite memory-intensive (product documents describe a 4,700-row test as consuming 125MB), but SAS claims that speed is much improved. We did not have an earlier version available to verify this.

Random number generation has also been re-implemented to reduce the chance of repeating a sequence: At 10 billion random numbers per second, hypothetically, the new algorithm should not repeat in less than 105984 years—as in, 1 followed by about 6,000 zeros. We think we can live with that.

Analytic power isnt much use, though, without transparent access to data, and JMP 5.0s Internet access option is marred by lack of generality, unable to work with non-HTTP URLs such as those for local files. The Internet access option is marred by lack of generality, unable to work with non-HTTP URLs such as those for local files.

An option to extract an HTML-formatted table directly into a JMP data set was handicapped, in the code we tested, by lack of freedom to specify whether the first row should be regarded as header or data. The manual (which we saw in final form) called this "an experimental command," and we hope that it will quickly be fleshed out.

While theyre at it, we hope SAS developers will incorporate accessed URLs into JMPs automatically maintained list of recently opened files.

We were dismayed to find even this limited Internet access offered in only the Windows version of the product. Nor is this the only example of platform bias. Version 5.0s new facilities for opening SAS data files and for one-step import of formatted text as data are likewise Windows-only features. This seems to us a breach of faith with academic users, who, in many cases, still favor the Macintosh.

We hope that SAS will not fail to take full advantage of Mac OS Xs graphical power, as well as offering Internet access to Mac users, soon.

Technology Editor Peter Coffee can be reached at peter_coffee@ziffdavis.com.

Executive Summary


: JMP 5.0">

Executive Summary: JMP 5.0

Usability Excellent
Capability Good
Performance Good
Interoperability Good
Manageability Good
Scalability Good
Security Good

JMP 5.0 updates SAS highly integrated statistical exploration package with advanced analytic functions, such as neural-net analysis of data relationships, and improves the products performance in large-scale tasks. Access to Internet-based data sources remains surprisingly tentative and as yet is offered only in its present incomplete form in the Windows version of the product; Mac OS users will find that several of the updates innovations are absent on their platform.

COST ANALYSIS

Statistical functions in spreadsheets, such as those in Microsoft Corp.s Excel, are far less accessible and much less sophisticated (especially in their handling of error analysis) than those in JMP. At $995 for a single copy, JMP is not inexpensive, but its integrated scripting facilities can provide considerable leverage in the hands of a few power users, generating reports and presentation graphics that can support improved decision making.

(+) Ease of learning; broad analytic capabilities; outstanding ease of access to related analyses and presentation graphics.

(-) Limited Internet data acquisition tools; surprising feature gaps between Mac OS and more capable Windows versions.

EVALUATION SHORT LIST

  • Microsofts Excel
  • Data Description Inc.s Data Desk
  • SPSS Inc.s SigmaStat
  • www.jmp.com

  • Rocket Fuel