Opinion: As hardware gets softer and software gets more modular, it's interesting to think about implementing compute-intensive, relatively stable portions of an application in silicon rather than in software.
People take it for granted that computer hardware should be universal, while applications should be specific to particular tasks. It seems obvious that this makes sense, since hardware is expensive to prototype but cheap to mass-producewhile software modules can be cost-effectively refined and customized.
At what point, though, do the cost curves cross to make specialized hardware more cost-effective than a special-purpose application on a general-purpose machine? As hardware gets softer with technology such as the FPGA (field-programmable gate array) and software gets more modular with architectures such as Web services, its interesting to think about implementing compute-intensive, relatively stable portions of an application in silicon rather than in software.
Chunks of specialized hardware, side by side, make it easy to do things simultaneously. Tracy Kidder made this point in "The Soul of a New Machine," his 1981 saga of the genesis of a minicomputer. "I wondered," Kidder wrote, "why they had to struggle to fit Eagles CPU onto seven boards when elsewhere engineers were packing entire CPUs onto single chips. The general answer was that a multiboard CPU simultaneously performs many operations that a single-chip CPU can do only sequentially."
Two decades later, single-chip CPUs are doing many more things concurrently than they used to. Thats what makes an AMD Athlon, for example, so interesting to explore. Even so, compilers and operating systems still have to work really hard to avoid wasting time while independent tasks wait for each other, or while the same thing gets done with independent data values several times in succession.
The challenge of predicting parallelism is what has made Intels Itanium odyssey so perilous. Itaniums Explicitly Explicitly Parallel Instruction Computing design depends on devising compilers that identify instruction dependencies in advance. EPIC tries to steer between Very Long Instruction Word computing, with cycle-by-cycle planning thats specific to particular hardware, and traditional compilation techniques that leave it up to hardware to achieve concurrency at run-time.
The EPIC approach makes for bulky code, averaging almost 43 bits per instruction compared with a mainstream CPUs 32-bit instructions, and that bulk drives up the cost of hardware resources such as main memory and cache. Its the reason why the latest (and probably last) single-core Itanium 2 processors, unveiled July 21, have as much as 9MB of on-chip cache fed by a 667MHz front-side bus moving data at speeds up to 10.6G bps between the CPU and the main memory. The dual-core Itanium 2 processors expected in volume next year will have a staggering 24MB of on-chip cacheand theyll need it.
Does the hardware burden of all that generality make you wonder about the alternative of identifying tasks that can be made parallel in an applicationand building cheap hardware modules to match? If youve ever looked at that option, youve hit a barrier of programmer productivity. Even VHDLwhose name proclaims its "very high-level" design language, representing in hundreds of lines what used to take hundreds of pages of schematicsstill typically requires 10 to 100 times the number of lines required to express the same function in C.
Its therefore interesting to consider the possible impact of something such as Mitrion-c, a C-like language for generating FPGA specifications that will appear in September in tools from Mitrionics. As with Java, which achieves portability by compiling to a virtual machine thats readily implemented on varying hardware, Mitrion-c compiles its code into an instance of a virtual processor thats then implemented in FPGA form.
"The language helps you find the parallelism in the program," said Mitrionics CEO Anders Dellson while showing me the companys visual tools as they simulated the hardware implementation and execution of an algorithm. His productivity target, moreover, is highon the order of 1,000 lines of generated VHDL per line of Mitrion-c with at least a tenfold reduction in development time.
The next time someone says, "Thats the hard part of the problem," maybe you should take that description literally. Perhaps using softer hardware, rather than writing harder software, will be the cheapest way to get something done.
Technology Editor Peter Coffee can be reached at firstname.lastname@example.org.
Check out eWEEK.coms for the latest news, views and analysis on servers, switches and networking protocols for the enterprise and small businesses.
Peter Coffee is Director of Platform Research at salesforce.com, where he serves as a liaison with the developer community to define the opportunity and clarify developers' technical requirements on the company's evolving Apex Platform. Peter previously spent 18 years with eWEEK (formerly PC Week), the national news magazine of enterprise technology practice, where he reviewed software development tools and methods and wrote regular columns on emerging technologies and professional community issues.Before he began writing full-time in 1989, Peter spent eleven years in technical and management positions at Exxon and The Aerospace Corporation, including management of the latter company's first desktop computing planning team and applied research in applications of artificial intelligence techniques. He holds an engineering degree from MIT and an MBA from Pepperdine University, he has held teaching appointments in computer science, business analytics and information systems management at Pepperdine, UCLA, and Chapman College.