The Evolving Memory Landscape

 
 
By Loyd Case  |  Posted 2004-07-19
 
 
 

The Evolving Memory Landscape


Robert Heinlein is famous for writing some of science fictions classic novels. We remember him today, however, for his famous acronym: TANSTAAFL ("There Aint No Such Thing As A Free Lunch.") Heinlein was by no means the first writer to express that sentiment, but he used the acronym often enough that it stuck. TANSTAAFL is really a restatement of the second law of thermodynamics: systems tend towards greater disorder unless you feed the system energy from an outside source.

In the case of PC performance, we can see that increasing amounts of energy need to be fed into the system to increase performance. Processors are getting faster -- and hotter. We now have graphics hardware that can eat up to 75W or more of power, and generate substantial heat in their own right. Memory is no exception. People are increasingly loading up their PCs with more and faster memory. The net result is added power draw and heat consumption.
Most new PC systems sold in the past year used DDR (double data rate) memory. DDR memory pumps out two data samples per clock cycle. Youll typically see DDR memory rated at the effective clock rate of the memory, as if it were sending one data sample per clock cycle. For example, DDR400 memory really runs at 200MHz. Youll also see this listed as "400MHz effective."

An alternative naming scheme uses the effective data rate of the memory. So DDR400, which is capable of shipping out 3.2 gigabytes per second of data, is also called PC3200 (for 3200MB per second). That seems pretty fast, but by todays processor standards, its not. A 3.4GHz Pentium 4 or a 2.4GHz Athlon 64 FX-53 still spends a lot of time waiting for memory. Thats one reason cache sizes inside the CPU have been increasing – to reduce the time spent waiting for data to be retrieved from or written to memory.




DDR has other issues that affect its stability and performance. Memory termination, for example, isnt built into DDR memory, so its actually a set of resistor packages built onto the motherboard itself. Termination is needed to minimize signal reflections which degrade stability. This adds some cost, and also increases risk of instability as clock rates go higher, since the termination resistors are far away (relatively speaking) from the DRAM chips. Another escalating problem is heat and power draw as memory clock rates go up.

So a new type of DDR memory has arrived on the scene.

DDR2


: Power and Latencies">

The latest memory for mainstream personal computers is DDR2. DDR2 solves some of the problems inherent with original DDR (now known as DDR1). For example, DDR2 has on-die termination, which improves signal integrity at higher clock rates.

DDR2 also solves the power and heat problem in a clever way. The actual memory core clock is 1/2 the clock rate "seen" by the system. DDR1, by contrast, clocks the core at the same speed as the external I/O clock. For example, DDR2/533 clocks at 266MHz -- but the internal core clock is 133MHz. The I/O buffer clock is 266MHz, and thats the clock rate that the system understands. To get around this seeming contradiction, DDR2 batches up four bits per clock cycle.

Since the I/O buffers run twice as fast, it really only hands off two bits per I/O clock cycle. So internally, the core is presenting data to the I/O buffers at quad data rate, but externally, the system sees two data items per clock cycle. In other words, DDR2 prefetches four data items, while DDR1 only prefetches two data items, per I/O clock cycle.



Latencies are also different. DDR1 CAS latencies could be as low as 2 clock cycles, though typical modules in OEM systems are 2.5 or 3 clocks. The DDR write latency is one clock, but as the external frequency goes up, thats too little time. So DDR2 adopts a simple algorithmic approach, where write latency is always CAS latency - 1. So if CAS is set to 4 -- typical for current DDR2 modules, write latencies are 3 cycles.

From the perspective of the system, the actual delay hasnt changed much. CAS2 for DDR400 is roughly 15ms, while CAS4 for DDR2/533 is about the same. Overall bandwidth goes up, because the relatively slow latency is for that first read of the memory row. After that, memory streams out to the system per the higher clock rate. Of course, if the system is running DDR2/400, you may see little or no gain in performance.

Power issues are addressed by lowering the voltage from 2.5V to 1.8V. As memory capacities increase, the power required by the added memory also goes up. Dr. Michael Schuette of Lost Circuits estimates that 4GB of DDR1 memory consumes 35-40W of power. The reduced voltage means power requirements also goes down, to about 25-30W for a 4GB system. The lower voltage also helps enable higher clock frequencies.

Frontside Bus, Memory Controllers


The current LGA775 Pentium 4 processors run their frontside bus speeds at 200MHz (800MHz effective). But DDR2/533 offers an effective memory clock of 266MHz. This results in reduced memory efficiency and throughput. In an ideal world, youd like to have the frontside bus clock be synchronous with the memory clock. Its likely that Intel will increase the frontside bus speed of the Pentium 4 line later this year.
AMD, on the other hand, is faced with a slightly different issue. The Athlon 64s memory controller is built onto the CPU die itself. The positive side of this is that memory controller latencies are radically reduced, since its running at the same speed as the processor. But it also means that AMD will need to re-spin the Athlon 64 to support DDR2. Whether theyll do that for the current CPU generation or wait until they make the move to 90nm is an open question. Given the excellent performance theyve been getting from vanilla DDR400, theyre certainly not hurting at present.

Other Memories

While DDR and DDR2 gets the lions share of system sales, other memory types are vying for some OEM attention. These include QBM (quad-band memory) and Rambus new XDR memory.

QBM support was announced by VIA with some fanfare last year. Unlike DDR and DDR2, but similar to Rambus, QBM is the product of one company, Kentron. Kentron doesnt build the memory, but rather, takes off-the-shelf DDR and uses switching technology to double the throughput. Think of it conceptually as a kind of dual-channel memory on a single module. Kentrons idea is to take lower speed, lower cost memory and effectively double the performance while keeping the clock rate low. XDR is Rambus latest attempt to woo PC and memory manufacturers to the companys proprietary technology. Rambus past effort in this arena was RDRAM, which gained some sway due to Intels original acceptance of RDRAM in the Intel 820, 850, and 860 chipsets.

XDR DRAM is really a forward looking technology. Rambus estimates that DDR2/667 may be end up being DDR2s maximum speed, though some other observers suggest the technology can progress as high as DDR2/800. Whatever the case, Rambus suggests that XDR DRAM could be the next step up in memory performance, sometime in late 2005 or early 2006. Of course, given Rambus litigious history, it remains to be seen who might actually work with them. XDR is based on Rambus Yellowstone signaling technology, which is clearly Rambus-owned IP. It certainly bears watching, but its at least 18 months before the technology will become a factor. Much can change in the technology world in that time.

What to Buy


The scenarios are pretty simple: either youre buying a new system, or youre not. If you have an existing system based on DDR1, upgrading your memory means buying DDR1 modules. However, if you have an older system youre planning to upgrade with a new processor, it may be worth holding off and seeing what your memory needs will be when you perform that systems upgrade.

If youre in the market for a new system, then the choice is a fork in the road: AMD or Intel? If youre going the Intel route, its really worth getting a socket T board that supports the new 900 series chipsets and DDR2 memory. As we noted in our 925X preview, the 900 series offers other useful new features, like four serial ATA ports supporting native command queuing. If you do go that route, spend the extra few dollars and get DDR2/533. Some companies are even starting to offer DDR2/667 already, though thats currently just for the overclocking set. Its unclear whether modules with the DDR2/667 label today will actually work in DDR2/667 systems when they arrive on the scene. The price premium is pretty serious, too, so for now, DDR2/533 is the performance sweet spot.

If you want to upgrade to an Athlon 64 today, youre choice is simple: DDR400. Given the natural efficiency of the Athlon 64s integrated memory controller, going with good quality, low-latency DDR400 does boost performance a bit. But weigh your decision carefully. If you can wait until autumn, you may be rewarded with support for new core logic and even DDR2 support for the Athlon 64.

Rocket Fuel