Cover Story (sidebar) / July 1993

New Memory Architectures
to Boost Performance

Tom R. Halfhill

One of the system bottlenecks exposed by high-speed processors like the Pentium is the interface to main memory. This interface is the most crucial pathway in the entire computer, because it's responsible for carrying a constant flow of program instructions and data between memory chips and the CPU. If memory or the pathway fails to keep pace with the CPU's insistent requests, the CPU stalls in a wait state and valuable processing time is lost.

Today's DRAM chips — variously known as asynchronous, page-mode, or generic DRAMs — are constrained by both their internal architecture and their interface to the CPU's memory bus. DRAM architecture hasn't changed significantly since 1974; neither has the memory interface in desktop PCs, except that memory buses have grown wider — from 8 bits on the 8088 to 64 bits on the Pentium. Although wider buses have increased the available raw bandwidth, throughput still lags behind the spiraling demands of faster microprocessors.

More and more PCs now bridge the gap by using high-speed SRAM (static RAM) chips to cache traffic between the CPU and DRAMs. A typical 486 or Pentium system might have 256 KB of SRAM cache. But SRAM is much costlier than DRAM, and boosting cache beyond 256 KB yields a diminishing rate of return.

To get around these limitations, several new technologies have been developed. Most of these technologies require new types of DRAMs, but two of them attack the memory interface problem. It's not clear at this point which will become the new DRAM standard. All of them cost about 15 percent more than generic DRAM, but even so, they can reduce the overall cost of a system by eliminating the SRAM cache and associated controller chips. This can also result in a smaller motherboard that consumes less power. These are important considerations for portable systems.

Enhanced DRAM. EDRAM, the brainchild of Ramtron (Colorado Springs, CO), is the only new DRAM that is now shipping in volume. It takes an evolutionary approach by integrating a small SRAM cache with a fast core of otherwise generic DRAM. Each EDRAM chip has 2 Kb of 15-nanosecond SRAM and 4 Mb of 35-ns DRAM.

Ramtron's benchmarks show an EDRAM-equipped machine outperforming a comparable system with generic DRAM and an SRAM cache, unless the application program fits completely inside the cache. In that case, EDRAM delivers about the same performance as the other system. Ramtron says it already has 44 EDRAM customers and that EDRAMs are going into everything from desktop PCs and workstations to laser printers and copiers. However, none of the first Pentium systems use EDRAM.

Rivals criticize EDRAM for being single-sourced. Ramtron says it is seeking second sources. Without the price competition and redundant supply fostered by multiple sources, some vendors are reluctant to adopt EDRAM.

Cache DRAM. CDRAM, invented by Mitsubishi, is similar to EDRAM. It integrates an SRAM cache with either 4 Mb or 16 Mb of DRAM. Although CDRAM's on-board cache is larger (16 Kb versus 2 Kb), the DRAM is slower (70 ns versus 35 ns). But CDRAM's on-board SRAM can be used as either a cache or a buffer, depending on whether the application requires serial or random access to the data.

When retrieving data serially — for example, to refresh a bit-mapped screen — CDRAM can prefetch the data from its DRAM core into the SRAM buffer and thus improve performance. In fact, Mitsubishi claims that CDRAM, which is single-ported, is faster for such applications than dual-ported VRAM (video RAM) is. The company says a CDRAM-based PC will run as fast as a comparable machine with DRAM and a 256-KB secondary SRAM cache.

Mitsubishi is the sole source for CDRAM, which is now being ramped up to volume production. However, Mitsubishi says chips will also be available from NEC and perhaps another company.

Synchronous DRAM. SDRAM is another evolutionary alternative, and it is attracting the widest support among semiconductor manufacturers. SDRAM chips are coming later this year or in 1994 from Mitsubishi, NEC, Samsung, Texas Instruments, and nearly every other major DRAM player. To ensure that SDRAM chips are interchangeable, a standard is being developed by the JEDEC (Joint Electronic Device Engineering Council).

Unlike today's asynchronous DRAMs, SDRAMs exchange data with the CPU in sync to an external clock signal and are designed to run at the full speed of the CPU/memory bus without imposing wait states. For instance, TI's 16-Mb SDRAM, which the company will be sampling late this year, is rated for speeds of up to 100 MHz. That's fast enough for the 66-MHz Pentium, with enough headroom to accommodate even faster processors.

SDRAM performs best when transferring data serially. TI says it's ideal for applications like word processing, spreadsheets, and multimedia. But for programs that depend heavily on random access (e.g., databases), a cache-type memory like CDRAM or EDRAM will probably outperform SDRAM.

Rambus DRAM. RDRAM, developed by Rambus (Mountain View, CA), takes a more revolutionary approach to the memory-bandwidth problem. In addition to introducing a new type of memory chip, Rambus has reinvented the interface to the CPU.

RDRAM chips are vertically packaged, with all pins on one side. They exchange data with the CPU over 28 wires that are no more than 12 centimeters long. The bus can address up to 320 RDRAM chips and is rated at 500 MBps, although 400 to 450 MBps is more realistic. That compares to about 33 MBps for asynchronous DRAM.

RDRAM chips have no on-board SRAM, but pages are cached by reading the sense amplifiers. The controller employs a new type of I/O cell, and the bus requires no extra glue logic. The chips can be manufactured in the same plants that make generic DRAMs.

Rambus is now sampling 4-Mb RDRAMs to major system vendors and is planning for volume production this fall. A 16-Mb RDRAM is also in the works. Rambus has licensed its technology to Fujitsu, Hitachi, NEC, and Toshiba.

RamLink. This technology is the most revolutionary of all, but it is destined for use further in the future than the others. It concentrates on the CPU/memory interface rather than the internal architecture of the memory chips. RamLink originates from the IEEE, but many firms are involved in its development, including Apple, Hewlett-Packard, TI, and all the major DRAM makers.

The technology is an offshoot of a recently adopted IEEE standard known as SCI (Scalable Coherent Interface), which defines a system architecture that encompasses anywhere from one to 64,000 microprocessors. RamLink is a memory interface with point-to-point connections arranged in a ring, like a network. Traffic on the ring is managed by a memory controller that sends messages to the DRAM chips, which act as nodes. Other nodes can be ROM chips, flash ROMs, drives, or even additional RamLink rings.

Hans Wiggers of HP Labs, who is chairman of the IEEE RamLink committee, says that RamLink could run as fast as 500 MHz or even 1 GHz. But RamLink is still years from reality. "Everybody says, 'Oh, this is very interesting,' but nobody has committed any designs to it yet," Wiggers says.

Even if you disregard RamLink and focus on the near-term contenders, it's unclear whether EDRAM, CDRAM, SDRAM, or RDRAM will become the new memory standard. "I wouldn't even touch predicting which of these will be the long-term winner," says Sherry Garber, an analyst at In-Stat (Scottsdale, AZ). "These things just haven't been out long enough. It takes a lot of momentum to replace a known product."

System makers are reluctant to adopt any of these alternatives until an obvious leader emerges. Nobody wants to build a computer with RAM that's not in wide production and that users can't readily find when they want to expand memory. "It's really hard for a clone maker to go out on a limb," notes Steven Przybylski, a consultant in San Jose, California, who specializes in system architectures.

Przybylski says that although the new DRAMs all command about the same 15 percent premium over existing DRAMs, prices could swing radically as production ramps up. "Volume is everything," he says. "What makes them so expensive is that no one is buying them."

The new DRAMs may filter slowly into the market by filling niches. They offer clear advantages for certain embedded applications and may replace VRAM on video cards. As volumes rise and prices fall, they could move gradually into main memory.

Another possibility is that an evolutionary design such as SDRAM will fill near-term needs for faster memory in Pentium-class systems. Later in the decade, as processor speeds approach 1 GHz, a revolutionary approach such as RDRAM or RamLink might rescue users from yet another memory bottleneck.

It's also possible that no single solution will prevail. In-Stat's Garber notes that the worldwide DRAM market was worth $8.5 billion last year. "That's big enough to support more than one DRAM architecture," she says.

Illustration: Comparing the relative speeds of microprocessors and memory chips necessarily requires some apples-and-oranges compromises, but no matter how these curves are plotted, the results are similar: CPU performance is rapidly outstripping DRAM performance. While the CPU curve climbs steeply, the DRAM curve remains almost flat.

Tom R. Halfhill is a BYTE senior news editor. You can reach him on BIX as "thalfhill."

Copyright 1994-1997 BYTE

Cover Story (sidebar) / July 1993

New Memory Architectures to Boost Performance

Tom R. Halfhill

Return to Tom's BYTE index page

New Memory Architectures
to Boost Performance