Phase change memory

 

The active material in PCM is a chalcogenide alloy, typically including at least one of the materials Ge, Sb and Te. The active material is placed between two electrically conducting electrodes, and resistance switching is induced by current flowing through the active material, causing a structural change of the material due to Joule heating. Phase-change materials exhibit two meta-stable states, namely, a (poly)-crystalline and an amorphous phase of high and low electrical conductivity, respectively. Switching to the amorphous phase (the RESET transition) is typically achieved in less than 50 ns, but requires relatively high current, whereas the transition to the crystalline phase (SET) is slower, on the order of 100 ns.

PCM scores well in terms of most of the desirable attributes of a universal memory technology. In particular, it exhibits very good endurance, typically exceeding 100 million cycles, excellent retention, and superb scalability to sub-20-nm nodes and beyond. However, a number of technological challenges need to be addressed for PCM to become universal memory. Apart from the necessary RESET current reduction and SET speed improvement mentioned above, a significant challenge of PCM technology is a phenomenon known as (short-term) resistance drift:  The resistance of a cell is observed to drift upwards in time, with the amorphous state drifting more than its crystalline counterpart. Drift seriously affects the reliability of MLC storage in PCM because of the reduced sensing margin between adjacent tightly-packed resistance levels. Therefore, effective solutions of the drift issue are a key factor of the cost competitiveness of PCM technology [2010-2].

At IBM Research in Zurich we are working on various aspects of PCM technology, including PCM materials and memory cells with a focus on enabling MLC storage [2011-5]. In particular, we conduct fundamental research on phase change materials to understand their properties and to guide the design of new materials with improved characteristics. We also apply finite element model simulations to study the impact of electrical transport and other material characteristics on memory cells.

Furthermore, we engage in experimental characterization of PCM cells in various configurations, from single cells to large (multi-Mbit) cell arrays. Advanced characterization processes provide an abundance of data which serves as input for statistical modeling, and for the definition of effective algorithms that target memory reliability enhancement.

We are conducting research into advanced signal processing and coding schemes to improve reliability by means of enabling higher storage capacity, longer data retention and higher endurance. Moreover, we are designing and implementing novel circuitry for PCM chips in order to program and extract the memory cell information reliably, with low latency and efficiently, in terms of implementation area.

At the device level, we have developed a PCM-based storage subsystem, which is connected to the host over the PCI-e bus. In our research prototype, the PCM chips are connected to custom-designed PCM channel controllers and attached to a mezzanine card, which is attached in turn to an FPGA board. Our custom PCM controller design employs a 2D channel configuration that allows the designer to trade off read performance for write performance, and vice versa, depending on the needs of their workloads.

At a system level, we have achieved a steady-state average latency of 35 μsec for random 4 kB reads and 61 μsec for random 4 kB writes. Most importantly, the latency is predictable and consistent: for several hours of sustained random writes, 99.9% of the requests were completed within 240 μsec and the highest latency observed was 2 msec. Conversely, for an MLC- based enterprise-class Flash PCI-e card we put to the same test, the latency for the 99.9th percentile was 3 msec (i.e., 12× higher) and the highest observed latency was 14 msec (i.e., 7× higher). Moreover, a TLC-based Flash SSD we tested showed a 99.9th percentile latency of 66 msec (i.e., 275× higher) and a highest observed latency of 122 msec (i.e., 61× higher).

For part of this research activity, we are exploring various ways of integrating PCM at a system level and at a cluster level, including use cases where PCM is deployed in the memory subsystem, as well as in the storage subsystem. The goal of this project is eventually to integrate PCM at a cluster and datacenter level using low-latency networking and appropriate support from system software, thereby enabling new use cases for data-intensive applications.