Newsgroups: comp.sys.mac.hardware Subject: Interleaved Memory on Centris 650 & Quadra 800 (long) From: Dale_Adams@gateway.qm.apple.com (Dale Adams) Date: 9 Feb 93 23:09:58 GMT Organization: Apple Computer, Inc., Cupertino, CA The new Macintosh Centris 650 and Quadra 800 computers introduced today feature a newly designed memory controller which supports interleaved memory. The following article explains how interleaved memory works on these machines and how to configure the machines for maximum performance. Interleaved Memory on the Centris 650 and Quadra 800 ---------------------------------------------------- The main memory subsystem of the Macintosh Centris 650 and Quadra 800 computers makes use of a memory access technique called "interleaved memory". This memory organization serves to reduce the overall access time of the 68040 processor into DRAM. The following description illustrates how this memory organization works and why it results in reduced memory access time. Non-interleaved Memory System In a non-interleaved memory system, all of the first bank of memory, bank 0, is addressed before the first long word of the second bank of memory, bank 1, all of bank 1 is addressed before the first long word of bank 2, and so on. Figure 1 shows this organization for two banks of N long words. (A long word is 4 bytes, or 32 bits, and is the natural unit of memory for the 68040.) Bank 0 Bank 1 ----------------- ----------------- | 0 | | N | ----------------- ----------------- | 1 | | N+1 | ----------------- ----------------- | 2 | | N+2 | ----------------- ----------------- ~ ~ ~ ~ ----------------- ----------------- | N-2 | | 2N-2 | ----------------- ----------------- | N-1 | | 2N-1 | ----------------- ----------------- ^ ^ | | ---------------------------- | v ----------------- | Buffer | ----------------- ^ | v System Data Bus ----------------------------------------------------- Figure 1. Non-interleaved Memory Organization The 68040 performs burst accesses (a single bus transaction that reads or writes 16 bytes in 4 adjacent long words) to move data between its caches and memory. All 16 bytes come from one bank of DRAM in a non-interleaved memory system, so the time required to complete the transfer depends directly on the access time of the DRAM. Figure 2 shows an example of such a burst access. The time needed to access the 2nd, 3rd, and 4th long words is shorter because a feature of the DRAMs called "page-mode access" is used. __ __ __ __ __ __ __ __ __ __ Clock __| |__| |__| |__| |__| |__| |__| |__| |__| |__| ______________________________________________________ DRAM Accesses | 1st long word | 2nd lwd | 3rd lwd | 4th lwd | ------------------------------------------------------ Figure 2. Non-interleaved Burst Access Timing Interleaved Memory System In an interleaved memory system, there are still two physical banks of DRAM, but logically the system sees one bank of memory that is twice as large. In the interleaved bank, the first long word of bank 0 is followed by the first long word of bank 1, which is followed by the second long word of bank 0, which is followed by the second long word of bank 1, and so on. Figure 3 shows this organization for two physical banks of N long words. All even long words of the logical bank are located in physical bank 0 and all odd long words are located in physical bank 1. Bank 0 Bank 1 ----------------- ----------------- | 0 | | 1 | ----------------- ----------------- | 2 | | 3 | ----------------- ----------------- | 4 | | 5 | ----------------- ----------------- ~ ~ ~ ~ ----------------- ----------------- | 2N-4 | | 2N-3 | ----------------- ----------------- | 2N-2 | | 2N-1 | ----------------- ----------------- ^ ^ | | v v ----------------- ----------------- | Buffer | | Buffer | ----------------- ----------------- ^ ^ | | v System Data Bus v ----------------------------------------------------- Figure 3. Interleaved Memory Organization The interleaved memory configuration is designed to speed up 68040 burst accesses by as much as 30%. (The actual improvement depends on the system clock speed and the DRAM access time.) Since the four long words of a burst access are spread across two physical banks of DRAM, the individual accesses can be overlapped to hide part, or all, of the DRAM access time delay, as shown below in Figure 4. __ __ __ __ __ __ __ __ __ __ Clock __| |__| |__| |__| |__| |__| |__| |__| |__| |__| _______________________________ | 1st long word | 3rd lwd | ------------------------------- DRAM Accesses ______________________________ | 2nd long word | 4th lwd | ------------------------------- Figure 4. Interleaved Burst Access Timing Centris 650 / Quadra 800 Memory Organization Physically, the DRAM in a Centris 650 or Quadra 800 system is organized as 1P10 banks of memory, where each bank is 32 bits wide and 4 or 16 MBytes deep. Logically, the DRAM is organized as 5 pairs of banks, any of which may or may not be interleaved. At system boot time, each pair of DRAM banks is examined; if they are the same size (4 or 16 MBytes) the interleaved memory configuration for that bank pair will be enabled. Otherwise, the bank pair will be left in the non-interleaved configuration. The memory controller in the C610/Q800 is capable of operating with some bank pairs in the interleaved configuration and some bank pairs in the non-interleaved configuration. The type of memory access which is performed is determined dynamically at the start of each cycle based on the value of an "interleave configuration register" within the memory controller. ROM accesses cannot be interleaved since there is only a single bank of ROM. The C650/Q800 motherboard contains 4 or 8 MB of DRAM and 4 DRAM SIMM sockets. Systems which contain 8 MB on the motherboard (all Q800s and some C650s) already interleave the two 4 MB banks soldered to the motherboard. Systems which have only 4 MB soldered on the motherboard cannot interleave the single soldered 4 MB DRAM bank, although DRAM on SIMMs can still be interleaved. Each DRAM SIMM can contain either one or two banks of DRAM. The C650 and Q800 use 72 pin DRAM SIMMs - these SIMMs have a 32-bit data path, allowing memory upgrades to be performed with a single SIMM. Single-sided SIMMs contain one DRAM bank; double-sided SIMMs contain two DRAM banks. A double-sided SIMM cannot contain an interleaved bank pair since there are not enough pins on the SIMM to accommodate the two 32-bit data buses required for interleaved memory. Interleaving can only be done between DRAM SIMM pairs. The motherboard contains banks 0 & 1, SIMM slot 1 contains banks 2 & 3 (remember, SIMMs can be double-sided and contain 2 banks of DRAM), slot 2 contains banks 4 & 5, slot 3 contains banks 6 & 7, and slot 4 contains banks 8 & 9. SIMM slot pairs 1-2 and 3-4 are interleaved together whenever a bank pair is of the same size. For example, if 4 MB SIMMs are placed in both SIMM slots 1 and 2, then that memory will be interleaved (banks 2 & 4). If a double-sided 8 MB SIMM (i.e, a SIMM with two 4 MB banks on it) is placed in slot 1, and a single-sided 4 MB SIMM is placed in bank 2, then two of the banks will be interleaved (banks 2 & 4) and one bank will not be interleaved (bank 3). The gist of all this is that in order to maximally enable memory interleaving, memory upgrades should be performed with a pair of SIMMs, both of the same size. A single SIMM can be used for memory expansion, but will result in a portion of memory being non-interleaved. The system actually takes care of configuring everything automatically at boot, regardless of what memory is installed. However, by physically configuring DRAM in identically sized bank pairs, the fastest overall memory access is achieved (i.e., the highest performance). The actual performance delta between an interleaved and non-interleaved memory system will depend on the application, and will vary from application to application. - Dale Adams Apple Computer, Inc.