Piecewise Linear Models for Rsim

Russell Kao
Mark Horowitz
The Western Research Laboratory (WRL) is a computer systems research group that was founded by Digital Equipment Corporation in 1982. Our focus is computer science research relevant to the design and application of high performance scientific computers. We test our ideas by designing, building, and using real systems. The systems we build are research prototypes; they are not intended to become products.

There are two other research laboratories located in Palo Alto, the Network Systems Laboratory (NSL) and the Systems Research Center (SRC). Other Digital research groups are located in Paris (PRL) and in Cambridge, Massachusetts (CRL).

Our research is directed towards mainstream high-performance computer systems. Our prototypes are intended to foreshadow the future computing environments used by many Digital customers. The long-term goal of WRL is to aid and accelerate the development of high-performance uni- and multi-processors. The research projects within WRL will address various aspects of high-performance computing.

We believe that significant advances in computer systems do not come from any single technological advance. Technologies, both hardware and software, do not all advance at the same pace. System design is the art of composing systems which use each level of technology in an appropriate balance. A major advance in overall system performance will require reexamination of all aspects of the system.

We do work in the design, fabrication and packaging of hardware; language processing and scaling issues in system software design; and the exploration of new applications areas that are opening up with the advent of higher performance systems. Researchers at WRL cooperate closely and move freely among the various levels of system design. This allows us to explore a wide range of tradeoffs to meet system goals.

We publish the results of our work in a variety of journals, conferences, research reports, and technical notes. This document is a technical note. We use this form for rapid distribution of technical material. Usually this represents research in progress. Research reports are normally accounts of completed research and may include material from earlier technical notes.

Research reports and technical notes may be ordered from us. You may mail your order to:

Technical Report Distribution
DEC Western Research Laboratory, WRL-2
250 University Avenue
Palo Alto, California 94301 USA

Reports and notes may also be ordered by electronic mail. Use one of the following addresses:

Digital E-net: DECWRL::WRL-TECHREPORTS
Internet: WRL-Techreports@decwrl.dec.com
UUCP: decwrl!wrl-techreports

To obtain more details on ordering by electronic mail, send a message to one of these addresses with the word “help” in the Subject line; you will receive detailed instructions.
Piecewise Linear Models for Rsim

Russell Kao
Digital Equipment Corporation Western Research Laboratory

Mark Horowitz
Stanford University

November, 1993

Abstract

Rsim is a switch-level simulator which can simulate large digital MOS integrated circuits with speedups of over 3 orders of magnitude over SPICE. Unfortunately, Rsim’s simple switched-resistor model renders it incapable of simulating certain CMOS and most BiCMOS and ECL digital circuits. We observe that the switched-resistor model is just one particular piecewise linear model and that Rsim’s simulation framework can accommodate more elaborate piecewise linear models. The resulting simulator, Mom, combines the efficiency of switch-level simulation with the ability to simulate a wider variety of circuits. We demonstrate Mom’s efficiency and flexibility on a variety of circuits.

This research was supported in part by DARPA contract N00039-91-C-1038.
1 Introduction

The high cost of semiconductor processing makes it desirable to verify the correctness of a large custom digital integrated circuit before it is fabricated. Although circuit simulators are usually used to analyze small pieces of the design, they can’t be used to simulate entire integrated circuits (having potentially millions of transistors) because their algorithms are inefficient and require execution times which grow superlinearly with circuit size. Nonetheless there is a need to verify at least the logical functionality of the entire design to confirm that no errors are made when the pieces are assembled.

To satisfy this need much work has gone into trying to accelerate circuit simulation[1, 4, 9]; and speedups of two orders of magnitude over SPICE have been reported. In contrast, we attempt to increase the accuracy of a switch-level simulator, Rsim[10]. This approach is attractive because switch-level simulators achieve speedups of over 3 orders of magnitude for small circuits. For large circuits the speedups are virtually unbounded, being determined by the amount of latency in the circuit under simulation.

Limitations of the switch-level approach must be addressed. Although Rsim is useful for predicting the first order behavior of most digital MOS circuits, the simple switched-resistor model is inadequate for MOS circuits which are more “analog” in nature (for example RAM sense amplifiers) and for most BiCMOS and ECL circuits. We observe that Rsim’s switched-resistor model is just one particular piecewise linear model and that Rsim can be modified to allow other more general piecewise linear models1. Multiple models of varying degrees of sophistication are provided allowing the user to make different speed vs accuracy tradeoffs for different parts of the integrated circuit. Comparisons with existing switch-level and circuit-level simulators reveal that our simulator, Mom[5], approaches the speed of switch-level simulators when the simplest transistor models are used. When more complex models are used Mom is able to handle sophisticated CMOS, BiCMOS and ECL circuits, which are beyond the capabilities of existing switch-level simulators.

2 Rsim’s algorithm

Since Mom uses the same basic algorithm as Rsim we will briefly review it here. Rsim approximates the behavior of MOS transistors using the switched-resistor model (Figure 1). This consists of the series combination of a resistor and a voltage controlled switch. If the gate voltage, \( V_g \), of an NMOS (PMOS) transistor is at a logic level high (low) then the switch is closed and the transistor may be replaced by a resistor. Otherwise the switch is open and the transistor is an open circuit.

When a node changes value (node \( \text{in} \) in Figure 2) all transistors with a gate attached to the node will switch and Rsim must determine the response of all subcircuits containing those transistors. The subcircuits, known as "stages" or "clusters", are identified by finding all nodes connected to the source or drain of the switching transistor along some path of "on" transistors. In Figure 2 clusters \( X \) and \( Y \) need to be analyzed as a result of node \( \text{in} \) changing (cluster \( Z \) will be analyzed

\[1\] Our approach was inspired by Pillage[7] who first suggested the combination of Asymptotic Waveform Evaluation with piecewise linear models as the basis of a new kind of circuit simulator. A circuit simulator built upon those principles demonstrates speedups over SPICE of a factor of 6[3]. We extend that approach by giving up the full generality of a circuit simulation in order to achieve greater efficiency.
Figure 1: Switched Resistor Model

Figure 2: Clusters.
when node A changes). Note that clusters X, Y, and Z can be analyzed independently of each other and of all other subcircuits in the integrated circuit. The switched-resistor model allows Rsim to partition the circuit and take advantage of latency.

To compute the response of a cluster Rsim analyzes the circuit formed when "on" transistors are replaced by resistors and "off" transistors by open circuits (Figure 3). For a typical MOS logic gate the resulting circuit is an RC tree, that is a tree of resistors with the root node grounded and capacitors to ground at every other node. This is convenient because the step response of an RC tree is well approximated by an exponential with a time constant equal to the first moment. In turn, the first moment can be efficiently computed ($O(n)$ complexity) via a depth first traversal of the tree.

![Figure 3: Equivalent Circuit.](image)

### 3 Piecewise linear models

Although Rsim’s algorithm was described assuming the use of the switched-resistor model, it can accommodate more general piecewise linear models. Rsim depends upon two characteristics of the switched-resistor model: 1) the unidirectional coupling from the gate to the source and drain allows Rsim to partition the circuit and 2) the simplicity of the model permits efficient timing analysis. However, more general piecewise linear models can be chosen to retain both characteristics. For example, models may have more than two regions of linearity. In that case events are associated not only with transitions between the two states: “on” and “off” but also with transitions between any two adjacent regions. In addition, the model needn’t be a resistor. Rsim’s tree analysis can be extended to handle device models which include dependent sources.

More general piecewise linear models can yield substantial improvements over the switched-resistor model. One model that has proven to be particularly useful is depicted in Figure 4. Three regions of operation are modeled. If the gate-source voltage, $V_{gs}$, is less than a threshold, $V_t$, the transistor is off. If $V_{gs} > V_t$ and the drain-source voltage is large then the transistor saturated and
the current is largely determined by the gate-source voltage (although the output conductance \( g_o \)
models channel length modulation). If \( V_{gs} > V_t \) and the drain-source voltage is small then the
transistor is linear and the current depends only on the drain-source voltage. We will refer to this
model as Mom’s Level-1 MOS model.

Figure 5 shows the I-V characteristics of this model superimposed over the I-V characteristics
of a SPICE transistor. The match is good because, for modern short channel devices, velocity
saturation tends to linearize what would otherwise be a quadratic dependence of the current upon
the gate-source voltage.

Figure 6 shows Mom’s output for a CMOS ring oscillator using the switched-resistor and Mom’s
Level-1 MOS model. It can be seen that the Level-1 MOS model brings a substantial improvement
in waveform accuracy\(^2\).

---

\(^2\) Also apparent from the figure is Mom’s use of voltages rather than Boolean values to represent the state of nodes.
Because Rsim is targeted towards CMOS logic it assumes that all signals swing rail-to-rail and only records whether
a signal swings to the positive or negative power supply rail. However Mom must also simulate circuits with signals
which don’t swing rail to rail (for example memory sense amplifiers and ECL logic gates). Consequently Mom uses
voltages to represent node state.
4 Timing analysis

Piecewise linear models require the use of more sophisticated timing analysis techniques. In principle, Rsim’s first moment timing analysis could be used. Rsim’s techniques for computing moments from an RC tree are readily generalized to allow piecewise linear transistor models. The response of clusters can then be approximated by (piecewise) exponentials. When we did this we discovered that the use of more accurate piecewise linear models yielded timing estimates that were worse rather than better. Although an exponential is a good approximation of the step response of an RC tree it is not necessarily a good approximation of the response of more sophisticated circuits. Consequently Mom employs a more general moments matching technique[2, 8]. Instead of using just the first moment to create a waveform approximation consisting of a single exponential, Mom also uses additional higher order moments to create a waveform approximation consisting of the sum of exponentials.

Note that the increased computational cost of this timing analysis technique relative to Rsim’s is mostly due to the estimation of multiple poles rather than the computation of additional moments. If the circuit has a tree topology (and most do) then the cost of moment computation rises only linearly with the size of the cluster and the number of moments. However, the cost of computing poles rises superlinearly with the number of poles. For this reason Mom restricts waveform approximations to three or fewer poles.

5 Demonstration

We will now demonstrate the flexibility of Mom on a number of CMOS, ECL, and BiCMOS circuits which are beyond the capabilities of conventional switch-level simulators.

The dynamic RAM is an interesting example because although Rsim can simulate most of the circuits in the RAM it has problems with the sense amplifier (Figure 7). The sensing phase (Figure 8) begins with the two bit lines, \textit{bit} and \textit{bit}, charged to slightly different voltages. A rising
Figure 7: Dynamic RAM Cell and Sense Amplifier

Figure 8: DRAM Bit Lines: Read followed by Precharge
transition on *sense* turns on the sense amplifier to magnify this voltage difference. When T3 turns on, $s$ begins to fall which will cause either $T1$ or $T2$ to turn on, depending upon which bit line is higher. For example, if *bit* is higher then $T2$ will turn on, $\overline{\text{bit}}$ will be pulled to ground, and *bit* will be pulled to Vdd.

Note that $T1$ or $T2$ is turned on by pulling the source terminal low. Because the switched-resistor model can only be turned on by pulling the gate terminal high it is incapable of modeling the behavior of those two transistors. However, if those transistors are simulated using Mom’s MOS Level-1 model, the correct circuit behavior can be obtained. Figure 8 shows plots of the bit line waveforms generated by SPICE and Mom for a read, sense, precharge sequence. For Mom’s simulation the Level-1 model was only used for $T1$, $T2$, and $T3$. Everywhere else switched-resistor models were employed. Although Mom’s response differs from SPICE’s, it is adequate for a first order verification of the entire DRAM. Table 1 shows that for this example Mom is 250 times faster than SPICE.

<table>
<thead>
<tr>
<th>Circuit Type</th>
<th>SPICE</th>
<th>Mom</th>
<th>SPICE Mom</th>
</tr>
</thead>
<tbody>
<tr>
<td>DRAM Cell</td>
<td>18.8</td>
<td>.075</td>
<td>250</td>
</tr>
<tr>
<td>ECL RAM Cell</td>
<td>4.2</td>
<td>.102</td>
<td>41</td>
</tr>
<tr>
<td>BiCMOS Buffer</td>
<td>5.1</td>
<td>.008</td>
<td>650</td>
</tr>
<tr>
<td>DRAM (21k devices)</td>
<td>—</td>
<td>13.900</td>
<td>&gt;1890</td>
</tr>
</tbody>
</table>

Table 1: Execution Time of Example Circuits (seconds).

The ECL switch-level simulator, Bisim[6], is based upon tracing paths through current steering networks formed by bipolar transistors (Figure 9). Negative current can be thought of as originating from the current source at the bottom of the network and rising towards the top. When the current encounters a node with multiple emitters attached, it is steered through the transistor with the highest base. If the current encounters a resistor then it has reached an output and the resulting voltage drop causes the output to fall. Thus a simple path tracing algorithm is sufficient to determine the behavior of textbook ECL logic gates.

However under certain circumstances current isn’t simply switched between one transistor or another but rather is shared. For example, in an ECL RAM (Figure 10) current from a single
current source is divided between all cells attached to the bottom word line. Simple path tracing algorithms can’t determine how current should be shared. Figure 11 shows the output of SPICE and Mom during a cell write operation. For this example Mom is 41 times faster than SPICE.

An increasing number of circuit designs utilize both bipolar and MOS transistors on the same chip. Unfortunately neither Rsim nor Bisim can handle these new BiCMOS designs. Figure 12 shows one variation of the BiCMOS buffer. For this example we trade off waveform accuracy for simulation efficiency by selecting the switched-resistor model for the MOS transistors. The outputs of SPICE and MOM are compared in Figure 13. A comparison of execution times reveals that Mom is 650 times faster than SPICE.

The preceding benchmarks understate the potential benefit of switch-level simulation because they are small and have little latency. The last circuit is a complete 16k bit DRAM (about 21,000
Figure 12: BiCMOS Buffer.

Figure 13: BiCMOS Buffer Response.
transistors), including the entire array, address decoders, data multiplexors, and control. Because Mom partitions the circuit and takes advantage of latency, the circuit can be simulated efficiently. Although the complete DRAM is 1400 times larger than the single cell DRAM circuit (in terms of transistor count) Mom needs only 185 times as much CPU time (Table 1) to simulate it\(^3\). For this example, Mom’s execution time grows sublinearly with the size of the circuit. By comparison, SPICE’s computational requirements tend to grow superlinearly with circuit size; the complete DRAM was too large to simulate using SPICE. However, even if SPICE’s execution times were to scale linearly with circuit size, Mom would still be over 1800 times faster.

### 6 Performance

The previous section illustrated the additional flexibility obtained by incorporating piecewise linear models into the switch-level framework. However, increased generality usually comes at the expense of decreased efficiency. To investigate this issue a number of ring oscillators were simulated using SPICE-3d2, Mom, Irsim, and Bisim\(^4\). Ring oscillators were built using a CMOS inverter, CMOS NAND gate, ECL inverter, and BiCMOS buffer at each stage. In addition the two CMOS ring oscillators were simulated using both the switched-resistor model (“CMOS0”) and Mom’s Level-1 MOS model (“CMOS1”). Many periods of oscillation were simulated in order to wash out the effects of simulator initialization. The relative efficiencies of the simulators were compared based on the amount of CPU time required to simulate identical numbers of oscillations. Table 2 shows the results. The first two columns compute the speedup of Mom over SPICE-3d2

<table>
<thead>
<tr>
<th></th>
<th>Ratio CPU times (Bi/Ir)sim</th>
<th>Period Error%</th>
</tr>
</thead>
<tbody>
<tr>
<td>SPICE</td>
<td>Mom</td>
<td></td>
</tr>
<tr>
<td>CMOS0 Inverter Ring</td>
<td>1600</td>
<td>2.7</td>
</tr>
<tr>
<td>CMOS1 Inverter Ring</td>
<td>80</td>
<td>53.2</td>
</tr>
<tr>
<td>CMOS0 NAND Ring</td>
<td>2300</td>
<td>1.5</td>
</tr>
<tr>
<td>CMOS1 NAND Ring</td>
<td>81</td>
<td>42.0</td>
</tr>
<tr>
<td>ECL Ring</td>
<td>230</td>
<td>3.3</td>
</tr>
<tr>
<td>BiCMOS Buffer Ring</td>
<td>1400</td>
<td>—</td>
</tr>
</tbody>
</table>

Table 2: Simulator Performance on Ring Oscillators.

and the degradation of Mom relative to the switch-level simulators. The last column reports the percentage error in Mom’s prediction of the period of oscillation relative to SPICE.

The efficiency of switch-level simulation is evident from the table. Bisim and Irsim are from 760 to 4200 times faster than SPICE. In addition, Mom’s increased generality exacts only a moderate performance penalty when switch-level models are employed. For circuits using the MOS switched

\(^3\)The single cell DRAM circuit consists of one column whereas the complete DRAM has 128 columns. Since an access activates all 128 columns, the simulation of the complete DRAM involves at least 128 times as much work as the simulation of a single column.

\(^4\)SPICE-3d2 is a derivative of the circuit simulator SPICE, and Irsim is a derivative of the MOS switch level simulator Rsim.
resistor model (“CMOS0 Inverter Ring” and “CMOS0 NAND Ring”) Mom is between 2.7 and 1.5 times slower than Irsim. For the ECL ring Mom is 3.3 times slower than Bisim. Note that for these models the accuracy of Mom is comparable to that of the switch-level simulators.

As more accurate models are used Mom’s efficiency decreases rapidly. When MOS Level-1 models are used in the CMOS ring (“CMOS1 Inverter Ring”) Mom slows down by a factor of 20. In return, Mom achieves increased accuracy. For this pair of circuits the period estimated by Mom is off by only 2.6% and 1.2% relative to SPICE. Such low errors are generally beyond the capabilities of Irsim and Bisim.

Execution profiles revealed the source of the speed degradation. When Mom simulated the CMOS inverter ring using the switched-resistor models 68% of the execution time was spent in timing analysis and 18% of the time was spent rescheduling transistors. However when Mom’s Level-1 MOS models were used 25% of the time was spent in timing analysis and 69% of the time was spent rescheduling transistors.

A couple of factors contribute to the increased cost of rescheduling. Devices with greater numbers of regions require more checks for region changes. If the switched-resistor model is off it is only necessary to check if the model will turn on. In contrast two checks must be made for the Level-1 MOS model. If the model is in the linear region it is necessary to check if the model will enter the saturated region or the off region. Additional expense is incurred because waveforms consists of sums of exponentials rather than just single exponentials. The root of an equation which has a single exponential can be found explicitly. The root of an equation which has the sum of three exponentials must be found iteratively.

7 Conclusion

We have shown that Rsim’s basic switch-level simulation framework can accommodate more general piecewise linear transistor models along with the original switched-resistor model. These more general models can be incorporated without seriously impairing the simulator’s efficiency for the simplest cases. That is when the simplest switch-level models are used our simulator, Mom, achieves speeds and accuracies comparable to those of dedicated switch-level simulators. In addition the more general models give the simulator greater flexibility. Mom can simulate circuits that can’t be simulated by Rsim or Bisim with substantial speedups over SPICE.

This approach is particularly well suited for simulating circuits that are just beyond the capabilities of switch-level simulation. Frequently most of a circuit can be simulated using switch-level models and only small portions require more accurate models. Because Mom has been structured such that the additional generality is paid for only where it is used it can simulate those circuits with only a minor degradation of efficiency.

Our experiments uncovered some limitations of the approach. Benchmarks show that the cost of rescheduling devices rises rapidly as the complexity of transistor models is increased. Unless this cost is ameliorated, the approach could lose its speed advantage when models approaching the accuracy and generality of SPICE’s nonlinear models are used.

---

5The low error of the CMOS0 inverter ring is not representative. It occurred only because that circuit was used to calibrate the switch-level model.
8 Acknowledgements

Jeffrey Mogul and the anonymous referees provided many helpful comments on an early draft of this paper. This research was supported in part by DARPA contract N00039-91-C-1038.

References


WRL Research Reports

“Titan System Manual.”
Michael J. K. Nielsen.
WRL Research Report 86/1, September 1986.

“Global Register Allocation at Link Time.”
David W. Wall.

“Optimal Finned Heat Sinks.”
William R. Hamburgen.

David W. Wall and Michael L. Powell.
WRL Research Report 87/1, August 1987.

Jeffrey C. Mogul, Richard F. Rashid, Michael J. Accetta.

“Fragmentation Considered Harmful.”
Christopher A. Kent, Jeffrey C. Mogul.

“Cache Coherence in Distributed Systems.”
Christopher A. Kent.

“Register Windows vs. Register Allocation.”
David W. Wall.

“Editing Graphical Objects Using Procedural Representations.”
Paul J. Asente.

“The USENET Cookbook: an Experiment in Electronic Publication.”
Brian K. Reid.

“MultiTitan: Four Architecture Papers.”
Norman P. Jouppi, Jeremy Dion, David Boggs, Michael J. K. Nielsen.

“Fast Printed Circuit Board Routing.”
Jeremy Dion.

“Compacting Garbage Collection with Ambiguous Roots.”
Joel F. Bartlett.

“The Experimental Literature of The Internet: An Annotated Bibliography.”
Jeffrey C. Mogul.

“Measured Capacity of an Ethernet: Myths and Reality.”
David R. Boggs, Jeffrey C. Mogul, Christopher A. Kent.

“Visa Protocols for Controlling Inter-Organizational Datagram Flow: Extended Description.”
Deborah Estrin, Jeffrey C. Mogul, Gene Tsudik, Kamaljit Anand.

“SCHEME->C A Portable Scheme-to-C Compiler.”
Joel F. Bartlett.

“Optimal Group Distribution in Carry-Skip Adders.”
Silvio Turrini.

“Precise Robotic Paste Dot Dispensing.”
William R. Hamburgen.
“Simple and Flexible Datagram Access Controls for Unix-based Gateways.”
Jeffrey C. Mogul.

V. Srinivasan and Jeffrey C. Mogul.

“A Unified Instruction-Level Parallelism for Superscalar and Superpipelined Machines.”
Norman P. Jouppi and David W. Wall.

“A One-Dimensional Thermal Model for the VAX 9000 Multi Chip Units.”
John S. Fitch.

“Link-Time Code Modification.”
David Wall.

“Noise Issues in the ECL Circuit Family.”
Jeffrey Y. F. Tang and J. Leon Yang.
WRL Research Report 90/1, January 1990.

“Efficient Generation of Test Patterns Using Boolean Satisfiability.”
Tracy Larrabee.

“Two Papers on Test Pattern Generation.”
Tracy Larrabee.

“Virtual Memory vs. The File System.”
Michael N. Nelson.

“Efficient Use of Workstations for Passive Monitoring of Local Area Networks.”
Jeffrey C. Mogul.

“A One-Dimensional Thermal Model for the VAX 9000 Multi Chip Units.”
John S. Fitch.
WRL Research Report 90/6, July 1990.

“1990 DECWRL/Livermore Magic Release.”
WRL Research Report 90/7, September 1990.

“Pool Boiling Enhancement Techniques for Water at Low Pressure.”

“Writing Fast X Servers for Dumb Color Frame Buffers.”
Joel McCormack.
“A Simulation Based Study of TLB Performance.”
J. Bradley Chen, Anita Borg, Norman P. Jouppi.

“Analysis of Power Supply Networks in VLSI Circuits.”
Don Stark.

“TurboChannel T1 Adapter.”
David Boggs.

“Procedure Merging with Instruction Caches.”
Scott McFarling.

“Don’t Fidget with Widgets, Draw!”
Joel Bartlett.

“Pool Boiling on Small Heat Dissipating Elements in Water at Subatmospheric Pressure.”

“Incremental, Generational Mostly-Copying Garbage Collection in Uncooperative Environments.”
G. May Yip.

“Interleaved Fin Thermal Connectors for Multichip Modules.”
William R. Hamburgen.

“Experience with a Software-defined Machine Architecture.”
David W. Wall.

“Network Locality at the Scale of Processes.”
Jeffrey C. Mogul.

“Cache Write Policies and Performance.”
Norman P. Jouppi.

“Packaging a 150 W Bipolar ECL Microprocessor.”
William R. Hamburgen, John S. Fitch.

“Observing TCP Dynamics in Real Networks.”
Jeffrey C. Mogul.

“Systems for Late Code Modification.”
David W. Wall.

“Piecewise Linear Models for Switch-Level Simulation.”
Russell Kao.

“A Practical System for Intermodule Code Optimization at Link-Time.”
Amitabh Srivastava and David W. Wall.

“A Smart Frame Buffer.”
Joel McCormack & Bob McNamara.

“Recovery in Spritely NFS.”
Jeffrey C. Mogul.

“Unreachable Procedures in Object-oriented Programming.”
Amitabh Srivastava.

“Limits of Instruction-Level Parallelism.”
David W. Wall.
WRL Research Report 93/6, November 1993.
WRL Technical Notes

‘‘TCP/IP PrintServer: Print Server Protocol.’’
Brian K. Reid and Christopher A. Kent.

‘‘TCP/IP PrintServer: Server Architecture and Implementation.’’
Christopher A. Kent.

‘‘Smart Code, Stupid Memory: A Fast X Server for a Dumb Color Frame Buffer.’’
Joel McCormack.

‘‘Why Aren’t Operating Systems Getting Faster As Fast As Hardware?’’
John Ousterhout.

‘‘Mostly-Copying Garbage Collection Picks Up Generations and C++.’’
Joel F. Bartlett.

‘‘The Effect of Context Switches on Cache Performance.’’
Jeffrey C. Mogul and Anita Borg.

‘‘MTOOL: A Method For Detecting Memory Bottlenecks.’’
Aaron Goldberg and John Hennessy.

‘‘Predicting Program Behavior Using Real or Estimated Profiles.’’
David W. Wall.

‘‘Cache Replacement with Dynamic Exclusion’’
Scott McFarling.

‘‘Boiling Binary Mixtures at Subatmospheric Pressures’’

‘‘A Comparison of Acoustic and Infrared Inspection Techniques for Die Attach’’
John S. Fitch.

‘‘TurboChannel Versatec Adapter’’
David Boggs.

‘‘A Recovery Protocol For Spritely NFS’’
Jeffrey C. Mogul.

‘‘Electrical Evaluation Of The BIPS-0 Package’’
Patrick D. Boyle.

‘‘Transparent Controls for Interactive Graphics’’
Joel F. Bartlett.

‘‘Design Tools for BIPS-0’’
Jeremy Dion & Louis Monier.

‘‘Link-Time Optimization of Address Calculation on a 64-Bit Architecture’’
Amitabh Srivastava and David W. Wall.

‘‘Combining Branch Predictors’’
Scott McFarling.

‘‘Boolean Matching for Full-Custom ECL Gates’’
Robert N. Mayo and Herve Touati.