## 18.1 A 20GS/s 8b ADC with a 1MB Memory in 0.18μm CMOS

Ken Poulton, Robert Neff, Brian Setterberg, Bernd Wuppermann, Tom Kopley, Robert Jewett, Jorge Pernillo, Charles Tan, Allen Montijo<sup>1</sup>

Agilent Laboratories, Palo Alto, CA <sup>1</sup>Agilent Technologies, Colorado Springs, CO

Recent work [1] brought CMOS into the world of ADCs for realtime high-bandwidth oscilloscopes. This work extends the reach of CMOS ADCs by a factor of 5 to a sample rate of 20 GS/s and a bandwidth of 6 GHz, using standard digital 0.18 $\mu$ m CMOS to create the world's fastest 8-bit ADC. This chip also solves the problem of capturing the resulting 20GB/s data stream by including a 1MB on-chip sample memory.

The architecture of the ADC is shown in Fig. 18.1.1. The analog input goes to a BiCMOS buffer chip which drives the 4pF input capacitance of the CMOS chip. The ADC is organized in 80 slices, each consisting of an input track and hold (T/H), a transconductor to convert voltage to current (V/I), a reduced-radix current-mode pipeline ADC, and a radix converter, all running at 250MS/s. Each slice is clocked 50 ps after the previous one. The 1GHz input clock drives a delay-locked loop (DLL), interpolators, and dividers to create 80 250-MHz clocks. The data are stored into a 1MB on-chip memory at up to 20GS/s. The decimator can subsample the data to allow longer time records in the memory. When an acquisition is finished, the data are read out from the memory at 250MS/s.

The key function of the input buffer chip is to provide a wellmatched  $50\Omega$  input termination while driving a 4pF load on the CMOS chip through bond wires, with a gain response that is flat from DC to 6GHz.

The simplified schematic of the input buffer chip is shown in Fig. 18.1.2. Differential inputs are terminated with a combination of fixed resistors and linear-region FETs used as electrically-variable resistors. The parasitic reactances are neutralized by incorporating them into an L-C-L transmission line consisting of the bond wire, the pad and input transistor capacitances, and an on-chip inductor. Emitter follower input buffers drive a Cherry-Hooper differential amplifier [2].

It is important to control the peaking of the output RLC circuit across process variations. This is solved by adjusting the current in the output followers to set their  $g_m$  to control the damping of the RLC circuit. The buffer chip is built in a 40GHz- $f_T$  SiGe BiCMOS process and dissipates 1W.

On the CMOS chip, all 80 T/H circuits are directly connected to the input pads. Each differential T/H is a pair of NMOS FET pass-gate circuits, followed by a differential transconductor stage to drive the current-mode pipeline ADC.

Each pipeline ADC uses 12 stages with radix 1.6 to provide tolerance for large mismatch errors, allowing the use of small devices to save power. A digital radix converter turns the 12b radix-1.6 data into 8b binary. The coefficient registers in the radix converter are programmed by calibration software to correct the per-stage variations in the pipelines.

The simplified schematic of one stage of the current-mode pipeline ADC [1] is seen in Fig. 18.1.3. The differential input current goes to a pair of 1.6x current mirrors, with NMOS pass gates inserted to operate as T/H circuits. The comparator senses the input polarity and causes a 1b DAC current to be added to or subtracted from the output current. This pipeline has several advantages: (i) low power: 57mW per slice including transconductor and radix converter (ii) low area: 0.12mm<sup>2</sup> per slice (iii) compatibility with digital CMOS: no linear resistors or capacitors are required to maintain parallel construction with (i) and (ii). At 250MS/s, this pipeline runs twice as fast as the  $0.35\mu$ m pipeline in [1], with 25% less power and 60% less area. It achieves a power figure of merit of 2.3pJ/conversion/level.

The clock system must create 80 250MHz clocks, each offset by 50ps from its neighbors with an error of less than 1ps. Starting with a single 1GHz input clock, the ADC uses a DLL to generate 5 differential clock phases. Interpolation is used to get 20 single-ended phases. These are divided by 4 to provide the 80 time-interleaved 250MHz clocks, each of which has a digital timing adjustment circuit.

The memory system (Fig. 18.1.4) achieves a 20GB/s write rate with a minimum of digital noise generation. It is divided into 8 groups of SRAM blocks, each with its own controller. To reduce bit-line charging power, the 1MB total of SRAM is broken into small 4KB blocks to keep the internal lines short. To allow adequate timing margin, the 4ns data stream from the slices is demultiplexed by 2. Each row of SRAMs has 16 active blocks and one spare; the controller performs self-test and repair of the SRAM arrays.

Variations in chip temperature degrade calibration accuracy, so dummy writes are used to keep the write rate (and power) constant at all times except during memory readout. The 6mm long data lines are complementary and precharged on each cycle to keep their power consumption independent of data patterns. To spread out the clock-related supply-current spikes at the 4ns write rate, the 8 memory group clocks are staggered at 0.5ns intervals.

The two chips are packaged in a custom 438-ball BGA (ball-grid array) package (Fig. 18.1.5). Direct chip-to-chip wirebonding minimizes the inductance between the chips. Total power is 10W.

Calibration is performed in software. Using a sawtooth calibration waveform, per-slice gain and offset values are measured and corrected with 160 on-chip DACs. Then gain coefficients for each stage of each pipeline are calculated and loaded into the radix converter, so that corrected 8b binary data comes out. Next, Fourier analysis of a pulse waveform is used to set the on-chip per-slice timing adjustments.

The differential nonlinearity (DNL) of the ADC is  $\pm 0.3$  LSBs; intrinsic integral nonlinearity (INL) is  $\pm 1.7$  LSBs (mainly due to third harmonic distortion in the open-loop transconductor circuit). INL is  $\pm 0.3$  LSBs with an external HD3 post-correction.

As shown in Fig. 18.1.6, ADC accuracy reaches 6.5 effective bits at input frequencies up to 500MHz, mainly limited by thermal noise. It achieves 4.6 effective bits for a full-scale input at 6GHz, limited by jitter. Total jitter is 0.7ps rms, including external clock jitter, on-chip thermal jitter and residual timing misalignment. Misalignment after calibration is less than 0.4 ps rms. The ADC achieves a full-scale input signal bandwidth of 6.6GHz.

Figure 18.1.7 shows the major results. This ADC sample rate is twice as high as any ADC of 6 or more bits [4] and five times as high as any other CMOS ADC [1].

## Acknowledgments

The authors thank Robert Saponas, Ken Rush, John Kerley, Bob Warren and Darrin Walraven for their contributions.

## References

[1] Ken Poulton, Robert Neff, Art Muto, Wei Liu, Andy Burstein, Mehrdad Heshami, "4 GSample/s 8b ADC in 0.35µm CMOS", *ISSCC Digest of Technical Papers*, pp. 166-167, Feb. 2002.

[2] E. Cherry and D. Hooper, 'The Design of Wideband Transistor Feedback Amplifiers," Proc. *IEEE*, vol. 110, pp. 375-389, Feb. 1963.

[3] P. Scholtens, et al, "6-bit 1.6-GSa/s Flash ADC in 0.18-um CMOS Using Averaging Termination", *ISSCC Digest of Digest of Technical Papers*, Feb. 2002.
[4] "10Gbps Single-chip A-D Converter", *Nikkei Electronics*, pp. 22-23, Mar. 11, 2002.



2

| Sample Rate               | 20 GSa/s              |              |
|---------------------------|-----------------------|--------------|
| Resolution                | 8 bits                |              |
| INL                       |                       |              |
| Intrinsic                 | ±1.7 LSBs             |              |
| With linearity correction | <u>+</u> 0.4 LSBs     |              |
| DNL                       | <u>+</u> 0.3 LSBs     |              |
| Bandwidth                 | 6.6 GHz               |              |
| Accuracy                  |                       |              |
| @ 500 MHz input           | 6.5 effective bits    |              |
| @ 6 GHz input             | 4.6 effective bits    |              |
| Jitter                    | 0.7 ps rms            |              |
| Input Range               | 0.25 Vpk differential |              |
|                           | Buffer Chip           | ADC Chip     |
| Input Capacitance         | 0.2 pF                | 4 pF         |
| Power                     | 1 W                   | 9 W          |
| Chip Size                 | 1.2 x 2.6 mm          | 14 x 14 mm   |
| Technology                | 40-GHz SiGe BiCMOS    | 0.18-mm CMOS |
| Transistors               | 1000                  | 50M          |
| Package                   | 438-ball BGA          |              |

## Figure 18.1.7: ADC results.

| DIGEST OF TECHNICAL PAPERS . |
|------------------------------|
| DIGEST OF TECHNICAL FAFERS * |

3