## UC Irvine UC Irvine Electronic Theses and Dissertations

### Title

Transceiver Design for Mobile Networks: Tackling mm-Wave High-Speed Link Challenges and Sub-6GHz Mobile Terminal Blocking Problems

Permalink https://escholarship.org/uc/item/70s1t9ng

**Author** Wang, Huan

**Publication Date** 2020

Peer reviewed|Thesis/dissertation

# UNIVERSITY OF CALIFORNIA, IRVINE

Transceiver Design for Mobile Networks: Tackling mm-Wave High-Speed Link Challenges and Sub-6GHz Mobile Terminal Blocking Problems

#### DISSERTATION

submitted in partial satisfaction of the requirements for the degree of

#### DOCTOR OF PHILOSOPHY

in Electrical Engineering

by

Huan Wang

Dissertation Committee: Professor Payam Heydari, Chair Professor A. Lee Swindlehurst Professor Ozdal Boyraz Assistant Professor Hamidreza Aghasi

© 2019 IEEE © 2020 Huan Wang

# DEDICATION

To my family.

# TABLE OF CONTENTS

|     | Pa                                                                                                                                                   | age                                                                |
|-----|------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------|
| LIS | ST OF FIGURES                                                                                                                                        | $\mathbf{v}$                                                       |
| LIS | ST OF TABLES                                                                                                                                         | / <b>iii</b>                                                       |
| AC  | KNOWLEDGMENTS                                                                                                                                        | ix                                                                 |
| VI  | ΓΑ                                                                                                                                                   | х                                                                  |
| AB  | STRACT OF THE DISSERTATION                                                                                                                           | xii                                                                |
| 1   | Introduction to Transceiver Design Challenges in Mobile Networks                                                                                     | 1                                                                  |
| 2   | High-Order QAM Direct Modulation Transmitter for mm-Wave High-         Speed Point-to-Point Wireless Links         2.1       Background Introduction | <b>5</b><br>7<br>7<br>11<br>15<br>15<br>18<br>21<br>24<br>27<br>20 |
|     | <ul> <li>2.4 I6QAM Direct Modulation TX Prototype Design</li></ul>                                                                                   | 29<br>30<br>33<br>35<br>39<br>41<br>48                             |
| 3   | LO Leakage Suppression in Wideband, Blocker-Tolerant, Highly Selective<br>and Widely Tunable Receivers<br>3.1 Introduction and Motivation            | <b>49</b><br>49                                                    |

| 3.2     | 3.2 LO Leakage Bottleneck |                                   |    |  |
|---------|---------------------------|-----------------------------------|----|--|
| 3.3     | Band-I                    | Pass Common-Gate Structure        | 55 |  |
|         | 3.3.1                     | High-Q Input Selectivity          | 56 |  |
|         | 3.3.2                     | LO Leakage Analysis               | 60 |  |
|         | 3.3.3                     | Parasitic Effects                 | 63 |  |
|         | 3.3.4                     | Stability Analysis                | 66 |  |
| 3.4     | Receive                   | er Design                         | 70 |  |
|         | 3.4.1                     | Noise and Linearity Consideration | 70 |  |
|         | 3.4.2                     | Common-Gate Down-Conversion Path  | 75 |  |
|         | 3.4.3                     | 8-Phase LO Generation             | 78 |  |
|         | 3.4.4                     | Complete RX Implementation        | 79 |  |
| 3.5     | Measur                    | rements                           | 80 |  |
| 3.6     | Conclu                    | $\operatorname{sion}$             | 85 |  |
| Bibliog | raphy                     |                                   | 86 |  |

# LIST OF FIGURES

## Page

| $1.1 \\ 1.2$ | Communications between base-stations and backbone network                         | $\frac{2}{3}$ |
|--------------|-----------------------------------------------------------------------------------|---------------|
| $2.1 \\ 2.2$ | Atmospheric attenuation in clear air                                              | 6             |
|              | conversion TX                                                                     | 8             |
| 2.3          | TX chain frequency response equivalent BB model                                   | 9             |
| 2.4          | Eye-diagrams of symbol-rate and $2 \times$ symbol-rate DACs with FFE              | 9             |
| 2.5          | 4-bit and 8-bit quantization for PAM8                                             | 10            |
| 2.6          | Current-steering DAC topology.                                                    | 11            |
| 2.7          | DAC non-linearity impact on a BB PAM8 eye-diagram.                                | 13            |
| 2.8          | (a) Generation of 16QAM constellation by adding two QPSKs with a ampli-           |               |
|              | tude ratio of 2. (b) Error vector distribution in each individual QPSK. (c)       |               |
|              | Generalization to $4^{M}$ -QAM constellation                                      | 15            |
| 2.9          | Simplified block diagram of the proposed TX                                       | 18            |
| 2.10         | (a) Settling trajectory of QPSK signal from symbol $S_2$ to $S_1$ . (b) QPSK      |               |
|              | constellation errors due to TX BW limitation.                                     | 19            |
| 2.11         | (a) Illustration of QPSK amplitude mismatch effect. (b) EVM of 16QAM as           |               |
|              | a function of amplitude mismatch                                                  | 21            |
| 2.12         | (a) Surface plot of 64QAM EVM with amplitude mismatch. (b) Contours of            |               |
|              | 64QAM EVM surface plot.                                                           | 23            |
| 2.13         | (a) Phase mismatch effect on 16QAM. (b) EVM of 16QAM as a function of             |               |
|              | phase mismatch.                                                                   | 25            |
| 2.14         | (a) Surface plot of 64QAM EVM with amplitude mismatch. (b) Contours of            |               |
|              | 64QAM EVM surface plot.                                                           | 26            |
| 2.15         | Conventional and proposed $4^{M}$ -QAM modulator                                  | 27            |
| 2.16         | Full architecture of the proposed direct modulation TX implementing 16QAM         |               |
|              | constellation.                                                                    | 29            |
| 2.17         | The four-stage PA schematic.                                                      | 30            |
| 2.18         | (a) PA load-pull contour. (b) Small-signal model of neutralized differential      |               |
|              | common-emitter stage. (c) Simulated $G_{max}$ with ideal and dummy BJT ca-        | 01            |
| 0.10         |                                                                                   | 31<br>22      |
| 2.19         | (a) PA first stage layout. (b) PA output stage layout. (c) PA simulation results. | 32            |

| 2.20 | PA simulation results of: (a) Output power versus input power for 3 different frequencies; (b) Group delay of PA small signal response               |
|------|------------------------------------------------------------------------------------------------------------------------------------------------------|
| 2.21 | (a) Wilkinson combiner schematic. (b) Lavout                                                                                                         |
| 2.22 | (a) Simulated differential mode S-parameters. (b) Simulated differential out-                                                                        |
|      | put magnitude and phase error, CM / DM conversion ratios                                                                                             |
| 2.23 | Schematic of the QPSK modulator                                                                                                                      |
| 2.24 | QPSK modulator simulation results with LO frequencies of 110 GHz (black)<br>and 115 GHz (blue) for (a) output power versus LO power (b) output power |
|      | and LO leakage for different bias codes 36                                                                                                           |
| 2.25 | QPSK modulator simulation results of: (a) Magnitude response. (b) Internal                                                                           |
| 0.00 | quadrature errors                                                                                                                                    |
| 2.26 | 10 GBaud output waveform in transition between symbols with 180° phase difference                                                                    |
| 2.27 | (a) LO tripler schematic. (b) LO doubler schematic. Layouts of (c) 0° phase                                                                          |
|      | shifter (d) $45^{\circ}$ phase shifter                                                                                                               |
| 2.28 | Simulation results of: (a) Output power versus input power for LO generation                                                                         |
|      | circuitry. (b) Phase shifter tuning range. (c) Output power versus frequency.                                                                        |
|      | (d) SSB phase noise of LO output at 110 GHz                                                                                                          |
| 2.29 | Chip micrograph of the prototype TX                                                                                                                  |
| 2.30 | Power consumption and active area breakdown                                                                                                          |
| 2.31 | (a) Measurement setup. (b) Photo of laboratory measurement bench                                                                                     |
| 2.32 | Measured (a) small signal frequency response, (b) TX output power frequency sweep with different bias codes                                          |
| 2.33 | Measured (a) error in amplitude ratio, (b) LO leakage                                                                                                |
| 2.34 | Individual QPSK and combined 16QAM constellation for (a) 4GBaud and (b) 5GBaud                                                                       |
| 2.35 | Down-converted spectrum of (a) 5 GBaud OPSK <sub>2</sub> and (b) 5 GBaud 16OAM 40                                                                    |
| 2.36 | EVM vs. data rate for (a) QPSK signal and (b) combined 16QAM signal 46                                                                               |
| 21   | (a) I O lookage without circuit migmatch. (b) I O lookage with amplitude and                                                                         |
| 0.1  | (a) LO leakage without circuit inisinaton. (b) LO leakage with amplitude and                                                                         |
| 39   | LO leakage without circuit mismatch (b) LO leakage with amplitude and                                                                                |
| 0.2  | phase mismatch                                                                                                                                       |
| 33   | LO leakage with amplitude and phase mismatch                                                                                                         |
| 3.4  | (a) Generation of frequency selective impedance (b) Regulated common-gate                                                                            |
| 0.1  | to modulate input resistance 56                                                                                                                      |
| 3.5  | (a) Circuit realization of BPCG structure (b) Circuit model for analysis 57                                                                          |
| 3.6  | BPCG input impedance with different N-path implementation and gain.                                                                                  |
| 3.7  | (a) N-path notch filter induced LO leakage from SWB1. (b) SWB2 LO leakage                                                                            |
| 0.0  | Contribution                                                                                                                                         |
| 3.8  | KA input mean LO leakage and OB input impedance at 1 GHz with different switch sizes and gain in $A_1$ .                                             |
| 3.9  | (a) Circuit model of BPCG with parasitic capacitance. (b) In-band input                                                                              |
|      | impedance model and phasor illustration                                                                                                              |

| 3.10 | (a) N-path notch filter with feed-forward phase correction. (b) Schematic of      |    |
|------|-----------------------------------------------------------------------------------|----|
|      | programmable gain amplifier (PGA).                                                | 65 |
| 3.11 | Measured input $S_{11}$ variation at $f_{LO} = 1.6$ GHz with varying PGA gain     | 66 |
| 3.12 | (a) BPCG loop gain. (b) Equivalent model for stability analysis and an ex-        |    |
|      | emplary circuit realization of $A_1$                                              | 67 |
| 3.13 | BPCG noise model when configured as: (a) BPF in parallel with a RX; (b)           |    |
|      | LNA stage in series with a RX chain                                               | 71 |
| 3.14 | IM3 generation in BPCG structure.                                                 | 72 |
| 3.15 | Noise and IM3 canceling RX architecture.                                          | 73 |
| 3.16 | CS path trans-conductor design.                                                   | 75 |
| 3.17 | CG down-conversion path with blocker sink and frequency translational feed-       |    |
|      | back                                                                              | 76 |
| 3.18 | 8-phase 12.5% duty cycle non-overlap LO generation.                               | 78 |
| 3.19 | Block diagram of complete RX prototype.                                           | 80 |
| 3.20 | (a) Die micrograph of the RX prototype. (b) Photo of PCB for measurement.         | 80 |
| 3.21 | (a) Measured $S_{11}$ with $f_{LO}$ swept from 0.2 to 2 GHz. (b). Input impedance |    |
|      | on Smith chart for 1 GHz $f_{LO}$                                                 | 81 |
| 3.22 | (a) RX-band LO leakage from 3 samples with $f_{LO}$ varying from 0.2 to 2.0       |    |
|      | GHz. (b) Harmonic LO leakage with 0.4 GHz $f_{LO}$ . (c) Harmonic LO leakage      |    |
|      | with 1.0 GHz $f_{LO}$ .                                                           | 81 |
| 3.23 | (a) Measured NF versus BB frequency with 1.0 GHz $f_{LO}$ . (b) NF at 2 MHz       |    |
|      | BB offset for different LO frequencies                                            | 82 |
| 3.24 | Measured IIP2 and IIP3 at different offset frequencies.                           | 83 |
| 3.25 | (a) Blocker NF measurement setup. (b) NF as a function of blocker power at        |    |
|      | 80 MHz offset                                                                     | 83 |

## LIST OF TABLES

## Page

| 2.1 | High-performance DAC Summary                               | 14 |
|-----|------------------------------------------------------------|----|
| 2.2 | Comparison with State-of-the-art High-speed Transmitters   | 47 |
| 3.1 | Comparison with State-of-the-art Wideband Receiver Designs | 84 |

## ACKNOWLEDGMENTS

I would like to express my deepest appreciation to my advisor, Professor Payam Heydari for his continuous and patient guidance and support during my Ph.D study. I would also like to thank all the current and former group members for all the helpful and inspirational discussions I had with them. Last but not least, I would like to acknowledge STMicroelectronics, GLOBALFOUNDRIES and TowerJazz for their generous support in chip fabrication and Keysight Technologies for assisting experimental measurements.

## VITA

## Huan Wang

## EDUCATION

Doctor of Philosophy in Electrical Engineering University of California, Irvine Master of Science in Electrical Engineering

The University of Texas at Austin Bachelor of Science in Electrical Engineering Zhejiang University **2020** *Irvine*, *California* 

> **2013** Austin, Texas

**2011** Hangzhou, China

#### EXPERIENCE

**Research Assistant** University of California, Irvine

**Engineering Intern** Qualcomm Inc.

**Engineering Intern** Qualcomm Inc.

**Analog Design Engineer** Cirrus Logic Inc.

**Engineering Intern** Qualcomm Inc. **2015–2020** *Irvine, California* 

**06/2019–09/2019** San Diego, California

**06/2016–12/2016** San Diego, California

> **2013-2015** *Austin, Texas*

**05/2012–08/2012** San Diego, California

## PUBLICATIONS

**H.Wang**, H. Mohammadnezhad and P. Heydari, "Analysis and Design of High-Order QAM Direct-Modulation Transmitter for High-Speed Point-to-Point mm-Wave Wireless Links," in *IEEE Journal of Solid-State Circuits (JSSC)*, vol. 54, no. 11, pp. 3161–3179, Nov 2019.

H. Mohammadnezhad, **H.Wang** and P. Heydari, "A 115-135-GHz 8PSK Receiver Using Multi-Phase RF-Correlation-Based Direct-Demodulation Method," in *IEEE Journal of Solid-State Circuits (JSSC)*, vol. 54, no. 9, pp. 2435–2448, Sept 2019.

**H.Wang**, Z. Wang and P. Heydari, "A Wideband Blocker-Tolerant Receiver with High-Q RF-Input Selectivity and < -80dBm LO Leakage," in *IEEE International Solid-State Circuits Conference (ISSCC)*, Feb 2019, pp. 450–452.

**H.Wang**, H. Mohammadnezhad, D. Dimlioglu and P. Heydari, "A 100–120GHz 20Gbps Bits-to-RF 16QAM Transmitter Using 1-bit Digital-to-Analog Interface," in *IEEE Custom Integrated Circuits Conference (CICC)*, Apr 2019, pp. 1–4.

H. Mohammadnezhad, **H.Wang**, A. Cathelin and P. Heydari, "A Single-Channel RF-to-Bits 36Gbps 8PSK RX with Direct Demodulation in RF Domain," in *IEEE Custom Integrated Circuits Conference (CICC)*, Apr 2019, pp. 1–4.

H. Mohammadnezhad, **H.Wang** and P. Heydari, "Analysis and Design of Wideband, Balun-Based, Differential Power Splitter at mm-Wave," in *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol. 65, no. 11, pp. 1629–1633, Nov 2018.

L. Zhang, **H.Wang**, Y. Cheng and C. Larsen, "Amplifier with Adjustable Ramp Up/Down Gain for Minimizing or Eliminating Pop Noise," U.S. Patent 20160118949 A1.

**H.Wang**, Y. Pan, X.-l. Yan and R.-h. Huan, "Behavioral Modeling of Direct Sampling Mixer," in *IEEE International Symposium of Circuits and Systems (ISCAS)*, May 2011, pp. 1892–1895.

## ABSTRACT OF THE DISSERTATION

Transceiver Design for Mobile Networks: Tackling mm-Wave High-Speed Link Challenges and Sub-6GHz Mobile Terminal Blocking Problems

By

Huan Wang

Doctor of Philosophy in Electrical Engineering

University of California, Irvine, 2020

Professor Payam Heydari, Chair

Wireless mobile networks are expected to become progressively more prevalent in the future society, connecting billions of populations across the globe and an even higher number of intelligent devices scattered in the environment. Moreover, the supported data rate on mobile devices keeps increasing with the deployment of next generation network infrastructure, e.g. 5G network, and upgrades on existing infrastructures. Consequently, enormous amounts of data traffic will be generated on a daily basis and data exchange between base stations and the backbone network through conventional backhaul links will quickly become a bottleneck. On the other hand, mobile terminals, the elemental building blocks of all mobile networks continue to be hindered by ever-increasing interference blocking problems in a more and more congested environment, both spectrally and spatially. This dissertation aims to study potential solutions for the aforementioned two major issues in mobile networks from a transceiver circuit design perspective.

In the first part of this dissertation, a direct modulation high-order QAM transmitter architecture is proposed and analyzed for mm-wave high-speed wireless links, targeting for applications such as wireless backhaul. The daunting and costly task of designing integrated high-speed-resolution digital-to-analog interface and complicated digital back-end in conventional architectures is completely avoided. Link performance, cost and level of integration are greatly improved as a result. Prototype transmitter has been designed, fabricated and measured to verify the proposed concept. Operating at 115-GHz carrier frequency, the transmitter achieves a 20Gbps data rate in a short-range wireless link using 16QAM modulation. Error vector magnitude (EVM) was measured to be -15.8 dB at a modulated output power of +1 dBm. The transmitter consumes 520 mW of power and occupies 3.17 mm<sup>2</sup> of active area in a 180-nm SiGe BiCMOS process.

In the second part, the mobile terminal blocking problem is addressed. While benefiting from superior linearity, blocker-tolerance and high-Q programmable selectivity brought forth by N-path filtering technique, the common problem of elevated local oscillator (LO) leakage in prior work is significantly mitigated in the proposed receiver design to comply with cellular standards. The design of the proposed receiver is analyzed in great details. The prototype receiver was measured to be highly linear with low noise figure and LO leakage. Out-of-band  $2^{nd}$  and  $3^{rd}$  order input-referred-intercept-point (IIP2 and IIP3) reaches +60 dBm and +14 dBm, respectively. Small-signal noise figure was measured to be below 2.5 dB and degrades by 4.5 dB in the presence of a 0 dBm blocker at 80 MHz offset. The LO leakage was kept under -80 dBm up to 2 GHz.

# Chapter 1

# Introduction to Transceiver Design Challenges in Mobile Networks

Moore's law [1] is known to be the underpinning impelling force for the entire semi-conductor industry over the past decades, driving computational speed to double every 18 months. Interestingly, the wired and wireless communication speed also witness such a multiplying growth roughly every 18 months in the past, formulated as Edholm's law [2]. As the market of communication electronics continue to prosper thanks to the super-linear growth of world's population and more accessible network, Edholm's law starts to serve as another major driving force of industry.

Among various kinds of communication networks, the mobile network poses the toughest challenges in terms of hardware design and circuit implementation due to the following reasons. First, channel environment is the most hostile compared to wired networks such as synchronous optical networking (SONET) using optical fibers and local-area network (LAN) built with twisted pair cables. The wireless channel, unlike its wired counterpart, suffers from higher loss, stronger interference and susceptibility to weather and terrestrial



Figure 1.1: Communications between base-stations and backbone network.

conditions. Second, the mobile network is mainly serving the general public, which makes it more cost sensitive. Specifically, the mobile terminal (user equipment) design is constantly driven by cost-down requirements to occupy a bigger market share. Base station installation and maintenance continue to take up the majority of service providers' annual budget. Apart from these difficulties, recent trend of development in next generation network infrastructure further complicates the problems.

A simplified illustration of today's mobile network structure is shown in Fig. 1.1 [3]. The mobile terminals can already support a maximum data rate in excess of 1 Giga-bit per second (Gbps) with Gigabit Long-Term-Evolution (LTE) deployment. Along with the popularization of internet-of-things (IoT) devices and automotive networks, humongous amounts of data traffic must flow seamlessly and unhindered within the network. The front/back-haul communication between base-stations and the backbone network can quickly become a bottleneck as the aggregated throughput requirement will be daunting given the growth rate of communication speed [2] and the number of devices connected to the network. Although the throughput requirement can be relaxed with a shrinking base-station cell size, the cost of infrastructure installation and maintenance grows proportionally. This is especially true in densely populated urban areas, the primary application scenario of mobile networks. Highly-



Figure 1.2: Mobile terminal interference blocking problems.

directional high-speed wireless point-to-point links operating at mm-wave frequencies offer a number of advantages compared to solutions based on optical fibers. The installation and long-term maintenance cost can be greatly reduced since buried installation of optical fiber cables is no longer necessary. Moreover, the network topology can be quite flexible by altering the direction of wireless link, facilitating more efficient data traffic management. However, low-cost and highly integrated high-speed wireless link realization still faces some intimidating challenges. In Chapter 2 of this dissertation, the bottleneck of conventional high-speed wireless transmitter(TX) architecture is identified and a novel architecture is proposed to improve system performance and efficiency while also enabling a highly integrated low cost solution. The proposed architecture is analyzed in details and a prototype transmitter design is presented.

Mobile terminals are the rudimentary building blocks of the entire network and their performance largely affects the overall network quality. Aside from the wireless standards shown in Fig. 1.2, numerous other legacy wireless standards may also exist in the same region using closely spaced frequency bands. Despite the abundance of bandwidth in mm-wave band, the range of coverage is much smaller compared to traditional radio frequency (RF) bands at lower frequencies. Therefore, most mobile terminals will continue to operate in the sub-6GHz frequency bands in the foreseeable future. Multi-standard co-existence leads to congestion in both spectral and spatial domain. As shown in Fig. 1.2, multiple interference from nearby links usually accompany the desired signal and receiver (RX) de-sensitization can occur due to inter-modulation or compression. This problem is more severe in mobile terminals than in base stations due to cost, volume and power consumption concerns. In Chapter 3 of this dissertation, a new topology is proposed to address the challenges in mobile terminal RX design. A programmable wideband RX with embedded high-Q filtering is demonstrated with high linearity and blocker tolerance while maintaining a low noise figure and LO leakage level for mobile applications. One such RX can potentially replace multiple narrow-band RX+filter combos and bring huge benefits to mobile terminal design.

# Chapter 2

# High-Order QAM Direct Modulation Transmitter for mm-Wave High-Speed Point-to-Point Wireless Links

## 2.1 Background Introduction

High-speed point-to-point wireless links with a data-rate comparable to wireline links are essential to future's more connected society. High data rate mobile devices are becoming more prevalent with the deployment of 5G network. The wireless backhaul/fronthaul links have to deliver at least tens of Gbps of throughput for such networks [4]. The demand for more computing resources is also growing as an increasing number of data centers are being built for massive cloud services. The data center networking faces severe over-subscription and hotspot problems when conventional solutions using fixed wireline links [5,6] are being employed. Moreover, high design, deployment and maintenance cost are associated with wireline links and lead to inefficient space utilization and cooling due to restricted airflow [7]. Flexible



Figure 2.1: Atmospheric attenuation in clear air.

wireless links with on-demand configuration greatly alleviates the problems assuming the data rate and latency of wireless links are similar to that provided by the wireline solutions.

Wide bandwidth (BW) operation is at the core of high-speed communication. As illustrated in Fig. 2.1, the 60 GHz industrial-scientific-medical (ISM) band and E-band only supply limited BW of 7 GHz and 5 GHz, respectively. Besides, the 60 GHz band is ill-suited for long range applications due to oxygen absorption. The above-100-GHz spectrum, however, is very attractive for its abundant BW and relatively low loss. Going to very high frequencies can be tempting due to available BW. However, transistor  $f_{MAX}$  will ultimately limit the practical range of operating frequencies.  $f_{MAX}$  is usually an extrapolation based on low frequency measurement data and a 6 dB per octave slope. At least 6 dB unilateral gain is necessary for a wideband amplifier with only 2-3 dB of gain per stage, when taking gain flatness and passive loss into account [8]. This forces the center frequency to be lower than  $f_{MAX}/2$ . Operating at a frequency close to or even higher than  $f_{MAX}$  often result in dismal system efficiency and link budget [9, 10]. Although III-V technologies offering  $f_{MAX}>1THz$  has been reported [8] and excellent transceiver performance have been published [9–27], the limited yield and production volume along with high cost renders it unsuitable for many commercial applications in mass production. Silicon-Germanium (SiGe) heterojuntion-bipolar-transistor (HBT) technology with 720 GHz  $f_{MAX}$  has been demonstrated [28] in an experimental setup. Commercially available advanced silicon-based processes typically have an  $f_{MAX}$  from 300–500 GHz [29–33]. This makes the F-band (90-140 GHz) and D-band (110-170 GHz) particularly suitable for the high-speed wireless communications. In general, a lower operating frequency is more preferable due to inherently better circuit performance, e.g. gain, efficiency, output power and noise figure, etc.

Spectral efficiency is another key in achieving high-speed communication. Transceivers (TRXs) using simple form of modulation such as on-off keying (OOK), amplitude shift keying (ASK), quadrature phase shift keying (QPSK) or pulse amplitude modulation (PAM) [11–14, 26, 34–36] offer the advantage of low complexity and good system power efficiency. However, the spectral efficiency is very limited. [37] utilized antenna polarization to achieve spectral efficiency same as 16QAM using simple ASK modulator. Nevertheless, the signal-to-noise ratio (SNR) requirement is 9 dB higher compared to coherent 16QAM due to non-linearity and non-coherent SNR loss [37]. Moreover, antenna alignment introduces extra overhead. High spectral efficiency modulations reported thus far all rely on power-hungry high-speed-resolution digital-to-analog converters (DACs) [9,10,16–25,38–43]. However, such DACs are extremely difficult and costly to implement, significantly limiting the achievable data rate in high-speed wireless applications.

## 2.2 High-Speed DAC Design Challenges

#### 2.2.1 Speed and Resolution

Two classic DAC-based TX architectures are illustrated in Fig. 2.2. The images above half the DAC sampling rate ( $f_{DAC}/2$ ) can be filtered by the zero-order-hold (ZOH) and



Figure 2.2: (a) DAC-based direct quadrature up-conversion TX. (b) DAC-based IF up-conversion TX.

reconstruction filter. However, filtering is incapable of removing aliased components and signal quality can be compromised permanently due to aliasing. The minimum  $f_{DAC}$  for direct conversion TXs [16, 17, 38] is at least twice the baud rate ( $f_B$ ). Assuming the I/Q baseband (BB) DACs generate rectangular pulse-shaped PAM signals, an  $f_{DAC}$  of  $2f_B$  folds the (2k-1)<sup>th</sup> sidelobes onto the main lobe (Fig. 2.2(a)), where k is an integer. Alternatively, local oscillator (LO) I/Q generation can be omitted in a heterodyne TX [9, 10, 18, 19] at the cost of heavier burden on the DAC and DSP designs. For a rectangular pulse-shaped QAM signal at an intermediate frequency (IF) of  $f_{IF}$ ,  $f_{IF}$  must be higher than  $f_B$  so that the main lobe stays above DC (Fig. 2.2(b)). To keep the images of main lobes from aliasing, the minimum required  $f_{DAC}$  is  $4f_B$ . Higher  $f_{DAC}$  obviously helps reducing aliasing. Additionally, more spectrally compact pulses (e.g. Root-Raised-Cosine (RRC)) also reduce the amount of aliasing by lowering the amplitudes of aliasing side-lobes and the width of main lobe at the cost of a more complex DSP. Nonetheless, minimum  $f_{DAC}$  requirement stays the same regardless of the pulse shape being used. This is because at least 2 samples per symbol are necessary to synthesize any pulse shape in a direct conversion TX. The DAC in a heterodyne



Figure 2.3: TX chain frequency response equivalent BB model.



Figure 2.4: Eye-diagrams of symbol-rate and 2×symbol-rate DACs with FFE.

TX must generate the pulse envelope at an IF carrier with varying phases. A minimum  $f_{DAC}$  of  $4f_B$  is required to accurately represent both the amplitude and the phase.

Sampling exactly at baud rate is a special case where aliasing happens constructively and has been reported in wireline applications [44–47]. However, its application in wireless is limited. Shown in Fig. 2.3 is a typical TX signal path which is comprised of low-pass (LP) and band-pass (BP) stages. The cascade of these filtering stages, if symmetric around the



Figure 2.5: 4-bit and 8-bit quantization for PAM8.

center frequency  $f_c$ , is equivalent to a simple all-pole LP response  $(H_{LP,eq})$  [48] once referred back to BB. A 6 dB attenuation is assumed at  $f_B/2$  to fully utilize signal path BW. The corresponding eye diagram is plotted in Fig. 2.4 with a feed-forward-equalization (FFE) and  $H_{LP,eq}$  filtering for two different DAC sampling rates. The floating-point 3-tap FFE coefficients are determined with a least-mean-squared (LMS) algorithm. Compared to the case where  $f_{DAC}=2f_B$ , the eye-opening of symbol-rate DAC ( $f_{DAC}=f_B$ ) TX is much smaller for PAM4 (equivalent 16QAM after combining I/Q paths). The eye is fully closed in the PAM8 case (64QAM constellation at TX output). This difference and can be traced to the inherently wider equalization BW in fractional-spaced FFE [49] compared to symbol-spaced FFE. Therefore, the DACs must sample above  $2f_B$  for a better equalization in high-speed, high-modulation-order wireless applications.

A minimum DAC resolution of M-bit seems sufficient in a direct conversion TX with  $4^{M}$ -QAM since each I/Q DAC only generates PAM- $2^{M}$  signal after all. However, FFE coefficients quantization will degrade the I/Q eye diagram opening as shown in Fig. 2.5 and a resolution higher than M-bit is mandated. Prior works [44–47] typically achieve 6-bit effective number of bits (ENOB) for PAM4 wireline applications. Higher resolution is necessary for more complex modulations. Note that the DACs in direct conversion TX only produce amplitude modulated signals. Meanwhile, the DAC output in a heterodyne TX represents the equalized amplitude and phase information and thus a higher resolution is required. In practice, some oversampling margin is needed to relax the reconstruction filter steepness. As will be



Figure 2.6: Current-steering DAC topology.

discussed in the next section, DAC linearity become a foremost concern in high-speed applications. Digital pre-distortion can be adopted as a remedy to DAC non-linearity. However, an even higher  $f_{DAC}$  is required due to the nonlinear BW expansion. Moreover, high-speed memory interface become a necessity to realize look-up-table in digital pre-distortion, significantly complicating the overall design.

#### 2.2.2 Transistor-level Implementation Challenges

A generic topology of segmented current-steering DAC typically found in high-speed applications is shown in Fig. 2.6. The N input bits are separated into B bits of binary leastsignificant-bit (LSB) and N - B bits of thermometer most-significant-bit (MSB). A 50  $\Omega$ load resistance  $R_L$  is typically used to maximize BW and provide good output matching to a standard transmission line. Shunt peaking inductor  $L_p$  can further improve BW [46]. The design typically starts with static metrics such as integral-non-linearity (INL) and differentialnon-linearity (DNL). The size of unit current source I<sub>0</sub> is typically determined by the INL specification. The normalized standard deviation of I<sub>0</sub> is [50]:

$$\sigma_u \le 1/[2\sqrt{2^N}F^{-1}(0.75 + \frac{INL_Yield}{4})]$$
(2.1)

where  $F^{-1}(x)$  is the inverse cumulative distribution function of a normal distribution  $\mathcal{N}(0, 1)$ , N is the number of bits and  $INL_Yield$  is the probability of DAC INL being less than 0.5 LSB. Using Pelgrom's random mismatch model [51], the required minimum gate area is calculated to be:

$$A_{gate,min} = WL = \frac{1}{2\sigma_u^2} \left[ A_\beta^2 + \frac{4A_{V_{TH}}^2}{(V_{GS} - V_{TH})^2} \right]$$
(2.2)

where  $A_{\beta}$  and  $A_{V_{TH}}$  represent mismatch parameters of current-factor and device threshold voltage in a given technology [51]. The transistor (M<sub>c</sub> in Fig. 2.6) channel length (L) and width (W) can be derived using long-channel model approximation as:

$$L = \frac{1}{2\sigma_u} \sqrt{\frac{(2^N - 1)\mu_n C_{ox} [A_\beta^2 (V_{GS} - V_{TH})^2 + 4A_{V_{TH}}^2]}{I_{FS}}},$$

$$W = \frac{A_\beta^2}{4\sigma_u A_{V_{TH}}^2} \sqrt{\frac{I_{FS} [A_\beta^2 (V_{GS} - V_{TH})^2 + 4A_{V_{TH}}^2]}{(2^N - 1)\mu_n C_{ox}}}$$
(2.3)

where  $I_{FS}$  is the full-scale current. The necessary  $I_{FS}$  can be large to make up for the mixer conversion loss [52] and a low power amplifier (PA) gain. In an exemplary 65nm CMOS technology with 8-bit resolution and 99.7% INL yield as design targets, the estimated transistor size of M<sub>c</sub> is  $L = 1.07\mu$ m, W = 225 nm assuming  $V_{GS} - V_{TH} = 400$  mV and  $I_{FS} = 10$  mA. Symmetric layout and smart unit element switching sequence greatly improve static linearity [53–59]. Nevertheless, these techniques are not suitable for high-speed applications due to complexity in routing and decoding logic.

The size of  $I_0$  limited by random mismatch, as estimated previously, is generally too large for high sample rate switching. Every additional bit doubles the area of  $M_c$  in each  $I_0$  and quadruples the total DAC area. Large occupied area makes high sample rate switching progressively harder due to large parasitic capacitance. The output impedance of  $I_0$  ( $Z_0$  in Fig. 2.6) also limits high frequency linearity.  $Z_0$  and  $3^{rd}$  harmonic distortion (HD3) can be



Figure 2.7: DAC non-linearity impact on a BB PAM8 eye-diagram.

related as follows [60]:

$$|Z_0| \ge \frac{R_L(2^N - 1)}{4\sqrt{HD_3}}.$$
(2.4)

Assuming 8-bit resolution and -36 dB HD3,  $Z_0$  must be greater than 25 k $\Omega$ . Shunt capacitor  $C_p$  typically dominates high-frequency impedance and has to be lower than 2.4 fF for only a 2.5 GHz output. This level of parasitic capacitance is impossible to achieve even in advanced CMOS processes considering routing parasitics. Techniques proposed in [60, 61] can be employed to alleviate this problem at the cost of lower output BW. As plotted in Fig. 2.7, top and bottom PAM8 eyes slightly close even with a -60 dB HD3. The top and bottom eyes are completely shut with -50 dB HD3. High-speed DACs are usually characterized using spurious-free-dynamic-range (SFDR) and SFDR can be approximated as -HD3 in a typical fully differential implementation.

Kickback to the tail node of current steering switch is another major reason of poor DAC SFDR at high output frequencies. Techniques in [62–64] using optimum switching threshold and quad-switching can reduce kickback effect. However, additional timing control hardware with extra parasitic capacitance is required and the technique is less effective at higher frequencies due to inherently stronger capacitive coupling. Using small transistors with calibration mechanism is a more viable solution for  $f_{DAC}>10$  GS/s. Unfortunately, achieved SFDR after calibration is only 45 dB for a 2.5 GHz output [45–47]. What's more, precise

| Ref  | Speed                                                                                      | SFDR     | BW         | Power | Technology       |
|------|--------------------------------------------------------------------------------------------|----------|------------|-------|------------------|
|      | Resolution                                                                                 | (dB)     | (GHz)      | (mW)  | 85               |
| [46] | $18 \mathrm{GS/s}$                                                                         | 44@9GHz  | > 9        | 84    | 28nm             |
| [40] | 8b                                                                                         |          |            |       | CMOS             |
| [45] | $56 \mathrm{GS/s}$                                                                         | 24@28CHz | >40        | 750   | $65 \mathrm{nm}$ |
| [40] | $6\mathrm{b}$                                                                              | 54@26GHZ | On Wafer   | 750   | CMOS             |
| [47] | $64 \mathrm{GS/s}$                                                                         | 41@14GHz | 20         | 620   | 20nm             |
| [4]  | 8b                                                                                         |          |            |       | CMOS             |
| [65] | $\begin{array}{c c} 65 \text{GS/s} \\ 8 \text{b} \end{array}  41@8 \text{GHz} \end{array}$ | 41@8CHz  | <u>\12</u> | 750   | 40nm             |
| [00] |                                                                                            | /10      | 100        | CMOS  |                  |

Table 2.1: High-performance DAC Summary

timing control (<0.3 ps skew) is necessary for the reduction of major carry glitches. The performance metrics of state-of-the-art DAC designs are summarized in Table 2.1. The phase-locked loop (PLL) jitter also affects DAC performance at high  $f_{DAC}$  and can be very stringent. The entire integrated DAC system including all the necessary building blocks at high sample rate is very costly to design and dissipates high power consumption [65]. Complicated calibration and high frequency operation demands advanced CMOS technology to be used for fabrication. However,  $f_{MAX}$  can actually drop in advanced CMOS processes since the gate resistance actually increases with smaller channel length. This poses a strict constraint on the mm-wave front-end performance if the DAC and mm-wave front-end were to be integrated together. Therefore, a multi-chip solution using different technologies is more common and the chip-to-package interface will introduce extra limitation on BW and greatly increase design cost.

## 2.3 Direct Modulation 4<sup>M</sup>-QAM Transmitter

#### 2.3.1 Core Idea

To better explain the proposed  $4^{M}$ -QAM direct modulation TX architecture, we can start with a 16QAM scheme depicted in Fig. 2.8(a). The 16QAM constellation can be divided into four quadrants, with a QPSK sub-constellation in each quadrant. The four centers of the QPSK sub-constellations (indicated with dashed squares) constituent a QPSK constellation by themselves. This larger QPSK is centered at the origin and has an amplitude twice as large as that of the QPSK sub-constellations in each quadrant. Consequently, vectorial addition of two QPSKs with an amplitude ratio of 2 [66, 67] constructs a 16QAM constellation. Illustrated in Fig. 2.8(c), high-order  $4^{M}$ -QAM constellation can be progressively constructed



Figure 2.8: (a) Generation of 16QAM constellation by adding two QPSKs with a amplitude ratio of 2. (b) Error vector distribution in each individual QPSK. (c) Generalization to  $4^{\text{M}}$ -QAM constellation.

in a similar way using M QPSK signals with a common amplitude ratio of 2.

As shown in Fig. 2.8(b), the error vectors (EVs) of individual QPSK constellation should satisfy a 2-D Gaussian distribution with no correlation. EVM in this dissertation is defined as the ratio between root-mean-square (rms) error magnitude and the average symbol magnitude. The mean power of EVs can be directly summed due to the independence between each QPSK signals. The total error vector power for the 4<sup>M</sup>-QAM constellation can be written as:

$$\overline{|\text{Total Error Vector}|^2} = 2\sum_{k=1}^M EV M_k^2 \times 4^{k-1} d^2$$
(2.5)

where  $EVM_k$  represents the k<sup>th</sup> QPSK's EVM as illustrated in Fig. 2.8(c). The 4<sup>M</sup>-QAM EVM is thus derived as:

$$EVM_{4^{M}-QAM} = \sqrt{\frac{|\text{Total Error Vector}|^2}{\text{Average Symbol Power}}} = \sqrt{\frac{3\sum_{k=1}^{M} EVM_k^2 \times 4^{k-1}}{4^M - 1}}.$$
 (2.6)

Eq. (2.6) indicates that the 4<sup>M</sup>-QAM EVM can be viewed as a weighted average of the constituent QPSK signals' EVM. QPSK signals with larger amplitudes bear more weights. The 16QAM EVM can be readily obtained as:

$$EVM_{16QAM} = \sqrt{\frac{EVM_1^2 + 4EVM_2^2}{5}}.$$
(2.7)

It is reasonable to assume all QPSKs have the same EVM due to similar generation mechanism. Under this assumption, the  $4^{M}$ -QAM EVM is equal to the individual QPSK EVM. This feature offers a few unique benefits to the proposed method. QPSK signals only require symbol-rate timing in the digital-to-analog (D/A) interface since BB PAM2 eye diagrams show negligible difference with different sample rate as shown in Fig. 2.4. Moreover, amplitude linearity is less crucial for QPSK generation and DAC linearity can be greatly relaxed. If the signal path BW is wide enough to render equalization dispensable, 1-bit D/A interface will suffice and thus greatly simplifies the integrated TX design. In summary, the advantages of proposed method come from the observation that low EVM QPSK signals are much easier to generate than high-order QAM signals. Reported experimental results in [68,69] support this argument.

The required amplitude ratio of 2 can be adjusted with good accuracy by tuning the DC bias current of each QPSK generator. This is distinct from trimming the DAC switching current source as typically done for high-speed designs. The trimming in DAC only improves static metrics such as DNL and INL while high-frequency linearity bottleneck due to kickback and output impedance modulation still exist. Contrarily, each QPSK path in the proposed method is less sensitive to non-linear distortions. With properly tuned QPSK amplitudes, the final 4<sup>M</sup>-QAM constellation does not suffer from non-linearity as long as the process of vectorial addition is linear.

The modulation format can be flexibly reprogrammed with the proposed method from  $4^{k}$ -QAM to QPSK ( $1 < k \leq M$ ). For example, high spectral efficiency 64QAM can be used when the link budget is high while 16QAM communication can be adopted once the link quality degrades and SNR drops. The reprogramming can be achieved by choosing the number of QPSK paths that are turned on. The QPSK paths must be well isolated and provide constant termination whether they are on or off. This guarantees the vectorial addition is insensitive to the modulation format chosen at the moment. If only QPSK is affordable, the proposed method can be turned into a M-way QPSK power combining by turning all QPSK paths on and giving the same inputs.



Figure 2.9: Simplified block diagram of the proposed TX.

#### 2.3.2 Bandwidth Limitation

Shown in Fig. 2.9 is a simplified block diagram of the proposed TX. The signal path is comprised of several BP stages such as QPSK modulator, combiner and PA. The BW limitation of these building blocks introduces EVM degradation, especially for high data-rate scenarios. The BP filtering of combiner and PA applied to the 4<sup>M</sup>-QAM can be referred to each QPSK output and then linearly combine the filtered QPSK signals. This holds true since the combining operation is linear and time-invariant. The BP responses can be translated to BB and modeled as an all-pole LP equivalent ( $H_{LP,TX}$ ) assuming the BP responses are symmetric around the carrier frequency  $f_c$  [48]. The BP response in each stage is assumed to come from a simple RLC tank (2<sup>nd</sup>-order BP) and the LP equivalent become a one-pole system. Therefore,  $H_{LP,TX}$  is approximated as follows:

$$H_{LP,TX}(s) = H_{LP,QPSK}(s) \times H_{LP,Comb}(s) \times H_{LP,PA}(s)$$
  
= 
$$\frac{1}{(1 + \tau_{QPSK}s)(1 + \tau_{Comb}s)(1 + \tau_{PA}s)},$$
(2.8)

where  $\tau_{QPSK}$ ,  $\tau_{Comb}$  and  $\tau_{PA}$  represent time constants of the single-pole LP equivalence of QPSK modulator, linear combiner and PA. The 3-dB BW of the complete TX is conserva-



Figure 2.10: (a) Settling trajectory of QPSK signal from symbol  $S_2$  to  $S_1$ . (b) QPSK constellation errors due to TX BW limitation.

tively approximated as [70]:

$$BW_{TX} \approx \frac{1}{2\pi [\tau_{QPSK} + \tau_{Comb} + \tau_{PA}]} = \frac{1}{2\pi \tau_{TX}}$$
  
=  $\frac{1}{1/BW_{QPSK} + 1/BW_{Comb} + 1/BW_{PA}}.$  (2.9)

The QPSK LP equivalent signal considering both in-phase and quadrature components is a complex-valued time-domain function. This complex-valued signal is filtered by  $H_{LP,TX}$ , leading to a settling trajectory ( $\gamma_{2-1}(t)$ ) from symbol S<sub>2</sub> to S<sub>1</sub> in the Q-t plane, as exemplified in Fig. 2.10(a). The ideal constellation points are shown in green color. For the reason that BW of PA is usually the bottleneck of entire TX due to its higher gain and output power requirements, trajectory  $\gamma_{2-1}(t)$  is approximated as:

$$\mathbf{Im}\{\gamma_{2-1}(t)\} = d - 2d \times e^{-t/\tau_{TX}}.$$
(2.10)

Incomplete settling results in an EV when  $\gamma_{2-1}(t)$  is sampled at the end of each symbol period  $(t = T_S = 1/f_B)$ . As shown in Fig. 2.10(b), orthogonal trajectories,  $\gamma_O$ , (e.g., S<sub>2</sub> to S<sub>1</sub> or S<sub>1</sub> to S<sub>4</sub>) and diagonal trajectories,  $\gamma_D$ , (e.g., S<sub>3</sub> to S<sub>1</sub>) can be clearly identified. The EV magnitude  $\varepsilon$  shown in Fig. 2.10(b) is:

$$\varepsilon = 2d \times e^{-T_S/\tau_{TX}}.$$
(2.11)

The settling of diagonal trajectories causes the same EV magnitude  $\varepsilon$  in both real and imaginary parts of the LP equivalent signal. As shown in Fig. 2.10(b), the reference symbol point of S<sub>1</sub> with incomplete settling moves from its ideal location at (d, d) to a new location at  $(d - \varepsilon/2, d - \varepsilon/2)$  and the distribution boundary is defined by the worst and best case points. Symbols S<sub>2</sub>, S<sub>3</sub> and S<sub>4</sub> have the same EV distribution due to symmetry of constellation. The filtered QPSK symbol energy now becomes  $2d^2 - 2d\varepsilon + \varepsilon^2/2$ . The red and blue points in Fig. 2.10(b) are associated with the worst case scenarios while the green dot represents the best case, a complete settling with a very long identical symbol sequence. The EVs are assumed to satisfy uniform distribution within its boundary indicated as a gray box in Fig. 2.10(b). This assumption is based on the observation that all possible symbol patterns that result in any point inside the gray box are equally likely if the data input is truly random. The probability density function of EV magnitude  $(P_{|EV|}(x))$  is:

$$P_{|EV_{I}|}(x) = P_{|EV_{Q}|}(x) \begin{cases} 1/\varepsilon, & -\varepsilon/2 < x < \varepsilon/2 \\ 0, & \text{otherwise.} \end{cases}$$
(2.12)

where  $|EV_I|$  and  $|EV_Q|$  are the in-phase and quadrature EV components. The average power of the EV is thus derived as:

$$\overline{|EV|^2} = \mathbf{E}\{|EV_I|^2 + |EV_Q|^2\} = 2\int_{-\frac{\varepsilon}{2}}^{\frac{\varepsilon}{2}} x^2 \frac{1}{\varepsilon} dx = \varepsilon^2/6.$$
(2.13)

Using Eq. (2.9), (2.11) and (2.13),  $BW_{TX}$  and the QPSK EVM can be related as:

$$EVM_{QPSK,BW} = \frac{\alpha}{\sqrt{3}(1-\alpha)}, \quad \alpha = e^{-2\pi BW_{TX}T_S}.$$
(2.14)



Figure 2.11: (a) Illustration of QPSK amplitude mismatch effect. (b) EVM of 16QAM as a function of amplitude mismatch

A quick estimation on the EVM degradation due to front-end BW bottleneck can be obtained using Eq. (2.14). For instance, for a 6-dB attenuation at  $f_B/2$  as discussed in Section 2.2.1, the EVM floor due to BW limitation is -23 dB (7%). This estimated EVM is sufficient for 16QAM operation. However, the front-end circuit BW must be improved to support 64QAM constellation with an acceptable bit-error-rate (BER).

#### 2.3.3 QPSK Amplitude Mismatch

The mismatch between QPSK paths inflicts EVM degradation due to the deviation in amplitude ratio away from the ideal value of 2. 16QAM is the lowest-order constellation that could be a victim of this amplitude mismatch. As shown in Fig. 2.11(a), the ideal symbols of the largest QPSK are plotted in dashed blue circles and the ideal constellation points of 16QAM are indicated in black filled circles. The amplitude ratio with the effect of mismatch can be written as  $2 \div (1 + \Delta_1)$ . The green and red points in Fig. 2.11(a) indicate cases where  $\Delta_1 > 0$  and  $\Delta_1 < 0$ . The EV magnitude is denoted as  $\rho = \sqrt{2}d\Delta_1$ . Therefore, the EVM
floor limited by amplitude mismatch effect for a 16QAM constellation is calculated to be:

$$EVM_{16QAM,\Delta} = \sqrt{\frac{|EV_{16QAM,\Delta}|^2}{16QAM \text{ Average Symbol Power}}} = \sqrt{\frac{\rho^2}{10d^2}} = \frac{|\Delta_1|}{\sqrt{5}}.$$
 (2.15)

As shown in Fig. 2.11(b), 10% of mismatch can be tolerated for a -27 dB EVM floor target. This level of mismatch is readily achievable in circuit implementation.

For 64QAM case, QPSK<sub>3</sub>, QPSK<sub>2</sub> and QPSK<sub>1</sub> have amplitudes of  $4\sqrt{2}d$ ,  $2\sqrt{2}d(1 + \Delta_2)$ and  $\sqrt{2}d(1 + \Delta_1)$ , respectively. The amplitude ratio between QPSK<sub>2</sub> and QPSK<sub>1</sub> can be rearranged as:

$$2(1 + \Delta_2) \div (1 + \Delta_1) = 2 \div (1 + \frac{\Delta_1 - \Delta_2}{1 + \Delta_2}) = 2 \div (1 + \Lambda_1).$$
(2.16)

Upon the combination of QPSK<sub>2</sub> and QPSK<sub>1</sub>, the 16QAM constellation shown in Fig. 2.11(a) now appears in each quadrant of the 64QAM constellation. However, the dashed blue circles for the 64QAM case are affected by a EV due to mismatch between QPSK<sub>3</sub> and QPSK<sub>2</sub>. The EVs induced by QPSK<sub>3</sub>-QPSK<sub>2</sub> mismatch are denoted as  $\overrightarrow{\mathbf{EV}_2}$  and EVs caused by QPSK<sub>2</sub>-QPSK<sub>1</sub> mismatch are written as  $\overrightarrow{\mathbf{EV}_1}$ . Their possible combinations are:

$$\overrightarrow{\mathbf{EV}_{1}} = \sqrt{2}d\Lambda_{1}\{ \angle 45^{o}, \angle -45^{o}, \angle 135^{o}, \angle -135^{o} \},$$

$$\overrightarrow{\mathbf{EV}_{2}} = 2\sqrt{2}d\Delta_{2}\{ \angle 45^{o}, \angle -45^{o}, \angle 135^{o}, \angle -135^{o} \}.$$

$$(2.17)$$

The EVs in the 16QAM sub-constellation are simply vector summation of  $\overrightarrow{\mathbf{EV_1}}$  and  $\overrightarrow{\mathbf{EV_2}}$ . The same argument holds true for all four quadrants in a 64QAM constellation.

As can be seen from Fig. 2.11(a) and Eq. (2.17),  $\overrightarrow{\mathbf{EV_1}}$  and  $\overrightarrow{\mathbf{EV_2}}$  have half of their elements parallel to each other and the other half perpendicular. Therefore, the average power of EV



Figure 2.12: (a) Surface plot of 64QAM EVM with amplitude mismatch. (b) Contours of 64QAM EVM surface plot.

combining  $\overrightarrow{EV_1}$  and  $\overrightarrow{EV_2}$  is derived to be:

$$\overline{|EV_{64QAM,\Delta}|^2} = \frac{32\left|\overline{\mathbf{E}\mathbf{V}_{\parallel}}\right|^2 + 32\left|\overline{\mathbf{E}\mathbf{V}_{\perp}}\right|^2}{64} = \frac{(\sqrt{2}d\Lambda_1 + 2\sqrt{2}d\Delta_2)^2 + [(\sqrt{2}d\Lambda_1)^2 + (2\sqrt{2}d\Delta_2)^2]}{2}$$
$$= 2d^2 \left[ \left(\frac{\Delta_1 + \Delta_2 + 2\Delta_2^2}{1 + \Delta_2}\right)^2 - 2\frac{\Delta_1\Delta_2 - \Delta_2^2}{1 + \Delta_2} \right].$$
(2.18)

Consequently, 64QAM EVM solely due to amplitude mismatch is:

$$EVM_{64QAM,\Delta} = \sqrt{\frac{|EV_{64QAM,\Delta}|^2}{64QAM \text{ Average Symbol Power}}}$$

$$= \sqrt{\frac{\left[\left(\frac{\Delta_1 + \Delta_2 + 2\Delta_2^2}{1 + \Delta_2}\right)^2 - 2\frac{\Delta_1\Delta_2 - \Delta_2^2}{1 + \Delta_2}\right]}{21}}.$$
(2.19)

Fig. 2.12(a) shows the EVM surface plot for 64QAM constellation with mismatch effect. The EVM contours are plotted in Fig. 2.12(b). The elliptically-shaped contours imply that larger amplitude QPSK's mismatch ( $\Delta_2$ ) has more impact on EVM degradation. For a -30 dB EVM floor,  $\Delta_1$  as large as ±14% can be tolerated, while  $\Delta_2$  must be less than ±8%. Fortunately, level of amplitude mismatch can be controlled very well in circuit realization with accurate calibration.

In 4<sup>M</sup>-QAM, the amplitudes of QPSK<sub>M</sub>, QPSK<sub>M-1</sub>, ..., QPSK<sub>1</sub> are  $2^{M-1}\sqrt{2}d$ ,  $2^{M-2}\sqrt{2}d(1 + \Delta_{M-1})$ , ...,  $\sqrt{2}d(1 + \Delta_1)$  when mismatch is considered. The exact equation quickly become tedious for modulation order higher than 64QAM since vector additions of arbitrary angles are involved. Eq. (2.6) already proves that larger QPSK signals have more impact on EVM. Thus, we can only use the largest 3 QPSK signals to approximate 4<sup>M</sup>-QAM EVM for M>3. The results in Eq. (2.18) and (2.19) can be directly applied to arrive at the following expressions:

$$EVM_{4^{M}-QAM,\Delta} = \sqrt{\frac{4^{M-3} \left[ (\Lambda_{M-2} + 2\Delta_{M-1})^{2} - 2\Lambda_{M-2}\Delta_{M-1} \right]}{(4^{M} - 1)/3}}$$

$$\Lambda_{M-2} = \frac{\Delta_{M-2} - \Delta_{M-1}}{1 + \Delta_{M-1}}.$$
(2.20)

#### 2.3.4 QPSK Phase Mismatch

As illustrated in Fig. 2.9 as well, the phases of QPSK signals also exhibit mismatch due to delay skews. Without loss of generality, the phase of the largest QPSK ( $\theta_{ref}$ ) is regarded as the reference and set to zero for convenience.

In the 16QAM case as shown in the top-left quadrant of Fig. 2.13(a). The phase deviation of QPSK<sub>1</sub> ( $\theta_1$ ) show up as a rotation in the QPSK sub-constellations. The EVM for this case is:

$$EVM_{16QAM,\theta} = \sqrt{\frac{|EV_{16QAM,\theta}|^2}{16QAM \text{ Average Symbol Power}}} = \sqrt{\frac{(2\sqrt{2}d \cdot \sin\frac{\theta_1}{2})^2}{10d^2}}$$
$$= \sqrt{\frac{4}{5}} \times \left|\sin\frac{\theta_1}{2}\right|.$$
(2.21)

Using Eq. (2.21), the 16QAM EVM as a function of phase mismatch is plotted in Fig.



Figure 2.13: (a) Phase mismatch effect on 16QAM. (b) EVM of 16QAM as a function of phase mismatch.

2.13(b). The achievable EVM is limited to -28dB with only 5° of phase mismatch.

The first quadrant of Fig. 2.13(a) demonstrates the effect of phase mismatch in each quadrant of the 64QAM constellation. Similar to the case of amplitude mismatch, the dashed blue points representing QPSK<sub>2</sub> are rotated by an angle equal to the phase mismatch  $\theta_2$ . The phase of QPSK<sub>1</sub> is referred to the phase of QPSK<sub>2</sub> so that the result in Eq. (2.21) can be applied. The rotational error is similar to the 16QAM case except the angle of rotation is now  $\theta_1 - \theta_2$ . The EVs in 64QAM constellation is a vectorial summation of the EVs due to  $\theta_2$  and  $\theta_1 - \theta_2$  phase mismatch. The exact derivation can be tedious even for 64QAM due to arbitrary angle vector addition. As shown in Fig. 2.13(a), the symbol displacements due to  $\theta_2$  (or  $\theta_1 - \theta_2$ ) are along the circle with radius of  $2\sqrt{2}d$  and center at origin (or radius of  $\sqrt{2}d$  and center at dashed blue points). Assuming  $\theta_1, \theta_2 \ll 180^\circ$ , the EV directions are along the tangent line of these circles and half of the EVs are parallel and the other half are perpendicular in the vector summation for 64QAM EV calculation. The combined EVM is:

$$EVM_{64QAM,\theta} = \sqrt{\frac{|EV_{64QAM,\theta}|^2}{64QAM \text{ Average Symbol Power}}}$$
$$= \sqrt{\frac{4\left[\left(2\sin\frac{\theta_2}{2} + \sin\frac{\theta_1 - \theta_2}{2}\right)^2 - 2\sin\frac{\theta_2}{2} \cdot \sin\frac{\theta_1 - \theta_2}{2}\right]}{21}}.$$
(2.22)

Similar to the approach in amplitude mismatch analysis, the largest 3 QPSKs suffice to approximate the  $4^{M}$ -QAM EVM (M>3), as follows:

$$EVM_{4^{M}-QAM,\theta} \approx \sqrt{\frac{4^{M-2} \left[ \left( 2sin \frac{\theta_{M-1}}{2} + sin \frac{\theta_{\Delta}}{2} \right)^{2} - 2sin \frac{\theta_{M-1}}{2} \cdot sin \frac{\theta_{\Delta}}{2} \right]}{(4^{M} - 1)/3}}.$$

$$(2.23)$$

where  $\theta_{\Delta} = \theta_{M-2} - \theta_{M-1}$ . Shown in Fig. 2.14 is the 64QAM EVM versus phase mismatch. Not surprisingly, phase mismatch induced by larger amplitude QPSK ( $\theta_2$ ) affects the EVM more than smaller QPSK phase deviation ( $\theta_1$ ) does. A -30 dB EVM contour corresponds to a phase mismatch of only  $\pm 5^{\circ}$ , a more challenging task compared to amplitude mismatch control from a circuit realization perspective.



Figure 2.14: (a) Surface plot of 64QAM EVM with amplitude mismatch. (b) Contours of 64QAM EVM surface plot.

#### 2.3.5 LO Phase Noise

The generation of low phase noise (PN) LO signals at high mm-wave / sub-THz frequencies is beyond the scope of this dissertation. However, it's still of interest to investigate how LO PN affects signal quality in the proposed TX architecture. A comparison between I/Q direct up-conversion and the proposed 4<sup>M</sup>-QAM generation method is shown in Fig. 2.15 along with exemplary waveforms at different locations. All circuit blocks in conventional direct upconversion TX must maintain a good linearity since amplitude varying modulated signals are being processed. The DAC design poses significant challenges as previously discussed. The mixer and RF gain stages further exacerbate the problem and degrade EVM. On the other hand, the linearity requirement in the proposed TX architecture is much relaxed for circuit blocks before the linear combiner. If a PA is used after the linear combiner, its non-linearity limit output EVM in the same way as that in conventional TXs.

The LO generation in the proposed TX is slightly different from that in a conventional TX. The I/Q LO signals must be distributed to M QPSK modulators. Some calibration circuitry may be needed to minimize phase mismatch between different QPSK modulators if exactly symmetrical layout in LO distribution is not possible. The k<sup>th</sup> QPSK signal is written as:

$$QPSK_k(t) = e^{j\frac{\pi}{4}m_k(t)}2^{k-\frac{1}{2}}d \cdot A_c(t)A_{u,k}(t)e^{j\left[\phi_c(t)+\phi_{u,k}(t)\right]}$$
(2.24)



Figure 2.15: Conventional and proposed 4<sup>M</sup>-QAM modulator.

where  $m_k(t)$  is the time-domain symbol sequence taking random transitions between integers from -3, -1, +1 and +3.  $\phi_c(t)$  and  $A_c(t)$  are fully correlated parts of phase and amplitude perturbations in the LO signals among different QPSK paths. This full correlation originates from the fact that quadrature LO signals for each QPSK modulator are typically produced in the a shared frequency synthesizer first and then routed to multiple QPSK modulators. The noise that leads to the amplitude and phase perturbations is ideally identical in a noiseless LO distribution network. Nonetheless, practical LO distribution network inevitably introduces independent and therefore uncorrelated LO noise, whose amplitude and phase components are denoted by  $A_{u,k}(t)$  and  $\phi_{u,k}(t)$ . The 4<sup>M</sup>-QAM time domain signal generated by the proposed TX can be expressed as:

$$4^{M} - QAM(t) = \sum_{k=1}^{M} QPSK_{k}(t) = \left[ d \sum_{k=1}^{M} 2^{k-\frac{1}{2}} e^{j\frac{\pi}{4}m_{k}(t)} A_{u,k}(t) e^{j\phi_{u,k}(t)} \right] \times A_{c}(t) e^{j\phi_{c}(t)}.$$
(2.25)

The term within the bracket is the ideal symbol points of 4<sup>M</sup>-QAM under the influence of uncorrelated LO noise from each QPSK path. It is multiplied by fully correlated LO signal from the synthesizer. This process is similar to that in a conventional direct up-conversion TX where the complex-valued BB equivalent signal is multiplied by the LO signal's perturbation phasor.

If the uncorrelated part of LO noise is negligible, the LO PN impact on the proposed TX is exactly the same as that in a conventional TX. Assuming the uncorrelated part of LO noise has the same power spectral density (PSD) profile  $\mathscr{L}_u(\Delta f)$ , the 4<sup>M</sup>-QAM constellation after linear combining sees an effective LO PN  $10log_{10}M$  dB higher than  $\mathscr{L}_u(\Delta f)$ . The practical level of  $\mathscr{L}_u(\Delta f)$  is typically much lower than that coming from the synthesizer since it originates from passive loss and LO buffers in the distribution network. Take a purely passive divider as an example, it only adds -174 dBm/Hz noise floor to the incoming LO signal. A typical 0 dBm LO signal with superior far-out PN provided by test instrument can have a noise floor from -150 to -140 dBm/Hz, a much higher level than that contributed by the distribution network. Integrated synthesizer solution typically has worse PN performance, rendering the uncorrelated noise contribution from LO distribution network negligible.

Using the estimation in [71], PN has to be -94 dBc/Hz and -104 dBc/Hz at 1 MHz and 10 MHz offset (integration BW is 10 MHz) to secure a -30 dB EVM floor. The results in [72] implies that a -30 dB EVM floor requires the far-out LO noise floor to be less than -125 dBc/Hz. State-of-the-art PLLs [73] can achieve this level of PN with careful design. More analytical and experimental results relating LO signal quality and transceiver performance can be found in [16, 72, 74–76].

## 2.4 16QAM Direct Modulation TX Prototype Design

A proof-of-concept 16QAM TX prototype operating at 115 GHz center frequency was designed and implemented as shown in Fig. 2.16. Two modulators with programmable bias current generate QPSK signals with 1X (QPSK1) and 2X (QPSK2) amplitudes. Passive Wilkinson combiner linearly adds two QPSK signals and the interaction between two QP-



Figure 2.16: Full architecture of the proposed direct modulation TX implementing 16QAM constellation.

SKs is minimized by the port-to-port isolation of the Wilkinson combiner. Buffers between QPSK modulators and Wilkinson combiner compensates for the loss during power combining. The PA then amplifies the combined 16QAM signal. Quadrature LO signals are derived from a 18.3 GHz external input using the multiplier-based LO distribution network. A  $2^7-1$  PRBS generator employing full-rate clock with current mode logic (CML) design was implemented on the same chip to facilitate testing. The output bits, B<sub>0</sub>-B<sub>3</sub> in Fig. 2.16, are delayed versions of each other so that all possible symbol states can be traversed [13]. C-R high-pass coupling network was used for biasing purposes. Around 0.5ps of delay mismatch between B<sub>0</sub>-B<sub>3</sub> was observed due to layout spacing between QPSK modulators (200  $\mu$ m). This delay spread is much less than typical symbol period of this design and thus will not affect output EVM.

#### 2.4.1 Power Amplifier Design

The class-A PA shown in Fig. 2.17 employs a four-stage design with transmission-line-based matching network. The simulated PA load-pull contours are plotted in Fig. 2.18(a) with 110GHz continuous wave (CW) output. The contours corresponds to power levels from -2 to +6 dBm in steps of 2 dB. The differential common-emitter topology in the first three



Figure 2.17: The four-stage PA schematic.



Figure 2.18: (a) PA load-pull contour. (b) Small-signal model of neutralized differential common-emitter stage. (c) Simulated  $G_{max}$  with ideal and dummy BJT capacitors.

stages are capacitively neutralized [77] to gain better stability and BW at high mm-wave frequencies. The small signal model of the neutralized differential common-emitter structure is shown in Fig. 2.18(b). The  $Y_{12}$  after neutralization is derived to be:

$$Y_{12,n} = \frac{-j\omega(C_{\mu} - C_n) + \omega^2 R_n C_n C_{\mu}}{1 - \omega^2 r_b R_n C_n C_i + j\omega[(R_n + r_b)C_n + r_b C_i]},$$
(2.26)

where  $C_n$  and  $R_n$  are the neutralization capacitance and its associated series resistance, respectively.  $C_{\mu}$  and  $C_{\pi}$  are the BJT transistor's intrinsic junction capacitance and  $C_i = C_{\mu} + C_{\pi}$ .  $R_n$  gives birth to a non-zero  $Y_{12}$  even with  $C_{\mu} = C_n$  and also induces resistive loss. MOSFETs are typically used as dummy neutralization capacitors [12] in CMOS processes to obtain a more robust circuit performance over process-voltage-temperature (PVT) variations. However, as indicated in Fig. 2.18(c), dummy BJT devices will introduce extra 3 dB loss in maximum available gain ( $G_{MAX}$ ) due to high base resistance.

The layout view of the PA first stage is shown in Fig. 2.19(a). The neutralization capacitors are designed to optimize gain-BW product and are realized with custom designed high-quality-factor metal-oxide-metal (MOM) capacitors. Only high level metal layers are used to minimize parasitic capacitance to substrate. All parasitic effects are captured in electro-magnetic (EM) simulation. Grounded-coplanar-waveguide (GCPW) structure are adopted



Figure 2.19: (a) PA first stage layout. (b) PA output stage layout. (c) PA simulation results.

to implement transmission line matching networks. 45°-mitered bends and T-junctions minimizes reflections due to impedance discontinuities and improves BW. High-frequency decoupling capacitors (De-cap) are MOM capacitors with a self-resonance-frequency (SRF) close to the carrier frequency. Metal-insulator-metal (MIM) de-caps with larger capacitance but lower SRF are placed next to the high-frequency de-caps to reduce supply ripples at lower frequencies. The PA output stage shown in Fig. 2.19(b) employs cascode topology to increase output power. The 3-D structure of output ground-signal-ground (GSG) pad is carefully EM simulated and serves as part of the output matching network, which also contains a balun and a 45 fF capacitor. The output matching network transforms a standard 50  $\Omega$  load to the load-pull impedance for +7 dBm saturated output power. The output balun is designed in the top-most two thick metal layers with a diameter of 52  $\mu$ m and a trace width of 10  $\mu$ m. The PA frequency response is stagger-tuned to improve BW. Simulation results of small-signal S-parameters and output power at 6 dB back-off are shown in Fig. 2.19(c). Stagger-tuned characteristic with roughly 3 dB ripple can be clearly observed from the plots. PA output power at three different frequencies are plotted in Fig. 2.20(a) along with large signal group delay shown in Fig. 2.20(b). The 1-dB compression point at PA output (OP1dB) for 105 GHz, 110 GHz and 115 GHz are 2.9 dBm, 2.2 dBm and 1.3 dBm, respectively.



Figure 2.20: PA simulation results of: (a) Output power versus input power for 3 different frequencies; (b) Group delay of PA small signal response.

#### 2.4.2 Wilkinson Power Combiner

The linear combiner is realized with a lumped differential Wilkinson combiner. Balancedto-balanced (differential) Gysel topology is also a potential option [78–81]. However, its BW is narrower due to several sections of transmission lines and Wilkinson combiner is preferred for high-speed wide BW applications. Differential Wilkinson combiner typically has to incorporate cross-overs, which can severely deteriorate BW and differential signaling at very high frequencies due to impedance discontinuity and cross coupling [12, 82]. The cross-overs can be avoided as shown in Fig. 2.21(a) and 2.21(b). The differential QPSK signals are converted to single-ended form using baluns first. The single-ended signals are



Figure 2.21: (a) Wilkinson combiner schematic. (b) Layout.

then combined in a lumped element Wilkinson realization using inductors and capacitors. Finally, an output balun transforms the combined signal back to differential form for power amplification. The intermediate port impedance for the lumped Wilkinson is chosen to be 70  $\Omega$  to strike a balance between loss and BW. This is because a low impedance transformation ratio enables wider BW.

The balun design is similar to that described in [82]. S-parameters simulation results are shown in Fig. 2.22(a). The insertion loss from input to output is around 6 dB at 110 GHz. The frequency response remains flat from 90 to 130 GHz. Both input and output port reflection coefficients are satisfactory and the port-to-port isolation is better than 20 dB in the frequency range of interest. Therefore, the interaction between two QPSK paths are negligible. If direct current combining is adopted as in conventional TXs to circumvent the inherent 3 dB loss due to out-of-phase power combining, load modulation effect tends to degrade combined 16QAM EVM and is therefore not implemented here. Conventional TXs operating at high mm-wave frequencies face similar challenges in combining the I/Q signals. The imperfections in differential signaling is also shown in Fig. 2.22(b). 1dB magnitude error and 1.7° phase error can be observed at 110 GHz.

The input to output common-mode (CM)/differential-mode (DM) conversion ratios (Fig. 2.22(b)) are simulated by feeding one port and terminate the other with 100  $\Omega$  resistance. The CM output due to DM input undergoes more than 30 dB attenuation and thus will not disturb the operation of PA following the Wilkinson combiner. At the same time, part of the CM input will be transformed into DM output due to parasitic capacitive coupling within the balun. This undesired DM signal contamination is minimized by the fully symmetric design of circuits driving the combiner.



Figure 2.22: (a) Simulated differential mode S-parameters. (b) Simulated differential output magnitude and phase error, CM / DM conversion ratios.

## 2.4.3 QPSK Modulator

The schematic of QPSK modulator is shown in Fig. 2.23, which employs a double-balanced active mixer design. The in-phase and quadrature parts are summed at the output nodes. The shunt inductance resonates out the parasitic capacitance at the collector output and the series routing inductance is canceled at the carrier frequency by a series capacitance of 26 fF, which also provides DC isolation between stages. The output amplitude is a function of tail



Figure 2.23: Schematic of the QPSK modulator.

bias current and can be accurately tuned. A current mirror with  $\beta$  helper provides a 4-bit tunable input current to program the tail current  $I_B$  from 0.5 to 4 mA. The parasitic effect is more prominent at high mm-wave frequencies. As shown in the red box in Fig. 2.23, the input impedance looking into the LO driving port with emitter degeneration can be derived as:

$$Z_{in} = \frac{1 + g_m R_E}{j\omega C_\pi - \omega^2 C_\pi C_E R_E} + \frac{R_E (C_\pi + C_E)}{C_\pi + j\omega C_\pi C_E R_E} = \frac{R_E (C_\pi - g_m R_E C_E)}{C_\pi + C_\pi \omega^2 C_E^2 R_E^2} + \frac{1}{j\omega C_\pi \frac{1 + \omega^2 C_E^2 R_E^2}{1 + g_m R_E + \omega^2 R_E^2 C_E (C_\pi + C_E)}}.$$
(2.27)

A negative resistive part can emerge if  $g_m R_E C_E > C_{\pi}$ , leading to potential instability. This effect is mitigated by careful layout and a resistive source impedance driving the LO port.



Figure 2.24: QPSK modulator simulation results with LO frequencies of 110 GHz (black) and 115 GHz (blue) for (a) output power versus LO power, (b) output power and LO leakage for different bias codes.

The peak swing of BB CML input  $(V_{BB})$  must provide sufficient voltage swing to fully steer the tail currents. Fig. 2.24(a) plots simulated modulator output power versus LO port input power for various BB swing. The tail current is set to its maximum value of 4 mA in the simulation.  $V_{BB}$  is designed for a peak swing of 200 mV since the modulator output only improves marginally for  $V_{BB}$  greater than 150 mV. Simulations for both 110 GHz (black) and 115 GHz (blue) are shown in Fig. 2.24(a) and OP1dB reaches -11.6 dBm and -11.3 dBm, respectively. The total output power combining the lower and upper side-bands around the carrier frequency with 200 mV peak  $V_{BB}$  is plotted in Fig. 2.24(b) for different bias current codes. The amplitude tuning range is 16 dB. Code values of 15 and 5 achieves an amplitude ratio of 5.93 dB and 5.64 dB for 110 GHz and 115 GHz signals, resulting in a 0.09 dB (1%) and 0.38 dB (4.5%) amplitude mismatch. The EVM limited by this level of amplitude mismatch is much better than -30 dB based on results in Eq. (2.15) and Fig. 2.11(b), a sufficiently low EVM floor for 16QAM. Amplitude accuracy can certainly be improved with finer current tuning resolution. The LO leakages for 110 GHz and 115 GHz LO signals are plotted in Fig. 2.24(b) for different bias codes. The LO port driving power is 0 dBm for the simulation. The LO leakage arise from parasitic coupling should be perfectly canceled in an ideal differential setup. However, device mismatch, LO imbalance and large-signal  $C_{\mu}$ non-linearity introduce asymmetry in the LO signal coupling path and lead to finite LO leakage at modulator output. By placing all transistors close to each other and employing a fully symmetric layout for differential LO paths, the LO leakage is minimized. The BB DC offset can also cause LO leakage due to frequency up-conversion. Nevertheless, the BB inputs are driven into saturation to fully steer the bias current, rendering the DC offset concern insignificant.

Fig. 2.25(a) shows the small-signal up-converted frequency response for different bias codes. The frequency dependence remains consistent for low amplitude and high amplitude bias codes. An amplitude ratio close to 1-to-2 is maintained in a wide range of frequencies for codes 15 and 5. The quadrature phase error of QPSK modulator is shown in Fig. 2.25(b)



Figure 2.25: QPSK modulator simulation results of: (a) Magnitude response. (b) Internal quadrature errors.

for the bias codes 15 and 5. Using the full bias current (code 15), quadrature phase error stays below 4° and only cause a minimal EVM degradation. The smaller amplitude case (code 5) shows larger phase error but has less weight on determining the combined 16QAM EVM according to Eq. (2.7). Transient waveform of a 10 GBaud signal with alternating transitions between 180° out-of-phase symbols is shown in Fig. 2.26. Transient simulation capture both amplitude and phase response of the QPSK modulator and shows a worst case symbol-to-symbol transition of 0.32 unit internal (UI). This demonstrates the QPSK modulator design is capable of processing wideband signals without inducing significant ISI.



Figure 2.26: 10 GBaud output waveform in transition between symbols with 180° phase difference.



Figure 2.27: (a) LO tripler schematic. (b) LO doubler schematic. Layouts of (c)  $0^{\circ}$  phase shifter (d)  $45^{\circ}$  phase shifter.

#### 2.4.4 LO Generation and Distribution

The LO generation and distribution network shown in Fig. 2.16 multiplies the input LO frequency by 6 and produce differential quadrature LO signals to the QPSK modulators. The tripler in Fig. 2.27(a) triples the input LO first and its output is then split by a differential Wilkinson power divider. The split outputs experience phase shifts with 45° phase difference and are then amplified by capacitively neutralized buffers similar to PA stages. The doubler shown in Fig. 2.27(b) then doubles the LO to around 110 GHz and the 45° phase difference is also doubled to 90° for quadrature operation. A balun converts the single-ended doubler output to differential for the following differential buffers. The buffers are driven into saturation to minimize amplitude mismatch between I/Q LO signals. Four baluns are used to divide the I/Q LO signals to feed the two QPSK modulators. The phase

shifters are based on all-pass filter topology [83]. Their layouts and crucial design parameters are shown in Fig. 2.27. Varactor voltages control the phase difference.

The input-output-relationships for tripler, doubler and the complete LO chain are plotted in Fig. 2.28(a). The output power of LO chain that feeds QPSK modulators saturates to 0 dBm with -10 dBm external input power. As shown in Fig. 2.28(b), the phase shifters' outputs can be tuned to have phase difference from 14.7° to 80.5°. This tuning range is sufficient to compensate quadrature phase error generated along the LO chain. The LO chain frequency response is shown in Fig. 2.28(c) and the single side-band (SSB) PN profile at 110 GHz is also shown in Fig. 2.28(d). With a noiseless external source, the PN solely due to multiplier chain is below -130 dBc/Hz at 1 MHz frequency offset and flattens to -138



Figure 2.28: Simulation results of: (a) Output power versus input power for LO generation circuitry. (b) Phase shifter tuning range. (c) Output power versus frequency. (d) SSB phase noise of LO output at 110 GHz.

dBc/Hz at far-out frequencies. Simulation with an external PN profile from a realistic signal generator (Agilent E8257D/567) is also plotted in Fig. 2.28(d) for comparison. The close-in PN is mainly limited by the external source (or an on-chip synthesizer in a fully integrated solution) while the on-chip LO chain mainly affects far-out PN. As pointed out by [72], far-out white PN can be a major source of EVM degradation in wideband communication systems. The EVM floor limited by -138 dBc/Hz of white PN floor is at least below -30 dB as shown by the experimental results in [72] and is sufficient for 16QAM modulation in this prototype. The above PN simulation results were obtained at a 0 dBm LO chain output power. The highest power undesired harmonics are the 5<sup>th</sup> (-38 dBm) and the 7<sup>th</sup> (-34 dBm). These LO harmonics are separated from the desired LO harmonic (6<sup>th</sup>) by more than 18.3 GHz. The LO chain frequency response as shown in Fig. 2.28(c) attenuates the undesired harmonics and the hard-switching of QPSK modulator's LO port ensures negligible impact on EVM due nearby LO harmonics.

## 2.5 Measurement Results

The 16QAM prototype TX was fabricated in TowerJazz SBC18H3, a 180 nm SiGe BiCMOS process. The die micrograph is shown in Fig. 2.29. The TX occupies 3.17 mm<sup>2</sup> active area



Figure 2.29: Chip micrograph of the prototype TX.



Figure 2.30: Power consumption and active area breakdown.

and dissipates a total of 520 mW DC power from 1.6V, 2.5V and 3.3V supply voltages. The power consumption breakdown is shown in Fig. 2.30(a), in which PA and the LO buffers consume most of the power owing to the limited transistor  $G_{MAX}$  around 110 GHz carrier frequency. As can be seen from Fig. 2.30(b) and Fig. 2.29, the quadrature LO generation part occupies most of the die area. Note that the distribution of quadrature LO to two QPSK modulators specific for the proposed TX does not inflict much area overhead. The total die area can be optimized and reduced with better floor plan and transformer-based, instead of transmission-line-based, matching networks.

Shown in Fig. 2.31 are the measurement setup and the testing bench during chip characterization. The external input of LO chain comes from a coaxial GSG probe and the internal source of a vector network analyzer (VNA), which was used as a signal generator. The TX output is first characterized with waveguide (WR 8.0) GSG probe and a wideband down-conversion mixer. The IF port BW of the mixer is 20 GHz and the TX output is down-converted to an IF frequency of 10 GHz. The LO drive for the mixer is provided by a frequency extension module for signal generators. Wireless measurement was also performed to verify the link performance. The rest of the setup is the same for both wired and wireless measurement while the waveguide sections in wired setup is replaced with a 25 dBi horn antenna. The maximum wireless measurement distance of 20 cm was limited by the lab environment.



Figure 2.31: (a) Measurement setup. (b) Photo of laboratory measurement bench.

We are interested in the frequency transfer from TX LO port to the output pads since the QPSK modulator can be treated as an LO buffer with cascode tail current source when the BB inputs are not switching. Frequency response from the modulator output to the TX output pad can be measured by giving a 200 mV DC to the BB input of one QPSK modulator while completely turning off the other modulator, and then sweeping LO frequency across the frequency range of interest. The actual LO power feeding into the QPSK modulator is estimated by simulation results shown in Fig. 2.31 since the internal modulator LO port cannot be accessed externally. An LO power of -30 dBm stimulates small-signal frequency response is thus shown in Fig. 2.32(a) with a clear stagger-tuned signature. The two peaks are around 105 GHz and 115 GHz.

One QPSK modulator is turned off while the other is fed with a BB input of 0.5 GHz sqaure wave to measure the large-signal up-conversion response. The TX output power is plotted versus varying LO frequency and different bias codes as shown in Fig. 2.32(b). The output power tuning range is 14 dB at 110 GHz, very close to that predicted by simulation in Fig. 2.24(b). The maximum TX output using only one QPSK modulator is 3 dBm at 120 GHz.



Figure 2.32: Measured (a) small signal frequency response, (b) TX output power frequency sweep with different bias codes.



Figure 2.33: Measured (a) error in amplitude ratio, (b) LO leakage.

The amplitude error (between code 15 and code 5) relative to the ideal 6 dB ratio is shown in Fig. 2.33(a). This error is kept below 1 dB in 20 GHz BW. The EVM limited by this level of amplitude mismatch is below -27 dB according to Eq. (2.15) and Fig. 2.11(b). The 0.5 GHz square-wave at both I/Q BB input are in-phase and therefore both lower and higher side-band tones are strong as shown in the down-converted spectrum in Fig. 2.33(b). The total up-converted output power is obtained by combining the power of lower and higher side-bands. The LO leakage at maximum output power level is measured to be -25.7 dBc.

The constellations obtained from wireless measurement setup is demonstrated in Fig. 2.34. An off-line equalizer implemented in MATLAB with 40-tap DFE was used to equalize the



Figure 2.34: Individual QPSK and combined 16QAM constellation for (a) 4GBaud and (b) 5GBaud.

frequency response of TX and interconnections involved in the measurement setup. Both QPSK constellations with different amplitudes were captured separately by only turning on one of them. The final combined 16QAM EVM is very close to the EVM of QPSK signals with  $\approx$  1dB extra degradation. This result verifies the analytical conclusion drawn in Section 2.3. Residual amplitude/phase mismatch and the BB data correlation due to finite PRBS length contribute to this extra degradation. The EVM for 4 GBaud (16 Gbps) and 5 GBaud (20 Gbps) 16QAM signals are -16.7 dB and -15.8 dB, respectively. The total modulated output power after de-embedding measurement setup loss is +1 dBm.

The unequalized IF spectrum of TX output is shown in Fig. 2.35 for a 5 GBaud QPSK and 16QAM signal. The EVM is plotted versus data rate as shown in Fig. 2.36. The PRBS design in this prototype limits the maximum achievable data rate and can be improved in future designs. The EVM degradation from QPSK to 16QAM is smaller at lower speeds, signifying a



Figure 2.35: Down-converted spectrum of (a) 5 GBaud QPSK<sub>2</sub> and (b) 5 GBaud 16QAM.

better amplitude and phase matching at lower symbol rates. The EVM discrepancy between a 2 Gbps QPSK and 4 Gbps 16QAM is only 0.1 dB, implying that the amplitude non-linearity of the signal path does not limit the measured EVM. The BW limitation and PA's AM-to-PM distortion ultimately sets the lower bound of EVM at high data rates. To be more specific, magnitude flatness and phase distortion/group delay variation are the dominant factors in determining the maximum data rate.

The mixer and the oscilloscope in the measurement setup is essentially a RX with poor conversion gain and noise figure. The -15.8 dB EVM obtained from the wireless setup for a 20 Gbps 16QAM signal includes all the impairments from prototype TX and the



Figure 2.36: EVM vs. data rate for (a) QPSK signal and (b) combined 16QAM signal.

| Reference | Architecture                | Modulation<br>(Baud Rate) | D/A<br>Interface     | D/A<br>Speed | Data Rate <sup>I</sup><br>(Gbps) | EVM<br>(dB) | Carrier<br>Frequency<br>(GHz) | Output<br>Power<br>(dBm) | Power<br>Dissipation <sup>I</sup><br>(mW) | Tech             |
|-----------|-----------------------------|---------------------------|----------------------|--------------|----------------------------------|-------------|-------------------------------|--------------------------|-------------------------------------------|------------------|
| [9]       | Direct<br>Modulation        | QPSK<br>(24 GBaud)        | Integrated<br>1-bit  | 24 GS/s      | 48 <sup>11</sup>                 | $-10^{IV}$  | 144                           | $+9^{III}$               | 165                                       | 250nm<br>InP     |
| [5]       | BB I/Q DAC<br>Up-conversion | 8PSK<br>(21.33 GBaud)     | External<br>DAC [59] | 64 GS/s      | 64                               | -13.8       | 240                           | -4.5                     | N/A                                       | 35nm<br>GaAs     |
| [10]      | IF DAC<br>Up-conversion     | 16QAM<br>(25 GBaud)       | External<br>AWG      | >50 GS/s     | 100                              | -16.5       | 287                           | $+9.5^{III}$             | N/A                                       | 80nm<br>InP      |
| [14]      | Direct<br>Modulation        | OOK<br>(10 GBaud)         | Integrated<br>1-bit  | 10 GS/s      | 10 <sup>V,II</sup>               | N/A         | 210                           | +5.13 <sup>VI</sup>      | 240                                       | 32nm<br>CMOS SOI |
| [13]      | Direct<br>Modulation        | OOK<br>(12.5 GBaud)       | Integrated<br>1-bit  | 12.5 GS/s    | 12.5 <sup>II</sup>               | $-10^{IV}$  | 115                           | +9.3                     | 59                                        | 55nm<br>SiGe     |
| [15]      | Direct<br>Modulation        | QPSK<br>(8 GBaud)         | Integrated<br>1-bit  | 8 GS/s       | 16 <sup>II</sup>                 | N/A         | 240                           | $+1^{VII}$               | 220                                       | 65nm<br>CMOS     |
| [20]      | BB I/Q DAC<br>Up-conversion | 64QAM<br>(3.52 GBaud)     | External<br>AWG      | >7.04 GS/s   | 21.12                            | -22         | 59.4<br>63.72                 | +7                       | 272                                       | 65nm<br>CMOS     |
| [21]      | BB I/Q DAC<br>Up-conversion | 64QAM<br>(7.04 GBaud)     | External<br>AWG      | >14.08 GS/s  | 42.24                            | -21.1       | 61.56                         | N/A                      | 169                                       | 65nm<br>CMOS     |
| [18]      | IF DAC<br>Up-conversion     | 32QAM<br>(21 GBaud)       | External<br>AWG      | >84 GS/s     | 105 <sup>VIII</sup>              | -21         | 289-311                       | -5.5 <sup>III</sup>      | 1400                                      | 40nm<br>CMOS     |
| [23]      | IF DAC<br>Up-conversion     | 16QAM<br>(15 GBaud)       | External<br>AWG      | >60 GS/s     | 60                               | -21.1       | 87.5-105                      | -9.4                     | 60 <sup>IX</sup>                          | 65nm<br>CMOS     |
| [24]      | Direct<br>Modulation        | BPSK<br>(50 GBaud)        | Integrated<br>1-bit  | 50 GS/s      | 50                               | N/A         | 190                           | -6                       | 32                                        | 130nm<br>SiGe    |
| [25]      | BB I/Q DAC<br>Up-conversion | 16QAM<br>(25 GBaud)       | External<br>AWG      | >50 GS/s     | 100                              | -15.4       | 220-255                       | +1.5                     | 1410                                      | 130nm<br>SiGe    |
| This Work | Direct<br>Modulation        | 16QAM<br>(5 GBaud)        | Integrated<br>1-bit  | 5 GS/s       | 20                               | -15.8       | 115                           | +1                       | 520                                       | 180nm<br>SiGe    |

 his Work
 Modulation
 (5 GBaud)
 1-un
 II
 Measured without equalization.

 <sup>1</sup> Data rate / power dissipation in single TX up-conversion channel.
 <sup>11</sup> Measured without equalization.
 VI
 EIRP with 4.5 dBi antenna gam.

 <sup>10</sup> Data rate / power dissipation in single TX up-conversion channel.
 <sup>11</sup> Measured with CW signal.
 VI
 EIRP with 4.5 dBi antenna gam.

 <sup>11</sup> Measured from reported BER.
 VIII Measured in a wired setup.
 <sup>1X</sup> Total power consumption of two TX up-conversion channel divided by 2.

Table 2.2: Comparison with State-of-the-art High-speed Transmitters

measurement RX. A BER of  $2 \times 10^{-3}$  is inferred, which can be further improved to  $< 10^{-15}$ (error free) at the cost of 7% overhead in forward error correction (FEC) implementation [84].

Table 2.2 compares the TX prototype with state-of-the-art high-speed TXs. BB data bits are directly converted into high-speed 16QAM signal using the proposed method, circumventing high-speed-resolution DAC design. The total TX power consumption is 520 mW while a high-speed-resolution DAC system by itself can already dissipates 750 mW of power [65]. The proposed architecture demonstrates a promising pathway toward fully integrated highspeed TX design as it achieves the highest modulation order among previously reported direct modulation TXs and a competitive 20 Gbps data-rate. The data-rate can be further improved with optimized front-end and BB circuit design.

## 2.6 Conclusion

This chapter proposed a innovative TX architecture for high-speed point-to-point wireless link applications. High order  $4^{M}$ -QAM constellations can be directly generated using direct modulation method, completely removing the challenging D/A interface design. Analytical study on the proposed architecture explains the underlying advantages over conventional TX architectures and quantitatively characterized impairments due to circuit imperfections. An experimental wireless link at 110-115 GHz center frequency was demonstrated using 16QAM modulation. 20 Gbps data rate was achieved at an EVM of -15.8 dB.

## Chapter 3

# LO Leakage Suppression in Wideband, Blocker-Tolerant, Highly Selective and Widely Tunable Receivers

## **3.1** Introduction and Motivation

With the proliferation of wireless standards, cellular systems typically have to incorporate multiple narrow-band receivers (RXs) to cover the entire operating frequency range and therefore mandates bulky and costly fixed frequency filter bank. As shown in Fig. 3.1(a), it is highly desirable to have one programmable wideband RX replace the filter and RX bank. System cost and hardware overhead can be significantly reduced as a result. However, excellent RX linearity, blocker tolerance and noise performance are demanded due to the absence of dedicated off-chip filters.

Nevertheless, the lack of LNA between antenna and passive switching devices immediately poses a few challenges. First, the impedance matching network typically embedded in LNA is



Figure 3.1: (a) LO leakage without circuit mismatch. (b) LO leakage with amplitude and phase mismatch.

missing. This issue can be tackled with techniques proposed in [85] even when both the real and imaginary parts of antenna impedance are varying. In addition, the RX noise figure (NF) can be very limited when a passive lossy network precedes active gain stages at baseband (BB). Nevertheless, NF as low as 2 to 3 dB can be achieved with noise cancellation [86], positive resistive feedback [87] or simply large switch size followed by low-noise BB amplifiers [85,88]. Furthermore, the strong LO leakage through switch parasitic capacitance (shown in Fig. 3.1(b)) not only leads to RX desensitization due to self mixing but can also manifest itself as interference to nearby RXs. Cellular radio standards typically specify it as conducted spurious emission [89–91] and require an LO leakage as low as -79 dBm in the RX band. The method based on mixer switch calibration (MxDAC) [88] adds minimal overhead but is limited by random mismatch and only achieved -62 dBm LO leakage at 1.5 GHz. Another solution [92] employs multiple digital-to-analog converter (DAC) at BB to cancel the LO leakage at RF through up-conversion. It suppresses the LO leakage down to -80 dBm at 0.7 GHz. However, the DACs occupy a large die area due to mismatch requirement and inevitably inject extra noise into the signal path. [93] proposes a randomized LO sequence to spread the LO leakage spectrum and reduce its power spectrum density. However, only -60 dBm LO leakage suppression is achieved up to 1.4 GHz at the cost of extra switch in series and unwanted spurs around the LO frequency due to pseudo-random sequence clock. Various other techniques have also emerged to further improve filter roll-off and linearity [94–96], realize compact high-Q channel selection at RF [97–99], enhance harmonic blocker tolerance [100, 101], etc. However, high-Q selectivity at RF input has always been accompanied by an LO leakage of -67 dBm or higher. This chapter of the dissertation proposes a wideband RX that simultaneously realizes high-Q selectivity at RF input and low LO leakage of -80 dBm up to 2 GHz.

## 3.2 LO Leakage Bottleneck

Fig. 3.2 illustrates a typical N-path structure, driven by 1/N duty cycle non-overlap LO. The realization of low-pass impedance  $Z_{LP}$  ranges from simple 1<sup>st</sup> order RC network to more complicated high-order active network [102]. The falling edge of the k<sup>th</sup> LO signal LO<sub>k</sub> comes first and couples through parasitic capacitance  $C_0$  to induce a negative pulse. After the non-



Figure 3.2: LO leakage without circuit mismatch. (b) LO leakage with amplitude and phase mismatch.

overlap time,  $LO_{k+1}$  turns on and cause a positive pulse on source (antenna) impedance  $R_s$ . This doublet waveform repeats itself after every  $T_{LO}/N$  in a perfectly matched circuit setup  $(T_{LO} = 1/f_{LO})$ . The spectrum of this leakage waveform is thus comprised of fundamental and harmonic terms at integer multiples of  $Nf_{LO}$ . These leakage tones are far away from the RX band around  $f_{LO}$  and have a more relaxed spurious emission requirement of -57 dBm [89–91]. Therefore, LO leakage posts very little concern when circuit mismatch is neglected. We can define a repetitive doublet function d(t) in time domain that originates from a sharp falling and rising edge of LO signals and repeats every  $T_{LO}$ . The full LO leakage waveform x(t) is written as:

$$x(t) = \sum_{k=1}^{N} d\left(t - (k-1)\frac{T_{LO}}{N}\right).$$
(3.1)

The spectrum of the LO leakage is

$$X(f) = D(f) \sum_{k=1}^{N} e^{-j2\pi f T_{LO} \frac{k-1}{N}}.$$
(3.2)

D(f) is the Fourier transform of d(t). It can be easily proven with Eq. (3.3) that the leakage spectrum evaluates to zero at  $f_{LO}$ , agreeing with the previous qualitative analysis.

$$\sum_{k=1}^{N} e^{-j2\pi \frac{k-1}{N}} = 0 \tag{3.3}$$

Identity in Eq. (3.3) can be graphically illustrated in Fig. 3.2 where a summation of N unity vectors with equally spaced angles between 0 and  $2\pi$  eventually cancels each other [103]. However, mismatch is an integral part of all circuit design and will lead to LO leakage at  $f_{LO}$ . Note that the waveform d(t) can be derived from an ideal pulse s(t) filtered by  $C_0$  and



Figure 3.3: LO leakage with amplitude and phase mismatch.

 $R_s$  (Fig. 3.3). The pulse width of s(t) is the non-overlap time of LO signals. D(f) is:

$$D(f) = S(f) \frac{j2\pi f C_0 R_s}{1 + j2\pi f C_0 R_s}$$
(3.4)

where S(f) is the Fourier transform of s(t). Both  $R_s$  (50  $\Omega$  typical) and  $C_0$  are low in practical implementations such that we can safely assume  $1/R_sC_0 \gg f_{LO}$ . Therefore,  $D(f_{LO})$  can be approximated as:

$$D(f_{LO}) \approx j2\pi f_{LO} C_0 R_s S(f_{LO}). \tag{3.5}$$

As shown in Fig. 3.3, device mismatch leads to a random variation in parasitic coupling capacitance. The LO leakage at  $f_{LO}$  with this mismatch effect can be expressed as:

$$X_m(f_{LO}) \approx \sum_{k=1}^N j 2\pi f_{LO} \left( C_0 + \Delta C_k \right) R_s S(f_{LO}) e^{-j2\pi \frac{k-1}{N}}$$
  
=  $j 2\pi f_{LO} R_s S(f_{LO}) \sum_{k=1}^N \Delta C_k e^{-j2\pi \frac{k-1}{N}}.$  (3.6)

 $\Delta C_k$  are independent random variables following a Gaussian distribution of  $\mathcal{N}(0, \sigma_{\Delta C}^2)$ . As illustrated in Fig. 3.3, this random deviation in coupling capacitance leads to slightly different doublet waveform every time a different switch turns on. The exact same waveform

only appears every  $T_{LO}$  when the same switch is turned on again, thus give rise to spectrum component at  $f_{LO}$ . It is of interest to quantitatively know RX band LO leakage power:  $|X_m(f_{LO})|^2/R_s$ . Notice that the magnitude square of the summation term in Eq. (3.6) is

$$\left|\sum_{k=1}^{N} \Delta C_{k} e^{-j2\pi \frac{k-1}{N}}\right|^{2} = \left|\sum_{k=1}^{N} \Delta C_{k} \cos \frac{2\pi (k-1)}{N} - j \sum_{k=1}^{N} \Delta C_{k} \sin \frac{2\pi (k-1)}{N}\right|^{2} = \sum_{k=1}^{N} \Delta C_{k}^{2}.$$
(3.7)

It can be shown that the summation of squared Gaussian variables in Eq. (3.7) follows Gamma distribution of  $\Gamma(N/2, 2\sigma_{\Delta C}^2)$  with a mean and variance of  $N\sigma_{\Delta C}^2$  and  $2N\sigma_{\Delta C}^4$ , respectively. Consequently,  $|X_m(f_{LO})|^2$  satisfies a scaled Gamma distribution of  $\Gamma(N/2, 2\eta\sigma_{\Delta C}^2)$ , where  $\eta = (2\pi f_{LO}R_s)^2 |S(f_{LO})|^2$ . Therefore, the mean RX band LO leakage power is

$$P_{Lkg}(f_{LO}) = \mathbf{E} \left\{ |X_m(f_{LO})|^2 / R_s \right\} = 4\pi^2 f_{LO}^2 R_s \left| S(f_{LO}) \right|^2 N \sigma_{\Delta C}^2.$$
(3.8)

The relative mismatch of parasitic capacitance in MOS transistors can be written as [104]:

$$\frac{\sigma_{\Delta C}^2}{C_0^2} = \sigma_u^2 = A_m^2 / WL \tag{3.9}$$

where  $A_m$  is a process dependent mismatch parameter, W and L are the width and length of the switching device. Note that  $C_0$  is proportional to the switch size and  $C_0 = C_u W L$ . As a result, the mean RX band LO leakage power can be expressed by the practical design parameters as:

$$P_{Lkg}(f_{LO}) = 4\pi^2 f_{LO}^2 R_s \left| S(f_{LO}) \right|^2 A_m^2 C_u^2 NWL.$$
(3.10)

Despite improved relative mismatch with larger switch size, the overall LO leakage power

become higher due to stronger capacitive coupling. To make matters worse, the N-phase LO signals will also introduce mismatch in its phases and lead to non-zero LO leakage power at  $f_{LO}$ . The random variation effect in LO path buffers can be mitigated by sharp rising and falling edges at the cost of higher switching power consumption. The remainder of LO phase mismatch mostly comes from routing skews due to asymmetry. Regardless of its origination, the phase mismatch induced LO leakage at  $f_{LO}$  directly couples to RX input and is proportional to the switch size. However, a switch resistance much lower than  $R_s$  (500 typical) is crucial in determining the performance of N-path based RXs. Switch size as large as  $180\mu m/30$ nm [96] can be found in literature for the purpose of low NF, high linearity and blocker attenuation. Although feedback can be introduced to reduce the necessary switch size [97], the resulting LO leakage is still too high for some sensitive applications.

A common remedy is to postpone N-path filtering to a wideband LNA's output [101,105–107] and suppress the leakage to RX input with LNA's reverse isolation. This typically reduces the RX band LO leakage to an acceptable level. Nevertheless, the wideband LNA's input is unprotected and sees large blockers without any filtering. Significant NF degradation can arise due to input compression by high power blockers.

## **3.3 Band-Pass Common-Gate Structure**

As previously discussed, the inevitable direct switch connection to the low impedance antenna interface is the culprit of strong LO leakage in N-path based RXs. It eventually boils down to the fact that the BP impedance around  $f_{LO}$  comes from the up-conversion of a LP (typically capacitive) impedance around DC (Fig. 3.4(a)) and switches are indispensable to any frequency translation. Alternatively shown in Fig. 3.4(a), modulating resistance with a frequency selective function serves the same purpose of filtering blockers while direct switch connection to antenna can be avoided.



Figure 3.4: (a) Generation of frequency selective impedance. (b) Regulated common-gate to modulate input resistance.

#### 3.3.1 High-Q Input Selectivity

Fig. 3.4(b) depicts a method of modulating input resistance in a regulated common-gate structure. Instead of biasing the gate of  $M_{in}$  with a fixed voltage, the gate of  $M_{in}$  senses an inverted and notch-filtered input voltage. The input resistance of this structure can be derived as:

$$R_{in}(f) = \frac{1}{g_m(1+A(f))}.$$
(3.11)

The notch-shaped response A(f) is turned into a BP response in the input resistance  $R_{in}$ through feedback. The notch filter has a low gain at  $f_{LO}$  ( $A(f_{LO}) \ll 1$ ) and  $R_{in}$  is simply  $1/g_m$ , same as that in a conventional common-gate LNA. The notch filter should have a high gain ( $A_{OB}$ ) at the out-of-band (OB) frequency  $f_{OB}$ . Accordingly, the input resistance at  $f_{OB}$ is reduced to  $1/[g_m(1 + A_{OB})]$  and greatly suppresses blocker voltage swing at the antenna interface. A circuit realization of this band-pass common-gate (BPCG) structure is shown in Fig. 3.5(a) with an N-path notch filter [108] in the feedback loop. N capacitors of  $C_N$ are used in the notch filter. Two voltage gain stages with a gain of  $A_1$  and  $A_2$  precedes and



Figure 3.5: (a) Circuit realization of BPCG structure. (b) Circuit model for analysis.

succeeds the N-path notch filter. The required high loop gain at OB frequencies is achieved by  $A_{OB} = A_1 A_2$  assuming the notch filter is lossless in its pass-band. The loop gain at  $f_{LO}$ is notched to a low value regardless of the gain of  $A_1$  and  $A_2$ .

The circuit in Fig. 3.5(a) can be analyzed as in Fig. 3.5(b) to the first order to gain some design insights.  $R_1$ ,  $R_2$  model the output impedance of gain stages while  $R_L$  represents the load impedance for the N-path notch filter. The transfer characteristic of the N-path notch filter around  $f_{LO}$  can be accurately modeled as a parallel resonant tank. The component values  $R_p$ ,  $C_p$  and  $L_p$  are derived in [108] and repeated in Eq. (3.12).  $R_{on}$  in series with the resonant tank models the notch filter loss due to switch resistance.

$$R_{p} = \frac{N^{2} \sin^{2}(\pi/N)}{\pi^{2} - N^{2} \sin^{2}(\pi/N)} R_{T} = \alpha_{N} R_{T}$$

$$R_{T} = R_{1} + R_{on} + R_{L}$$

$$C_{p} = \frac{\pi^{2}}{2N \sin^{2}(\pi/N)} C_{N} = \beta_{N} C_{N}$$

$$L_{p} = \frac{1}{4\pi^{2} f_{LO}^{2} C_{p}}$$
(3.12)
The input impedance of BPCG can be derived as:

$$Z_{in}(s) = \frac{1}{g_m} \times \frac{s^2 L_p C_p R_p R_T + s L_p (R_p + R_T) + R_p R_T}{s^2 L_p C_p R_p (R_T + A_1 A_2 R_L) + s L_p (R_p + R_T + A_1 A_2 R_L) + R_p (R_T + A_1 A_2 R_2)}$$
  
$$= \frac{1}{g_m} \times H_{in}(s)$$
(3.13)

where a resistance of  $1/g_m$  is multiplied by a frequency selective function  $H_{in}(s)$ .  $H_{in}(s)$  can be rearranged into:

$$H_{in}(s) = \frac{1}{1 + \frac{R_L}{R_T/(A_1 A_2)}} \times \frac{s^2 / \omega_{LO}^2 + s / (\omega_{LO} Q_n) + 1}{s^2 / \omega_{LO}^2 + s / (\omega_{LO} Q_d) + 1}$$

$$Q_n = (R_p / / R_T) \, \omega_{LO} C_p$$

$$Q_d = [R_p / / (R_T + A_1 A_2 R_L)] \, \omega_{LO} C_p$$
(3.14)

where  $\omega_{LO} = 2\pi f_{LO}$ . The pole pair of the 2<sup>nd</sup> order denominator mainly determines the BP response of  $H_{in}(s)$  and must be complex, which entails  $Q_d > 0.5$ . This condition is not hard to satisfy given the typical GHz range of LO frequencies and larger capacitance can always be used to increase the quality factor  $Q_d$ . When the same N-path network is directly used to interface with an antenna impedance of  $R_s$ , the overall network quality factor is

$$Q_0 = \frac{R_s}{1 + 1/\alpha_N} \omega_{LO} C_p \approx R_s \omega_{LO} C_p.$$
(3.15)

The approximation in Eq. (3.15) introduces little error for typical values of N, e.g.  $\alpha_4 = 4.28, \alpha_6 = 10.35, \alpha_8 = 18.86$ . For a low loss notch implementation  $(R_1 + R_{on} \ll R_L), Q_d$  can be simplified to

$$Q_d \approx [(1 + A_1 A_2) / / \alpha_N] R_L \omega_{LO} C_p.$$
(3.16)

Comparing the results in Eq. (3.15) and (3.16) and notice the relationship between  $C_p$ 

and physical capacitance  $C_N$  in Eq. (3.12), we can conclude that a larger number of Nin the N-path implementation in conjunction with higher  $A_1A_2$  help increase the network quality factor and thus improve the front-end selectivity with a given center frequency and capacitance. The zero pair of the numerator in Eq. (3.14) has the same natural frequency of  $\omega_{LO}$  but with a much lower quality factor  $Q_n$  compared to the denominator. Therefore, the zero pair negligibly affects frequency response close to  $f_{LO}$  and is mainly responsible for the OB response. Thus, the 3-dB RF bandwidth can be simply approximated with  $f_{LO}/Q_d$ .

At OB frequencies far away from  $\omega_{LO}$ , the input impedance approaches a constant resistance of

$$Z_{in}(OB) = \frac{1}{g_m} \times \frac{1}{1 + \frac{R_L}{(R_1 + R_{on} + R_L)/(A_1 A_2)}}.$$
(3.17)

Note that the term  $R_1 + R_{on}$  is divided by  $A_1A_2$ . Although non-zero  $R_1 + R_{on}$  increases the pass-band loss of the notch filter and degrade OB attenuation, its effect is reduced by a factor of  $A_1A_2$  in the feedback loop. Therefore, a moderate  $R_1 + R_{on}$  can be tolerated to facilitate transistor level implementation. The input impedance at  $f_{LO}$  is calculated as:

$$Z_{in}(f_{LO}) = \frac{1}{g_m} \times \frac{1}{1 + \frac{A_1 A_2 R_L}{R_p + R_T}} = \frac{1}{g_m} \times \frac{1}{1 + \frac{A_1 A_2 A_n}{(1 + \alpha_N)}}.$$
(3.18)

 $A_n$  equals to  $R_L/R_T$  and represents the pass-band gain of the N-path notch filter. Although a higher  $A_1A_2$  helps OB attenuation, it will lower in-band input impedance and worsen input S<sub>11</sub>. Fortunately, a higher N in the N-path implementation amends the problem and input impedance will only drop slightly. Simulation results of BPCG input impedance with varying N and  $A_1A_2$  are shown in Fig. 3.6. The following parameters are used in the simulation:  $g_m = 20 \ mS$ ,  $R_1 = 50 \ \Omega$ ,  $R_{on} = 20 \ \Omega$ ,  $R_L = 250 \ \Omega$ . The total capacitance used for both N = 4 and N = 8 cases is 6 pF. Both selectivity and OB impedance are clearly improved with higher  $A_1A_2$ . More in-band resistance drop is observed for smaller



Figure 3.6: BPCG input impedance with different N-path implementation and gain.

N due to smaller notch depth [108]. Therefore, higher gain in  $A_1A_2$  and more paths in the N-path implementation are preferred for the sake of improving selectivity, OB attenuation and also maintaining input matching. This constructive effect of active gain helping improve selectivity is similar to the band-pass Miller effect [97] and gain-boosted N-path [99] reported in prior work. However, the distinction in the proposed BPCG structure is that the N-path filter is embedded in the loop and not exposed to the antenna interface. The LO leakage can therefore be greatly reduced as will be discussed in the following section.

## 3.3.2 LO Leakage Analysis

Although switches are no longer directly connected to the antenna, LO signals driving the N-path notch filter will inevitably leak to the RX input through different paths. As shown in Fig. 3.7, N-path notch response is realized with switch-bank1 (SWB1) on the left and switch-bank2 (SWB2) on the right of the capacitors. Although either one of the switch banks suffice to synthesize notch response, both are used so that the parasitic capacitance on both sides of the capacitors does not introduce low frequency poles [97]. As will be discussed



Figure 3.7: (a) N-path notch filter induced LO leakage from SWB1. (b) SWB2 LO leakage contribution.

later, poles at a low frequency relative to  $f_{LO}$  incur input impedance variation and loop stability issue. SWB1 and SWB2 contribute to the LO leakage at RX input in different ways and proper design can in fact manage the LO leakage magnitude to within an acceptable level.

The LO signals driving SWB1 will couple to the output of  $A_1$  and the left plate of N-path capacitors as shown in Fig. 3.7(a). The leakage traveling left is attenuated by the reverse isolation of  $A_1$  before appearing across the antenna impedance  $R_s$ . For the LO leakage from SWB1 to appear at the input of  $A_2$  and then travels to  $R_s$ , some DC mismatch must be induced on the right plates of the N-path capacitors. This hypothetical DC mismatch will be up-converted to  $f_{LO}$  via the periodic commutation among the right plates of capacitors and manifest itself as LO leakage. However, DC signals can not pass through the N-path capacitors and therefore there will be ideally no LO leakage at the input of  $A_2$  that is due to SWB1. In practice, higher harmonics of the LO signal (around  $2f_{LO}, 3f_{LO}, \ldots$ ) will couple through the N-path capacitors and re-appear across  $R_L$  after being frequency translated by SWB2 to  $f_{LO}$ . However, this part of contribution is low since higher harmonics have lower energy and will be heavily attenuated by the parasitic capacitance to ground on both sides of the capacitors. In short, the LO leakage contributed by SWB1 will be low.

Similar analysis can be carried out for LO leakage originated from SWB2, shown in Fig. 3.7(b). The leakage signal traveling toward the left side is blocked by the N-path capacitors plus the reverse isolation of  $A_1$  and can be ignored. The LO signal coupled to  $R_L$  will experience a gain of  $A_2$  and then reaches the gate of M<sub>in</sub>. From the gate of M<sub>in</sub> to  $R_s$ , the leakage signal undergoes a gain of  $g_m R_s/(1 + g_m R_s)$  to the first order. In a well matched scenario, it equals to a 6dB attenuation on the LO leakage from SWB2. This implies that  $A_2$  should be minimized to reduce LO leakage, in conflict with the high  $A_1A_2$  requirement discussed in Section 3.3.1. Therefore,  $A_1$  should be designed for a high gain while  $A_2$  is kept low to mitigate LO leakage.

Fig. 3.8 plots the mean LO leakage power contributed by SWB1 and SWB2 individually from a Monte Carlo simulation with  $f_{LO} = 1$  GHz. The LO leakage contributions from SWB1 and SWB2 are separated by replacing one of the switch bank with ideal switches. The LO leakage increases with larger switch size in general. However, SWB1's contribution



Figure 3.8: RX input mean LO leakage and OB input impedance at 1 GHz with different switch sizes and gain in  $A_1$ .

is much lower than that of SWB2 and the combined LO leakage is dominated by the switch size of SWB2, in accordance with the previous analysis. Therefore, a size ratio of 4 to 1 between SWB1 and SWB2 was chosen in the proposed design.

One consequence of limiting the switch size to control LO leakage is that the pass-band notch filter loss will increase. This will be reflected in an elevated OB input impedance as shown in Fig. 3.8 with red dotted lines. Nevertheless, larger switch size results in diminishing return in OB rejection while LO leakage continues to rise. Therefore, a moderate switch size based on the LO leakage target should be chosen to implement the N-path notch filter. SWB2 size of  $5\mu m/40$ nm is chosen with a transmission gate design (both NMOS and PMOS width  $2.5\mu m$ ) to handle potentially large voltage swing due to blockers. The complementary switch design also helps reducing LO leakage amplitude. As for the degraded OB attenuation, higher gain in  $A_1$  comes to the rescue as shown in Fig. 3.8.

# **3.3.3** Parasitic Effects

A more comprehensive model is shown in Fig. 3.9(a) to include parasitic capacitors.  $C_1$ models the output capacitance of  $A_1$  plus the capacitance from switches in SWB1 while  $C_2$  models the output capacitance of  $A_2$ . The parasitic capacitance from SWB2 and the input of  $A_2$  are lumped into  $C_L$ . At in-band (IB) frequencies around  $f_{LO}$ ,  $C_p$  and  $L_p$  are in resonance and have infinite impedance. The transfer function from node X to node Y can be approximated as:

$$A_{IB}(s) \approx \frac{-A_1 A_2 R_L}{R_p + R_T} \frac{1}{(1 + sR_1 C_1)(1 + sR_L C_L)(1 + sR_2 C_2)}.$$
(3.19)

The three poles in Eq. (3.19) should be much higher than  $f_{LO}$  so that it does not degrade



Figure 3.9: (a) Circuit model of BPCG with parasitic capacitance. (b) In-band input impedance model and phasor illustration.

the overall loop gain. Therefore, Eq. (3.19) can be further written as:

$$A_{IB}(f_{LO}) \approx \frac{-A_1 A_2 R_L}{R_p + R_T} e^{-j\theta_0} \approx \frac{-A_1 A_2 A_n}{1 + \alpha_N} (1 - j\theta_0)$$
  

$$\theta_0 = 2\pi f_{LO} \left( R_1 C_1 + R_L C_L + R_2 C_2 \right).$$
(3.20)

The IB input impedance can be derived as:

$$Z_{in}(f_{LO}) = \frac{1}{g_m \left(1 - A_{IB}(f_{LO})\right)} = \frac{1}{g_m} \times \frac{1}{1 + \frac{A_1 A_2 A_n}{1 + \alpha_N} - j\theta_0 \frac{A_1 A_2 A_n}{1 + \alpha_N}}$$
  
=  $\frac{1/g_m}{(1 + K_N)} / / \frac{j}{g_m \theta_0 K_N}$   
 $K_N = A_1 A_2 A_n / (1 + \alpha_N).$  (3.21)

The equivalent circuit suggested by Eq. (3.21) is depicted in Fig. 3.9(b) along with a graphical explanation. The second parallel impedance in Eq. (3.21) represents a frequency

dependent shunt inductance expressed as:

$$L_{IB}(f_{LO}) = \frac{1}{2\pi f_{LO} g_m \theta_0 K_N}.$$
(3.22)

This effect can be understood via the phasor illustration in Fig. 3.9(b). The phasor of  $\dot{V}_Y$  experiences a phase delay of  $\theta_0$  and  $\dot{V}_{sg}$  of M<sub>in</sub> is no longer in phase with  $\dot{V}_X$ . Phase of the resulting port current  $\dot{I}_X$  is lagging that of  $\dot{V}_X$  which signifies an inductive behavior of  $Z_{in}$ . This shunt inductive impedance can be driven to a very high value with large N such that its impact on input impedance is negligible.  $A_1A_2$  can be kept high for a good OB attenuation and selectivity. However, larger number of phases in the N-path filter complicates LO generation and distribution with proportionally higher power consumption. Typical value of N ranges from 4 [96] to 16 [109] and an 8-path design is adopted here.

An 8-path notch filter with  $A_1A_2 = 20$  and a phase delay of  $15^{\circ}$  generates a 190  $\Omega$  inductive



Figure 3.10: (a) N-path notch filter with feed-forward phase correction. (b) Schematic of programmable gain amplifier (PGA).



Figure 3.11: Measured input  $S_{11}$  variation at  $f_{LO} = 1.6$  GHz with varying PGA gain.

impedance in shunt with the input node and lead to center frequency shift. As shown in Fig. 3.10(a), feed-forward phase correction similar to [97] is implemented to compensate the parasitic phase delay. One eighth of capacitance is used in the feed-forward path to save some area and a PGA (Fig. 3.10(b)) adjust the amount of phase compensation. Complementary PGA design minimizes gain variation over large input dynamic range. The bandwidth of PGA is limited by the output RC to less than 10 MHz so that it is only effective around IB frequencies. Measurement results in Fig. 3.11 demonstrates the effect of phase delay correction. A S<sub>11</sub> shift of 4 MHz around 1.6 GHz  $f_{LO}$  can be corrected.

## 3.3.4 Stability Analysis

Stability is always a concern when any form of feedback is involved. Stability metrics such as phase margin, gain margin are measured at a frequency where the loop gain drops to unity or below due to roll-off of multiple poles. The addition of transmission zeros due to the notch filter reduces the loop gain within a narrow bandwidth around integer multiples of  $f_{LO}$ . This harmonic response [108, 110] effect is shown in the simulation results in Fig. 3.12(a) for  $f_{LO} = 1$  GHz. The equivalent RLC resonators for the N-path notch at higher



Figure 3.12: (a) BPCG loop gain. (b) Equivalent model for stability analysis and an exemplary circuit realization of  $A_1$ .

harmonics have lower Q and therefore the notch depth degrades at higher harmonics. This can be intuitively understood by noticing that increasing input frequency by an integer m is equivalent to reducing the number of paths by a factor of m. For example, when a 2 GHz tone is fed to an 8-path notch filter clocked at 1 GHz, each capacitor sees 1/4 instead of 1/8 of the sinusoidal waveform when it is on. This operation is equivalent to a 4-path notch filter clocked at 2GHz. As indicated by Eq. (3.12),  $R_p$  will drop from 18.86 $R_T$  to only 4.28 $R_T$ . Due to the narrow-band drop in loop gain, stability is less of a concern around integer multiples of  $f_{LO}$ . For OB frequencies, which occupy the majority of the spectrum, the notch filter acts as resistive loss due to on-resistance of SWB1 and SWB2. The loop gain magnitude is much higher for OB frequencies and should be considered as worst case scenario. Moreover, the notch harmonics around the unity gain frequency has very little effect. The loop gain simulation results without the notch response (N-path switches always on) is also plotted in Fig. 3.12(a), showing a close resemblance with the true loop gain when notch filter is enabled. Thus, the model in Fig. 3.12(b) can be readily applied with good accuracy while avoiding high-Q resonator in the stability analysis.

The dominant pole  $(p_D)$  is assumed to be placed inside  $A_1$  since it should provide the majority of the gain while  $A_2$  can only have a low gain due to LO leakage concern discussed in Section 3.3.2. The loop gain can be written as follows:

$$LG(s) = -A_1 A_2 A_n \frac{g_m R_s}{1 + g_m R_s} \times \frac{1}{(1 + s/p_D) (1 + s/p_1) (1 + s/p_L) (1 + s/p_2) (1 + s/p_s)}$$
(3.23)

Four major parasitic poles exist in the BPCG loop due to four independent capacitors. The pole frequencies can be estimated using open-circuit time constant method to provide some design guidelines. The poles  $p_1$ ,  $p_L$ ,  $p_2$  and  $p_s$  are associated with capacitors  $C_1$ ,  $C_L$ ,  $C_2$  and  $C_s$ , respectively. Their estimated values are:

$$p_{1} = \frac{1}{[R_{1} / / (R_{L} + R_{on})] C_{1}} \approx \frac{1}{[R_{1} / / R_{L}] C_{1}}$$

$$p_{L} = \frac{1}{[R_{L} / / (R_{1} + R_{on})] C_{L}} \approx \frac{1}{[R_{1} / / R_{L}] C_{L}}$$

$$p_{2} = \frac{1}{R_{2}C_{2}}$$

$$p_{s} = \frac{1}{(1/g_{m} / / R_{s}) C_{s}}.$$
(3.24)

Assuming the dominant pole dictates loop gain roll off before the unity gain frequency, the unity gain frequency  $\omega_u$  can be approximated by the gain bandwidth product as:

$$\omega_u \approx p_D A_1 A_2 A_n \frac{g_m R_s}{1 + g_m R_s}.$$
(3.25)

In a well matched condition,  $p_s$  is at  $2/(R_sC_s)$ , a very high frequency.  $p_2$  can also be easily set to a high frequency due to the low gain requirement of  $A_2$ .  $p_1$  and  $p_L$  are determined by roughly the same resistance if  $R_{on}$  is low. The capacitor  $C_1$  is typically larger than  $C_L$ . Recall that  $C_1$  includes output capacitance of  $A_1$  and parasitic capacitance from SWB1.  $A_1$  usually involves large device size due to the high gain requirement and SWB1 is sized larger than SWB2 due to its weaker LO leakage contribution. Contrarily,  $C_L$  can be much smaller due to low gain in  $A_2$  and smaller size of SWB2 in order to control its LO leakage contribution. Therefore,  $p_1$  is typically the first non-dominant pole and sets the stability margin. Once  $p_1$  is set based on available device cutoff frequency, OB attenuation, power consumption and LO leakage considerations, the dominant pole  $p_D$  can be determined based on Eq. (3.25). For example, for a 45° phase margin in a two pole system,  $\omega_u = p_1$  and  $p_D \approx 2p_1/(A_1A_2A_n)$ . The frequency of  $p_D$  sets an upper limit for the LO frequency range. Operating at an  $f_{LO}$  close to the  $p_D$  causes too much IB impedance variation due to excessive phase delay introduced by  $p_D$  and can no longer be corrected with the technique described in Section 3.3.3.

An exemplary implementation of  $A_1$  is shown in Fig. 3.12(b) as well. Two stages of CMOS class-AB amplifier loaded by a low resistance of  $R_1$  realize a non-inverting gain with low output impedance. The dominant pole is set at the output of  $M_{n0}$  and  $M_{p0}$ .  $R_b$  is a high impedance bias resistor while the feedback resistor  $R_f$  helps increasing dominant pole frequency and reducing output impedance of  $M_{n1}$  and  $M_{p1}$ . It can be treated as a simplified trans-conductor ( $G_m$ ) followed by a trans-impedance amplifier (TIA). As indicated in Fig. 3.12(a), the dominant pole is set at around 2.7 GHz with a unity gain frequency around 19 GHz. The typical phase margin is 26° and kept above 20° over PVT. The phase margin can be improved at the cost of lower  $p_D$  and thus lower  $f_{LO}$  upper limit. However, a relatively low phase margin is a minor issue here since the input signal frequency range is below  $p_D$  and  $\ll \omega_u$ . BPCG response in the desired  $f_{LO}$  range is negligibly affected by low phase margin. This is sharply distinguished from most feedback circuitry that processes input signal frequency range up to  $\omega_u$ , which is roughly the closed-loop 3-dB BW. Moreover, RX settling behavior in response to a step input when switching channel frequency is a secondary concern since the settling time is usually dominated by the LO synthesizer whose loop BW is much lower than  $\omega_u$  of BPCG.

# 3.4 Receiver Design

# 3.4.1 Noise and Linearity Consideration

The BPCG structure can be employed as a BPF in shunt with a wideband RX. The noise contributed by  $A_1$ ,  $R_1$  and  $R_{on}$  at  $f_{LO}$  is low since it can barely pass the notch filter and is thus not included in the noise analysis model in Fig. 3.13(a). The noise from  $R_L$  and  $A_2$ is combined together as  $\overline{i_{n,A_2}^2} = 4kT\gamma g_{m,A_2} + 4kTR_L g_{m,A_2}^2$ , where  $\gamma$  is the transistor thermal noise factor and  $g_{m,A_2}$  is the trans-conductance in  $A_2$ .  $A_2$  is realized as CMOS class-AB stage loaded by  $R_2$  so that  $A_2 = g_{m,A_2}R_2$ .  $\overline{v_{n,RX}^2}$  represents the input referred noise of a wideband RX with high input impedance. The noise factor can be expressed as:

$$F_{BPF} = 1 + \frac{\overline{v_{n,RX}^2}(1 + g_m R_s)^2}{4kTR_s} + \gamma g_m R_s + g_m^2 R_2 R_s + g_m^2 R_s \gamma A_2 R_2 + g_m^2 R_s A_2^2 R_L.$$
(3.26)

Alternatively, the BPCG structure can be used in a way similar to a common-gate LNA stage followed by down-converter as shown in Fig. 3.13(b). Current mode passive mixer followed



Figure 3.13: BPCG noise model when configured as: (a) BPF in parallel with a RX; (b) LNA stage in series with a RX chain.

by BB TIA is used here for its exceptional linearity.  $\overline{i_{n,mix}^2}$  models the input referred noise current of the down-conversion path. The noise factor for this case is:

$$F_{LNA} = 1 + \frac{\overline{i_{n,mix}^2 (1 + g_m R_s)^2}}{4kTR_s g_m^2} + \frac{\gamma}{g_m R_s} + \frac{R_2}{R_s} + \frac{\gamma A_2 R_2}{R_s} + \frac{A_2^2 R_L}{R_s}.$$
(3.27)

The 2<sup>nd</sup> through the 6<sup>th</sup> terms in Eq. (3.26) and (3.27) represent noise contribution from: the rest of the RX chain,  $M_{in}$ ,  $R_2$ , CMOS class-AB stage in  $A_2$  and  $R_L$ , respectively. In a well matched design with an ideal noise-less RX and down-converter, the noise factor in the above two cases converge to the same value expressed below:

$$F_{BPCG} = 1 + \gamma + \frac{R_2}{R_s} + \frac{\gamma A_2 R_2}{R_s} + \frac{A_2^2 R_L}{R_s}.$$
(3.28)

For a low  $A_2$  of 0 dB and a  $\gamma$  close to 1 in short channel devices, the noise factor is limited to  $2 + (2R_2 + R_L)/R_s$ . Although  $R_2$  can be very small for a low  $A_2$  implementation,  $R_L$  must be relatively large to minimize notch filter loss and results in an unacceptably high noise factor for many applications. This implies that a common-gate (CG) plus common-source (CS) noise cancellation topology [111,112] has to be employed for a low noise RX design.

The linearity impact of including BPCG structure in RX design can be analyzed as in Fig. 3.14. Two interference tones located at OB frequencies of  $f_1$  and  $f_2$  see a low impedance



Figure 3.14: IM3 generation in BPCG structure.

from BPCG and their amplitudes are suppressed at the input.  $A_1$  amplifies the interference tones at its output and generates a 3<sup>rd</sup> order intermodulation product (IM3) due to its non-linearity. Nevertheless, this IM3 generated by  $A_1$  falls in-band and will be significantly attenuated by the notch filter. Therefore, the IM3 at the gate of M<sub>in</sub> is mainly produced by  $A_2$  and is lower in magnitude than that generated by  $A_1$  due to the low gain of  $A_2$ . The two OB interference tones also incur a large V<sub>gs</sub> on M<sub>in</sub> due to feedback and induce an IM3 current in the channel of M<sub>in</sub>. Dividing this IM3 current by  $g_m$ , we can group the IM3 contribution from both  $A_2$  and M<sub>in</sub> into an IM3 voltage source in series with M<sub>in</sub>'s gate.

Similarly, all the noise contributions within BPCG can be combined together into one noise voltage source  $\overline{v_{n,BPCG}^2}$  at the gate of M<sub>in</sub>. A noise and IM3 canceling RX architecture can be employed as shown in Fig. 3.15 to overcome the noise and linearity limitations set by the BPCG structure. The CG path is same as that in Fig. 3.13(b) while the CS path is comprised of a trans-conductance  $G_{m,CS}$  followed by current mode passive mixer. The final output is taken as the difference between CS and CG path outputs. Since the IM3 and noise sources are inserted at the same place within BPCG structure, their transfer functions to the BB output are exactly the same. Simultaneous noise and IM3 cancellation can be achieved



Figure 3.15: Noise and IM3 canceling RX architecture.

if the following condition is satisfied:

$$\frac{g_m R_s}{1 + g_m R_s} G_{m,CS} G_{CS} R_{CS} = \frac{g_m}{1 + g_m R_s} G_{CG} R_{CG}$$
(3.29)

The expression on the left of the equality stands for the transfer function from BPCG noise/IM3 source to the output of CS path  $V_{o,CS}$ , while the right hand side represents the same transfer in the CG path.  $G_{CS}$  and  $G_{CG}$  are the current conversion gains of the passive mixers in CS and CG paths:

$$G_{CS} = \frac{Z_{o,CS}}{Z_{o,CS} + Z_{mix,CS}} \times a_{-1} \times e^{j\phi_{CS}}$$

$$G_{CG} = \frac{Z_{o,CG}}{Z_{o,CG} + Z_{mix,CG}} \times a_{-1} \times e^{j\phi_{CG}}$$

$$a_{-1} = \frac{\sin(\frac{\pi}{N})}{\pi} e^{-j\frac{\pi}{N}}$$
(3.30)

where  $a_{-1}$  represents the ideal N-phase mixer down-conversion gain. The terms  $e^{j\phi_{CS}}$  and  $e^{j\phi_{CG}}$  model the distinct LO phases in CS and CG down-conversion paths. The mixers' finite input impedance is denoted as  $Z_{mix,CS}$  and  $Z_{mix,CG}$  while the output impedance from transconductors is denoted as  $Z_{o,CS}$  and  $Z_{o,CG}$ . The condition in Eq. (3.29) must be satisfied in both magnitude and phase for perfect noise/IM3 cancellation. The mismatch between the impedance involved in Eq. (3.30) introduces both magnitude and phase imbalance. The magnitude mismatch is corrected by tuning the ratio between  $R_{CS}$  and  $R_{CG}$  while the phase mismatch can be compensated by tuning the phase difference,  $\phi_{CS} - \phi_{CG}$ , between the LO signals for CS and CG paths.

With proper magnitude and phase tuning, the noise and IM3 induced by BPCG are perfectly canceled and the consequent noise factor can be derived as:

$$F = 1 + \frac{\overline{v_{n,CS}^2 (G_{m,CS} R_{CS})^2 |G_{CS}|^2 + \overline{i_{n,CG}^2} R_{CG}^2 |G_{CG}|^2}}{4kT R_s |A_{RX}|^2}$$
(3.31)

where  $\overline{v_{n,CS}^2}$  is the input referred noise voltage of CS path,  $\overline{i_{n,CG}^2}$  is the input referred noise current of CG down-conversion path.  $A_{RX}$  represents the voltage conversion gain of the entire RX:

$$A_{RX} = \frac{G_{m,CS}R_{CS}G_{CS} + g_m R_{CG}G_{CG}}{1 + g_m R_s}$$
(3.32)

Applying the condition in Eq. (3.29) into Eq. (3.31) and (3.32), the noise factor with ideal cancellation can be simplified into:

$$F = 1 + \frac{\overline{i_{n,CG}^2}R_s^2 + \overline{v_{n,CS}^2}}{4kTR_s}.$$
(3.33)

Both noise contributors  $\overline{v_{n,CS}^2}$  and  $\overline{i_{n,CG}^2}$  should be minimized. The CS path trans-conductor design is shown in Fig. 3.16. The DC bias at the drain is set by the feedback circuit in a replica biasing block with one tenth of the core transistor size. The total core transconductance is 130 mS for low noise operation. The bias voltages of core NMOS and PMOS are determined such that the large signal trans-conductance remain flat over a wide range of input voltage [86] for optimum linearity. A large resistor brings out the drain DC level for process monitoring and bias tuning.



Figure 3.16: CS path trans-conductor design.

Note that the CG path current noise  $\overline{i_{n,CG}^2}$  appears in Eq. (3.33) without front-end attenuation. This is because the common-gate device M<sub>in</sub> only acts as a current buffer without any current gain. Therefore, the CG down-conversion path operates like a "mixer-first" RX path from a noise perspective. However,  $\overline{i_{n,CG}^2}$  can not be directly canceled by the RX architecture in Fig. 3.15. This is because  $\overline{i_{n,CG}^2}$  cannot appear at the input node and be sensed by  $G_{m,CS}$  due to the reverse isolation of M<sub>in</sub>. Although the noise contribution from  $\overline{i_{n,CG}^2}$ can be reduced by increasing mixer switch size and employing a low noise TIA design at the cost of power consumption, linearity and blocker tolerance concern compels us to look for an alternative CG down-conversion path.

# 3.4.2 Common-Gate Down-Conversion Path

As shown in Fig. 3.17, a high power blocker can accompany the weak desired signal and cause RX compression and saturation. The blocker voltage swing at RX input is suppressed by the low OB input impedance of BPCG. Therefore, the linearity of CS path is greatly improved.

However, the consequence of the blocker voltage suppression is that an enlarged blocker current must flow through BPCG. For example, a 0 dBm blocker with a BPCG OB impedance



Figure 3.17: CG down-conversion path with blocker sink and frequency translational feedback.

of 5  $\Omega$  will incur a peak current of 11.5 mA. The common-gate device must be able to handle such large current without compression. Otherwise, the blocker induced voltage swing seen by  $G_{m,CS}$  will still be high. Due to the large  $G_{m,CS}$  for low noise operation, it can easily saturate the CS path, leading to significant degradation in NF. Therefore, commongate transistor size is determined such that it can carry 11.5 mA of current when operating at the verge of saturation region with maximum V<sub>GS</sub> swing provided by BPCG feedback. Complementary design is also adopted as shown in Fig. 3.17. M<sub>in,N</sub> and M<sub>in,P</sub> handle the positive and negative peak of the blocker current, respectively. To realize the desired smallsignal trans-conductance of 20 mS for good input matching, the DC bias voltage must be low and the transistors work near sub-threshold region. This biasing scheme helps improving trans-conductance linearity in a complementary design. One side-effect is the reduced transistor cutoff frequency ( $f_T$ ) with a low DC bias. However, this is not a major concern for most advanced CMOS technologies as their intrinsic  $f_T$  is much higher than typical RF frequencies.

The large blocker current at the drain of  $M_{in,N}$  and  $M_{in,P}$  eventually has to be steered to a low impedance node and separated from the signal path. The same mixer and TIA topology

in the CS path could be directly connected to the drain of  $M_{in,N}$  and  $M_{in,P}$  to perform downconversion. The blocker current would have be sunk by large BB capacitors to ground and excessive voltage swing is circumvented. However, the mixer switches still has to pass the blocker current. Both 2<sup>nd</sup> and 3<sup>rd</sup> order mixer non-linearity can generate IM tones to desensitize the RX. Furthermore, the non-zero mixer switch impedance is flat for all frequencies and will induce a large voltage swing to modulate the  $g_{ds}$  of  $M_{in,N}$  and  $M_{in,P}$ . This non-linear modulation leads to a higher IM3 if the cancellation in Fig. 3.15 is not perfect.

The CG down-conversion path as shown in Fig. 3.17 is proposed and implemented to tackle the aforementioned challenges. A blocker sink circuitry comprised of active N-path [97,99] design provides low-impedance path for the blocker current while maintaining a high IB impedance to minimize signal loss. The signal path input impedance should be high at blocker frequencies to prevent blocker current from entering the signal path. Meanwhile, a low IB impedance is desired to attract signal current. As a result of this BP impedance in parallel with band-stop impedance, the blocker and signal currents are separated at the drain of  $M_{in,N}$  and  $M_{in,P}$ . The blocker current is ultimately absorbed by the output of the amplifier in blocker sink circuitry. The voltage swing at the output of blocker sink can be high. However, it does not affect RX linearity since it's not on the signal path.

The band-stop input impedance of the signal path is synthesized with frequency translational feedback [113]. Large feedback resistor  $(R_{fb})$  with a high loop gain realizes a low IB impedance of around 200  $\Omega$  with a negligible noise penalty. The OB input impedance become high and approaches  $R_{fb}$  because the bandwidth limitation in BB amplifiers essentially breaks the feedback at large offset frequencies from  $f_{LO}$ . The 200  $\Omega$  impedance provides 12 dB voltage gain for IB signals to suppress noise contribution from later stages. A trans-conductor  $G_{m,CG}$  further amplifies the signal and greatly relaxes the down-converter noise concern indicated in Eq. (3.33).  $G_{m,CG}$  shares the same topology as  $G_{m,CS}$  shown in Fig. 3.16 with one third of trans-conductance. Note that the inclusion of  $G_{m,CG}$  inverts the polarity of both signal and noise/IM3 in the CG path. The noise/IM3 cancellation should be performed with a summation of CS and CG path outputs instead of subtraction in Fig. 3.15.

# 3.4.3 8-Phase LO Generation

The 8-phase 12.5% duty cycle non-overlap LO signals required by the 8-path notch filter, CS and CG down-conversion mixers and the blocker sink are generated as shown in Fig. 3.18. A single-ended input at  $8 \times f_{LO}$  is first amplified by high-speed buffers to sharpen the rising and falling edges. The buffered signal is then divided down to  $f_{LO}$  by a cascade of three true single-phase clock (TSPC)  $\div 2$  dividers. The divided signal at  $f_{LO}$  is delayed by a tunable delay cell controlled by 4-bit coarse tuning plus varactor-based fine tuning. The delayed signal at  $f_{LO}$  feeds a chain of TSPC D-flip-flops (DFFs) to generate 8-phase 50% duty cycle signals, e.g. CLK<sub>0</sub> - CLK<sub>7</sub>. The rising and falling edges of these 50% duty cycle clock signals



Figure 3.18: 8-phase 12.5% duty cycle non-overlap LO generation.

are triggered by the buffered high frequency  $(8f_{LO})$  input signal. The additional phase noise on top of the original  $8f_{LO}$  input LO mainly comes from the CMOS buffer and clock-to-Q delay of the DFFs. The buffer is sized large enough so that its phase noise contribution is negligible compared to the source. TSPC design of the DFFs reduces capacitive loading and sharpen the clock edges to minimize phase noise degradation. The clock edges are then combined by AND gates as shown in Fig. 3.18 to produce 12.5% duty cycle LO signals. The non-overlap time is guaranteed by skewing the Q and  $\overline{Q}$  outputs of the DFFs to have a slightly slower rising edge. The necessary LO phase for noise and IM3 cancellation is achieved by tuning relative delay between the CS and CG path LO generation. The LO signals for the N-path notch filter and blocker sink are insensitive to its relative phase and are directly routed from the CS and CG path LO signals. The dynamic power consumption driving all the switches and routing in the RX is around 15 mW per GHz.

## 3.4.4 Complete RX Implementation

The complete RX block diagram is shown in Fig. 3.19. High density stacked capacitor comprised of MOS capacitor, metal-oxide-metal (MOM) and metal-insulator-metal (MIM) capacitors in front of BB TIA further suppresses blocker voltage swing and improve RX linearity with little area overhead. 8-phase down-conversion with  $3^{rd}$  and  $5^{th}$  harmonic rejection (HR) was implemented. Resistance ratio of  $29 \div 12$  approximates the ideal ratio of  $1 + \sqrt{2}$  in an 8-phase implementation. Noise canceling (NC) summation is realized with an Op-Amp-based adder which also provides additional gain and filtering.



Figure 3.19: Block diagram of complete RX prototype.



Figure 3.20: (a) Die micrograph of the RX prototype. (b) Photo of PCB for measurement.

# 3.5 Measurements

The RX prototype has been fabricated in a 45nm CMOS RF-SOI technology and occupies 1.05 mm<sup>2</sup> of active area. The die photo is shown in Fig. 3.20(a). The die is directly bonded to the PCB (Fig. 3.20(b)) pads using chip-on-board package and results in a 0.5 mm bond



Figure 3.21: (a) Measured S<sub>11</sub> with  $f_{LO}$  swept from 0.2 to 2 GHz. (b). Input impedance on Smith chart for 1 GHz  $f_{LO}$ .

wire length for the RF input signal. SPI interface sends control bits to the RX chip for tuning and calibration.

The measured  $S_{11}$  after de-embedding PCB trace is shown in Fig. 3.21(a) with  $f_{LO}$  varying from 0.2 to 2 GHz. High-Q input selectivity can be observed as a deep notch in  $S_{11}$  around each LO frequency. The Smith chart in Fig. 3.21(b) demonstrates the impedance trajectory in a 200 MHz bandwidth around 1 GHz. The IB impedance at cursor 1 is well matched with a small capacitive component due to various parasitics. The OB impedance drops to around



Figure 3.22: (a) RX-band LO leakage from 3 samples with  $f_{LO}$  varying from 0.2 to 2.0 GHz. (b) Harmonic LO leakage with 0.4 GHz  $f_{LO}$ . (c) Harmonic LO leakage with 1.0 GHz  $f_{LO}$ .

#### 8 $\Omega$ at 100 MHz offset from $f_{LO}$ .

The RX band LO leakage at  $f_{LO}$  is measured across the same LO frequency range and plotted in Fig. 3.22(a). It rises with increased LO frequency and is below -80 dBm up to 2.0 GHz, satisfying most sensitive cellular applications. The harmonic LO leakage at integer multiples of  $f_{LO}$  is also of interest in a wideband RX. Harmonic LO leakage may fall into the RX band of other nearby mobile devices operating with a different standard. The measured leakage spectrum from fundamental to the 8<sup>th</sup> harmonic is shown in Fig. 3.22(b) and 3.22(c) for  $f_{LO}$ of 0.4 and 1.0 GHz. Since some of the harmonic leakage frequency is beyond the nominal LO frequency range, regular inductor-based RF choke impedance start to decrease significantly due to limited self-resonance-frequency. In order to avoid underestimating harmonic LO leakage, a wideband RF choke (Mini-Circuits<sup>®</sup>ADCH-80A+) working up to 10 GHz is used to measure harmonic leakage. The 8<sup>th</sup> harmonic leakage has the highest power since the leakage waveform should have an  $8f_{LO}$  fundamental if mismatch does not exist. The 0.4 GHz  $f_{LO}$  introduces a -67.3 dBm leakage interference for some LTE bands around 3.2 GHz. However, LTE usually have a more relaxed LO leakage requirement [114] due to larger channel bandwidth. The 8<sup>th</sup> harmonic leakage of 1 GHz  $f_{LO}$ , albeit reaches -53.3 dBm, is well beyond typical RF bands and no longer poses any concern.



Figure 3.23: (a) Measured NF versus BB frequency with 1.0 GHz  $f_{LO}$ . (b) NF at 2 MHz BB offset for different LO frequencies.



Figure 3.24: Measured IIP2 and IIP3 at different offset frequencies.



Figure 3.25: (a) Blocker NF measurement setup. (b) NF as a function of blocker power at 80 MHz offset.

The small-signal NF measurement is shown in Fig. 3.23(a). The flicker noise corner is around 100 kHz and can be improved with larger BB TIA input pair. The spot NF is taken at 2 MHz BB offset and plotted for varying LO frequencies as shown in Fig. 3.23(b). The NF from 0.2 to 2 GHz is below 2.5 dB. The noise canceling gain and phase calibration is performed by tuning CS path BB TIA gain and LO delay while the CG path is fixed. The same procedure is carried out for each LO frequency and the overall RX gain is set to 40 dB in all NF measurement. Using the same gain and phase matching setup acquired for minimum NF, IIP2 and IIP3 are measured as shown in Fig. 3.24 with a  $f_{LO}$  of 1 GHz. The OB-IIP3 reaches +14 dBm at 100MHz offset frequency while the OB-IIP2 is +60 dBm.

The blocker NF is measured with the setup illustrated in Fig. 3.25(a). The blocker frequency

| Reference                      | JSSC 2012<br>D. Murphy         | JSSC 2014<br>J. Park           | TMTT 2016<br>C. Wu               | JSSC 2019<br>YC. Lien           | JSSC 2011<br>J. Borremans   | JSSC 2015<br>H. Hedayati          | TMTT 2018<br>Y. Zhang      | This Work               |
|--------------------------------|--------------------------------|--------------------------------|----------------------------------|---------------------------------|-----------------------------|-----------------------------------|----------------------------|-------------------------|
| Architecture                   | Mixer-First<br>Noise Canceling | Mixer-First<br>Miller Bandpass | Mixer-First<br>MxDAC Calibration | Mixer-First<br>Bottom-Plate Mix | LNA-First<br>w/ N-path Load | LNA-First<br>Blk. Filtering Mixer | LNA-First<br>IM3 Canceling | BPCG<br>Noise Canceling |
| <b>RF Input Selectivity</b>    | Yes                            | Yes                            | Yes                              | Yes                             | No                          | No                                | No                         | Yes                     |
| LO Leakage [dBm]               | -65@2GHz                       | -67@2GHz                       | -62@1.5G                         | NR                              | NR                          | -82*                              | 88@2G                      | -80@2G                  |
| Frequency Range [MHz]          | 80 - 2700                      | 50 - 2500                      | 400 - 3500                       | 100-2000                        | 400-6000                    | 100-2800                          | 500-2500                   | 200-2000                |
| Gain [dB]                      | 72                             | 38                             | 35                               | 16                              | 70                          | 50                                | 40                         | 40                      |
| NF [dB]                        | 1.5 ~ 1.9                      | 2.9                            | 2.4 ~ 2.6                        | 4.1 ~ 10.3                      | 3.0 ~ 7.4                   | 1.5 ~ 2.2                         | 3.2 ~ 5.3                  | 2.1 ~ 2.5               |
| 0dBm Blocker NF [dB]           | 4.1 (Δf=80MHz)                 | 5.1 (Δf=20MHz)                 | 6.5 (Δf=50MHz)                   | 8.1 (Δf=80MHz)                  | 13 (Δf=20MHz)               | 14 (Δf=50MHz)                     | NR                         | 6.7 (Δf=80MHz)          |
| OB-IIP3 [dBm]                  | +13.5                          | +10                            | +16                              | +44                             | +10                         | +5                                | +32.5                      | +14                     |
| OB-IIP2 [dBm]                  | +54                            | +52                            | +60                              | +90                             | +70                         | +50                               | +54                        | +60                     |
| Supply Voltage [V]             | 1.3                            | 1.2                            | 1.1 / 1.5                        | 1.0 / 1.2                       | 1.1 / 2.5                   | 1.1                               | 1.2                        | 1.2 / 1.6               |
| RX Power [mW]                  | 35.1 ~ 78                      | 20 @ 2GHz                      | 38 ~ 75                          | 36 ~ 96                         | 30 ~ 55                     | 27 ~ 40                           | 70*                        | 68 ~ 95                 |
| Active Area [mm <sup>2</sup> ] | 1.2                            | 0.82                           | 0.23                             | 0.49                            | 2                           | 0.8                               | 0.84                       | 1.05                    |
| Technology                     | 40nm                           | 65nm                           | 28nm                             | 28nm                            | 40nm                        | 40nm                              | 65nm                       | 45nm SOI                |

Table 3.1: Comparison with State-of-the-art Wideband Receiver Designs

NR: Not Reported \*: Frequency not specified

is set to 892 MHz, highest pass-band frequency of the filter available for our measurement. The blocker signal generator's far out noise floor above 10 MHz offset is only around -140 dBc/Hz. A 0 dBm blocker implies a -140 dBm/Hz noise floor feeding into the RX under test, significantly affecting NF measurement. The filter rejection 20 MHz away from the blocker tone is only around 25 dB, insufficient for blocker noise floor suppression. Therefore, the signal frequency is placed at a 80 MHz offset where the filter provides more than 60 dB attenuation. The NF versus blocker frequency is then measured and plotted in Fig. 3.25(b). The NF degrades by 4.5 dB in the presence of a 0 dBm blocker.

The performance of the proposed RX is summarized in Table. 3.1 and compared with state-of-the-art wideband RX designs. Compared to mixer-first designs, the LO leakage is improved by more than 13 dB while excellent NF, linearity and blocker tolerance are maintained. On the other hand, the proposed RX achieved similar LO leakage level compared to LNA-first designs while the input compression problem is greatly alleviated and 0 dBm blocker NF is much improved.

# 3.6 Conclusion

The design of a wideband RX with high-Q selectivity at RF input and low LO leakage level is described in this chapter. The BPCG structure at the core of it is proposed and analyzed along with a complete RX prototype implemented and characterized. High-Q input selectivity facilitates a highly linear RX design that achieves +14 dBm OB-IIP3 and +60 dBm OB-IIP2 while the LO leakage is below -80 dBm up to 2 GHz. NF is kept below 2.5 dB with a dual-path noise-canceling RX architecture. A 0 dBm blocker at 80 MHz offset only degrades the NF by 4.5 dB, demonstrating an excellent blocker tolerance.

# Bibliography

- R. Keyes, "Moore's law today," *IEEE Circuits and Systems Magazine*, vol. 8, no. 2, pp. 53–54, 2008.
- [2] S. Cherry, "Edholm's law of bandwidth," IEEE Spectrum, vol. 41, pp. 58–60, Jul 2004.
- [3] Qualcomm, "The essential role of Gigabit LTE & LTE Advanced Pro in a 5G world [Online] Available at: https://www.qualcomm.com/documents/essential-role-gigabitlte-lte-advanced-pro-5g-world."
- [4] M. Jaber, M. A. Imran, R. Tafazolli, and A. Tukmanov, "5G backhaul challenges and emerging research directions: A survey," *IEEE Access*, vol. 4, pp. 1743–1766, 2016.
- [5] S. Kandula, J. Padhye, and V. Bahl, "Flyways to de-congest data center networks," *Proc. of Hot Nets*, Aug 2009.
- [6] T. Benson, A. Anand, A. Akella, and M. Zhang, "Understanding data center traffic characteristics," ACM SIGCOMM Computer Communication Review, vol. 40, pp. 65– 72, Jan 2010.
- [7] A. S. Hamza, J. S. Deogun, and D. R. Alexander, "Wireless communication in data centers: A survey," *IEEE Communications Surveys & Tutorials*, vol. 18, no. 3, pp. 1572– 1595, 2016.
- [8] M. Urteaga, Z. Griffith, M. Seo, J. Hacker, and M. J. W. Rodwell, "InP HBT technologies for THz integrated circuits," *Proc. IEEE*, vol. 105, pp. 1051–1067, Jun 2017.
- [9] K. Takano, S. Amakawa, K. Katayama, S. Hara, R. Dong, A. Kasamatsu, I. Hosako, K. Mizuno, K. Takahashi, T. Yoshida, and M. Fujishima, "A 105Gb/s 300GHz CMOS transmitter," in *IEEE ISSCC Dig. Tech. Papers*, Feb 2017.
- [10] K. Katayama, K. Takano, S. Amakawa, S. Hara, A. Kasamatsu, K. Mizuno, K. Takahashi, T. Yoshida, and M. Fujishima, "A 300 GHz CMOS transmitter with 32-QAM 17.5 Gb/s/ch capability over six channels," *IEEE J. Solid-State Circuits*, vol. 51, pp. 3037–3048, Dec 2016.
- [11] N. Dolatsha, B. Grave, M. Sawaby, C. Chen, A. Babveyh, S. Kananian, A. Bisognin, C. Luxey, F. Gianesello, J. Costa, C. Fernandes, and A. Arbabian, "A compact 130GHz

fully packaged point-to-point wireless system with 3D-printed 26dBi lens antenna achieving 12.5Gb/s at 1.55pj/b/m," in *IEEE ISSCC Dig. Tech. Papers*, Feb 2017.

- [12] Z. Wang, P.-Y. Chiang, P. Nazari, C.-C. Wang, Z. Chen, and P. Heydari, "A CMOS 210-GHz fundamental transceiver with OOK modulation," *IEEE J. Solid-State Circuits*, vol. 49, pp. 564–580, Mar 2014.
- [13] S. Kang, S. V. Thyagarajan, and A. M. Niknejad, "A 240 GHz fully integrated wideband QPSK transmitter in 65 nm CMOS," *IEEE J. Solid-State Circuits*, vol. 50, pp. 2256–2267, Oct 2015.
- [14] Y. Kim, B. Hu, Y. Du, A. Tang, H.-N. Chen, C. Jou, J. Cong, T. Itoh, and M.-C. F. Chang, "A 20Gb/s 79.5mW 127GHz CMOS transceiver with digitally pre-distorted PAM-4 modulation for contactless communications," in *IEEE ISSCC Dig. Tech. Papers*, Feb 2018.
- [15] S. Hara, K. Katayama, K. Takano, R. Dong, I. Watanabe, N. Sekine, A. Kasamatsu, T. Yoshida, S. Amakawa, and M. Fujishima, "A 32Gbit/s 16QAM CMOS receiver in 300GHz band," in *IEEE MTT-S Int. Micrw. Symp. Dig.*, Jun 2017.
- [16] R. Wu, R. Minami, Y. Tsukui, S. Kawai, Y. Seo, S. Sato, K. Kimura, S. Kondo, T. Ueno, N. Fajri, S. Maki, N. Nagashima, Y. Takeuchi, T. Yamaguchi, A. Musa, K. K. Tokgoz, T. Siriburanon, B. Liu, Y. Wang, J. Pang, N. Li, M. Miyahara, K. Okada, and A. Matsuzawa, "64-QAM 60-GHz CMOS transceivers for IEEE 802.11ad/ay," *IEEE J. Solid-State Circuits*, vol. 52, pp. 2871–2891, Nov 2017.
- [17] J. Pang, S. Maki, S. Kawai, N. Nagashima, Y. Seo, M. Dome, H. Kato, M. Katsuragi, K. Kimura, S. Kondo, Y. Terashima, H. Liu, T. Siriburanon, A. T. Narayanan, N. Fajri, T. Kaneko, T. Yoshioka, B. Liu, Y. Wang, R. Wu, N. Li, K. K. Tokgoz, M. Miyahara, K. Okada, and A. Matsuzawa, "A 128-QAM 60GHz CMOS transceiver for IEEE 802.11ay with calibration of LO feedthrough and I/Q imbalance," in *IEEE ISSCC Dig. Tech. Papers*, Feb 2017.
- [18] K. K. Tokgoz, S. Maki, S. Kawai, N. Nagashima, J. Emmei, M. Dome, H. Kato, J. Pang, Y. Kawano, T. Suzuki, T. Iwai, Y. Seo, K. Lim, S. Sato, L. Ning, K. Nakata, K. Okada, and A. Matsuzawa, "A 56Gb/s W-band CMOS wireless transceiver," in *IEEE ISSCC Dig. Tech. Papers*, Feb 2016.
- [19] K. K. Tokgoz, S. Maki, J. Pang, N. Nagashima, I. Abdo, S. Kawai, T. Fujimura, Y. Kawano, T. Suzuki, T. Iwai, K. Okada, and A. Matsuzawa, "A 120Gb/s 16QAM CMOS millimeter-wave wireless transceiver," in *IEEE ISSCC Dig. Tech. Papers*, Feb 2018.
- [20] D. Fritsche, P. Starke, C. Carta, and F. Ellinger, "A low-power SiGe BiCMOS 190-GHz transceiver chipset with demonstrated data rates up to 50 Gbit/s using on-chip antennas," *IEEE Trans. Microw. Theory Techn.*, vol. 65, pp. 3312–3323, Sep 2017.

- [21] P. Rodriguez-Vazquez, J. Grzyb, B. Heinemann, and U. R. Pfeiffer, "A 16-QAM 100-Gb/s 1-m wireless link with an EVM of 17% at 230 GHz in an SiGe technology," *IEEE Microw. Wireless Compon. Lett.*, pp. 1–3, 2019.
- [22] N. Sarmah, J. Grzyb, K. Statnikov, S. Malz, P. R. Vazquez, W. Foerster, B. Heinemann, and U. R. Pfeiffer, "A fully integrated 240-GHz direct-conversion quadrature transmitter and receiver chipset in SiGe technology," *IEEE Trans. Microw. Theory Techn.*, vol. 64, pp. 562–574, Feb 2016.
- [23] J. Grzyb, P. R. Vazquez, N. Sarmah, B. Heinemann, and U. R. Pfeiffer, "A 240 GHz high-speed transmission link with highly-integrated transmitter and receiver modules in SiGe HBT technology," in *IEEE 42nd Int. Conf. Infrared Milli. Terahz. Waves*, Aug 2017.
- [24] P. Rodriguez-Vazquez, J. Grzyb, B. Heinemann, and U. R. Pfeiffer, "Performance evaluation of a 32-QAM 1-meter wireless link operating at 220-260 GHz with a datarate of 90 Gbps," in *IEEE Asia-Pacific Microw. Conf.*, Nov 2018.
- [25] P. Rodriguez-Vazquez, J. Grzyb, N. Sarmah, B. Heinemann, and U. R. Pfeiffer, "Towards 100 Gbps: A fully electronic 90 Gbps one meter wireless link at 230 GHz," in *IEEE 48th Eur. Microw. Conf.*, Sep 2018.
- [26] M. Elkhouly, Y. Mao, C. Meliani, F. Ellinger, and C. Schyett, "A 245 GHz ASK modulator and demodulator with 40 Gbits/sec data rate in 0.13 μm SiGe BiCMOS technology," in *IEEE MTT-S Int. Micrw. Symp. Dig.*, Jun 2013.
- [27] S. Lee, R. Dong, T. Yoshida, S. Amakawa, S. Hara, A. Kasamatsu, J. Sato, and M. Fujishima, "An 80Gb/s 300GHz-band single-chip CMOS transceiver," in *IEEE ISSCC Dig. Tech. Papers*, Feb 2019.
- [28] B. Heinemann, H. Rucker, R. Barth, F. Barwolf, J. Drews, G. G. Fischer, A. Fox, O. Fursenko, T. Grabolla, F. Herzel, J. Katzer, J. Korn, A. Kruger, P. Kulse, T. Lenke, M. Lisker, S. Marschmeyer, A. Scheit, D. Schmidt, J. Schmidt, M. A. Schubert, A. Trusch, C. Wipf, and D. Wolansky, "SiGe HBT with ft/fmax of 505 GHz/720 GHz," in *IEEE Int. Electro Devices Meeting*, Dec 2016.
- [29] GlobalFoundries, "45RFSOI: Advanced 45nm RF SOI technology [Online] Available at: https://www.globalfoundries.com/sites/default/files/product-briefs/pb-45rfsoi.pdf."
- [30] TOWERjazz, "Towerjazz high performance SiGe BiCMOS processes [Online] Available at: https://towersemi.com/technology/rf-and-hpa/sige-bicmos-platform/."
- [31] STMicroelectronics, "BiCMOS055 technology offer. [Online] Available at: http://cmp.imag.fr/img/pdf/04-kt-bicmos055-overview2016-3.pdf."
- [32] GlobalFoundries, "SiGe 9HP: 90nm SiGe BiCMOS technology. [Online] Available at: https://www.globalfoundries.com/sites/default/files/product-briefs/sige-9hpproduct-brief-113018.pdf."

- [33] IHP, "Low-volume & multi-project service. [Online] Available at: https://www.ihpmicroelectronics.com/en/services/mpw-prototyping/sigec-bicmos-technologies.html," 2019.
- [34] H. Takahashi, T. Kosugi, A. Hirata, J. Takeuchi, K. Murata, and N. Kukutsu, "120-GHz-band fully integrated wireless link using QSPK for realtime 10-Gbit/s transmission," *IEEE Trans. Microw. Theory Techn.*, vol. 61, pp. 4745–4753, Dec 2013.
- [35] H. Takahashi, A. Hirata, J. Takeuchi, N. Kukutsu, T. Kosugi, and K. Murata, "120-GHz-band 20-Gbit/s transmitter and receiver MMICs using quadrature phase shift keying," in *IEEE 7th Eur. Microw. Integr. Circuit Conf.*, pp. 313–316, Oct 2012.
- [36] H.-J. Song, J.-Y. Kim, K. Ajito, N. Kukutsu, and M. Yaita, "50-Gb/s direct conversion QPSK modulator and demodulator MMICs for terahertz communications at 300 GHz," *IEEE Trans. Microw. Theory Techn.*, vol. 62, pp. 600–609, Mar 2014.
- [37] C. Jiang, A. Cathelin, and E. Afshari, "A High-Speed efficient 220-GHz spatialorthogonal ASK transmitter in 130-nm SiGe BiCMOS," *IEEE J. Solid-State Circuits*, vol. 52, pp. 2321–2334, Sep 2017.
- [38] K. Okada, R. Minami, Y. Tsukui, S. Kawai, Y. Seo, S. Sato, S. Kondo, T. Ueno, Y. Takeuchi, T. Yamaguchi, A. Musa, R. Wu, M. Miyahara, and A. Matsuzawa, "A 64-QAM 60GHz CMOS transceiver with 4-channel bonding," in *IEEE ISSCC Dig. Tech. Papers*, Feb 2014.
- [39] I. Kallfass, J. Antes, T. Schneider, F. Kurz, D. Lopez-Diaz, S. Diebold, H. Massler, A. Leuther, and A. Tessmann, "All active MMIC-based wireless communication at 220 GHz," *IEEE Trans. Terahertz Science and Technology*, vol. 1, pp. 477–487, Nov 2011.
- [40] I. Kallfass, F. Boes, T. Messinger, J. Antes, A. Inam, U. Lewark, A. Tessmann, and R. Henneberger, "64 Gbit/s transmission over 850 m fixed wireless link at 240 GHz carrier frequency," J. Infrared Milli. Terahz. Waves, vol. 36, pp. 221–233, Jan 2015.
- [41] I. Kallfass, I. Dan, S. Rey, P. Harati, J. Antes, A. Tessmann, S. Wagner, M. KURI, R. WEBER, H. MASSLER, A. LEUTHER, T. MERKLE, and T. KÜRNER, "Towards MMIC-based 300GHz indoor wireless communication systems," *IEICE Trans. Electron.*, vol. E98.C, pp. 1081–1090, Dec 2015.
- [42] S. Carpenter, D. Nopchinda, M. Abbasi, Z. S. He, M. Bao, T. Eriksson, and H. Zirath, "A D-Band 48-Gbit/s 64-QAM/QPSK direct-conversion I/Q transceiver chipset," *IEEE Trans. Microw. Theory Techn.*, vol. 64, pp. 1285–1296, Apr 2016.
- [43] H. Hamada, T. Fujimura, I. Abdo, K. Okada, H.-J. Song, H. Sugiyama, H. Matsuzaki, and H. Nosaka, "300-GHz. 100-Gb/s InP-HEMT wireless transceiver using a 300-GHz fundamental mixer," in *IEEE MTT-S Int. Micrw. Symp. Dig.*, Jun 2018.
- [44] J. Savoj, A. Abbasfar, A. Amirkhany, M. Jeeradit, and B. W. Garlepp, "A 12-GS/s phase-calibrated CMOS digital-to-analog converter for backplane communications," *IEEE J. Solid-State Circuits*, vol. 43, pp. 1207–1216, May 2008.

- [45] Y. M. Greshishchev, D. Pollex, S.-C. Wang, M. Besson, P. Flemeke, S. Szilagyi, J. Aguirre, C. Falt, N. Ben-Hamida, R. Gibbins, and P. Schvan, "A 56GS/s 6b DAC in 65nm CMOS with 256x6b memory," in *IEEE ISSCC Dig. Tech. Papers*, Feb 2011.
- [46] A. Nazemi, K. Hu, B. Catli, D. Cui, U. Singh, T. He, Z. Huang, B. Zhang, A. Momtaz, and J. Cao, "A 36Gb/s PAM4 transmitter using an 8b 18GS/s DAC in 28nm CMOS," in *IEEE ISSCC Dig. Tech. Papers*, Feb 2015.
- [47] J. Cao, D. Cui, A. Nazemi, T. He, G. Li, B. Catli, M. Khanpour, K. Hu, T. Ali, H. Zhang, H. Yu, B. Rhew, S. Sheng, Y. Shim, B. Zhang, and A. Momtaz, "A transmitter and receiver for 100Gb/s coherent networks with integrated 4x64GS/s 8b ADCs and DACs in 20nm CMOS," in *IEEE ISSCC Dig. Tech. Papers*, Feb 2017.
- [48] J. G. Proakis and M. Salehi, *Digital Communications*. McGraw Hill, 5th edition ed., 2008.
- [49] A. Momtaz and M. M. Green, "An 80 mW 40 Gb/s 7-tap T/2-spaced feed-forward equalizer in 65 nm CMOS," *IEEE J. Solid-State Circuits*, vol. 45, pp. 629–639, Mar 2010.
- [50] A. V. den Bosch, M. Steyaert, and W. Sansen, "An accurate statistical yield model for CMOS current-steering D/A converters," in *IEEE ISCAS*, May 2000.
- [51] M. Pelgrom, A. Duinmaijer, and A. Welbers, "Matching properties of MOS transistors," *IEEE J. Solid-State Circuits*, vol. 24, pp. 1433–1439, Oct 1989.
- [52] VirginiaDiodes, "Spectrum/signal analyzer extension modules operational manual."
- [53] T. Miki, Y. Nakamura, M. Nakaya, S. Asai, Y. Akasaka, and Y. Horiba, "An 80-MHz 8-bit CMOS D/A converter," *IEEE J. Solid-State Circuits*, vol. 21, pp. 983–988, Dec 1986.
- [54] Y. Nakamura, T. Miki, A. Maeda, H. Kondoh, and N. Yazawa, "A 10-b 70-MS/s CMOS D/A converter," *IEEE J. Solid-State Circuits*, vol. 26, pp. 637–642, Apr 1991.
- [55] A. V. den Bosch, M. Borremans, J. Vandenbussche, G. V. der Plas, A. Marques, J. Bastos, M. Steyaert, G. Gielen, and W. Sansen, "A 12 bit 200 MHz low glitch CMOS D/A converter," in *IEEE Proc. CICC*, Apr 1998.
- [56] J. Bastos, A. Marques, M. Steyaert, and W. Sansen, "A 12-bit intrinsic accuracy highspeed CMOS DAC," *IEEE J. Solid-State Circuits*, vol. 33, pp. 1959–1969, Dec 1998.
- [57] C.-H. Lin and K. Bult, "A 10-b, 500-MSample/s CMOS DAC in 0.6 mm<sup>2</sup>," *IEEE J. Solid-State Circuits*, vol. 33, pp. 1948–1958, Dec 1998.
- [58] A. van den Bosch, M. Borremans, M. Steyaert, and W. Sansen, "A 10-bit 1-GSample/s Nyquist current-steering CMOS D/A converter," *IEEE J. Solid-State Circuits*, vol. 36, pp. 315–324, Mar 2001.

- [59] G. V. D. Plas, J. Vandenbussche, W. Sansen, M. Steyaert, and G. Gielen, "A 14-bit intrinsic accuracy Q<sup>2</sup> random walk CMOS DAC," *IEEE J. Solid-State Circuits*, vol. 34, pp. 1708–1718, Dec 1999.
- [60] C.-H. Lin, F. M. L. van der Goes, J. R. Westra, J. Mulder, Y. Lin, E. Arslan, E. Ayranci, X. Liu, and K. Bult, "A 12 bit 2.9 GS/s DAC with IM3  $\ll$  -60 dBc beyond 1 GHz in 65 nm CMOS," *IEEE J. Solid-State Circuits*, vol. 44, pp. 3285–3293, Dec 2009.
- [61] J. Xiao, B. Chen, T. Y. Kim, N. Wang, X. Chen, T. Chih, K. Raviprakash, H. Chen, R. Gomez, and J. Y. C. Chang, "A 13-bit 9GS/s RF DAC-based broadband transmitter in 28nm CMOS," in *IEEE Symp. VLSI Circuits*, pp. 262–263, Jun 2013.
- [62] S. Park, G. Kim, S.-C. Park, and W. Kim, "A digital-to-analog converter based on differential-quad switching," *IEEE J. Solid-State Circuits*, vol. 37, pp. 1335–1338, Oct 2002.
- [63] B. Schafferer and R. Adams, "A 3V CMOS 400mW 14b 1.4GS/s DAC for multi-carrier applications," in *IEEE ISSCC Dig. Tech. Papers*, Feb 2004.
- [64] G. Engel, S. Kuo, and S. Rose, "A 14b 3/6GHz current-steering RF DAC in 0.18μm CMOS with 66dB ACLR at 2.9GHz," in *IEEE ISSCC. Dig. Tech. Papers*, Feb 2012.
- [65] FujitsuSemiconductor, "Factsheet LEIA 55 65 GSa/s 8-bit DAC."
- [66] H. Wang, H. Mohammadnezhad, D. Dimlioglu, and P. Heydari, "A 100-120GHz 20Gbps bits-to-RF 16QAM transmitter using 1-bit digital-to-analog interface," in *IEEE CICC*, to appear Apr 2019.
- [67] H. Wang, H. Mohammadnezhad, and P. Heydari, "Analysis and design of high-order QAM direct-modulation transmitter for high-speed point-to-point mm-wave wireless links," *IEEE Journal of Solid-State Circuits*, vol. 54, pp. 3161–3179, nov 2019.
- [68] S. Shopov, O. D. Gurbuz, G. M. Rebeiz, and S. P. Voinigescu, "A D-Band digital transmitter with 64-QAM and OFDM free-space constellation formation," *IEEE J. Solid-State Circuits*, vol. 53, pp. 2012–2022, Jul 2018.
- [69] H. Al-Rubaye and G. M. Rebeiz, "W-band direct-modulation >20-Gb/s transmit and receive building blocks in 32-nm SOI CMOS," *IEEE J. Solid-State Circuits*, vol. 52, pp. 2277–2291, Sep 2017.
- [70] T. H. Lee, The Design of CMOS Radio-Frequency Integrated Circuits. Cambridge University Press, 2006.
- [71] A. Georgiadis, "Gain, phase imbalance, and phase noise effects on error vector magnitude," *IEEE Trans. Veh. Technol.*, vol. 53, pp. 443–449, Mar 2004.
- [72] J. Chen, D. Kuylenstierna, S. E. Gunnarsson, Z. S. He, T. Eriksson, T. Swahn, and H. Zirath, "Influence of white LO noise on wideband communication," *IEEE Trans. Microw. Theory Techn.*, vol. 66, pp. 3349–3359, Jul 2018.

- [73] S. Kang, J.-C. Chien, and A. M. Niknejad, "A W-Band low-noise PLL with a fundamental VCO in SiGe for millimeter-wave applications," *IEEE Trans. Microw. Theory Techn.*, vol. 62, pp. 2390–2404, Oct 2014.
- [74] T. Siriburanon, S. Kondo, M. Katsuragi, H. Liu, K. Kimura, W. Deng, K. Okada, and A. Matsuzawa, "A low-power low-noise mm-wave subsampling PLL using dualstep-mixing ILFD and tail-coupling quadrature injection-locked oscillator for IEEE 802.11ad," *IEEE J. Solid-State Circuits*, vol. 51, pp. 1246–1260, May 2016.
- [75] J. Antes and I. Kallfass, "Performance estimation for broadband multi-gigabit millimeter- and sub-millimeter-wave wireless communication links," *IEEE Trans. Mi*crow. Theory Techn., vol. 63, pp. 3288–3299, Oct 2015.
- [76] M. R. Khanzadi, D. Kuylenstierna, A. Panahi, T. Eriksson, and H. Zirath, "Calculation of the performance of communication systems from measured oscillator phase noise," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 61, pp. 1553–1565, May 2014.
- [77] P. Heydari, "Neutralization techniques for high-frequency amplifiers: An overview," *IEEE Solid-State Circuits Magazine*, vol. 9, pp. 82–89, Nov 2017.
- [78] A. Natarajan, S. K. Reynolds, M.-D. Tsai, S. T. Nicolson, J.-H. C. Zhan, D. G. Kam, D. Liu, Y.-L. O. Huang, A. Valdes-Garcia, and B. A. Floyd, "A fully-integrated 16element phased-array receiver in SiGe BiCMOS for 60-GHz communications," *IEEE J. Solid-State Circuits*, vol. 46, pp. 1059–1075, May 2011.
- [79] B. Xia, L.-S. Wu, S.-W. Ren, and J.-F. Mao, "A balanced-to-balanced power divider with arbitrary power division," *IEEE Trans. Microw. Theory Techn.*, vol. 61, pp. 2831– 2840, Aug 2013.
- [80] L.-S. Wu, Y.-X. Guo, and J.-F. Mao, "Balanced-to-balanced Gysel power divider with bandpass filtering response," *IEEE Trans. Microw. Theory Techn.*, vol. 61, pp. 4052– 4062, Dec 2013.
- [81] M. Luo, X. Xu, X.-H. Tang, and Y.-H. Zhang, "A compact balanced-to-balanced filtering Gysel power divider using  $\lambda_g/2$  resonators and short-stub-loaded resonator," *IEEE Microw. Wireless Compon. Lett.*, vol. 27, pp. 645–647, Jul 2017.
- [82] H. Mohammadnezhad, H. Wang, and P. Heydari, "Analysis and design of a wideband, balun-based, differential power splitter at mm-Wave," *IEEE Trans. Circuits Syst. II, Express Briefs*, vol. 65, pp. 1629–1633, Nov 2018.
- [83] D. Adler and R. Popovich, "Broadband switched-bit phase shifter using all-pass networks," in *IEEE MTT-S Int. Microw. Symp. Dig.*, July 1991.
- [84] F. Chang, K. Onohara, and T. Mizuochi, "Forward error correction for 100 G transport networks," *IEEE Communications Magazine*, vol. 48, pp. S48–S55, Mar 2010.

- [85] C. Andrews and A. C. Molnar, "A passive mixer-first receiver with digitally controlled and widely tunable RF interface," *IEEE J. Solid-State Circuits*, vol. 45, pp. 2696–2708, Dec 2010.
- [86] D. Murphy, H. Darabi, A. Abidi, A. A. Hafez, A. Mirzaei, M. Mikhemar, and M.-C. F. Chang, "A blocker-tolerant, noise-cancelling receiver suitable for wideband wireless applications," *IEEE J. Solid-State Circuits*, vol. 47, pp. 2943–2963, Dec 2012.
- [87] C. Izquierdo, A. Kaiser, F. Montaudon, and P. Cathelin, "Reconfigurable wide-band receiver with positive feed-back translational loop," in *Proc. IEEE Radio Freq. Integr. Circuits Symp. (RFIC)*, pp. 1–4, Jun 2011.
- [88] C. Wu, Y. Wang, B. Nikolic, and C. Hull, "An interference-resilient wideband mixerfirst receiver with LO leakage suppression and I/Q correlated orthogonal calibration," *IEEE Trans. Microw. Theory Techn.*, vol. 64, pp. 1088–1101, Apr 2016.
- [89] "Recommended minimum performance standards for CDMA2000 spread spectrum mobile stations (3GPP2 C.S0011-E Version 2.0), Mar 2014."
- [90] "Digital cellular telecommunication system (Phase2+); Radio transmission and reception (3GPP TS 45.005 version 15.0.0 Release 15), Jul 2018."
- [91] "Universal mobile telecommunication system; User equipment radio transmission and reception (3GPP TS 25.101 version 10.1.0 Release 10), May 2011.."
- [92] S. Jayasuriya, D. Yang, and A. Molnar, "A baseband technique for automated LO leakage suppression achieving -80dbm in wideband passive mixer-first receivers," in *Proc. IEEE Custom Integr. Circuits Conf. (CICC)*, pp. 450–452, Sep 2014.
- [93] C. M. Thomas, V. W. Leung, and L. E. Larson, "A pseudorandom clocking scheme for a CMOS n-path bandpass filter with 10-to-15 dB spurious leakage improvement," in 2015 IEEE Radio and Wireless Symposium (RWS), IEEE, Jan 2015.
- [94] M. Darvishi, R. van der Zee, and B. Nauta, "Design of active N-path filters," IEEE J. Solid-State Circuits, vol. 48, pp. 2962–2976, Dec 2013.
- [95] Y.-C. Lien, E. A. M. Klumperink, B. Tenbroek, J. Strange, and B. Nauta, "Enhanced-selectivity high-linearity low-noise mixer-first receiver with complex pole pair due to capacitive positive feedback," *IEEE J. Solid-State Circuits*, vol. 53, pp. 1348–1360, may 2018.
- [96] Y.-C. Lien, E. A. M. Klumperink, B. Tenbroek, J. Strange, and B. Nauta, "Highlinearity bottom-plate mixing technique with switch sharing for n-path filters/mixers," *IEEE J. Solid-State Circuits*, vol. 54, pp. 323–335, Feb 2019.
- [97] J. W. Park and B. Razavi, "Channel selection at RF using miller bandpass filters," *IEEE J. Solid-State Circuits*, vol. 49, pp. 3063–3078, dec 2014.
- [98] S. Youssef, R. van der Zee, and B. Nauta, "Active feedback technique for RF channel selection in front-end receivers," *IEEE J. Solid-State Circuits*, vol. 47, pp. 3130–3144, dec 2012.
- [99] Z. Lin, P.-I. Mak, and R. P. Martins, "A sub-GHz multi-ISM-band ZigBee receiver using function-reuse and gain-boosted n-path techniques for IoT applications," *IEEE J. Solid-State Circuits*, vol. 49, pp. 2990–3004, Dec 2014.
- [100] D. Murphy, H. Darabi, and H. Xu, "A noise-cancelling receiver resilient to large harmonic blockers," *IEEE J. Solid-State Circuits*, vol. 50, pp. 1336–1350, Jun 2015.
- [101] Y. Xu, J. Zhu, and P. R. Kinget, "A blocker-tolerant RF front end with harmonicrejecting N-Path filter," *IEEE J. Solid-State Circuits*, vol. 53, pp. 327–339, Feb 2018.
- [102] R. Chen and H. Hashemi, "Dual-carrier aggregation receiver with reconfigurable frontend RF signal conditioning," *IEEE Journal of Solid-State Circuits*, vol. 50, pp. 1874– 1888, aug 2015.
- [103] A. Oppenheim, Discrete-time signal processing. Upper Saddle River: Pearson, 2010.
- [104] M. Pelgrom, A. Duinmaijer, and A. Welbers, "Matching properties of MOS transistors," *IEEE J. Solid-State Circuits*, vol. 24, pp. 1433–1439, Oct 1989.
- [105] J. Borremans, G. Mandal, V. Giannini, B. Debaillie, M. Ingels, T. Sano, B. Verbruggen, and J. Craninckx, "A 40 nm CMOS 0.4-6 GHz receiver resilient to out-of-band blockers," *IEEE J. Solid-State Circuits*, vol. 46, pp. 1659–1671, Jul 2011.
- [106] H. Hedayati, W.-F. A. Lau, N. Kim, V. Aparin, and K. Entesari, "A 1.8 dB NF blocker-filtering noise-canceling wideband receiver with shared TIA in 40 nm CMOS," *IEEE J. Solid-State Circuits*, vol. 50, pp. 1148–1164, May 2015.
- [107] Y. Zhang, J. Zhu, and P. R. Kinget, "An out-of-band IM3 cancellation technique using a baseband auxiliary path in wideband LNTA-based receivers," *IEEE Trans. Microw. Theory Techn.*, vol. 66, pp. 2580–2591, Jun 2018.
- [108] A. Ghaffari, E. A. M. Klumperink, and B. Nauta, "Tunable n-path notch filters for blocker suppression: Modeling and verification," *IEEE J. Solid-State Circuits*, vol. 48, pp. 1370–1382, Jun 2013.
- [109] C. kai Luo and J. F. Buckwalter, "A 0.25-to-2.25 GHz, 27 dBm IIP3, 16-path tunable bandpass filter," *IEEE Microw. Wireless Compon. Lett.*, vol. 24, pp. 866–868, Dec 2014.
- [110] A. Ghaffari, E. A. M. Klumperink, M. C. M. Soer, and B. Nauta, "Tunable high-Q N-path band-pass filters: Modeling and verification," *IEEE J. Solid-State Circuits*, vol. 46, pp. 998–1010, May 2011.
- [111] S. C. Blaakmeer, E. A. M. Klumperink, D. M. W. Leenaerts, and B. Nauta, "The Blixer, a wideband balun-LNA-I/Q-mixer topology," *IEEE J. Solid-State Circuits*, vol. 43, pp. 2706–2715, Dec 2008.

- [112] S. C. Blaakmeer, E. A. M. Klumperink, D. M. W. Leenaerts, and B. Nauta, "Wideband Balun-LNA with simultaneous output balancing, noise-canceling and distortioncanceling," *IEEE J. Solid-State Circuits*, vol. 43, pp. 1341–1350, Jun 2008.
- [113] X. He and H. Kundur, "A compact SAW-less multiband WCDMA/GPS receiver frontend with translational loop for input matching," in 2011 IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, IEEE, fFeb 2011.
- [114] "LTE; Evolved Universal Terrestrial Radio Access(E-UTRA); User equipment radio transmission and reception (3GPP TS 36.101 version 12.5.0 Release 12), Nov 2014.."