#### **UC Berkeley** #### **UC Berkeley Electronic Theses and Dissertations** #### **Title** Design Techniques for Fully Integrated Switched-Capacitor Voltage Regulators #### **Permalink** https://escholarship.org/uc/item/9fx1b70t #### **Author** LE, HANH-PHUC #### **Publication Date** 2013 Peer reviewed|Thesis/dissertation #### **Design Techniques for Fully Integrated Switched-Capacitor Voltage Regulators** by #### Hanh Phuc Le A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in Engineering – Electrical Engineering and Computer Sciences in the **Graduate Division** of the University of California, Berkeley Committee in charge: Professor Elad Alon, Co-Chair Professor Seth R. Sanders, Co-Chair Professor Clark T.-C. Nguyen Professor Paul K. Wright Fall 2013 #### **Design Techniques for Fully Integrated Switched-Capacitor Voltage Regulators** Copyright 2013 by Hanh Phuc Le #### **Abstract** #### **Design Techniques for Fully Integrated Switched-Capacitor Voltage Regulators** by #### Hanh Phuc Le Doctor of Philosophy in Engineering – Electrical Engineering and Computer Sciences University of California, Berkeley Professor Elad Alon, Co-Chair Professor Seth R. Sanders, Co-Chair As parallelism increases the number of cores integrated onto a chip, there is a clear need for fully integrated DC-DC converters to enable efficient on-die power management. Due to the availability of high density and low series resistance capacitors in existing CMOS processes, switched-capacitor DC-DC converters have recently gained significant interest as a cost-effective means of enabling such power management functionality. In this thesis, described are design techniques to implement fully integrated switched-capacitor DC-DC converters with high power density and efficiency. The area required by a fully integrated switched-capacitor DC-DC converter in order to deliver a certain level of power to the load has direct implications on both cost and efficiency, and hence in Chapter 2 a methodology is presented to predict and minimize the losses of such a converter operating at a given power density. Chapter 3 further introduces gate driver and level shifter circuit design strategies to enable topology reconfiguration and hence efficient generation of a wider range of output voltages. In order to demonstrate the possibility of replacing all off-chip PMICs, Chapter 4 presents a battery-connected switched-capacitor DC-DC converter that is able to convert the wide input voltage range from Li-ion battery to an output regulated at ~1V using cascode switches and intermediate voltage rails. The SC converter in Chapter 4 also employs a fast control loop to regulate the output with sub-ns response times. Measured results from the converters presented in Chapters 3 and 4 match with the analytical prediction and, thus, confirm the design methodology presented in Chapter 2. The 32nm SOI prototype presented in Chapter 3 achieves ~80% efficiency at a power density of ~0.5-1W/mm² for a 2:1 step-down converter operating from a 2V input and utilizing only standard MOS capacitors. Reconfiguration of the converter's topology enables it to maintain greater than 70% efficiency for most of the output voltage range from 0.7V to ~1.15V. The 65nm Bulk CMOS prototype discussed in Chapter 4 also utilizes only standard MOS capacitors to regulate the output voltage at ~1V from a ~2.9V-4V input. It achieves ~73% efficiency at 0.19 W/mm² output power density and maintain efficiency above 72% over the whole range of target power density. The sub-ns response control loop maintains <76 mV voltage droop out of a 1V regulated output under a full load step of $0 \rightarrow 0.253 \text{ A/mm}^2$ in 50ps. Given that these results were achieved in a standard CMOS process with no modifications or additions, they illustrate that fully integrated switched-capacitor converters are indeed a promising candidate for low-cost but efficient power management on a per-core or per-functional unit basis. They can possibly replace all the off-chip PMICs and passive components and free up significant PCB area to be used to implement new functions on next-generation mobile devices. To Viet-Dung and Ha-Linh, my love To My Family, my dedication To My Advisors and Friends, my sincere thanks To Lions, a toast ## **Contents** | Coi | Contents | | 1 | |------|---------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------| | List | t of Fig | gures | iv | | List | List of Tables | | vi | | 1. | Introduction | | | | | 1.1.<br>1.2. | The Need for Fully Integrated Power Management Design Challenges in Full Integration and Choice of Power Transfer Elements 1.2.1. Switched-Inductor Converter 1.2.2. Switched-Capacitor Converter Organization of Dissertation | $\epsilon$ | | 2. | | ched-Capacitor DC-DC Converter Fundamentals | 8 | | | 2.1.<br>2.2.<br>2.3.<br>2.4.<br>2.5.<br>2.6. | Switched Capacitor DC-DC Converter and Power Density Requirements Operation of a Sample Switched Capacitor DC-DC Converter Loss Analysis Loss Optimization Output Voltage Range Considerations Summary | 10<br>16<br>19<br>21 | | 3. | High-Performance Switched-Capacitor DC-DC Converter Prototype | | | | | 3.1.<br>3.2.<br>3.3. | Reconfigurable Topology Circuit Techniques 3.2.1. Switch Drivers 3.2.2. Active Level Shifter Experimental Verification 3.3.1. Test structure 3.3.2. Measurement Results and Discussion Conclusion | 23<br>23<br>28<br>28<br>28<br>29<br>29<br>33 | | 4. | Battery-Connected Switched-Capacitor Regulator Prototype | | | | | 4.1.<br>4.2.<br>4.3 | Introduction | 34<br>36<br>41 | | Bibliography | | | 63 | | |--------------|------------|--------|------------------------------------------------|----| | 5. | Conclusion | | | 60 | | | 4.5. | Chapte | r Summary | 58 | | | 4.4. | | rement Results and Discussions | | | | | | Sub-ns Response Regulation | | | | | | Clock Level Shifter | _ | | | | 4.3.3. | Intermediate Voltage Rail Auxiliary Converters | 46 | | | | 4.3.2. | Switch Drivers | 43 | | | | 4.3.1. | Body Biasing to Reduce Parasitic Capacitance | 41 | # **List of Figures** | Figure 1.1. | Boards of iPhone 3GS (a), iPhone 4S (b) and iPhone 5 (c) | |-------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | Figure 1.2. | Power distribution using local DC-DC converters | | Figure 1.3. | Power management in today's SoC using (a) power gating and (b) linear regulators | | Figure 1.4. | Examples of switching converters: (a) buck converter and (b) 2-to-1 switched capacitor (SC) converter | | Figure 2.1. | (a) A 2:1 step-down SC DC-DC converter and (b) its operational waveforms | | Figure 2.2. | A 2:1 step-down SC DC-DC converter voltage and current waveforms | | Figure 2.3. | A sample 4-phase interleaved SC converter (a) and flying capacitor operation (b) | | Figure 2.4. | Flying capacitor voltage (a) and its effect on output voltage and current ripple (b) of the 4-phase interleaved converter in Figure 2.3 | | Figure 2.5. | Achievable efficiency of an example 2:1 SC converter as a function of the number of interleaved phases | | Figure 2.6. | SC converter simplified model for loss calculations | | Figure 2.7. | Analytical prediction of optimized power density vs. efficiency tradeoff curves for a 2:1 SC converter. The switch characteristics of a 32nm CMOS technology (i.e., $R_{ON} = 130 \ \Omega \cdot \mu m$ , $C_{gate} = 3 \ fF/\mu m^2$ , $V_{sw} = 1V$ ) were used to generate these curves, which also highlight the | | Figure 2.8. | impact of +/-30% process variations (modeled as shifts in $R_{\rm ON}$ ) Predicted efficiency vs. Vo with 3 available topologies with two capacitor technologies. For both types of capacitors, the load is adjusted so that the converter is supplying 0.1 W/mm² with a Vo of 0.95 V. These curves assume that the load is equivalent to a CMOS ring oscillator – i.e., that $R_{\rm L}$ varies along with the output voltage in the | | E' 0.1 | same manner as a ring oscillator | | Figure 3.1. | Standard cell (a) and reconfigurable converter unit (b) | | Figure 3.2. | Converter power switch control circuits and timing diagram. The converter operates off of 2 non-overlapping clocks c1 and c2 with a controllable dead-time (DT) | | Figure 3.3. | Switch M5 and operation of the flying inverter INV5 | | Figure 3.4. | Illustration of the weak "Low" input to INV5 in the 2/3 mode when M5 is intended to connect the two flying capacitors in series | | Figure 3.5. | Illustration of the weak "Low" input to INV5 in the 2/3 mode when M5 is intended to connect the two flying capacitors in series | | Figure 3.6. | Active level shifter implementation | 27 | |--------------|--------------------------------------------------------------------------------------|----| | Figure 3.7. | 32nm SOI SC converter prototype die photo | 28 | | Figure 3.8. | Measured converter efficiency (a) and optimal switching frequency | | | | (b) versus power density in the $1/2$ mode with $Vi = 2V$ and $Vo \approx$ | | | | 0.88V | 30 | | Figure 3.9. | Measured converter efficiency and switching frequency across Vo and | | | | topologies with Vi = 2V and the load circuits set to $R_L \approx 0.9\Omega$ at Vo = | | | | 0.88V | 30 | | Figure 3.10. | Measured converter efficiency (a) and optimal switching frequency | | | | (b) versus power density in the $2/3$ and $1/3$ modes with $Vi = 2V$ and | | | | $Vo \approx 1.1V$ and 0.6V, respectively | 31 | | Figure 4.1. | Battery voltage versus circuit supply | 35 | | Figure 4.2. | Direct battery connected power distribution using IVRs | 36 | | Figure 4.3. | Proposed reconfigurable SC converter topologies to support ~1V | | | C | output across a Li-ion battery input voltage range | 37 | | Figure 4.4. | Two phase operation of traditional 2/5 and 3/5 series-parallel SC | | | C | converters | 38 | | Figure 4.5. | The two phase operation of a 3/5 mode | 39 | | Figure 4.6. | Operation of traditional series-to-parallel topology | 39 | | Figure 4.7. | Operation of partial series-to-parallel topology using 2 groups of | | | S | capacitors. | 40 | | Figure 4.8. | The operation of partial series-to-parallel topology using k groups of | | | S | capacitors | 40 | | Figure 4.9. | Flying capacitors using Bulk CMOS transistors with (a) traditional | | | S | biasing and (b) proposed biasing scheme | 42 | | Figure 4.10. | Proposed SC converter with power switches and gate driver | | | S | circuitry | 45 | | Figure 4.11. | Auxiliary SC converters for intermediate voltage rail generations | 47 | | Figure 4.12. | The proposed converter unit for auxiliary SC converters and timing | | | S | diagram | 47 | | Figure 4.13. | (a) Break and (b) Make operations of the proposed auxiliary SC | | | S | converters | 48 | | Figure 4.14. | Clock level shifter | 50 | | • | Simple closed-loop controller | 50 | | • | Complete closed-loop controller with fast transient response | 51 | | | Transient diagram for large and small positive load steps | 51 | | • | Transient diagram for large and small unloading transients | 53 | | • | Die photo | 54 | | | Measured efficiency with Vo regulated at 1V and (a) Vi=3.6V, (b) | | | C | Vi=3V-4V, Pout=0.19W/mm <sup>2</sup> | 54 | | Figure 4.21. | SC regulator transient response at full current (a) step-up and (b) step- | | | C | down | 56 | | Figure 4.22. | SC regulator performance in (a) load regulation and (b) line | | | D: 6.1 | regulation. | 57 | |-------------|-------------------------------------------------------------------|----| | Figure 5.1. | Performance of fully integrated converters in production CMOS and | | | | in processes where extra steps are allowed | 62 | ## **List of Tables** | Table 3.1. | Comparison of recently published fully integrated SC converters | 32 | |------------|-----------------------------------------------------------------|----| | Table 4.1. | Node voltages in operation | 44 | | Table 4.2. | Comparison with prior art | 58 | # Chapter 1 Introduction #### 1.1. The Need for Fully Integrated Power Management The rapid growth of the integrated circuit industry over the last 60 years has been fostered primarily by technology scaling [1], which has allowed exponential growth of transistor density, and an accompanied increase in logic circuit speed. In the past decade, the pace of integration has become even faster. Designs have evolved from a simple central processing unit (CPU) to a system on chip (SoC), that integrates the CPU with a graphic processing unit (GPU), memory, interface controller (USB, PCI, display, data ports) and analog blocks including wireless communication functions (WIFI, 3G, 4G LTE etc). This system integration yields two significant benefits that directly affect end users. First, it improves the system speed and reduces the system power consumption by mitigating the interconnect parasitic losses due to shorter physical distance between functional blocks. Second, it enables significant area reduction in board implementation. This is particularly important in mobile applications such as smartphones, where users constantly demand improved functionality, and with thinner and lighter form factors. This trend is illustrated in Figure 1.1 by the three latest generations of iPhone PCB implementations. While additional functionality continues to be added in order to provide customers more productivity, fewer individual chips and passive components are seen on the board. Figure 1.1. Boards of iPhone 3GS (a), iPhone 4S (b) and iPhone 5 (c) While nearly all other functions are being integrated into SoCs, power management units, highlighted in red rectangles in Figure 1.1, have resisted integration, and in fact have barely even shrunk in form factor. The power management block usually consists of a chip (i.e. power management IC or PMIC) with integrated power switches and controls, along with many off-chip passive components for energy storage. These off-chip passive components, especially power inductors and capacitors, have certain footprints, and are not as scalable (i.e. do not follow Moore's Law of scaling) as the integrated circuits. Therefore, the relative area for off-chip power management increased, rather than decreased, from one generation to the next, as shown in Figure 1.1. With this mismatch in development with other functions, power management has gradually become one of the main limiting factors in system size reduction. The challenge to power management scalability has become even greater as the requirement for the number of voltage domains has increased. The reason for this increase is that power consumption limitations prevent further increases in logic speeds, and thus, designers have utilized parallelism to increase throughput within a strict power constraint. As parallelism increases the number of cores integrated onto a chip, there is increasing need and potential benefit to utilizing an independent power supply for each core in order to optimize total chip power and circuit performance [2-4]. Simply adding off-chip supplies will not only incur significant degradation of supply impedance due to split package power planes and a limited number of pins, but also additional cost due to increased motherboard size and package complexity. In order to meet these challenging requirements in the number of voltage domains and with the supply impedance, fully integrated voltage conversion appears to be the ultimate solution. As shown in Figure 1.2, the supply voltage of each load circuit (i.e. digital core or other functional block) can be generated by a fully integrated voltage regulator (IVR), with all of the IVRs sharing the same global input supply. In this architecture, IVRs not only has area that can scale with the loading circuit, but can also maintain a tight regulation loop because of physical proximity to the load, i.e. at the point of load. In addition, since IVRs typically down-convert the input voltage to a lower voltage, the requirements on the input supply impedance (relative to a system without IVRs) can be relaxed. The worst case output impedance requirements are associated with situations in which the load circuits are operating at their maximum load or undergo many large-scale load transients. In a non-IVR solution using separate off-chip supply for each of the load circuits, the probability of a circuit being exposed to the worst-case supply voltage transient is set by the probability that any individual load circuit operating in its own individual worst-case conditions. Since this probability is high, the requirements for each of the individual supply voltages is very demanding. As such, this approach dictates the need for a much larger number of total power supply bumps/pins than would have been necessary in a design with a single (shared) global supply voltage. In the architecture shown in Figure 2.1, the worst case for global supply variations occurs only when all of the loading circuits hit their critical operating conditions at the same time. Since the probability of this aggregated worst case is substantially lower, this architecture relaxes the requirement for global supply impedance shared by all the IVRs on the die. Figure 1.2. Power distribution using local DC-DC converters The key challenge in realizing this power management architecture is to implement IVRs that are both area efficient and power efficient. There are many options to implement IVRs, and designers must take into account all available technologies and their associated costs, as well as design techniques, in order to make an optimal design. These considerations will be discussed further in the next section. # 1.2. Design Challenges in Full Integration and Choice of Power Transfer Elements In today's SoC, on-die power management is typically limited to power gating for digital circuits [5] and linear regulators for analog circuits, as shown in Figure 1.3. While power gating can be efficient in suppressing leakage current while the underlying circuit is off, it often requires additional circuits to retain the state of the logic elements [6-8]. These switches, of course, introduce area and power overhead (due both to IR drop and switching losses from actuating the switches), and do not directly provide the capability to support dynamic voltage and frequency scaling (DVFS). Linear regulator technology (e.g., the low-drop-out regulator or LDO) is mature, and mostly meets these objectives of DVFS, while providing a low-noise implementation [9]. However, linear regulation is usually not used for DVFS and power domains requiring high current in SoCs, because the associated efficiency is fundamentally limited to the ratio of output voltage and input voltage. Instead, LDOs are usually used for low power blocks like analog circuits that require good supply noise rejection, and that can afford relatively low efficiency from the regulator. In these applications, the regulator power consumption is a relatively small fraction of the total. Figure 1.3. Power management in today's SoC using (a) power gating and (b) linear regulators Figure 1.4. Examples of switching converters: (a) buck converter and (b) 2-to-1 switched capacitor (SC) converter In order to provide a wide range of output voltages without necessarily sacrificing efficiency, switching regulators are the only option. In switching regulators, charge is pulled from the input, stored in one or more passive components (typically inductors or capacitors), and then transferred to the output. Two typical examples of step-down switching converters – a buck converter using an inductor, and a switched-capacitor (SC) converter – are shown in Figure 1.4. While the buck converter transfers charge from the input to the output in the form of current in the inductor L, the SC converter moves charge in the form of voltage over the flying capacitor $C_{\rm fly}$ . Theoretically, by using lossless inductors and capacitors, there is no fundamental limit on the efficiency of these switching converters. However, depending upon application and implementation requirements, one approach may be more attractive than the other to implement a step down converter. The next section will discuss the pros and cons of each converter type in various contexts, e.g. using off-chip passive or fully integrated components. #### 1.2.1. Switched-Inductor Converter In inductor-based switching converters, the output voltage can be conveniently regulated to required level by controlling the amount of charge stored and transferred in the form of current in the inductor, during each switching cycle [10]. Off-chip inductors can achieve high quality factors, i.e. high inductance and small equivalent series resistance (ESR), to reduce loss, leading to high achievable efficiency [11-12]. Output (Input) voltage ripple is simply handled by big decoupling capacitors at the output (input). The relative simplicity of achieving high efficiency across a wide range of output voltages has made the inductor-based converter dominant in off-chip moderate to high power (>100mW) implementations. However, in the context of full integration, inductor-based switching converters face substantial challenges. First, power inductors are typically large in value and physical footprint, and thus difficult to integrate. Recent efforts to co-package and reduce the inductor size [13-14] have brought them closer to complete integration. Unfortunately however, fully integrated DC-DC converters based on CMOS inductors either require costly additional fabrication steps [15-17], such as thick metals or integrated magnetic materials to improve inductor Q, or suffer from the high series resistance and low energy density of standard on-die inductors [18]. In a very recent effort, in-package inductors have been used to implement a fully integrated voltage regulator (FIVR) in Intel's Haswell microprocessors. Although serving the functionality, these inductors require large area and many package routing layers that raise concerns of cost and complexity in other applications. Second, inductor-based converters always require decoupling capacitors, leading to additional area consumption and increased cost. Therefore, although it is still relatively straightforward to achieve a wide range of conversion ratios. the performance and cost of integrated inductor-based designs is substantially degraded. Note that this type of converter has many other undesirable issues, including electromagnetic interference (EMI) to other surrounding devices, and degraded efficiency at high input voltages due to the need for high switch voltage blocking [19-20]. These issues with integrated switched inductor converters suggest a re-examination of an alternative converter type using switched capacitors. #### 1.2.2. Switched-Capacitor Converter Historically, SC converters have been used in integrated circuits [47] to provide programmable voltages to memories, but have mostly been limited to low power (i.e. <100mW) applications. In other settings, SC converters are typically implemented using off-chip capacitors, limiting the practicality of interleaved implementations. Therefore, large output filtering capacitors are required to suppress voltage noise. As will be shown in Chapter 2, a switched-capacitors converter can only achieve high efficiency in a range of output voltage that is close to the ideal conversation ratio, determined by its topology. This limitation in efficiency versus conversion ratio adds another significant obstacle to the acceptance of this converter type. Although these issues have limited the attractiveness of SC converters in implementations with off-chip passives, the SC converter has several key advantages that make it a good alternative to switched-inductor designs in fully integrated implementations. First, benefiting from technology scaling, integrated capacitors can achieve significantly high capacitance density and low series resistance, enabling SC converters to support high output power. Second and most importantly, they can be used to implement DC-DC converters in current CMOS processes without additional fabrication steps. In addition, SC converters theoretically have lower intrinsic conduction loss than inductor-based converters for a given total rating (e.g. V-A product) of switches in certain applications [22]. These advantages of SC converters have motivated the research described in this thesis. #### 1.3. Organization of Dissertation Given the advantages discussed earlier, fully integrated switched-capacitor (SC) converters have recently received increased attention from both academic and industrial researchers. For example, both [23] and [24] investigated multiphase interleaving to reduce the output ripple of fully integrated SC voltage doublers, with [23] demonstrating high efficiency (82%) but low power density (0.67mW/mm²), and [24] achieving high power density (1.123W/mm²) but low efficiency (60%). The need for high efficiency is perhaps self-evident, but high power density is also critical since it directly sets the area overhead (and hence cost) of the converter relative to the on-chip circuitry to which it is supplying power. In order to explore the boundaries of their capabilities, Chapter 2 will describe the fundamental operation of SC converters along with the associated conventional issues and possible solutions. This chapter will include a design methodology for loss optimization aiming to maximize the achievable efficiency versus power density of fully integrated SC converters. Chapter 3 presents a first design prototype to verify the design methodology and demonstrate a high-performance converter that employs topology reconfiguration and output impedance control in order to enable wide output voltage range from a fixed input supply. Chapter 4 presents a second design prototype that can interface directly with a wide range of input voltage from a Li-ion battery while regulating the output voltage at 1V with a sub-ns closed-loop response. Details of various circuit techniques and measurement results for both prototype converters will be discussed. Finally, the thesis will be concluded in Chapter 5. ### Chapter 2 # Switched-Capacitor DC-DC Converter Fundamentals As discussed in Chapter 1, due to its compatibility with commercial processes, the switched-capacitor (SC) converter is emerging out of the shadows of switched-inductor converters, and has drawn significant recent attention from the research community [22-27]. The goal of the presentation in this chapter is to develop a simple design methodology for fully integrated SC converters. Measurement data presented in Chapters 3 and 4 will verify the methodology. By developing this methodology, we hope to accelerate the adoption of SC converters for power management in a broad variety of applications. The design methodology in this chapter focuses on loss mechanisms and optimization to achieve the optimal tradeoff between power density and efficiency. Analytical calculations shows that SC converters can actually achieve relatively high power density (~1W/mm<sup>2</sup>) while achieving reasonably high efficiency (~80%). # 2.1. Switched Capacitor DC-DC Converter and Power Density Requirements Unlike off-chip implementation where efficiency is often considered the first and most important parameter to evaluate a converter design, fully integrated converters should be evaluated by both efficiency and power density (W/mm<sup>2</sup>). The need for high <sup>&</sup>lt;sup>1</sup> An 1-W/mm<sup>2</sup> converter can deliver 1 W to the output while only consuming a total implementation area of 1 mm<sup>2</sup>. In certain comparisons, current density (A/mm<sup>2</sup>) or conductance density (S/mm<sup>2</sup>) can be used instead of power density (W/mm<sup>2</sup>). efficiency is perhaps self-evident, but high power density is also critical since it directly sets the silicon area requirement and thus the cost overhead of the converter. In order to implement a fully integrated SC converter that can achieve high (~80%) efficiency at high (~1W/mm²) power density, designers need to first choose the right topology. Switched-capacitor DC-DC converters have five common topologies including ladder, Dickson, Fibonacci, doubler, and Series-Parallel [19]. The ladder and Dickson topologies [47] provide regularity to the power switches and their drivers. Fibonacci and doubler topologies give high voltage gain out of the same number of switching elements (i.e., switches and capacitors), while the Series-Parallel topology has the best capacitor utilization (i.e., requires the smallest total capacitance for the same performance). In today's CMOS logic processes, fully integrated SC converter implementations are typically dominated by the capacitor area and performance. Therefore, Series-Parallel is the most favorable topology and will be used as the default topology in this research. Note that although the analysis in this Chapter can be applied to both step-up and step-down conversions, the focus of this research will be on step-down conversion. This choice was made since in the target application (i.e., power management for SoCs), down conversion is often required for interfacing digital circuits with supplies of ~1V or below with a higher input voltage (e.g. 2.8V-4.2V from the Li-ion battery). The operation of a sample 2:1 step-down converter will be presented in Section 2.2. Except for the 1:1 mode where a SC converter operates as a resistor [28-29], 2:1 is the fundamental conversion ratio as all the topologies converge (i.e., become identical) at this conversion ratio. Furthermore, the 2:1 topology is simplest to explain and contains all the key components of any SC converter. Based on understanding the operation of this converter, analysis of the loss mechanisms and development of optimization framework will follow in Section 2.3. This analysis will lead to design equations for switching frequency and switch width that minimize the total loss in a given technology and power density. Since a single-topology SC converter is only efficient when generating an output voltage within a limited range, Section 2.4 will describe a simple design strategy for enabling reconfigurable topologies as well as predicting the overall efficiency versus output voltage. # 2.2. Operation of a Sample Switched Capacitor DC-DC Converter The circuit and operation of a 2:1 step-down SC converter are shown in Figures 2.1.a and 2.1.b, respectively. Switched-capacitor DC-DC converters typically operate in two phases, each of which ideally has 50% duty cycle. While it is possible to operate SC DC-DC converters at a fixed switching frequency and use a variable duty cycle to adjust the output impedance of the converter [30-31], as will be shown later in our analysis, maximum efficiency can only be achieved by optimizing switching frequency and operating with 50% duty cycle to maximize the charge transfer duty cycle. Figure 2.1. (a) A 2:1 step-down SC DC-DC converter and (b) its operational waveforms With the 50% duty cycle operation shown in Figure 2.2, during phase $\phi$ 1, the flying capacitor $C_{fly}$ is connected between the input node Vi and the output node Vo. The charge drawn from Vi though $C_{fly}$ charges this capacitor up and flows to the load. In phase $\phi$ 2, $C_{fly}$ is connected between Vo and GND, and thus the charge previously stored on the flying capacitor is transferred to the output. Since the switching cycle is typically much smaller than the charge/discharge time constant (which is set by $R_L C_{fly}$ ), the ramp rate of the voltage across the capacitor is relatively constant, and hence the load can be approximated as a current source. As will be detailed later, in order to maximize efficiency, it is desirable to utilize all available capacitance within the converter itself. Therefore, no explicit filtering capacitor is assumed at the output. In the case of the simple SC converter described so far, this makes the peak-to-peak voltage ripple across the capacitor and the converter's output equal, as shown in Figure 2.2. This voltage ripple has a direct implication on the loss of the converter—and hence the achievable efficiency—that will be discussed in more detail in the later section on loss analysis. #### 2.3. Loss Analysis The voltage ripple across the capacitor scales with the load current, and as will be described shortly, will therefore appear as a form of series loss similar to the switch conduction losses. In addition, any SC converter will also have shunt losses that are independent of the load current, including gate and bottom plate capacitor switching losses. Note that the control circuitry for an SC converter will also contribute to shunt loss, but will be neglected here since this loss is typically small in comparison to the other losses at the power levels that are the focus of this work. These losses can be modeled as shown in Figure 2.6, where the series losses are represented by an equivalent output resistance $R_o$ [22], [32], the shunt losses by the parallel resistor $R_p$ , and the transformer represents the ideal voltage conversion ratio. Figure 2.2. A 2:1 step-down SC DC-DC converter voltage and current waveforms In order to elucidate the relationship between voltage ripple across the capacitor and loss, it is important to recall that most fully integrated switched-capacitor converters will be delivering power to synchronous digital circuitry. The performance of circuits in synchronous digital systems is determined by the operating frequency, which in turn is set by the minimum average voltage over a clock period. Since the clock period of most digital circuits will be short in comparison to the SC converter's switching period, the performance of these circuits is typically simply set by the minimum voltage $V_{min}$ of the supply rail [33]. In this case, the efficiency of the converter should be calculated relative to the power that would have been consumed by the load if it were constantly operating at exactly $V_{min}$ [33]. In other words, the ideal power consumed by the load is: $$P_{L-min} = V_{min}I_L \tag{1}$$ where $I_L = \frac{V_{min}}{R_L}$ . However, due to the voltage ripple from the converter, assuming that this ripple is relatively small compared to the nominal voltage, the average power dissipated by the load over one switching cycle of the converter is approximately: $$P_{L-tot} \approx \left(V_{min} + \frac{\Delta V}{2}\right) \left(I_L + \frac{\Delta I}{2}\right)$$ (2) where $\Delta V$ is the output voltage ripple (due to the operation of the converter) and $\Delta I = \Delta V/R_L$ . Although $P_{L\text{-tot}}$ is indeed dissipated by the load, any power consumed beyond $P_{L\text{-min}}$ should be counted as loss since this additional power does not contribute to an increase in performance. In order to quantify this loss, we need to calculate $V_{min}$ and $\Delta V$ ; as shown in Figure 2.2, for the 2:1 converter considered here, $V_{min}$ is lower than the ideal output voltage Vi/2 by $\Delta V/2$ : $$V_{min} = \frac{Vi}{2} - \frac{\Delta V}{2},\tag{3}$$ and the voltage ripple $\Delta V$ is set by: $$\Delta V = \frac{I_L}{C_{fly}} \cdot \frac{T}{2} = \frac{I_L}{2C_{fly}f_{sw}},\tag{4}$$ where T is the switching period and $f_{sw} = 1/T$ is the switching frequency. As should be clear from equation (2), the loss caused by the operation of the converter is due to both the ripple in the voltage across the capacitor $\Delta V$ as well as due to the excess current flowing in the load $\Delta I$ . The loss due to the voltage ripple $\Delta V$ is unavoidable because the voltage drop (i.e. $V_{drop} = \Delta V/2$ ) in equation (3) is inherent to the fact that charge (power) is being delivered through a capacitor – i.e., this voltage drop $V_{drop}$ is fundamental to the operation of SC converters. However, the current ripple $\Delta I$ can be eliminated if the ripple in the output voltage above $V_{min}$ is minimized. Fortunately, the ripple in the output voltage, and hence the load current ripple, can be reduced by multiphase interleaving. As described in [23], [24], and [34], multiphase interleaving is implemented by partitioning the converter into small units and switching each one of these units on a different clock phase. Figure 2.3 depicts a sample 4-phase interleaved converter and the operation of the flying capacitors in clock phase 0 (clk<sub>0</sub>) and clock phase 1 (clk<sub>1</sub>). Each unit in this converter uses 1/4 of the total capacitance and operates off of a clock that is 45° phase-shifted from its neighbor. Figure 2.3. A sample 4-phase interleaved SC converter (a) and flying capacitor operation (b) Figure 2.4. Flying capacitor voltage (a) and its effect on output voltage and current ripple (b) of the 4-phase interleaved converter in Figure 2.3 Figure 2.5. Achievable efficiency of an example 2:1 SC converter as a function of the number of interleaved phases The total charge (per switching cycle) required by the output is obviously the same as that in the converter without interleaving, but is equally divided among each sub-unit. Thus, the charge flowing through each unit flying capacitor in the interleaved design is the same as it would have been in the original design. As illustrated in Figure 2.4(a), the voltage ripple on each unit capacitor required to deliver that charge is, therefore, essentially identical to the previous $\Delta V$ from (4). As a result, $V_{min}$ is unchanged. However, because the charge delivered to the output is divided more finely, the output voltage and current ripple are reduced by the interleaving factor ( $k_{interleave} = 4$ ), as shown in Figure 2.4 (b). This leads to a reduction in the loss associated with the current ripple: $$P_{L-tot} \approx \left(V_{min} + \frac{\Delta V}{2}\right) \left(I_L + \frac{1}{k_{interlegase}} \cdot \frac{\Delta I}{2}\right)$$ (5) As shown in Figure 2.5, interleaving the converter by roughly by a factor of 10 (which is relatively simple to achieve in an integrated design) is sufficient to almost completely eliminate the efficiency penalty due to load current ripple. In other words, extreme levels of interleaving are generally not required—especially if they would result in significant control circuitry overhead. Assuming a sufficiently interleaved design (i.e., $k_{interleave} > \sim 10$ ) we can generally ignore the loss caused by the output current ripple, resulting in the classic switched-capacitor loss [22] given by: $$P_{C_{fly}} = I_L \cdot \frac{\Delta V}{2} = \frac{I_L^2}{M_{cap} C_{fly} f_{sw}} \tag{6}$$ where $M_{cap}$ is a constant related to the converter's output resistance and is determined by the converter's topology (e.g., $M_{cap} = 4$ for a 2:1 SC converter). In addition to the intrinsic switched-capacitor loss, the finite conductance of the switches will lead to an additional series loss term. To simplify the notation, we will assume here that the characteristics for all of the switches (regardless of device type or precise gate overdrive) are identical, but it is straightforward to extend the analysis to handle differences in the characteristics of each individual switch. The switch conductance loss $P_{R_{SW}}$ can therefore be expressed by: $$P_{R_{SW}} = I_L^2 \frac{R_{on}}{W_{SW}} M_{SW} \tag{7}$$ where $R_{on}$ is the switch resistance density measured in $\Omega \cdot m$ , $W_{sw}$ (m) is the total width of all switches, and $M_{sw}$ is a constant determined by the converter's topology. For the 2:1 converter in Figure 2.2, there are four switches, and thus each switch occupies 1/4 of the total switch area. During each half of a switching period, two of the four switches conduct the current flowing into the output, resulting in: $$M_{sw} = N_{switches,tot} \times \left(\frac{T_{ph1}}{T} \times N_{sw,on,ph1} + \frac{T_{ph2}}{T} \times N_{sw,on,ph2}\right)$$ $$= 4 \times \left(\frac{1}{2} \times 2 + \frac{1}{2} \times 2\right) = 8$$ (8) As shown in (6) and (7), the intrinsic switched-capacitor loss and switch conductance loss are both set by the load current, and can hence be modeled by the equivalent output resistance $R_o$ in Figure 2.6. The total series loss is therefore approximately set by:<sup>2</sup> $$P_{s} = I_{L}^{2} R_{o} = P_{R_{sw}} + P_{C_{fly}}$$ (9) Figure 2.6. SC converter simplified model for loss calculations The other key portion of an SC converter's losses stems from the shunt losses arising from switching the parasitic capacitance of the flying capacitors and power switches. Any flying capacitor implementation – particularly fully integrated ones – will always have parasitic capacitance associated with both its top plate and its bottom plate. In steady-state operation, both of these plates experience approximately equal voltage swings. Therefore, we will group both losses caused by the top-plate capacitor $C_{top-plate}$ and the bottom-plate capacitor $C_{bottom-plate}$ into one parasitic capacitor switching loss $P_{bott-cap}$ , given by: $$P_{bott-cap} = M_{bott} V_o^2 C_{bott} f_{sw} (10)$$ where $M_{bott}$ is a constant determined by the converter's topology (e.g., $M_{bott} = 1$ for a 2:1 SC converter) and $C_{bott} = C_{bottom-plate} + C_{top-plate}$ . Once again assuming that all <sup>&</sup>lt;sup>2</sup> As discussed in [21], the series resistance of the switches can impact the amount of charge that flows through the capacitors, and hence a more accurate approximation for the series losses is $P_s = I_L^2 R_o = \sqrt{P_{R_{sw}}^2 + P_{C_{fly}}^2}$ . We have nonetheless chosen to approximate the total loss with a linear addition of the individual loss terms since it enables us to obtain simple and meaningful analytical expressions without significantly altering the final results. of the switches have identical characteristics to simplify the notation, the gate parasitic capacitance switching loss $P_{qate-cap}$ is given by: $$P_{gate-cap} = V_{sw}^2 W_{sw} C_{gate} f_{sw}$$ (11) where $V_{sw}$ is the gate voltage swing and $C_{gate}$ is the gate capacitance density (F/m) of the switches. #### 2.4. Loss Optimization In order for the converter to achieve the highest overall efficiency at a given power density we must minimize the total loss, which is set by the combination of the four previously discussed loss components: $$P_{loss} = \left(P_{C_{fly}} + P_{R_{sw}}\right) + \left(P_{bott-cap} + P_{gate-cap}\right)$$ $$= \left(\frac{I_L^2}{M_{cap}C_{fly}f_{sw}} + I_L^2 \frac{R_{on}}{W_{sw}} M_{sw}\right) + \left(M_{bott}V_o^2 C_{bott}f_{sw} + V_{sw}^2 W_{sw} C_{gate}f_{sw}\right)$$ (12) For a given technology, $R_{on}$ and $C_{gate}$ are set by the available transistors, and hence are essentially fixed. Similarly, the intrinsic switched-capacitor loss $P_{C_{fly}}$ (and hence the overall loss) will always be minimized by utilizing as large a flying capacitor $C_{fly}$ as possible given the implementation area constraint. Therefore, at a given power density, the only two variables that can be freely optimized in order to minimize the total losses are switch width $W_{sw}$ and switching frequency $f_{sw}$ . Increasing either switch width or switching frequency decreases the series losses at the cost of increasing the shunt loss. Therefore, minimizing the converter's total losses comes down to setting the values of $W_{sw}$ and $f_{sw}$ in order to appropriately balance series and shunt losses. As we will describe next, the (relative) power density required of the converter plays an important role in determining the most dominant loss components, and hence how $W_{sw}$ and $f_{sw}$ should be chosen to minimize loss. At high power densities (i.e., large $I_L$ or equivalently small $R_L$ , where $R_L = V_o/I_L$ ), $W_{sw}$ and $f_{sw}$ must both increase with the load current in order to suppress the series losses. Since $P_{gate-cap}$ is proportional to both width and frequency while $P_{bott-cap}$ scales only with switching frequency, beyond a certain load current the bottom plate loss becomes the least significant loss term. Therefore, in order to arrive at simple analytical equations for the optimal $f_{sw}$ and $W_{sw}$ in this regime, we can ignore the bottom plate portion of the shunt losses. In this case, the optimal $f_{sw}$ and $W_{sw}$ —which can be derived from (12) by taking the derivatives with respect to $f_{sw}$ and $W_{sw}$ —will be: $$f_{sw\_opt} = \frac{1}{\sqrt[3]{M_{cap}^2 M_{sw}}} \cdot \sqrt[3]{\frac{V_o^2}{V_{sw}^2} \times \frac{1}{R_{on} C_{gate} (R_L C_{fly})^2}}$$ (13) $$W_{sw\_opt} = \sqrt[3]{M_{sw}^2 M_{cap}} \cdot \sqrt[3]{\frac{V_o^2}{V_{sw}^2} \frac{R_{on}^2 C_{fly}}{R_L^2 C_{gate}}}$$ (14) Under these conditions and with the optimal $f_{sw}$ and $W_{sw}$ , the minimum normalized loss (which sets the efficiency as $\eta = (1 + P_{loss}/P_L)^{-1}$ ) is approximately: $$\frac{P_{loss}}{P_L} = 3 \sqrt[3]{\frac{M_{sw}}{M_{cap}}} \cdot \sqrt[3]{\frac{V_{sw}^2}{V_o^2} \frac{R_{on}C_{gate}}{R_L C_{fly}}}$$ $$\tag{15}$$ Figure 2.7. Analytical prediction of optimized power density vs. efficiency tradeoff curves for a 2:1 SC converter. The switch characteristics of a 32nm CMOS technology (i.e., $R_{ON} = 130~\Omega \cdot \mu m$ , $C_{gate} = 3~fF/\mu m^2$ , $V_{sw} = 1V$ ) were used to generate these curves, which also highlight the impact of +/-30% process variations (modeled as shifts in $R_{ON}$ ) This relative loss expression highlights the tradeoff between power density and efficiency (Figure 2.7). For a given technology and converter topology, increasing the power density by a factor of x at a given output voltage implies that $R_L$ also decreases by a factor of x, leading to an increase in the minimum normalized loss by a factor of $\sqrt[3]{x}$ . Note that although we have so far focused on *power* density as an area (cost) metric against which efficiency trades off, it is actually the load *conductance* density (i.e. load conductance per unit area) that sets the efficiency. To make this more obvious, we can define $G_L = 1/R_L$ and rewrite (15) as: $$\frac{P_{loss}}{P_L} = 3 \sqrt[3]{\frac{M_{sw}}{M_{cap}}} \cdot \sqrt[3]{\frac{V_{sw}^2}{V_o^2} R_{on} C_{gate} \frac{G_L}{C_{fly}}}$$ $$\tag{16}$$ This relative loss expression also highlights that the most important technology metric guiding the selection of the switches is the product of gate voltage swing squared and intrinsic time constant (i.e., $V_{sw}^2 R_{on} C_{gate}$ ). Similarly, since it is the ratio of this switch metric to the load voltage squared multiplied by the effective time constant for charging/discharging the flying capacitors (i.e., $V_o^2 R_L C_{fly}$ ), increasing the density of the capacitors also directly improves efficiency at a given power density. Although the previous analysis provides a clear intuitive picture of the relationship between power density and efficiency, it is only accurate at high power densities where the loss due to switching the "bottom-plate" parasitics of the flying capacitors is negligible compared to the other losses. Both the optimal switching frequency and the switch area scale down as load power decreases, and hence at low power densities, the loss due to driving the parasitic capacitance of the switches becomes much smaller than all of the other losses. Therefore, in this regime we can approximately find the optimum loss by ignoring the gate loss of the switches and finding the optimum switching frequency $f_{sw,opt}$ : $$f_{sw,opt} = \frac{1}{\sqrt{M_{cap}M_{bott}k_{bott}}} \frac{1}{C_{fly}R_L}$$ (17) where $k_{bott} = C_{bott}/C_{fly}$ is the parasitic to flying capacitance ratio. Although the switch gate losses were assumed to be small, we can still size the switches to minimize the total loss in (12) with the frequency found in (17): $$W_{sw} = \sqrt{\frac{V_o^2}{V_{sw}^2} \cdot \frac{R_{on}C_{fly}}{R_L C_{gate}} \cdot \sqrt{M_{sw}^2 M_{cap} M_{bott} k_{bott}}}$$ (18) Combining the results from (12), (17) and (18), the normalized loss in the low power density regime is: $$\frac{P_{loss,opt}}{P_L} = 2\sqrt{\frac{M_{bott}}{M_{cap}}k_{bott}} + 2\sqrt{\frac{M_{sw}}{\sqrt{M_{cap}M_{bott}}} \cdot \frac{1}{\sqrt{k_{bott}}}} \cdot \sqrt{\frac{V_{sw}^2}{V_o^2} \cdot \frac{R_{on}C_{gate}}{R_LC_{fly}}}$$ (19) This result highlights a key fundamental limit on the efficiency of a switchedcapacitor DC-DC converter. Specifically, even in very light load conditions (i.e., R<sub>L</sub> = ∞), the maximum efficiency of the converter is limited by the bottom-plate capacitance ratio $k_{bott}$ and the converter's topology (i.e., by the first term in (19)). For example, with a bottom-plate capacitor ratio of 1% (or 0.5%), the efficiency of a 2:1 converter is limited to 90.9% (or 93.4%). Of course, any non-zero load will decrease the efficiency of the converter, but for sufficiently light loads, the efficiency will still be dominated by bottom-plate losses. Note that for an ideal capacitor with $k_{bott} = 0$ , the assumption of switch gate losses being negligible in comparison to all of the other loss terms is violated. In this case, (15) and not (19) should be used to predict the minimum loss. To illustrate these effects, Figure 2.7 shows the efficiency vs. power density characteristics of two optimized converter designs with different flying capacitor characteristics. One converter employs capacitors with a higher capacitance density but also a higher bottom-plate capacitor ratio (e.g., MOS capacitor), while the other employs capacitors with lower density but also lower parasitics (e.g., a MIM or MOM capacitor). At high power densities (where (16) accurately predicts the minimum loss), high capacitance density directly translates into higher efficiency. However, at low power densities (where (19) is more accurate) the flying capacitors should have as low parasitic capacitance as possible in order to maximize peak efficiency. Beyond illustrating the importance of selecting an appropriate capacitor given the target power density of the converter, Figure 2.7 also predicts that a 2:1 SC converter using currently available CMOS technology can achieve an efficiency in the range of 79.4% - 80.7% at a power density of $\sim 1 \text{W/mm}^2$ . While this performance is substantially better than previous predictions or demonstrations of fully integrated DC-DC converters [15,18,23,24,35], it is only achievable at a single output voltage. #### 2.5. Output Voltage Range Considerations Unlike in inductor-based converters where charge is saved and transferred in the form of current in the inductors—which enables efficient control of the output voltage by modulating the DC voltage applied to one side of the inductor—SC converters save and transfer charge in the form of voltage across the flying capacitors. Therefore, the output voltage of a SC converter is determined by its topology. To efficiently achieve a wider output voltage range, SC converters require reconfigurable topologies that can support multiple conversion ratios [35, 37]. By employing a certain number of reconfigurable topologies, an SC converter can support the same number of discrete open-circuit voltage levels. The intermediate voltages between these discrete voltage levels can then be obtained by controlling the converter's output impedance $R_o$ , which is equivalent to linear regulation off of the open-circuit voltages. Figure 2.8. Predicted efficiency vs. Vo with 3 available topologies with two capacitor technologies. For both types of capacitors, the load is adjusted so that the converter is supplying 0.1 W/mm² with a Vo of 0.95 V. These curves assume that the load is equivalent to a CMOS ring oscillator – i.e., that R<sub>L</sub> varies along with the output voltage in the same manner as a ring oscillator As discussed in [22] and shown here in (9) and (12), the converter's output impedance $R_o$ can be adjusted by controlling one or a combination of the following parameters: switching frequency $f_{sw}$ [26,37], switch sizing $W_{sw}$ , and effective flying capacitance $C_{fly}$ [38]. Figure 2.8 shows the resulting efficiency vs. output voltage for a converter operating off a 2V input and allowing reconfiguration into one of three possible topologies with conversion ratios of 1/3, 1/2, and 2/3. Even with linear regulation performed only by adjusting $f_{sw}$ (which is slightly sub-optimal from an efficiency standpoint), Figure 2.8 predicts that such a converter could achieve above 70% efficiency for most output voltages spanning from ~0.5V up to ~1.2V. Before proceeding further, it is important to note that the previously described optimization methodology does not directly include the implications of closed-loop regulation mechanisms (such as switching frequency control [36,19], duty cycle control [30-31], or switched capacitance modulation [38]) needed to control the output voltage under load current variations. However, for typical digital circuit load current profiles—e.g., periods of relatively constant current (on average across a typical DC-DC converter's clock cycle) and more infrequent load current "steps" due to enabling or disabling of blocks—utilizing closed-loop control for the converter should not force drastic deviations from the results of the optimization. This is because the high-frequency transient response of a fully integrated SC converter will retain nearly the same characteristics (for the interleaved series-parallel converters described in this work, to within a factor of 2) as the on-die decoupling capacitors (de-cap) which would have been required in any case. This equivalence holds because of the fact that the load always sees some fraction (typically at least one-half) of the flying capacitance as decoupling capacitance to ground. Thus, if the SC converter can achieve sufficiently high power density, these decoupling capacitors can simply be re-purposed to implement the converter.<sup>3</sup> #### 2.6. Summary As discussed above, the area required by a fully integrated switched-capacitor DC-DC converter in order to deliver a certain level of power to the load has direct implications on both cost and efficiency. Hence, this chapter describes a methodology to predict and minimize the losses of such a converter operating at a given power density. This chapter also discuss design techniques to reduce output voltage and current ripple and support a wider output voltage range while still maintaining reasonably high (i.e. >70%) efficiency. In the next chapter, we will present a research prototype to verify the methodology described in this chapter. Circuit techniques to enable low output voltage ripples and a wide output voltage range will also be discussed. 3 ³ This is unlike an inductor-based converter where an output filtering capacitor is required, and where the transient response is inherently limited by (Vin-Vout)/L (or Vout/L). In contrast, an SC converter's response is largely limited only by the latency of the feedback control circuitry in terms of, for example, increasing or decreasing the switching frequency. Furthermore, this latency requirement is typically not stringent with respect to the performance available in modern CMOS technologies. For example, at 1V output and 1W/mm² power density, a typical fully integrated SC converter could tolerate up to ∼500ps (20-40 fanout-of-four inverter delays) of response latency under a full load current step while maintaining <10% voltage drop. # Chapter 3 High-Performance SwitchedCapacitor DC-DC Converter Prototype This chapter describes circuit techniques to implement the efficiency/power density design methodology described in Chapter 2. A prototype SC converter was designed and fabricated in a 32nm SOI test-chip in collaboration with AMD [26]. Although proper selection of the flying capacitor, switch width, and switching frequency (as outlined in the previous chapter) are critical to achieving a converter with high efficiency and minimal area overhead, the need to support reconfigurable topologies (in this design, 2/3, 1/2, and 1/3) and multiple output voltage results in several circuit design are challenges which must be overcome as well. The first challenge, which is addressed in Section 3.1, is to find a simple physical design approach that enables the reconfigurable topologies. Next, the wide variation in output and flying capacitor voltage levels across the different topologies makes efficiently driving the power transistors—for which one would like to use thin-oxide transistors in order to maximize efficiency—particularly challenging. Section 3.2, therefore, proposes and describes circuit techniques to address these challenges. These methods are verified by a proof-of-concept converter prototype, presented in Section 3.3, implemented in 0.374 mm<sup>2</sup> of a 32 nm SOI process. The 32phase interleaved converter can be configured into three topologies to support output voltages of 0.5V – 1.2V from a 2V input supply, and achieves 79.76% efficiency at an output power density of 0.86W/mm<sup>2</sup>. Chapter 3 will be concluded in Section 3.4. #### 3.1. Reconfigurable topologies Figure 3.1. Standard cell (a) and reconfigurable converter unit (b) As with any custom designed VLSI structure, a physical design strategy that enables one to construct larger SC converter blocks by arraying identical sub-converter unit cells is highly desirable. In order to achieve this goal while supporting topology reconfiguration, we therefore propose to partition the converter into a unit cell consisting of one flying capacitor and five switches, as shown in Figure 3.1(a). Conceptually, each standard cell can be configured to operate in series or in parallel with the rest of the cells, leading to a simple physical design strategy that supports multiple conversion ratios. As depicted in Figure 3.1(b), for this specific prototype converter we have grouped together two standard cells in order to form a converter unit supporting three topologies with conversion ratios of 1/3, 1/2, and 2/3 (0.66V, 1V, and 1.33V at the output with a 2V input). For simplicity, the intermediate voltage levels are generated by controlling $f_{sw}$ . #### 3.2. Circuit techniques #### 3.2.1. Switch drivers As previously mentioned in Chapter 2, since they have the lowest $V_{sw}^2 R_{on} C_{gate}$ metric, it is desirable to utilize thin-oxide devices to maximize the converter's efficiency. However, in a step-down converter intended to generate output voltages near the nominal process voltage, the breakdown voltage of these switches will most likely be substantially smaller than the input voltage Vi, and hence appropriate switch driving strategies are required. As shown in Figure 3.2, each converter unit operates in two non-overlapping clock phases, c1 and c2 with controllable dead-time. Using versions of c1 and c2 that swing either between Vo and GND or between Vi and Vo (noted by h) and topology control signals that swing between Vi and GND (noted by \_fs), it is fairly straightforward to drive the gates of all of the switches except for M4, M5, and M7. In the 1/3 (2/3) mode, switch M4 (M7) should always be off. However, since the source of M4 (M7) is driven above (below) the logic "High" (logic "Low") rail of a standard inverter driver in one phase of operation, M4 (M7) could turn on unintentionally. Dual-rail power gating and voltage clamps (M4a, M7a, D4, and D7) are therefore added to INV4 and INV7 in order to ensure these switches remain off, as illustrated in Figure 3.2. In the other two conversion modes, the small switch M4a (M7a) always has $V_{GS} \ge 0$ ( $V_{GS} \le 0$ ), turning off the current path through the voltage clamps, while the dual-rail power gating transistors are turned on and thus INV4 (INV7) functions off the voltage rails Vo-GND (Vi-Vo) as normal. Note that these power gating transistors are statically-configured based on the operating mode, and thus they can be sized to minimize the additional resistance they induce to INV4 and INV7. Therefore, other than the addition of some parasitic capacitance, these added elements have no significant effect on the operation of the converter in the modes in which they are not used. Figure 3.2. Converter power switch control circuits and timing diagram. The converter operates off of 2 non-overlapping clocks c1 and c2 with a controllable dead-time (DT) Driving switch M5 is more challenging since its source and drain voltages can reside anywhere among Vi, (Vi/2+Vo/2), Vo, and (Vo/2) in the three modes of operation. Fortunately, however, the physical/logical ordering of the standard cells can be leveraged to realize a relatively simple driver design. Specifically, since C1 is always intended to be stacked above C2, C1\_pos and C1\_neg can be used as virtual rails for a "flying" inverter INV5, as also shown in Figure 3.2. Figure 3.3. Switch M5 and operation of the flying inverter INV5 The operation of this "flying" inverter INV5 is illustrated in Figure 3.3. In one phase when the capacitors are connected in parallel by directly controlling the other switches in the converter, the input of INV5 receives a logically "High" signal, which makes its output logic "Low." Thus, as intended, M5 stays off when the capacitors are placed into a parallel configuration. In the other phase, switch M5 should be turned on to connect the capacitors in series. In this phase, the other switches in the converter connect C1-pos to a high voltage HighV and C2-neg to a low voltage LowV. Since the difference between HighV and LowV is twice the voltage across the flying capacitors, the voltage difference between C1-neg and C2-pos is small. Therefore, the input of INV5 is Low, making its output High and turning on switch M5 as necessary. Notice that in both cases, M5 is automatically controlled by the actions of the other switches on the two flying capacitors. In the 1/2 and 1/3 modes of operation, the flying inverter INV5 functions exactly as just described. However, due to the converter's output resistance $R_o$ , Vo drops by $\Delta V_o$ from the ideal output voltage. As we will describe next, in the 2/3 mode of operation this voltage drop can potentially cause INV5 to malfunction. Figure 3.4. Illustration of the weak "Low" input to INV5 in the 2/3 mode when M5 is intended to connect the two flying capacitors in series This situation is illustrated in Figure 3.4. In phase 1, when the capacitors are connected in parallel between Vi and Vo, C1 and C2 are charged to (Vi - Vo)—which is larger than the ideal value Vi/3 by $\Delta V_o$ . Consequently, in phase 2 when C1\_pos is connected to Vo and C2\_neg to GND, the voltage at C1\_neg is initially lower than C2\_pos by $3\Delta V_o$ . Since the voltage supply of INV5 is only $V_{C1} = Vi/3 + \Delta Vo$ , this voltage difference $3\Delta V_o$ could be close to the switching threshold of INV5 (i.e., $V_{C1}/2$ for a symmetric inverter). In other words, instead of receiving a strong Low signal, the input to INV5 may lie in the metastable region or even be interpreted as a "High". Moreover, even if INV5 properly passes a High to its output, the $V_{GS}$ of switch M5 will be $(Vi/3 - \Delta Vo/2)$ , which may degrade the $R_{on}$ of M5 and hence reduce the converter's efficiency. To resolve this issue, a helper circuit that is only activated in the 2/3 mode is added to INV5, as shown in Figure 3.5. With this modification, in the 2/3 mode, T32\_h is set to Vi, turning off the current path through M53. The pull-up of this gate driver is then handled by M54 and M55 through the control of clock c2\_h. M55 is stacked with M54 to avoid voltage overstress. In the other operation modes, T32\_h is Low, bypassing the additional helper circuit and returning INV5 to its normal operation. Therefore, although this helper circuit does somewhat increase driver losses due to stacked transistors in the 2/3 mode of operation, in the other modes of operation it only introduces some additional parasitic capacitance. Figure 3.5. Improved gate driver circuit for switch M5 Figure 3.6. Active level shifter implementation #### 3.2.2. Active level shifter The wide variation in output voltage levels also makes level-shifting the clock signals from Vo-GND to Vi-Vo challenging. Using conventional static level-shifter designs can result in Vo-dependent unbalanced duty cycles in the shifted signals, causing timing mismatch between c1, c2 and c1\_h, c2\_h, and consequently leading to short-circuit currents. Although increasing dead-time can mitigate these issues, this comes at the penalty of reduced efficiency due to a lower effective power-transfer duty cycle. To minimize level-shifter-induced clock mismatch, the design shown in Figure 3.6 DC-biases Iv3 at its trip point through R1 and AC-couples the Vo-GND input clock through a capacitor $C_{ls1}$ . In order to enable low minimum $f_{sw}$ without excessive area, R1 is implemented by minimum-sized pass-gates in series with two pairs of back-to-back diodes. These diodes reduce the voltage swing across the pass-gate and hence increase their equivalent resistance. To further reinforce matched timing, $C_{ls2}$ is added to couple Iv4 and Iv5, and fewer than three levels of logic are used after the level shifter. The level shifter is experimentally verified to work at a clock frequency as low as 10MHz. #### 3.3. Experimental Verification Figure 3.7. 32nm SOI SC converter prototype die photo A die photo of the implemented SC converter employing the previously described design optimization and circuit techniques is shown in Figure 3.7. In order to maximize efficiency at high power densities and mitigate the losses due to current ripple, this design utilizes standard thin-oxide MOS transistors to implement $C_{flv}$ as well as 32-way interleaving. This level of interleaving was chosen because even at high power densities, the optimal switching frequency of the converter is low in comparison to the intrinsic speed of the transistors in this technology. Thus, obtaining the 32 interleaved clock phases from a multi-stage on-chip ring oscillator incurs almost no overhead, and is beneficial in terms of reducing the nominal ripple as well as the sensitivity to random variations in the exact phase spacing. Since this converter is intended to be co-integrated with the load, measuring the converter's performance requires a proper testing strategy. We will therefore first describe the load structure and its characterization strategy, followed by measured results verifying the design methodology and proposed design techniques. #### 3.3.1. Test structures In order to obtain correct I-V measurements of the on-die loading circuits and thus the power efficiency of the converter, the on-die load—which was implemented with a variable-width PMOS device—must be pre-characterized. Four-wire sensing was used to measure the power consumption across the load in order to avoid inaccuracies due to drops in solder bumps, package pins, and PCB traces. Characterization of the load was carried out by gating the clock of the converter in order to disable it and then driving the output node Vo of the converter from an off-chip power supply. For each load current (i.e., PMOS transistor width) setting, the voltage supply Vo is swept and the current consumption is measured. Utilizing this data, the power consumed by the load circuit while the converter is in its normal operation can then be extracted simply by measuring Vo. #### 3.3.2. Measurement Results and Discussion Figure 3.8 shows the converter's measured efficiency and optimal switching frequencies in the 1/2 mode while supplying the on-die load circuits. For simplicity and in order to obtain optimal efficiency in this demonstration, the switching frequency was adjusted by externally controlling the supply of an on-chip ring oscillator. However, any one of a broad variety of techniques to control switching frequency [39] could be utilized. The measured converter achieved an efficiency of 79.76% at 0.86W/mm<sup>2</sup>. The experimental data matches the analytical predictions to within 1.3% across the range of measured power density (0.24W/mm<sup>2</sup> to 0.86W/mm<sup>2</sup>). Note that the performance quoted here is better than that reported in [26] due to the availability of new test-chips fabricated in a nearly production version of the process, rather than the developmental process used to obtain the original results. Figure 3.8. Measured converter efficiency (a) and optimal switching frequency (b) versus power density in the 1/2 mode with Vi = 2V and $Vo \approx 0.88V$ Figure 3.9. Measured converter efficiency and switching frequency across Vo and topologies with Vi = 2V and the load circuits set to $R_L\approx 0.9\Omega$ at Vo = 0.88V Figure 3.10. Measured converter efficiency (a) and optimal switching frequency (b) versus power density in the 2/3 and 1/3 modes with Vi = 2V and $Vo \approx 1.1V$ and 0.6V, respectively. Figures 3.9 and 3.10 show the converter's efficiency vs. output voltage and power density in the three operating modes, verifying that the converter functions correctly in each of the three reconfigurable topologies. The measurements in the 2/3 and 1/2 modes match very well with the analytical predictions. However, the measured efficiency in the 1/3 mode is substantially lower. The cause of this significant discrepancy appears to be un-modeled leakage from Vi and Vo due to over-voltage-stress ( $\approx 1.4V$ ) on switches M1, M2, M4, M6, M7 and M9 powered off of the Vi-Vo rails. Therefore, a practical implementation that uses the 1/3 conversion ratio would likely require a lower input voltage ( $\sim 1.8V$ ), higher voltage devices, and/or switch cascoding. Despite this issue with the 1/3 conversion ratio, the two reconfigurable topologies enable the converter to maintain an efficiency of over 70% for most of the output voltage range from 0.7V to $\sim 1.15V$ . Note that although Figure 3.10 shows efficiency vs. power density, one should recall that it is actually the load conductance density that sets the area required for the converter to achieve a given efficiency. Thus, even though the 2/3 topology appears superior to the 1/3 topology in terms of power density, in the limit of negligible bottom-plate losses, the performance of these two topologies in terms of conductance density is identical. The converter's performance is summarized and compared with other work in Table 3.1. This prototype experimentally verifies that by following the design methodology and techniques proposed in this thesis, both boundaries in efficiency and power density of the previous works in [23] and [24] can be achieved, with an implementation in a commercial process. At 79.76% efficiency and 0.86W/mm², the proposed design could potentially be integrated into the same space as that already required for decoupling capacitors (as well as serve the same function) in a processor targeting mobile applications where the load operates at ~100mW/mm². | Work | [23] | [24] | [40] | This work | |--------------------------|-----------------------------|-------------------------|-------------------------|--------------------------------------------------| | Technology | 130nm Bulk | 32nm Bulk | 45nm SOI | 32nm SOI | | Topology | 2/1 step-up | 2/1 step-up | 1/2 step-down | 2/3, 1/2, 1/3<br>step-down | | Capacitor<br>Technology | MIM | Metal finger | Deep trench | CMOS oxide | | Interleaved<br>Phases | 16 | 32 | 1 | 32 | | C <sub>out</sub> | 400pF (= C <sub>fly</sub> ) | 0 | Yes | 0 | | Converter<br>Area | 2.25 mm <sup>2</sup> | 6678 μm² | 1200 μm² | <b>0.378 mm<sup>2</sup></b> (1.4% used for load) | | Quoted<br>Efficiency (η) | 82% | 60% | 90% | 79.76%<br>(in 1/2 step-down) | | Power density<br>@ η | 0.67 mW/mm <sup>2</sup> | 1.123 W/mm <sup>2</sup> | 2.185 W/mm <sup>2</sup> | 0.86 W/mm <sup>2</sup> | | Conductance density @ η | 0.2 mS/mm <sup>2</sup> | 0.5 S/mm <sup>2</sup> | 2.421 S/mm <sup>2</sup> | 1.11 S/mm <sup>2</sup> | Table 3.1. Comparison of recently published fully integrated SC converters In order to expand the applicability of SC converters to even higher performance processors operating at ~1W/mm², the work reported in [40] utilizes ~200fF/um² deep trench capacitors and achieves 90% efficiency at a power density of 2.185W/mm². This further experimentally verifies the benefit of high density capacitors in increasing efficiency and power density—as also predicted in equation (16) and Figure 2.7. In fact, the analysis from Chapter 2 predicts that with 200fF/µm² deep trench capacitors and modern CMOS switches, an optimized SC design may achieve over 88% efficiency for power densities up to 10W/mm². Thus, the application of the techniques outlined in this thesis along with existing high-density capacitor technologies appears promising in enabling the broad adoption of fully integrated SC converters for on-die power distribution and management. #### 3.4. Conclusion As discussed in Chapters 1 and 2, the area required by a fully integrated switched-capacitor DC-DC converter in order to deliver a certain level of power to the load has direct implications on both cost and efficiency. Applying the design methodology presented in Chapter 2 to predict and minimize the losses of such a converter operating at a given power density, this chapter further introduces gate driver and level shifter circuit design strategies to enable topology reconfiguration and hence efficient generation of a wider range of output voltages. Measured results from a 32nm SOI prototype converter confirm the methodology's predictions of ~80% efficiency at a power density of ~0.5-1W/mm² for a 2:1 step-down converter operating from a 2V input and utilizing only standard MOS capacitors. Reconfiguration of the converter's topology enables it to maintain greater than 70% efficiency for most of the output voltage range from 0.7V to ~1.15V. Given that these results were achieved in a standard CMOS process with no modifications or additions, these results illustrate that fully integrated switched-capacitor converters are indeed a promising candidate for low-cost but efficient power management on a per-core or perfunctional unit basis. # Chapter 4 Battery-Connected SwitchedCapacitor Regulator Prototype With the high-performance converter prototype described in the previous chapter we have explored the SC converter's capability in power density and efficiency in a standard CMOS process. The high-performance converter could also support a wide output voltage range while maintaining greater than 70% efficiency. However, this converter could only support a fixed input voltage at ~2V. This limitation at the input would require another stage of voltage conversion in mobile applications where a Li-ion battery, typically ~2.9V-4.2V, is the input. The series conversion will degrade the overall efficiency to 64% even each stage achieves 80%. It is therefore desirable to be able to support this conversion in one stage. This chapter describes a fully integrated switched-capacitor voltage regulator prototype enabling integrated circuits to interface directly with a Li-ion battery. Circuit design techniques are proposed to reduce the parasitics of MOS capacitors and simplify gate drivers while supporting multiple topologies (and hence input voltages). The circuits are verified by measurements of a proof-of-concept converter prototype implemented in 0.64 mm<sup>2</sup> of a 65nm bulk LP CMOS process. The converter uses two reconfigurable topologies (1/3 and 2/5) to support the Li-ion battery voltage range, and achieves >73% efficiency at 0.19 W/mm<sup>2</sup> output power density. A sub-ns response control loop maintains <76 mV voltage droop out of a 1V regulated output under a full load step of 0 $\rightarrow$ 0.253 A/mm<sup>2</sup> in 50ps. #### 4.1. Introduction Over the three decades since their invention [41], due to their high energy densities, Lithium-ion has remained the most dominant power source in mobile devices. As illustrated in Figure 4.1, while the voltage of this popular battery has remained $\sim$ 2.9V-4.2V (nominally $\sim$ 3.6V), the supply voltage required for both digital and analog circuits in processors and SoCs has scaled down to $\sim$ 1V and below in order to save power while improving processing performance [1]. To bridge this voltage difference between the battery voltages and circuit supplies, off-chip power management ICs (PMICs) are typically required. Figure 4.1. Battery voltage versus circuit supply In order to support the high input voltage from the Li-ion battery voltage, the converter published in [37,42-45] would require another voltage conversion stage. For example, the design [37] as well as other similar designs [42-44] support only ~2V inputs, and hence a complete system would require another series conversion stage from the battery, which would seriously degrade the overall achievable efficiency. The design reported in [45] supporting the battery voltage range, but with only a single 2:1 step-down topology and a power density of ~0.05W/mm², has limited (<55%) overall efficiency in providing the ~1V output required by many of today's digital circuits. There is therefore strong motivation for efficient, fully integrated voltage regulators (IVRs) that interface directly with the battery while supporting multiple separate on-chip supplies, as shown in Figure 4.2. In order to extend these previous designs, a prototype of a battery-connected switching regulator will be discussed in this chapter. This design supports high power-density with a 1V-regulated output from a Li-ion battery using a multi-topology SC converter. Section 4.2 therefore presents the reconfigurable topologies (in this design, 1/3 and 2/5) with a detailed discussion of a proposed partial series-to-parallel topology supporting the 2/5 mode. Section 4.3 presents detailed circuit techniques implemented in the design, including body biasing to reduce parasitic capacitance for flying capacitors, switch drivers to support reconfigurable topologies, and auxiliary converters to generate intermediate voltage rails for switch drivers. To complete a regulator design, Section 4.4 discusses a closed-loop regulation control with a sub-ns response to maintain sufficiently low voltage droop under worst-case load transient conditions. Measurement results from the prototype regulator verifying the predicted performance and proposed design techniques are presented in Section 4.3, and the chapter is finally summarized in Section 4.4. Figure 4.2. Direct battery connected power distribution using IVRs #### 4.2. Partial Series-Parallel Topology In order to efficiently support the wide input voltage range of a Li-ion battery, topology reconfiguration is once again critical for this converter design. As shown in Figure 4.3, a 2/5 topology is proposed to efficiently support the lower range of battery voltage (Vi<3.35V), while the 1/3 topology [37] supports the higher battery voltage range (Vi>3.35). The converter core combines 18 converter units working in 18 interleaved phases to reduce output voltage ripple and improve efficiency [37]. Each converter unit—modified from 2 converter units described in Chapter 3 and [37]—consists of 4 capacitors and 15 switches, and can reconfigurably implement either of these topologies. As illustrated in Figure 4.3, the converter operates in two phases. In the 1/3 mode, the bridge switch M8 is always OFF and the two halves of the converter unit work in parallel. In the first phase $\Phi$ 1, the capacitors C11 (C21) and C12 (C22) are connected in series by switches M11, M14 and M17 (M21, M24 and M27) between the input Vi and output Vo. In the second phase $\Phi$ 2, all the capacitors are connected in parallel between Vo and GND. In this conversion mode, the voltage over all capacitors is ideally Vo, making the ideal conversion ratio from Vi to Vo equal to 1/3. When supplying a regulated output of 1V from a 3.6V input, the 1/3 operating mode for this converter has the theoretical peak efficiency of 83.33% (i.e. the ideal output voltage is 1.2V). Since the converter will always have a finite output impedance associated with it, this voltage drop from the ideal output to the required voltage (i.e. 0.2V in this case) is used for load regulation. When the battery voltage drops close to 3V, the converter cannot tolerate any voltage drop from the ideal, and thus the converter would go out of regulation if only the 1/3 mode is utilized. Therefore, another conversion mode is needed. Figure 4.3. Proposed reconfigurable SC converter topologies to support ~1V output across a Li-ion battery input voltage range As discussed in Chapters 2 and 3, the series-parallel topology is typically the best choice for integrated implementations since it provides the lowest intrinsic output impedance for a given total capacitance (i.e. it has the best capacitance utilization). Therefore, since capacitors dominate the area, it usually results in the most compact implementation for a given efficiency (i.e., highest power density). In order to maintain the optimal capacitance utilization, during one of the phases of operation (of a non-interleaved design) all of the capacitors should be connected in parallel to either the output or between the input and the output. The next series-parallel conversion ratio (i.e., topology) with this characteristic after 1/3 would be 1/2. If the 1/2 mode is used when Vi = 3V—making the ideal output voltage 1.5V—the converter with the output regulated at 1V will have the efficiency theoretically limited to 66.67%. In order to provide intermediate conversion ratios more efficiently, a 2/5 conversion ratio would be required, enabling a theoretical peak efficiency of 83.33% at Vi = 3V and Vo = 1V. To implement the 2/5 conversion, a traditional series-parallel converter includes 6 equally-sized capacitors, operated in 2 phases, as shown in Figure 4.4. To characterize the capacitor utilization of this structure, let us assume a two-way interleaved design where half of the capacitors are configured as in $\Phi$ 1 and the other half as in $\Phi$ 2. In this case, the capacitance seen at Vo is $13C_{total}/72$ , comprised of the sum of $C_{total}/18$ to Vi and $C_{total}/8$ to GND. Compared to the 1/2 and 1/3 series-parallel configurations, in which there is at least $C_{total}/2$ connected to either Vi or GND, this topology suffers a large degradation in capacitance utilization. As also discussed in Chapter 2, this degradation has a direct impact on the output impedance and efficiency of the converter and thus, would possibly negate the purpose of adding this additional topology. Figure 4.4. Two phase operation of traditional 2/5 and 3/5 series-parallel SC converters In order to enable intermediate conversion ratios with better capacitance utilization, a modified version of the series-parallel topology, which we have named as partial series-to-parallel (PS2P), is proposed. As shown in Figure 4.3, the 2/5 converter also operates in two phases. The first phase Φ1 in this mode is similar to that in the 1/3 mode, where the capacitors C11 (C21) and C12 (C22) are connected in series by switches M11, M14 and M17 (M21, M24 and M27) between Vi and Vo. However, in the second phase Φ2, while the two top capacitors C11 and C21 are connected in parallel between Vo and GND (as in the 1/3 mode), the two bottom capacitors C12 and C22 are connected in series between Vo and GND by switches M25, M16 and the bridge switch M8. Therefore, while the voltage over C11 and C21 is Vo, C12 and C22 have Vo/2 across them. The conversion ratio can be calculated using the phase Φ1 configuration in Figure 4.3. Since C12 (C22) blocking Vo/2 in series C11 (C21) blocking Vo on top of Vo to be connected to Vi, the conversion ratio from Vi to Vo is 2/5. M15 and M26 are always kept off in this mode. During the operation of this 2/5 mode, the four capacitors can be considered as grouped into two capacitor groups: $C_{top}$ (including C11 and C21), and $C_{bottom}$ (including C12 and C22). The two capacitor groups are connected in series in $\Phi$ 1 and in parallel in $\Phi$ 2. While the two top capacitors of the $C_{top}$ group are always in parallel, the two bottom capacitors C12 and C22 are changed from a parallel connection in $\Phi$ 1 to series in $\Phi$ 2. This is the fundamental difference in the operation of partial series-to-parallel topology, compared with the traditional series-to-parallel topology. The complimentary conversion of the 2/5 mode is a 3/5 mode that can also be implemented using two capacitor groups (as shown in Figure 4.5). However, the two capacitor groups are connected in parallel in phase $\Phi 1$ and series in phase $\Phi 2$ , while partially, the top group capacitors (C21 and C11) switch between being connected in series in $\Phi 1$ to parallel in $\Phi 2$ . Figure 4.5. The two phase operation of a 3/5 mode Once again assuming a two-way interleaved design to elucidate the capacitor utilization of the 2/5 PS2P mode, the capacitance seen at Vo, is $7C_{total}/16 - i.e.$ , $C_{total}/8$ to Vi combined with $5C_{total}/16$ to GND. This is >2.42X better than the traditional series-to-parallel (TS2P) topology in capacitance utilization. The same result applies to the comparison between the complimentary conversion of the PS2P and TS2P topologies. Therefore, to optimize for capacitance utilization and performance, TS2P can be used for conversion ratios of 1/2, 1/3, 2/3, 1/4, 3/4, etc., while PS2P can be employed for all the intermediate conversion ratios. Figure 4.6. Operation of traditional series-to-parallel topology The operation and conversion ratio of the traditional series-to-parallel (TS2P) topology and the new partial series-to-parallel (PS2P) topology can be generalized. With m capacitors (capacitor groups) a TS2P converter has all the capacitors (capacitor groups) connected in series in phase 1 and then connected in parallel in phase 2. This operation results in the conversion ratios of 1/(m+1) or m/(m+1) (depending on whether the series configuration is connected between Vin-Vo or Vo-GND in phase 1), as shown in Figure 4.6. Figure 4.7. Operation of partial series-to-parallel topology using 2 groups of capacitors While the traditional series to parallel configuration is favorable in supporting all conversion ratios of 1/(m+1) and m/(m+1), where m is an integer number, the partial series-to-parallel topology is preferred in supporting intermediate conversion ratios. The operation of the PS2P topology using two groups of capacitors (Cx1 and Cx2) is generalized as shown in Figure 4.7. Dependent on the connections to Vi-Vo or Vo-GND in the two phases of operation, a conversion ratio of m/(2m+1) or (m+1)/(2m+1) can be achieved. As m increases, the conversion ratios will approach 1/2 from 1/3 or 2/3. We can also expand the same the operation for k groups of capacitors, where (k-1) groups are operated in TS2P and one group operated in PS2P as shown in Figure 4.8. The obtainable conversion ratio is m/(km+1) or (m+1)/(km+1). Similarly, as m increases, the conversion ratios will approach 1/k from 1/(k+1) or 2/(k+1). As a result, fine conversion ratio steps can be achieved. However, the complexity in switch implementation and capacitor utilization will limit efficiency of a converter with high k and/or high m. Therefore, a careful analysis is necessary to decide if additional topologies can efficiently cover a wide and fine range of conversion ratios. Figure 4.8. The operation of partial series-to-parallel topology using k groups of capacitors In this design prototype for mobile applications, the converter is designed with two configurable topologies (1/3 and 2/5) as mentioned above. Intermediate conversion ratios will be generated using linear regulation through controlling the converter switching frequency, as described in Chapter 2. #### 4.3. Circuit Techniques As shown in Chapter 3, in order to verify the design methodology discussed in Chapter 2 the first converter prototype was implemented in an SOI technology with the benefit of low (2%-3%) parasitic bottom-plate capacitance. While SOI has been utilized for several generations of desktop/server microprocessors, chips designed for the mobile market have utilized almost exclusively bulk processes. Therefore, TSMC's 65nm bulk LP CMOS process was chosen to implement this SC converter prototype that aims to verify the concept of the battery connected converter with the new reconfigurable topology [26]. In this design, it is also critical to optimize flying capacitors, switch width, and switching frequency to achieving high efficiency and minimal area overhead. This optimization was done by applying the methodology discussed in Chapter 2. The key challenges that need to be overcome in this design lie in the change in process from SOI to Bulk, and in the wide and high range of input voltages from the Li-ion battery. The key to the first challenge is to reduce the effect of parasitic capacitance of the flying capacitors implemented using CMOS transistors in bulk CMOS technology. Next, the high and wide input voltage range and the wide variation in flying capacitor voltage levels across the different topologies makes efficiently driving power transistors—for which one would like to use thin-oxide transistors in order to maximize efficiency—challenging. In addition, in order to be able to handle a stringent load step of 5 A/mm²/ns, designing the converter control loop raises another serious challenge. This section therefore proposes and describes circuit techniques to address these challenges. #### 4.3.1. Body Biasing to Reduce Parasitic Capacitance In this design that uses a commercial CMOS process, the flying capacitors are implemented using MOS transistors since they have the highest available capacitance density. The issue with a bulk transistor is its undesirable parasitic capacitance while its body is tightly biased. The vertical structure and the capacitance model of a bulk PMOS transistor used to implement flying capacitors are shown in Figure 4.9. In operation, the transistor is biased with a negative Gate-Source/Drain voltage (i.e. $V_{GS} < V_{T,PMOS}$ ) so that it is in ON state. The capacitance $C_{\text{gate}}$ is the capacitance from the gate to the channel formed between source and drain, representing the flying capacitance used to transfer power. In a traditional implementation (Figure 4.9(a)), the N-well body of this transistor would be biased at a fixed voltage that is higher than the maximum voltage level (i.e. relative to GND) that the Source-Drain terminal voltage $V_{SD,GND}$ can reach in operation. In this design, it could be connected to the input voltage Vi, which is the highest voltage on the die. However, a fixed connection of the body will leave the capacitance $C_C$ between the channel and N-well to be the undesirable parasitic capacitance. In a normal bulk process, $C_C$ is often 10%-20% of $C_{gate}$ , seriously degrading the converter's overall efficiency, as discussed in Chapter 2, Equation 19. For example, a $C_C$ of 15% of $C_{gate}$ limits the maximum achievable efficiency of a 2:1 converter to 72.08%. Figure 4.9. Flying capacitors using Bulk CMOS transistors with (a) traditional biasing and (b) proposed biasing scheme As shown in the vertical structure of the PMOS capacitor in Figure 4.9(a), the parasitic capacitance $C_W$ from the N-well body to substrate is strongly connected to the fixed potentials Vi and GND. Since the substrate usually has a very low doping level compared with the transistor's channel, $C_W$ is >10 times smaller than $C_C$ . Therefore, it is more desirable to have $C_W$ than $C_C$ as the parasitic capacitor. In order to realize this, a big resistor R1 is introduced for the N-well bias, as shown in Figure 4.9(b). The value of R1 is designed to be large enough to effectively put $C_C$ and $C_W$ in series for AC operation, while still DC-biasing the N-well at Vi. As the result of the series connection, the effective parasitic capacitance of flying capacitors is determined by the smaller capacitor $C_W$ . To make the circuit operate as intended, the value of R1 must be chosen within a certain range. The lower limit of this range is set by ensuring that the time constant ( $\tau$ = R1\*( $C_C + C_W$ )) is significantly (e.g. >5X) larger than the converter's largest period (i.e. lowest frequency). The upper limit is set by considering the leakage via $C_C$ and $C_W$ . In this design, R1 = 800K $\Omega$ for $C_C = \sim 8.1$ pF (i.e. $\sim 15\%$ of $C_{gate} = 54$ pF), and the lowest frequency that this converter can operate is 1MHz. Employing this biasing technique, the effective parasitic capacitance of 0.8% $C_{gate}$ is achieved at terminal SD in post-layout extraction, and the total parasitics, including fringe and connection wire parasitic, are $\sim 1.2\% C_{gate}$ , as confirmed later by our measured efficiency. Using the unsalicided poly resistor provided by TMSC, the implementation of R1 only occupies 0.74% of the converter area while enabling >10X reduction in parasitic capacitance. This also shows that there is no fundamental advantage the first prototype described in Chapter 3 was implemented in an SOI technology. With the minimal effort of biasing the well using this method, we can achieve a parasitic capacitance that is even lower than in the SOI CMOS technology (i.e. $1.2\% C_{gate} < \sim 3\% C_{gate}$ ) [37]. #### 4.3.2. Switch Drivers As mentioned in Chapter 2, due to the fact that they offer the lowest $V_{sw}^2R_{on}C_{gate}$ metric, it is desirable to utilize native thin-oxide devices to maximize the converter's efficiency. However, while stepping down to a regulated output of ~1V, the converter has a high input voltage range of 2.9V-4V from the Li-ion battery, leading to high voltage stress on nearly half of the switches (Figure 4.3). Table 4.1 shows the voltages of the converter switching nodes in the two phases of operation in the two modes (i.e. 1/3 and 2/5 conversions) and the switches connected to these nodes. In operation, switch M11, M12, M21 and M22 experience a voltage of (Vi-Vo) while M13 and M23 need to handle (Vi+Vo)/2 in 1/3 mode and (Vi+Vo/2)/2 in 2/5 mode. These voltages lie in the range of 2.05V (i.e. (Vi+Vo/2)/2) to 2.6V (i.e. (Vi-Vo)) for a nominal input of 3.6V, and are higher than the breakdown voltage of thin-oxide devices in this 65nm technology. To avoid breakdown without using thick-oxide devices (which would lead to higher conduction/switching loss and thus lower efficiency), these 6 switches are implemented utilizing cascoded thin-oxide devices as shown in Figure 4.10. The only thick-oxide (2.5V) switch in the entire converter is M8 that will be discussed in detail shortly. Each converter unit operates in 2 non-overlapping clock phases $\Phi 1$ and $\Phi 2$ with controllable dead-time. Using intermediate voltage rails 2Vo and Vi-Vo, the generation of which will be described shortly, and versions of $\Phi 1$ and $\Phi 2$ that swing between Vo and GND, Vi and Vi-Vo (labeled \_h1), or 2Vo and Vo (labeled \_h2), it is straightforward to drive the gates of most of the switches. The drivers of M8, M15 and M26 utilize additional circuits (Figure 4.10) to ensure robust and efficient operation in the two configurable topologies. First, switch M15 and M26 should always be off in the 2/5 configuration. While that can be realized simply for M26's driver with a logical AND of the clock $\Phi$ 2 and signal T31 (which would be driven "High" in the 1/3 configuration), the implementation of the M15 driver requires more design effort. Since the source of M15 is driven below the logic "Low" rail of a standard inverter driver in phase $\Phi 2$ of operation, M15 could turn on unintentionally. Therefore, dual-rail power gating and voltage clamps (M15a and D15) are added to IV15 in order to ensure these switches remain off, as illustrated in Figure 4.10. In the 1/3 conversion mode, the small switch M15a always has $V_{GS} \leq 0$ , turning off the current path through the voltage clamps, while the dual-rail power gating transistors are turned on and thus IV15 functions off the voltage rails 2Vo-Vo as normal. | | n = 1/3 | | n = 2/5 | | | |--------------|-----------|-----|-------------|------|-------------------------------------------------------| | Node | Ф1 | Ф2 | Ф1 | Ф2 | Connected switches (the other terminal connection) | | C11_Positive | Vi | Vo | Vi | Vo | M11 (Vi), M12 (Vo) | | C11_Negative | (Vi+Vo)/2 | GND | (Vi+Vo/2)/2 | GND | M13 (GND), M14<br>(C12_Positive) | | C12_Positive | (Vi+Vo)/2 | Vo | (Vi+Vo/2)/2 | Vo/2 | M15 (Vo), M14<br>(C11_Negative), M8<br>(C22_Negative) | | C12_Negative | Vo | GND | Vo | GND | M16 (GND), M17<br>(Vo) | | C21_Positive | Vi | Vo | Vi | Vo | M21 (Vi) , M22 (Vo) | | C21_Negative | (Vi+Vo)/2 | GND | (Vi+Vo/2)/2 | GND | M23 (GND), M24<br>(C22_Positive) | | C22_Positive | (Vi+Vo)/2 | Vo | (Vi+Vo/2)/2 | Vo | M25 (Vo), M24<br>(C21_Negative) | | C22_Negative | Vo | GND | Vo | Vo/2 | M26 (GND), M27<br>(Vo), M8<br>(C12_Positive) | Table 4.1. Node voltages in operation Figure 4.10. Proposed SC converter with power switches and gate driver circuitry Next, switch M8 connected between the C12 Positive and C22 Negative terminals needs to stay off in the 1/3 mode while being active in the 2/5 configuration. The challenge for this switch driver is that C12 Positive (C22 Negative) is switching between (Vi+Vo)/2 and Vo (Vo and GND) in the 1/3 mode, but between (Vi+Vo/2)/2 and Vo/2 (Vo and Vo/2) in the 2/5 configuration, as shown in Table 4.1. If M8 is implemented using a thin-oxide device, its driver will need to switch the gate of M8 between Vo and GND to ensure it stays off without danger of breakdown in the 1/3 mode, but then between 2Vo and GND to operate M8 in the 2/5 mode. Consequently, the design of the M8 driver would be very complex, potentially negating the benefit of using a thinoxide device. Therefore, to simplify this switch driver, M8 is implemented with a thickoxide device. The use of this single thick-oxide transistor incurs an insignificant penalty in overall efficiency since it is responsible for only 6.7% of switch resistive losses in the 2/5 topology, which is active for less than 20% of a battery cycle (Vi<3.35V) [46]. As shown in Figure 4.10, M8 driver is implemented with a thin-oxide inverter IV8 and dualrail power gating. The gate of M8 is kept at GND in the 1/3 mode and switched in the 2Vo-Vo domain in the 2/5 configuration. The NOR gate h2 is added to set the logic LOW at the output of h2 when T31 h2 is connected 2Vo, avoiding possible breakdown for the PMOS transistor of IV8 in the 1/3 mode. Since one of the biggest challenges in designing SC converters lies in switch driver designs, it is often beneficial to simplify them as much as possible. One of the good design techniques in switch drivers for SC converters is to enable as many switch to be automatically controlled by the actions of the other switches as possible. This technique of automatically sequencing switch operations can help minimize the complexity of switch drivers. Particularly in this design, switch M14 and M24 are implemented using PMOS transistors with the gates always tied to Vo and require no explicit switch drivers. The operation of these two switches is synchronized by the other switches. For example, in the 1/3 configuration, in Φ1 when C11 Positive (C12 Negative) is driven to Vi (Vo) by M11 (M17), C11 Negative and C12 Positive are pulled up to close to (Vi+Vo)/2. As a result, a gate-to-source voltage of magnitude $|V_{GS,M14}| \approx Vo$ is applied and turns on M14, connecting C11 and C12 in series as intended. In $\Phi$ 2, C11 Negative and C12 Positive are driven by M13 and M15 to GND and to Vo, respectively. Since both of these voltages at its Source and Drain terminals are lower than the gate voltage at Vo, the PMOS M14 is turned off as desired to allow C11 and C12 to be connected in parallel between Vo and GND. Therefore, although it performs a similar function as the bridge switch M5 of the converter prototype reported in Chapter 3 [2], the M14 "driver" in this converter is obviously dramatically simplified. The N-well body of M14 is biased at the intermediate voltage 2Vo (i.e. close to (Vi+Vo)/2) to avoid undesirable body effect when the switch is ON. As mentioned above, intermediate voltage rails 2Vo and Vi-Vo are required in order to bias the cascode transistors (i.e. to avoid breakdown) and the N-well of switch M14 and M24, and to power gate the drivers. In the next section, the design of the auxiliary converters used to generate these rails will be presented. #### 4.3.3. Intermediate Voltage Rail Auxiliary Converters In order to generate the intermediate voltage rails 2Vo and Vi-Vo, two auxiliary SC converters are used. The simplified versions of the auxiliary SC converter units are shown in Figure 4.11. As illustrated, both converters can utilize the same unit circuit with 1 flying capacitor and 4 switches; differences in the rail connections and operation of the switches define which converter is actually implemented. While switch ML1, ML2 and MH1 are still connected to Vo, GND and the output of the auxiliary converters, respectively, MH1 is connected to Vo in the 2Vo converter and to Vi in the Vi-Vo converter. The auxiliary converter output voltage is defined by the two-phase operation of the switches and capacitor. The flying capacitor is connected between Vo and GND (Vin and Vo) in $\Phi$ 1, and between the output of the 2Vo converter and Vo (the output of the Vi-Vo converter and GND) in $\Phi$ 2. Since the capacitor is charged to Vo (Vi-Vo) in $\Phi$ 1, it generates 2Vo (Vi-Vo) at the output in $\Phi$ 2 of the 2Vo (Vi-Vo) converter. While thinoxide devices can be used to implement the 2Vo converter, the flying capacitor in the Vi-Vo converter requires thick-oxide (i.e. higher capacitance density compared with stacking two thin-oxide capacitors) as it needs to handle Vi-Vo across the two terminals. However, all of the switches can utilize low voltage devices because they only need to block a voltage of Vo (i.e. $\sim 1$ V). Figure 4.11. Auxiliary SC converters for intermediate voltage rail generations Figure 4.12. The proposed converter unit for auxiliary SC converters and timing diagram Figure 4.13. (a) Break and (b) Make operations of the proposed auxiliary SC converters With the above-mentioned similarity between the two auxiliary converters, one converter unit can be used to implement both of them. As shown in Figure 4.12, the proposed converter unit has the high-side rail connections Vdd-H and Vss-H set to 2Vo and Vo (Vi and Vi-Vo) in the 2Vo (Vi-Vo) converter. The converter unit has three stages: power conversion, non-overlap control, and signal level shift. The power conversion stage uses transistors MH1-2, ML1-2 and capacitor $C_{\rm pwr}$ to generate the new rail. In order to minimize shoot-through current and achieve high efficiency, non-overlap is required in the operation of the power switches MH1-2 and ML1-2. This non-overlap operation is implemented utilizing the "break-before-make" (BBM) technique, which will be detailed shortly. The signal level-shift stage includes capacitor $C_{\rm sig}$ to couple/level-shift the control signals and small transistors MF1-2 to ensure the rail-to-rail operation of the level-shifted signal at node H. To implement the BBM operation for the power conversion stage, the non-overlap stage is designed to ensure that all the power switches that were previously ON are turned OFF before the switches which should be active in the next phase are turned ON. As shown in the timing diagram of Figure 4.12 and Figure 4.13a, power switch ML1 (ML2) is turned off by ML3 (ML8) right after the clock signal is received at node L to implement the "Break" operation. The "Make" operation is only activated by turning on ML2 (ML1) when the clock signal arrives at node L1 after a buffer delay of BF2 from L and turns on ML7 (ML4) in series with ML6 (ML5). As a result, the BBM operation is realized in the bottom circuit. The "Break" operation of the top circuit is implemented similar to the bottom circuit and synchronized with the bottom circuit by the signal capacitor C<sub>sig</sub> (i.e. node H transitions in the top voltage domain at the same time with node L in the low domain). However, it is desirable to delay the "Make" operation of the top circuit after the "Make" operation of the bottom circuit and C<sub>pwr</sub> already transitioned. In order to implement this, after the "Make" operation of the bottom circuit turns on ML2 (ML1) pulling L<sub>pwr</sub> to GND (Vo) (i.e. "Make" operation of the bottom circuit finished), node H<sub>pwr</sub> moves to Vi-Vo (Vi) and turns on the "Make" transistor of the top circuits MH7 (MH4), which then turns on MH2 (MH1). This sequence is designed to guarantee the BBM operation not only in each top or bottom circuit but also across these two circuits to avoid potential timing mismatch and current shoot-through problems across all corners. These two auxiliary converters, occupying 7.5% of the total circuit area, utilize 4-way interleaving. Since the auxiliary converters are designed to support the drivers of the main SC converter switches, the clocks for these interleaving phases can be obtained from the 18-phase ring-oscillator clock generator of the main converter. By utilizing the same switching frequency as the main SC converter, the auxiliary converters' output impedance scales along with the current of the gate drivers (i.e. their loading circuits), and thus they do not require their own explicit closed-loop control circuits. #### 4.3.5. Clock Level Shifter Since all the control circuits are powered by the Vo-GND domain, level shifters are required to provide the clock signals to the other two voltage domains, 2Vo-Vo and Vi-Vi-Vo. This level shifter is required to operate over a wide range of frequency (~1MHz to 300MHz). The low limit in switching frequency is to provide high efficiency at light load. The clock level shifters described in Chapter 3 have an RC discharge effect that limits the lowest frequency at which it can function reliably (i.e. without crowbar current in the top inverters). Therefore, in this design a latch-based level shifter is used as shown in Figure 4.13. The small feedback inverters ensure rail-to-rail operation of the level-shifted signals, allowing arbitrarily low clock frequency. This circuit works particularly well in this design where all three domains ideally have the same effective supply voltage of ~1V. Figure 4.14. Clock level shifter #### 4.4. Sub-ns Response Regulation In order to turn the SC converter into a regulator, closed-loop control is required. As discussed in Chapter 2 (i.e. Equation 9 and 6), the output impedance of the regulator can be set by controlling the converter's switching frequency. Such control can be implemented using the simple closed-loop architecture shown in Figure 4.15. The loop is comprised of a comparator Comp1, a charge-pump integrator, and a VCO, and regulates the steady-state output to the reference voltage $V_{ref}$ . In this architecture, the output voltage Vo is sampled and compared with reference $V_{ref}$ by comparator Comp1. The comparator output goes through a charge-pump integrator to control the input voltage of a 9-stage voltage controlled ring-oscillator (VCO) that generates 18 clock phases for the interleaved SC converter. As long as the comparator's offset is sufficiently low, this loop accurately regulates Vo to the target voltage in steady state. However, due to the use of an integrator in the main loop, and hence the relatively low loop bandwidth typically implied by this, the key challenge for such a design is its response to a $0 \rightarrow I_{max}$ load step. Figure 4.15. Simple closed-loop controller Figure 4.16. Complete closed-loop controller with fast transient response Figure 4.17. Transient diagram for large and small positive load steps In order to handle a $0 \rightarrow I_{max}$ load step, a fast increase in the converter switching frequency is necessary to keep the output voltage in regulation. Therefore, a fast comparator Comp2 and an analog Mux are added to the main control loop, as shown in Figure 4.16. As illustrated by the black curves in the transient diagram in Figure 4.17, if a load transient causes Vo to drop below V<sub>r low</sub> (which is set to V<sub>ref</sub>-30mV), the fast comparator Comp2 will trigger the mux to bypass the main loop (by turning S3 off) and set the VCO to its maximum frequency f<sub>max</sub> (by turning on S1 and S2, forcing V<sub>ctrl</sub> and V<sub>integ</sub> to V<sub>max</sub>). Due to the isolation from C<sub>integ</sub> gained by turning S3 off, V<sub>ctrl</sub> can be nearly instantly reset to V<sub>max</sub>. The instant increase in switching frequency to f<sub>max</sub> leads to a decrease in the converter's output impedance to $R_{o,min}$ within one switching period $T_{min}$ (i.e. $1/f_{max}$ ), speeding up the transient response. The response time of the converter, which is defined as the duration between the time that Vo crosses $V_{r \text{ low}}$ and the time it starts to go back up, depends on the latency of the feedback circuitry, including comparator Comp2, mux and VCO, and the clock distribution network. In order to obtain a fast transient response, this latency needs to be short. In this prototype, the response time was designed to be less than 1ns, allowing the SC regulator to guarantee voltage droop <10% of the nominal Vo under full load current steps. The blue dotted curves in Figure 4.17 illustrates the converter's transient response to a load step that is smaller than $0 \rightarrow I_{max}$ but still big enough to cause Vo to drop below $V_{r_{low}}$ and trigger the frequency jump to $f_{max}$ . Since the load current is not at $I_{max}$ , jumping to $f_{max}$ causes Vo to increase back up very fast and overshoot the nominal voltage. As soon as Vo goes back up above $V_{r_{low}}$ , the fast comparator Comp2 will turn switch S1 and S2 off and switch S3 on to return the control to the main loop, which will then slowly bring down the switching frequency and thus Vo to its nominal value. Output voltage overshoot also occurs when the load current steps down abruptly, as shown in Figure 4.18. When the load change is small and does not cause Vo to increase above upper reference $V_{r_{high}}$ (i.e. set to $V_{ref}$ +80mV), as illustrated by the blue dotted curves in the figure, the main slow loop will reduce the switching frequency and bring Vo back to its nominal value. However, when there is a big load down-step – illustrated by the black curves in the figure – Vo can rise above $V_{r_{high}}$ . In this case, the settling time calculated from the time Vo jumps up to when it stabilizes back to its nominal value is long (e.g. few microseconds). In order to shorten the settling to bring Vo down faster, comparator Comp3 and the current sink $i_{sink}$ connected at $V_{ctrl}$ were added. When Vo goes above upper reference $V_{r_{high}}$ , Comp3 turns on $i_{sink}$ in order to discharge Vctrl more quickly and reduce the switching frequency, thus returning Vo to its nominal value faster. Figure 4.18. Transient diagram for large and small unloading transients #### 4.5. Measurement Results and Discussions A die photo of the SC converter implemented in a 65nm bulk LP CMOS process employing the previously described design optimization and circuit techniques is shown in Figure 4.19. The control circuitry and two auxiliary converters occupy ~1% and ~7.5% of the total converter area, respectively. The core converter, occupying an area of ~0.59 mm², has 9 blocks of two 180-degree-out-of-phase converter units. The converter blocks are interleaved to mitigate the losses due to current ripple, as discussed in Chapter 2. To measure the converter's performance, we used the same on-die load structure, its characterization strategy and four-wire sensing technique discussed in Chapter 3. For all of the subsequent measurements, the converter control loop was always active and regulated the output voltage to ~1V. Figure 4.20a shows the converter's measured efficiency across a power density range of ~0.01W/mm²-0.25W/mm². When the input is fixed at 3.6V, the converter efficiency remains above 72% over the full range of power density and peaks at 74.25% at ~50mW/mm². At a nominal operating point of 0.19W/mm² output power density, the converter achieved >73% efficiency. Figure 4.20b shows the measured efficiency with a variable input voltage at this nominal output power density of 0.19W/mm², verifying operation across the Li-ion battery voltage range. All measured efficiencies match the analytical predictions within 1%, further proving the design methodology described in Chapter 2. Figure 4.19. Die photo Figure 4.20. Measured efficiency with Vo regulated at 1V and (a) Vi=3.6V, (b) Vi=3V-4V, Pout= $0.19W/mm^2$ Figure 4.21 shows the regulator's measured step response using an on-die load circuit with a rise/fall time of 50ps. The regulator loop is verified to stabilize the output and achieves 76mV droop (7.6% of Vo) with a full load step of between 0 and 0.253 A/mm² (i.e. or $5A/mm^2/ns$ slope) (Figure 4.21a). As also mentioned in the previous section, since the low threshold $V_{r_low}$ to activate the fast loop is set around 30mV lower than $V_{ref}$ , the converter response time is calculated from the time Vo drops below $V_{r_low}$ to the time Vo stops dropping and starts to recover. As indicated in Figure 4.21a, the response time of the converter to a full load step is <1ns. Figure 4.21 shows the regulator's measured step response using an on-die load circuit with a rise/fall time of 50ps. The regulator loop is verified to stabilize the output and achieves 76mV droop (7.6% of Vo) with a full load step of between 0 and 0.253 A/mm² (i.e. or $5A/mm^2/ns$ slope) (Figure 4.21a). As also mentioned in the previous section, since the low threshold $V_{r_low}$ to activate the fast loop is set around 30mV lower than $V_{ref}$ , the converter response time is calculated from the time Vo drops below $V_{r_low}$ to the time Vo stops dropping and starts to recover. As indicated in Figure 4.21a, the response time of the converter to a full load step is <1ns. The response of the converter at a full-scale unloading step from $0.253 \text{ A/mm}^2$ to 0 is shown in Figure 4.21b. This load current change in 50 ps abruptly reduces the voltage drop over the output impedance of the SC converter to close to zero. As a result, Vo jumps close to the voltage set by the ideal conversion ratio. The only reason Vo does not increase to the ideal output of 1.2V (i.e. = 3.6/3) is due to the quiescent current of the converter control circuitry and other parasitic leakage. In this measurement, it takes ~800ns for the output to settle down to within 1% of nominal Vo = ~1V. The effect of the additional loop formed by Comp3 and current source $i_{sink}$ at $V_{ctrl}$ node described above (Figure 4.16) is shown in simulation to enable the settling of Vo after the overshoot from ~2us to ~800ns. Although overshoot exists in the transient operation of the converter, it is not as critical as voltage droop, which can cause the digital loading circuits to have setup time violation. The ideal output voltage (i.e. the output voltage at no load) limiting the max overshoot is designed to ensure it would not cause devices to break down and reduce the system life-time. In order to test the regulation loop, the SC regulator's steady-state load and line regulation performance is measured, as shown in Figure 4.22. For load regulation (Figure 4.22a), while the nominal output is regulated at 1.005V at Vi = 3.6V, the output load is swept in the measurement shown in Figure 4.22a. For line regulation, the input voltage was swept in the range of $\sim$ 2.9V-4V, while the output was regulated at 1.005V. The regulator achieved $\sim$ 4.2%/A/mm<sup>2</sup> (i.e. $\sim$ 4.2mV per A/mm<sup>2</sup>) for load regulation and $\sim$ 1.75%/V (i.e. $\sim$ 1.76mV/V) for line regulation. Figure 4.21. SC regulator transient response at full current (a) step-up and (b) step-down Figure 4.22. SC regulator performance in (a) load regulation and (b) line regulation Table 4.2 shows the comparison of the regulator in this work with prior art. While supporting $\sim 1.5 \mathrm{X}$ higher conversation ratio than the previous designs with $\mathrm{Vin} > 2 \mathrm{V}$ , this design achieves essentially identical power density and efficiency, and the highest transient current densities (>4X) and lowest latency reported to date. With the fast transient response from the proposed control loop and 18 phase interleaving, the regulator requires no output filtering capacitor while being implemented with smaller total capacitance. The two reconfigurable topologies also enable this regulator to support a wide input voltage range while regulating the output at $\sim 1 \mathrm{V}$ . | Work | [50] | [45] | This work | |-----------------------------------------------------|------------------------------------------------------|---------------------------------------------------|-------------------------------------| | Technology | 130nm Bulk | 90nm Bulk | 65nm Bulk | | Topology | 3 Level | 1/2 SC | 1/3, 2/5 SC | | Input(V) / Output(V) | 2.4/(0.6–1.35) | 3.6/1.5<br>3.0/1.3 | (3-4) / 1 | | Interleave / C <sub>fly</sub> / C <sub>out</sub> | 4 / 18nF / 10nF | 10 / 2nF / 3.2nF | 18 / 3.88nF / 0 | | Efficiency (η) / η <sub>peak</sub> | 63 / 77 | 74 / 77 | 73 / 74.3 | | Power density (W/mm²) @ η | 0.2 | 0.05 | 0.19 | | Load step (mA/mm <sup>2</sup> ) @ t <sub>rise</sub> | 58.6 @ 100ps<br>(220→370mA<br>@2.56mm <sup>2</sup> ) | 9.25 @ 25ns<br>(42→72mA<br>@3.24mm <sup>2</sup> ) | 253 @ 50ps<br>(0→162mA<br>@0.64mm²) | | Droop | 7.5% (w/ shunt reg.) | 2.1% | 7.6% | Table 4.2. Comparison with prior art #### 4.6. Chapter Summary In this chapter, we reported on a battery-connected DC-DC voltage regulator that can efficiently bridge the voltage gap between Li-ion battery and modern digital circuit supply. The converter can be reconfigured to the proposed PS2P (2/5) topology or the TS2P (1/3) topology to regulate the output voltage at ~1V from a ~2.9V-4V input. We presented a detailed description of proposed circuit techniques, including body biasing to reduce parasitic capacitance for flying capacitors, switch drivers to support reconfigurable topologies, auxiliary converters to generate intermediate voltage rails for switch drivers, and a sub-ns response control loop to maintain sufficiently low voltage droop under worst-case load transient conditions. The circuits implemented in 0.64 mm<sup>2</sup> of a 65nm bulk LP CMOS process are then verified in experiments. The converter with two reconfigurable topologies (1/3 and 2/5) achieves >73% efficiency at 0.19 W/mm<sup>2</sup> output power density and maintains efficiency above 72% over the whole range of target power density. The sub-ns response control loop maintains <76 mV voltage droop out of a 1V regulated output under a full load step of $0 \rightarrow 0.253$ A/mm<sup>2</sup> in 50ps. These results of this demonstration, achieved in a standard Bulk CMOS process, further indicate that fully integrated switched-capacitor converters are indeed capable of enabling low-cost true per-core power management. Fully integrated switched-capacitor converters can also allow reduction and/or elimination of all external PMICs and all their area-consuming passive components. ## Chapter 5 Conclusions Many functions have been integrated into SoCs to benefit from a closer proximity in system communications and to reduce the total implementation size. However, power management units have still remained separated with a large number of off-chip passive components, and thus occupy ~25% of the total PCB area in today's smartphones. This area consumption is getting larger as the need for more voltage domains to efficiently manage on-die power of parallel systems increases. The ultimate solution to this problem is fully integrated switching converters. Due to the availability of high density and low series resistance capacitors in existing CMOS processes, switched-capacitor DC-DC converters rise to be a promising candidate for low-cost but efficient implementations. Since full integration is a must for power management, switched-capacitor DC-DC converters benefit from this trend. In co-integration, all the main power devices, including power switches and capacitors, are implemented using the same type of transistors as the load. This makes the converter characteristics change, due to process-voltage-temperature (PVT) variations, in the same way as the load. Therefore, the performance of the converter, which depends on the ratio rather than the absolute values of the load and converter characteristics, is actually independent of corner variations. Converters using fully integrated capacitors also give certain important tradeoffs in design that are not critical for designs using off-chip capacitors. In an offchip implementation, the optimization focuses mainly on the capacitance density of the flying capacitors while the associated parasitic capacitance is often ignored. This parasitic is insignificant (<<0.01%) compared with the associated flying capacitors for off-chip implementation, but is significantly bigger (>>0.1%) for on-chip components. While the capacitance density is still the key parameter to the achievable power density, the ratio of parasitic capacitance to flying capacitance directly sets the maximum efficiency of switched-capacitor DC-DC converters. Therefore, the type and implementation method for the flying capacitors should be chosen wisely to achieve the best performance within a certain range of operation. As pointed out in the thesis, high-density capacitors are more favorable in high power-density operations, while low-parasitic capacitors are the better choice for low power density. Although it is true that transistor is an important part of the converter, the technological advances in terms of transistors do not guarantee better performance. It is combination of both the capacitor and switch characteristics that define the performance. For example, other than the benefit of not needing to bias the body of transistors, an SOI process is not necessarily better than a Bulk process. As we showed with the two converter prototypes, properly designed capacitors implemented in a 65nm Bulk process can be better, in terms of para sitics and capacitance density, than those in a 32nm SOI. These superior capacitors can offset the process's inferior transistors and yield a better performance in the right setting. In order to implement optimized fully integrated switched-capacitor DC-DC converters, in this thesis a design methodology along with corresponding design techniques were developed and experimentally verified on two prototypes. The first prototype demonstrated the capability of SC converters in delivering high efficiency at high power density, and the second prototype proved the possibility of replacing all off-chip PMICs and their area-consuming passive components with fully integrated SC regulators. Reconfigurable topologies with a newly proposed partial-series-to-parallel topology and multi-phase interleaving were employed to alleviate the traditional trade-off of efficiency and regulation and the issue of high output voltage ripple, respectively. The second prototype also showed that SC regulators, if co-integrated with loads, can push the limit of switching converter response times into the sub-nanosecond regime, significantly improving the supply impedance without the need to use linear regulators. Given that these results were achieved in a standard CMOS process with no modifications or additions, fully integrated voltage regulators based on switched-capacitor topologies appear to be a very promising approach that is ripe for industrial adoption in specific applications. Nonetheless, adoption of fully integrated SC converters across a broad variety of applications will require solutions to several remaining technical challenges. For example, although reconfigurable topologies can enable a high overall peak efficiency for SC converters, there are still undesirable valleys of efficiency between optimal operating points of adjacent topologies. This fundamental characteristic can significantly hinder the adoption of this approach in systems where flat efficiency over a wide range of conversion ratios is required. In sum, the chart, shown in Figure 5.11 and also discussed in [20], compares the experimental performance of published SC and inductor-based converter prototypes. In this chart, the first SC converter prototype in this thesis [37] wins the "front" performance of all prototypes fabricated with standard CMOS processes, and with no special technology steps included. When extra process steps are allowed, it is not clear which converter type wins the new performance front. However, note that while high density capacitors, such as deep trench capacitors or e-DRAM capacitors, even still limited to specific applications, are already available in production, magnetic material and ultra-thick (~6µm) metal are still facing both technical and business challenges for mass production. Therefore, fully integrated SC converters currently have a significant advantage over their inductive counterpart for early and reliable production. Figure 5.1. Performance of fully integrated converters in production CMOS and in processes where extra steps are allowed - [1] International Technology Roadmap for Semiconductors, available online at <a href="http://www.itrs.net/">http://www.itrs.net/</a> - [2] T. Simunic, L. Benini, A. Acquaviva, P. Glynn, and G. D. Micheli, "Dynamic Voltage Scaling and Power Management for Portable Systems," Design Automation Conference, 2001. - [3] B. H. Calhoun and A. P. Chandrakasan, "Ultra-dynamic voltage scaling (UDVS) using subthreshold operation and local voltage dithering," IEEE J. Solid-State Circuits, vol. 41, no. 1, pp. 238–245, Jan. 2006. - [4] J. Donald and M. Martonosi, "Techniques for multicore thermal management: Classification and new exploration," Proc. of the 33rd annual Intl. symposium on Computer Architecture, 2006. - [5] Jacob Leverich, Matteo Monchiero, Vanish Talwar, Parthasarathy Ranganathan, Christos Kozyrakis, "Power Management of Datacenter Workloads Using Per-Core Power Gating," IEEE Computer Architecture Letters, vol. 8, no. 2, pp. 48-51, July-Dec. 2009. - [6] Suhwan Kim; Kosonocky, S.V.; Knebel, D.R.; Stawiasz, K., "Experimental Measurement of A Novel Power Gating Structure with Intermediate Power Saving Mode," Low Power Electronics and Design, Proceedings of the 2004 International Symposium on, ISLPED, pp.20-25, Aug. 2004. - [7] H. Mahmoodi-meimand and K. Roy, "Data-retention flip-flops for power-down applications," in ISCAS, 2004. - [8] Li Li; Ken Choi; Ho Joon Lee, "Power efficient data retention logic design in the integration of power gating and clock gating," Circuits and Systems (MWSCAS), 2011 IEEE 54th International Midwest Symposium on , pp. 1-4, Aug. 2011. - [9] G.Rincon-Mora and P.Allen, "Alow-voltage, low quiescent current, low drop-out regulator," IEEE J. Solid-State Circuits, vol. 33, no. 1, pp. 36–44, Jan. 1998. - [10] R. W. Erickson and D. Maksimovic, "Fundamentals of Power Electronics," Second Edition, Kluwer Academic Publisher, 2001. - [11] Maxim Integrated, MAX16961 Data sheet Rev A, "3A, 2.2MHz, Synchronous Step-Down DC-DC Converter," June 2013, Document Ref.: 19-6520, <a href="http://www.maximintegrated.com/datasheet/index.mvp/id/7933">http://www.maximintegrated.com/datasheet/index.mvp/id/7933></a> - [12] Texas Instrument, TWL6032 Data sheet Rev A., "Fully Integrated Power Management With Power Path and Battery Charger," Dec. 2011, Accessed Dec. 2012, <a href="http://www.ti.com/lit/ds/symlink/twl6032.pdf">http://www.ti.com/lit/ds/symlink/twl6032.pdf</a>>. [13] Q. Li, Y. Dong, F. C. Lee, "High Density Low Profile Coupled Inductor Design for Integrated Point-of-Load Converter," *IEEE Applied Power Electronics Conference (APEC)*, pp. 79 – 85, 2010. - [14] G. Schrom, et al., "A 100MHz Eight-Phase Buck Converter Delivering 12A in 25mm2 Using Air-Core Inductors," *APEC*, pp. 727 730, 2007. - [15] J. Wibben and R. Harjanai, "A High-Efficiency DC–DC Converter Using 2 nH Integrated Inductors," *IEEE J. of Solid-State Circuits*, Vol. 43, No. 4, pp. 844 854, 2008. - [16] D. S. Gardner, G. Schrom, P. Hazucha, F. Paillet, T. Karnik, S. Borkar, "Integrated On-Chip Inductors With Magnetic Film," *IEEE Trans. Magnetics*, Vol. 43, No. 6, 2007. - [17] Sturcken, N.; O'Sullivan, E.J.; Wang, N.; Herget, P.; Webb, B.C.; Romankiw, L.T.; Petracca, M.; Davies, R.; Fontana, R.E.; Decad, G.M.; Kymissis, I.; Peterchev, A.V.; Carloni, L.P.; Gallagher, W.J.; Shepard, K.L., "A 2.5D Integrated Voltage Regulator Using Coupled-Magnetic-Core Inductors on Silicon Interposer," Solid-State Circuits, IEEE Journal of, vol.48, no.1, pp.244,254, Jan. 2013. - [18] J. Lee, G. Hatcher, L. Vandenberghe, C.K. Yang, "Evaluation of Fully integrated Switching Regulators for CMOS Process Technologies," *IEEE Trans. VLSI*, pp. 1017 1117, 2007. - [19] M. Seeman, "A Design Methodology for Switched-Capacitor DC-DC Converters", University of California, Berkeley, Technical Report No. UCB/EECS-2009-78, 2009. - [20] Seth Sanders, Elad Alon, Hanh-Phuc Le, Michael Seeman, Mervin John, Vincent Ng, 'The Road to Fully Integrated DC-DC Conversion via the Switched-Capacitor Approach', Power Electronics, IEEE Transaction on, pp. 4146 - 4155, Vol. 28, Iss. 9, Sept. 2013. - [21] M. Seeman and S.R. Sanders, "Analysis and Optimization of Switched-Capacitor DC-DC Converters", *10th IEEE Workshop on Computers in Power Electronics (COMPEL)*, pp. 216-224, July 2006. - [22] M.D. Seeman, S.R. Sanders, "Analysis and Optimization of Switched-Capacitor DC–DC Converters," *IEEE Trans. Power Electronics*, pp. 841 851, March, 2008. - [23] T. Van Breussegem and M. Steyaert, "A 82% Efficiency 0.5% Ripple 16-Phase Fully Integrated Capacitive Voltage Doubler," *IEEE Symp. VLSI Circuits*, pp. 198 199, June, 2009 - [24] D. Somasekhar, B. Srinivasan, G. Pandya, F. Hamzaoglu, M. Khellah, T. Karnik, K. Zhang, "Multiphase 1GHz Voltage Doubler Charge-Pump in 32nm logic process," *IEEE J. Solid-State Circuits*, Vol. 45, No. 4, pp. 751 758, 2010. - [25] Vincent Ng, Seth Sanders, "A 92%-Efficiency Wide-Input-Voltage-Range Switched-Capacitor DC-DC Converter", ISSCC Dig. Tech. Papers, Feb. 2012. - [26] H.-P. Le, M.D. Seeman, S.R. Sanders, V. Sathe, S. Naffziger and E. Alon, "A 32nm Fully Integrated Reconfigurable Switched-Capacitor DC-DC Converter Delivering 0.55W/mm<sup>2</sup> at 81% Efficiency," *ISSCC Dig. Tech. Papers*, pp. 210-211, February 2010. - [27] Michael Seeman, Vincent W Ng, Hanh-Phuc Le, Mervin Johns, Elad Alon, Seth R Sanders, "A Comparative Analysis of Switched-Capacitor and Inductor-Based DC-DC Conversion Technologies," COMPEL, June 2010. - [28] Caves, J.T.; Rosenbaum, S.D.; Copeland, M.A.; Rahim, C.F., "Sampled analog filtering using switched capacitors as resistor equivalents," Solid-State Circuits, IEEE Journal of, vol.12, no.6, pp. 592-599, Dec 1977. - [29] R. Jain, B. Geuskens, M. Khellah, S. Kim, J. Kulkarni, J. Tschanz, V. De, "A 0.45-1V Fully Integrated Reconfigurable Switched Capacitor Step-Down DC-DC Converter with - High Density MIM Capacitor in 22nm Tri-Gate CMOS," *IEEE Symp. VLSI Circuits*, June, 2013. - [30] K.D.T. Ngo, R. Webster, "Steady-state analysis and design of a switched-capacitor DC-DC", *PESC*, pp. 378-385, Vol. 1, 1992. - [31] B.R. Gregoire, "A compact Switched Capacitor Regulated Charge Pump Power Supply", *IEEE J. Solid-State Circuits*, Vol. 41, No. 8, pp. 1944-1953, 2006. - [32] D. Maksimovic and S. Dhar, "Switched-capacitor DC-DC converters for low-power onchip applications," in Proc. Power Electronics Spe- cialists Conf. (PESC), 1999, vol. 1, pp. 54–59. - [33] E. Alon and M. Horowitz, "Integrated Regulation for Energy-Efficient Digital Circuits," *IEEE J. of Solid-State Circuits*, pp. 1795 1807, August, 2008. - [34] D. Ma and F. Luo, "Robust Multiple-Phase Switched-Capacitor DC-DC Power Converter with Digital Interleaving Regulation," *IEEE Trans. VLSI Sys.*, Vol. 16, No. 6, 2008. - [35] G. Patounakis, Y. Li and K. L. Shepard, "A Fully Integrated On-Chip DC-DC Conversion and Power Management System," *IEEE J. Solid-State Circuits*, Vol. 39, No. 3, pp. 443-451, 2004 - [36] Y. K. Ramadass and A. P. Chandrakasan, "Voltage Scalable Switched Capacitor DC-DC Converter for Ultra-Low-Power On-Chip Applications," *IEEE Power Electronics Specialists Conference (PESC)*, pp. 2353-2359, June 2007 - [37] H.-P. Le, S. Sanders and E. Alon, "Design Techniques for Fully Integrated Switched-Capacitor DC-DC Converters", *IEEE Journal of Solid-State Circuits* (JSSC), pp. 2120 2131, Vol. 46, Iss. 9, Sep. 2011. - [38] Y. Ramadass, A. Fayed, A. Chandrakasan, "A Fully-Integrated Switched-Capacitor Step-Down DC-DC Converter With Digital Capacitance Modulation in 45 nm CMOS," *IEEE J. Solid-State Circuits*, Vol. 45, No. 12, pp. 2557-2565, 2010. - [39] S.K. Enam, A.A. Abidi, "A 300-MHz CMOS voltage-controlled ring oscillator," *IEEE J. of Solid-State Circuits*, Vol. 25, No. 1, pp. 312 315, 1990. - [40] L. Chang, R. Montoye, B. Ji, A. Weger, K. Stawiasz, R. Dennard, "A Fully integrated Switched-Capacitor 2:1 Voltage Converter with Regulation Capability and 90% Efficiency at 2.3A/mm<sup>2</sup>," *IEEE Symp. VLSI Circuits*, June, 2010. - [41] M. S. Whittingham, "Electrical Energy Storage and Intercalation Chemistry," Science, Vol. 192, no. 4244, pp. 1126-1127, June 1976. - [42] G. Villa Pique, "A 41-phase switched-capacitor power converter with 3.8mV output ripple and 81% efficiency in baseline 90nm CMOS", *ISSCC Dig. Tech. Papers*, pp. 98–100, Feb. 2012. - [43] H. Meyvaert, T. V. Breussegem, M. Steyaert, "A Monolithic 0.77W/mm<sup>2</sup> Power Dense Capacitive DC-DC Step-Down Converter in 90nm Bulk CMOS", *European Solid-State Circuits Conference*, 2011. - [44] W. Kim, D. Brooks, G.-Y. Wei, "A Fully-Integrated 3-Level DC/DC Converter for Nanosecond-Scale DVFS", *IEEE Journal of Solid-State Circuits*, Jan. 2012. - [45] T. Van Breussegem, M. Steyaert, "Monolithic Capacitive DC-DC Converter With Single Boundary–Multiphase Control and Voltage Domain Stacking in 90 nm CMOS", *IEEE Journal of Solid-State Circuits*, July 2011. - [46] Sony Corporation, "Lithium Ion Rechargeable Batteries Technical Handbook", available at <a href="http://www.sony.com.cn/products/ed/battery/download.pdf">http://www.sony.com.cn/products/ed/battery/download.pdf</a> [47] J. F. Dickson, "On-chip High-Voltage Generation in NMOS Integrated Circuits Using an Improved Voltage Multiplier Technique." IEEE Journal of Solid-State Circuits, Vol. 11, No. 6, pp. 374–378, June 1976. - [48] T. M. Anderson, F. Krismer, and J. W. Kolar, "A 4.6 W/mm2 power density 86% efficiency on-chip switched capacitor DC-DC converter in 32 nm SOI CMOS," in Proc. IEEE Appl. Power Electron. Conf., Mar. 2013. - [49] J. T. Dibene, II, P. R. Morrow, C.-M. Park, H. W. Koertzen, P. Zou, F. Thenus, X. Li, S. W. Montgomery, E. Stanford, R. Fite, and P. Fischer, "A 400 amp fully integrated silicon voltage regulator with In-die magnetic coupled embedded inductors," in Proc. Appl. Power Electron. Conf., Special Session on On-Die Voltage Regulators, Palm Springs, CA, 2010. - [50] W. Kim, D. Brooks, and G.-Y. Wei, "A fully-integrated 3-level DC– DC converter for nanosecond-scale DVFS," IEEE J. Solid-State Circuits, vol. 47, no. 1, pp. 206–219, Jan. 2012.