# Power Efficient Clock Distribuition for Switched Capacitor DC-DC Converters

**A.S.R. Murthy<sup>1</sup>, Sridhar. T<sup>2</sup>** <sup>1</sup>Raja Rajeswari of Engineering College, VTU, Bangalore, India <sup>2</sup>Alliance University, Bangalore, India

# Article InfoABSTRACTArticle history:In various VLSI based digital systems, on-chip interconnects have become<br/>the system bottleneck in state-of-the-art chips, limiting the performance of<br/>high-speed clock distributions and data communication devices in terms of<br/>propagation delay and power consumption. Increasing power requirements<br/>and power distribution to multi-core architectures is also posing a challenge<br/>to power distribution networks in the integrated circuits. Clock distribution<br/>networks for the switched capacitor converters becomes a non-trivial task

Keywords:

VLSI Switched capacitor converters Interconnects the system bottleneck in state-of-the-art chips, limiting the performance of high-speed clock distributions and data communication devices in terms of propagation delay and power consumption. Increasing power requirements and power distribution to multi-core architectures is also posing a challenge to power distribution networks in the integrated circuits. Clock distribution networks for the switched capacitor converters becomes a non-trivial task and the increased interconnect lengths cause clock degradation and power dissipation. Therefore, this paper introduce low swing signaling schemes to decrease delay and power consumption. A comparative study presented of low voltage signaling schemes in terms of delay, power consumption and power delay product. Here, we have presented a power efficient signaling topology for driving the clocks to higher interconnect lengths.

> Copyright © 2018 Institute of Advanced Engineering and Science. All rights reserved.

Corresponding Author:

A.S.R. Murthy Raja Rajeswari of Engineering College VTU, Bangalore, India Email: asr\_murthy2001@yahoo.co.in

# 1. INTRODUCTION

High-end With the VLSI technology scaling down, an increasing number of devices are getting packed in the silicon real estate. Unprecedented levels of integration of devices and functionality has been achieved over the past few technology nodes. As the technology is scaling, the on-chip interconnect wires are also increasingly becoming important and the overall system performance is being increasingly dominated by the on-chip interconnects [1]. Interconnects are becoming important than the actual devices in terms of power, delay, the figures of merit during the selection and Implementation of a device in chip fabrication and packing density. This process can be used in numerous electronic applications like data communication networks, digital signal processing, automobiles, multimedia and medical applications due to its high speed, reliability and capability of size reduction in various electronic components.

With the modernization in VLSI technique, the use of number of transistors per chip rapidly increasing. The enhancement in transistor count produces high complexity in interconnects hence the increment in the interconnect length which led to enhancement in die size. Therefore, some silicon die which span across the length of global interconnects do not scale precisely. This phenomenon known as reverse scaling interconnects which led to power dissipation. Therefore, Interconnects greatly affect the chip performance and robustness. The power consumption due to interconnects can be as high as 40% of total power dissipation in high performance gate arrays [2].

There are numerous techniques come in existence in recent years such as Current-mode signaling [2-3], signal modulation [4], transmission line interconnects [5], amplitude pre-emphasis techniques [6-7] to reduce power dissipation, propagation delay and increase chip design efficiency. In [8], [9] and [10] a simple capacitive driven wire model is applied in time domain and frequency domain respectively to enhance

bandwidth. An effective way to reduce power dissipation and to increase the efficiency on interconnects is to reduce the voltage swing of the interconnect lines [11], [12].However, this approach induces power consumption and delay due to presence of voltage convertors [13].Another simple way to enhance interconnects bandwidth is Repeater insertion [14].However, Repeaters dissipate large amount of power and need large area [1, 15]. There are few more approaches namely modulated signaling [4] and pulsed current-mode signaling [1] attain high amount of latency. However, due to reduced voltage swing these techniques are highly sensitive to parameter alterations, undergoes low noise margin, and consists of highly complex design, which makes them difficult to be used in the industry. Cross talk, substrate coupling, interconnect delay, transmission line effects, power supply integrity are the effects generated due to low signal reliability and integrity.

There are some essential factors in VLSI which determine the overall efficiency, cost, feasibility and reliability of the design namely power dissipation, Propagation delay and power delay product. The power consumption of interconnects can be optimized by controlling power in drivers, interconnect segments, receivers and repeaters. The propagation delay can be eliminated by placing the repeaters or accelerators along the global RC wires and helps to speed up the design process. The energy consumption of the interconnect design model can be specified by power-delay product. The efficient use of all these interconnects reduces power dissipation and propagation delay and enhances the overall performance of system. Therefore, there is a need of a technique, which reduces power dissipation, delay and power-delay-product to provide high efficiency and larger design throughput. Therefore, here we have introduced Power Efficient Clock Distribution Technique for Switched Capacitor DC-DC Converters.

Hence, in our model, a voltage regulation technique used in complex integrated circuit designs, which is attained at individual modules. Hence, a clock distribution scheme is required for switching in dc-dc converters. These clocks are routed to different converter modules from a common source on the IC. This clock distribution scheme needs specific amount of energy in interconnect wires with their linked circuitries. This approach significantly enhances power savings, since the last stage of a clock network has low switching capacitance. In our model basic idea of limiting the voltage swing along interconnect adopted to improve performance. Our model provide better voltage swing and reduces noise impedance to a large extent clock distribution technique

In this work, we also focus on bus architecture, its design, types, working and operations in VLSI design. The power dissipation in busses are up to 40% of the total power consumption in VLSI chip. Therefore, here a new architecture, which reduces the power consumption in the interconnection busses into the chip reducing the voltage swing using our proposed method, is presented which can be very helpful in future industrial applications. Here, a comprehensive knowledge of available bus drivers, bus receivers and their drawbacks are discussed to eliminate power consumption. This bus design architectures can be further used in Nano-technology and VLSI spice parameters frequency in various VLSI applications over a next decade.

This paper is organized as follows. Section II presents the test platform for evaluating the performance of the design topologies. Section III presents the two design topologies for the signaling schemes. Section IV presents the results and analyses the performance of the two schemes. Section V finally presents the conclusion.

# 2. RELATED WORK

In recent years, the design of interconnects in VLSI architecture has taken immense popularity and it is one of the fastest growing research topic of modern era due to its numerous applications in medical, multimedia and electronic devices like computer, mobiles, TV etc. Due to increasing popularity of micro and Nano-electronic devices, there is a need of shrinking the size of interconnects to make VLSI chips smaller while maintaining minimum power consumption and delay. To eliminate power dissipation and delay in various electronic devices a low voltage swing technique is a promising option for VLSI applications. This technique can deduct computational complexities largely which present in high voltage swing techniques. Hence, it offers help in shrinking of interconnects size in different VLSI applications by eliminating noise. Designing a low swing signaling schemes for driving long interconnects using driver-receiver end architecture is a very cumbersome task. Many researchers has done some significant amount of research in this field, which is as follows:

In [16], a novel mixed technique introduced by combining a low swing and regenerator technique to eliminate power consumption and delay and to increase feasibility of the system in global interconnects of chip. However, noise margin is very high in this technique, which led to high power consumption. In [17], a novel LSDFF cell and a novel swing and slew-aware CTS algorithm has been presented to eliminate a significant amount of noise and power consumption based on clock networks. However, it generates high

**D** 29

insertion delay and clock skew, which led to degradation in VLSI chip performance. In [18], robust magnetic skyrmion low power sensors are used to reduce the power consumption in global interconnects. This technique can provide high-energy efficiency and reduces high computational complexity. In [19], to optimize total delay and energy consumption a buffered ST sensor based voltage swing technique adopted. This technique also reduces noise immunity. However, high computational complexity can affect the overall performance of the system. In [20], a LMS adaptive filter based distributed arithmetic architecture adopted to reduce computational complexity and interconnect performance in critical paths. However, memory requirement in distributed arithmetic architecture is very high. In [21], a hardware oriented architecture adopted based on fuzzy logics, block truncation coding and digital half-toning to reduce computational complexity and memory requirement for VLSI implementation. However, compression ratio using this technique is very low. In [22], a 3-D NoC-bus hybrid architecture presented to make routing more efficient in bus architectures of VLSI design through 3-D IC technology hence the power consumption and this system also reduces the computational cost. However, due to high interconnect traffic the thermal issues can degrade performance of the system. In [23], a wireless 3-D Network-on-chips (NoC) technique adopted based on inductive coupling to improve the performance of bus architectures. However, the number of chips and power supply becomes limited in this model.

In this paper, we have introduced a robust Power Efficient Clock Distribution Technique for Switched Capacitor DC-DC Converter architecture, which helps to eliminate the drawbacks of existing algorithms. The architecture of our design build in such a way that power consumption and delay becomes minimum between interconnects for VLSI implementation. A clock distribution scheme utilized to switch power between DC-DC convertors without compromising performance of the system. In our model basic idea of limiting the voltage swing along interconnect adopted to improve performance. Our model provide better voltage swing and reduces noise impedance to a large extent clock distribution technique. In addition, a new architecture, which reduces the power consumption in the interconnection busses into the chip reducing the voltage swing our proposed method, is presented which can be very helpful in future industrial applications.

# 3. TEST ARCHITECTURE

The proposed driver- receiver circuits are tested using the test architecture as presented in Figure 1 and Figure 2 which models the interconnect length and the parasitic capacitance from interconnect to ground. Interconnect which is fabricated in Metal-3 layer can be varying length up to -mm. It is modeled as a  $\pi 3$  distributed resistance-capacitance model. All circuits are simulated with a receiver output load capacitance of 25fF. The fan out is modeled as extra distributed capacitive load of around 250fF/mm of interconnect length. All the circuits are analyzed under identical loading conditions, power supply and gate-source voltage. The test conditions are listed in Table 1.



Figure 1. Test Architecture



Figure 2. Interconnect Model

Table 1. Test Conditions

| Parameter           | Symbol | Value    |
|---------------------|--------|----------|
| Power Supply        | Vddh   | 1.0V     |
| Gate Source Voltage | Vgs    | 0.54V    |
| Loading Condition   | CL     | 250fF/mm |
|                     | CLout  | 25fF     |

# 4. DESIGN TOPOLOGIES

Due This section describes two design schemes for the power efficient routing of clock signals for the switched capacitor dc-dc converters. This section also discusses the operation of these two design topologies.

### 4.1. Functioning of scheme-1

The scheme 1 mainly works on the idea of limiting the voltage swing along interconnect to improve performance. The necessity of external power supply of reference voltage is avoided by using this limitation. The voltage swing limit is given by Equation 1.

$$\sim V_{\rm tn} \le V_{\rm s} \le \left( V_{\rm dd} - |\sim V_{\rm tp} | \right) \tag{1}$$

Equation 2 gives the energy saving ratio.

$$\frac{E_{low}}{E_{tot}} = \frac{V_s}{V_{dd}} \cong \frac{V_{dd} - |-V_{tp}| - V_{tn}}{V_{dd}}$$
(2)

The schematic diagram of driver-receiver configuration is shown in Figure 3. The driver and receiver end circuits are detailed Figure 4 and Figure 5. The driver section of the proposed scheme operates in three different modes: Active, Diode connected and Source follower. In the fully active mode, the driver provides full drive capability to charge/ discharge the interconnect line. The voltage swing is limited by diode-connected mode of driver also offering lower impedance. Finally, the source follower mode provides better noise immunity. When interconnect is driven to opposite direction, transistor finally turns off.

The overdrive beyond the switching limits improves the propagation delay and can be controlled by proper transistor sizing. The scheme gives higher drive strength for the same area as it has only one series transistor. In case the line is inactive for long periods, voltage level guards can be incorporated.

When input is HIGH, transistorsM3, M4 and M6 are ON and M1 (N driver), M2, M5and M7 are OFF. When the input transits from HIGH-LOW, M4, M3 and M8 (P driver) are turned OFF, while the gate of M1 is charged, through M5-M6, fully activating the output transistor (mode 1). As the interconnect line discharges towards ground, M7 which is active, turns M6 OFF and turns M2 ON. Gate of M1 "holds" the charge while the line is discharging till it is not low enough to activate M2. When M2 is active, gate of M1 is driven to match the line (mode 2). Upon LOW-HIGH transition of input, the same events occur on the upper half of the circuit (M8 side).

The receiver section is a simple inverter with an Enable signal. A balanced inverter is selected because of its simplicity and faster performance for conditions when the driven line crosses Vdd/2 at every transition. Long interconnect lines can lead to transistor mismatch in driver and receiver transistors. The enable signal in receiver then turns off the receiver avoiding any bias current when the line is not used.

On the other end of the transmission line (driver end), a simple inverter with enable signal is used as in Figure 5. The enable signal can be used to turn-off the inverter when the interconnect line is not used.







Figure 4. Driver End Schematic for scheme-1



Figure 5. Receiver End Schematic for scheme 1

# 4.2. Functioning of scheme-2

The circuit schematic for the second design topology is shown in Figure 6.

Power Efficient Clock Distribution for Switched Capacitor DC-DC Converters (Dr. A.S.R. Murthy)



Figure 6. Driver-receiver configuration Schematic diagram, Scheme-II



Figure 7. Driver end schematic, Scheme-II



Figure 8. Receiver end schematic, Scheme-II

33

The circuit on the receiver end of the interconnect line shown in Figure 8. The pass transistor M1 provides isolation from previous stages. In the absence of this pass stage, the current from Vdd through M3 will flow towards the previous stage. With the internal node isolated, M4 pulls up the gate of M3 at the input. M13, M14 and M15 transistors ensure reduction in the output pull-down transition time. The static power dissipation is eliminated by M11 and M12 transistors, which do not allow static current to ground when M2 is not fully ON. Additional pull-up device in the output at the receiver side improves the low-to-high propagation delay.

# 5. SIMULATION RESULT

The simulation results obtained for Scheme-I and II are discussed in this section. The schemes are evaluated based on three performance metrics: delay, power consumption, power delay product. A comparative study of the two schemes is presented.

# 5.1. Delay

Figure 6 and Figure 7 shows the propagation delay versus interconnect length for Scheme-I and II. The interconnect length was varied from 1mm to 10mm. The propagation delay follows a linear relationship with the interconnect length while the same for Scheme-II is quadratic. The propagation delay is lesser for Scheme-II for the interconnect lengths investigated in this work. However, for longer interconnect lengths; the delay in Scheme-II can increase appreciably higher than that in Scheme-I because of the quadratic relationship.



Figure 9. Propagation delay versus interconnect length, Scheme-I



Figure 10. Propagation delay versus interconnect length, Scheme-II

# 5.2. Power Consumption

Figure 8 shows the power consumption for Scheme-I for varying interconnect length from 1mm to 10mm. Figure 9 shows the same result for Scheme-II. Scheme-I performs better than Scheme-II by an order of magnitude. Power consumption in Scheme-I shows a steady increase with interconnect length. In Scheme-

Power Efficient Clock Distribution for Switched Capacitor DC-DC Converters (Dr. A.S.R. Murthy)

II, the power rises and then drops around 8mm interconnect length. However, the minimum power consumption in Scheme-I is an order more than the maximum consumption in Scheme-II.



Figure 11. Power consumption versus interconnect length, Scheme-I



Figure 12. Power consumption versus interconnect length, Scheme-II

# **5.3.** Power-Delay Product

Figure 13 10 shows the power-delay product for Scheme-I for varying interconnect length from 1mm to 10mm. Figure 11 shows the same result for Scheme-II. As discussed in IV-A and IV-B, the delay performance is better for Scheme-II while power performance is better for Scheme-I. In such a case the power-delay product is a good performance criterion to evaluate the two Schemes. As is shown in Figure 10 and 11, the power-delay product for Scheme-I is an order less than the Scheme-II.

Figure 12 shows the delay performance for the two schemes for varying loads. Both the schemes show a steady increase in the delay with load as expected.



Figure 13. Power-delay product versus interconnect length, Scheme-I



Figure 14. Power-delay product versus interconnect length, Scheme-II



Figure 15. Propagation delay v/s Load capacitance

# 6. CONCLUSION

In this paper, two optimized low swing signaling schemes based on MOS current mode logic circuit for driving long interconnects in on-chip global interconnects is discussed. Driver- receiver pair architectures at the two ends of a interconnect line using scheme-1 and scheme-2 are discussed and simulated. It is demonstrated that the performance of proposed low-swing signaling schemes for interconnects outperforms other schemes in terms of propagation delay, power consumption and power-delay product. It is concluded that the Scheme-II shows better performance that Scheme-I in terms of propagation delay, while Scheme-I is better in terms of power consumption. Scheme-I performs better than Scheme-II in terms of the power-delay product criterion. Simulation results verify that this analysis can provide a valuable guideline for the interconnect driver design for various VLSI applications. Hence, as the target of present work is to develop a power efficient scheme for efficient signal routing through interconnects, it can be concluded that Scheme-I is a design of choice.

#### REFERENCES

- M. Dave, M. Jain, M. S. Baghini, D. Sharma. A variation tolerant current-mode signaling scheme for on-chip interconnects. *IEEE Trans. VLSI Syst.* (2013); 21: 342–353.
- [2] S. K. Lee, S. H. Lee, D. Sylvester, D. Blaauw, J. Y. Sim. A 95 fJ/b current-mode transceiver for 10mm on-chip interconnect. in IEEE Solid-State Circuits Conf. (ISSCC) Digest Technical Papers (IEEE, 2013: 262–263.
- [3] S. Lee, S. Lee, B. Kim, H. Park, J. Sim. Current-mode transceiver for silicon Interposer Channel. IEEE J. Solid-State Circuits 2014; 49: 2044–2053.
- [4] R. T. Chang, N. Talwalkar, C. P. Yue and S. S. Wong, near speed-of-light signaling over on-chip electrical interconnects, IEEE J. Solid-State Circuits. 2003; 38: 834–838.
- [5] R Ambika, S Ramachandran, KR Kashwan. *Data security using serial commutative RSA CORE for multiple FPGA system*. IEEE, Devices, Circuits and Systems (ICDCS), 2014 2nd International Conference on, march 2014.

- [6] Y. Bai and S. S. Wong. Optimization of driver preemphasis for on-chip interconnects, *IEEE Trans. Circuits Syst. I: Regular Pap.* 2009; 56: 2033–2041.
- [7] L. Zhang, J. M. Wilson, R. Bashirullah, L. Luo, J. Xu and P. D. Franzon, A 32-Gb/s onchip bus with driver preemphasis signaling, *IEEE Trans. VLSI Syst.* 2009; 17: 1267–1274.
- [8] R. Ho, T. Ono, R. D. Hopkins, A. Chow, J. Schauer, F. Y. Liu, R. Drost. High speed and low energy capacitively driven on-chip wires. *IEEE J. Solid-State Circuits*. 2008; 43: 52–60.
- [9] E. Mensink, D. Schinkel, E. A. M. Klumperink, E. van Tuijl, B. Nauta, Power efficient gigabit communication over capacitively driven RC-Limited on-chip interconnects. *IEEE J. Solid-State Circuits*. 2010; 45: 447–457.
- [10] S. Devendra K. Verma, P. K. Barhai, V. Nath. QCA and CMOS Nanotechnology Based Design and Development of Nanoelectronic Security Devices with Encryption Schemes. *TELKOMNIKA Indonesian Journal of Electrical Engineering*. 2015; 14(2): 270 ~ 279 DOI: 10.11591/telkomnika.v14i2.7485.
- [11] Narasimhan, M. Kasotiya, R. Sridhar. A low-swing differential signaling scheme for on-chip global interconnects. IEEE International Conference on VLSI Design. 2005; 6: 634.
- [12] Wu Chenjian, Li Zhiqun, Yao Nan, Zhang Meng, Chen Liang, Cao Jia, CMOS Low Voltage Power Amplifier for WSN Application. *TELKOMNIKA Telecommunication Computing Electronics and Control*. 2013; 11(8): 4470~4476.
- [13] Rabaey J M (2009), "Low Power Design Essentials," Springer US.
- [14] P. Singh, J.S. Seo, D. Blaaw, D. Sylvester. Self-timed regenerators for high-speed and low-power on-chip global interconnect. *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.* 2008; 16(6): 673–677.
- [15] J.C. Montesdeoca, G.J. Montiel-Nelson, S. Nooshabadi, CMOS driver-receiver pair for low-swing signaling for low energy on-chip interconnects, *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.* 2009; 17(2): 311–316.
- [16] H. Rezaei, S. A. Moghaddam, A. Rahmati. High-speed low-power on-chip global interconnects using low-swing self-timed regenerators. *Microelectronics J.* 2016; 58: 76–82.
- [17] C. Sitik, W. Liu, B. Taskin, E. Salman. Design Methodology for Voltage-Scaled Clock Distribution Networks. in IEEE Transactions on Very Large Scale Integration (VLSI) Systems. 2016; 24(10): 3080-3093.
- [18] Z. A. Azim, M. C. Chen, K. Roy. Skyrmion Sensor-Based Low-Power Global Interconnects," in *IEEE Transactions on Magnetics*. 2017; 53(1): 1-6.
- [19] Z. A. Azim, A. Sharma, K. Roy. Buffered Spin-Torque Sensors for Minimizing Delay and Energy Consumption in Global Interconnects. in *IEEE Magnetics Letters*, 2017; 8: 1-5.
- [20] M. T. Khan, S. R. Ahamed, F. Brewer. Low Complexity and Critical Path Based VLSI Architecture for LMS Adaptive Filter Using Distributed Arithmetic. 2017 30th International Conference on VLSI Design and 2017 16th International Conference on Embedded Systems (VLSID). Hyderabad. 2017: 127-132.
- [21] S. L. Chen; G. S. Wu. A Cost and Power Efficient Image Compressor VLSI Design with Fuzzy Decision and Block Partition for Wireless Sensor Networks. in *IEEE Sensors Journal*. 99: 1-1
- [22] J. Zheng, N. Wu, L. Zhou, Y. Ye, K. Sun. DFSB-Based Thermal Management Scheme for 3-D NoC-Bus Architectures. in *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*. 2016; 24(3): 920-931.
- [23] T. Kagami, H. Matsutani, M. Koibuchi, Y. Take, T. Kuroda, H. Amano. Efficient 3-D Bus Architectures for Inductive-Coupling ThruChip Interfaces. in *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, 2016; 24(2): 493-506.