# Optimizing timing closure and enhancing efficiency in RTL design: a focus on physical design tasks for I2C design blocks

## Madhura Ramegowda<sup>1</sup>, Krutthika Hirebasur Krishnappa<sup>2</sup>, Divyashree Yamadur Venkatesh<sup>3</sup>, Kokila Sreenivasa<sup>4</sup>

<sup>1</sup>Department of Electronics and Communication Engineering, Dayananda Sagar College of Engineering, Bangalore, India
 <sup>2</sup>Department of Computer Science, Southern University and A&M College, Baton Rouge, United States
 <sup>3</sup>Department of Electronics and Communication Engineering, SJB Institute of Technology, Bangalore, India
 <sup>4</sup>Department of Electronics and Communication Engineering, Dayananda Sagar University Hosur Road, Bangalore, India

#### **Article Info**

## Article history:

Received Nov 23, 2024 Revised Apr 13, 2025 Accepted Jul 2, 2025

#### Keywords:

Application specific integrated circuit
Clock tree synthesis
Inter integrated circuit
Process voltage temperature
Resistorcapacitor
Synopsys design constraints

#### **ABSTRACT**

Achieving precise timing closure in integrated circuit (IC) design is a significant challenge, especially with today's rapid technology advancements and intricate design specifications. Even with intense post-synthesis optimization, timing violations persist particularly in multi-corner, multimode designs. This research work emphasizes the necessity for powerefficient methods and streamlined approaches to boost timing closure and physical verification. Modern IC design thrives on effective physical design optimization strategies, usually tackled top-down. Clock tree synthesis (CTS) is transformative which effectively addresses clock deviation, latency, transition time, and insertion delay. This investigation mainly focuses on improving timing closure for inter integrated circuit (I2C) design blocks using custom-designed ccopt\_spec and mmmc.tcl files to support multicorner, multi-mode settings and significantly reduces register-to-register path violations from 80 to. 0. Additionally, the development and the usage of mmmc.tcl and global files are highlighted as critical components in the design process.

This is an open access article under the <u>CC BY-SA</u> license.



1525

## Corresponding Author:

Madhura Ramegowda

Department of Electronics and Communication Engineering, Dayananda Sagar College of Engineering Bangalore, Karnataka, 560111, India

Email: madhura-ece@dayanandasagar.edu

## 1. INTRODUCTION

Integrating pre-designed gates and components from diverse designers into semiconductor devices demands meticulous attention to detail because the physical design activities have to strike a balance with the severe timing constraints. This process involves strategically placing buffers to control signal transitions, optimizing cell sizes to minimize delays, and fine-tuning wire lengths to ensure seamless integration and efficient performance. Critical stages in this process include floor planning, power distribution, component placement, and clock tree synthesis (CTS). The procedures outlined are necessary to enhance the performance and reliability of the whole design. There is an emphasis on precision and coordination so that each component will perfectly integrate to create an efficient and reliable semiconductor.

Floor planning involves strategically placing components, a task crucial in optimizing performance and minimizing interference. Power planning, on the other hand, involves efficient delivery and effective management of power throughout the design. This accurate placement of components reduces signal loss and delay. The CTS refers to the process of distributing clock signals uniformly to all sequential elements used in the design [1].

Journal homepage: http://ijeecs.iaescore.com

In CTS, clock signals should be distributed equitably to have minimum clock skew, latency, and transition time. The term clock skew, a significant time difference between the arrival of a clock signal at two different flip-flops, and clock latency, the temporal difference between the origin of the clock pin and the flip-flops, are not just definitions but critical factors that need to be handled with care to avoid any kind of timing-related issues in ASIC design. Buffers are strategically placed along the clock pathways to ensure that the clock delay remains consistent for all the clock inputs, thereby distributing the clock signals effectively with minimal differences in clock arrival times [2].

Global networks serve as the initial stages of CTS for clock signals, with clock propagation occurring only after necessary tunings are made through buffer and inverter addition. This technical method ensures the even propagation of clock signals to all relevant components, hence improving the overall effectiveness of the semiconductor design in meeting stringent timing requirements.

Significant contributions of this paper include:

- Thorough explanation of the physical design chores for a particular netlist of inter integrated circuit (I2C) design blocks with power reduction, violating paths.
- Developed the ccopt\_spec file, congestion report, and compared them using timing reports at each design step, and fixed design violations.
- The development of the global files and mmmc.tcl for the design is emphasized successfully.

This research work presents a novel approach for I2C block timing optimization using customized physical design files and modified CTS methodologies for power-conscious timing closure and path violation fixing.

Designing an efficient and reliable very large-scale integration (VLSI) circuit is challenging because of design complexity, power consumption, design rule violations, and timing closure. An effective physical design strategy can overcome each of the mentioned challenges, which may smoothly provide clock distribution, optimize path delay, and integrate innovative engineering change order (EOC) methodologies. The literature review discusses various approaches developed to handle these constraints, focusing on recent developments in timing optimization, power efficiency, and layout precision to improve VLSI performance and scalability. These limitations motivate the proposed work, which aims to build on top of the existing techniques and introduce new strategies to further improve design reliability and efficiency in complex VLSI systems.

Roy et al. [3] aimed to enhance the efficiency and performance of gated clock trees by accounting for the activity levels of different parts of the circuit and the placement of registers and addresses the challenges of designing clock trees that are both robust and efficient, particularly in the context of varying process, voltage, and temperature (PVT) conditions, aims to improve the clock tree design process to achieve timing closure across multiple operating conditions and modes by introducing an advanced clock tree resynthesis approach. Lu et al. [4] established a gated CTS technique that incorporates power consumption and slew rate to enhance overall performance and efficiency in VLSI systems.

Liu *et al.* [5] have introduced an innovative approach to low-power gated clock tree design, aimed to reduce clock tree power with less gating logic compared to prior clock gating research. Their proposed methodogies achieved a 30% reduction in clock divergence and improved skew variation across various corners, with a minimal 1% increase in wire length and a 2% rise in the buffer area, as detected in multiple test scenarios. João *et al.* [6], the authors investigated critical challenges in VLSI circuit design for testability, focusing on physical design methodologies to enhance IC test efficiency. Srivatsa *et al.* [7] exhibits higher fault coverage than the original designs. Additionally, proposed a tweaked conventional boundary port CTS starting from the center of the block and Multi-Source CTS with a symmetric H-tree, demonstrating reduced latency, skew, and power consumption.

Wu et al. [8] proposes a method for low clock skew applicable to mainstream industry CTS design flow. Pranav and Hiremath [9] introduced a rail analysis for power grid quality, aiding in debugging IR drop issues and providing optimizations for lower technology nodes. Lu and Taskin [10], the authors proposed a post-CTS clock delay insertion method for efficiently using limited delay insertion space, addressing timing violations, and optimizing the clock period. The papers [11]-[13] explores the tunable clock buffers and inverters designed for clock skew optimization in pre- and post-CTS are discussed. Lastly, the paper [14] discussed three major clustering algorithms for VLSI circuit partitioning, highlighting the K-medoid model's superior performance. Utility theory is suggested for use in cell placement problem decision-making under risk and uncertainty. Madhura and Jamuna [15], the authors have suggested a scan insertion method for defect analysis in the design synthesis process.

Stathis *et al.* [16], the authors have utilized synchronous VLSI architecture to generate clock trees using the abutment approach. Static timing analysis (STA) was used to evaluate the scalable RCT the abutment built. The papers [16], [17] highlighted the chip's critical path latency and clock path standard deviation declination, rise in the clock frequency and the surge in clock tree structure. Based on this, the

authors have concluded that the CTS is a problematic procedure that validates the mock evaluation of CTS to make a pre-estimate utilizing the heuristic approach without requiring the use of the licensed EDA tool. Lopera *et al.* [18] have discussed the traditional worst-case analysis techniques often underestimate design performance due to increased process variability in deep sub-micron technologies. For the chosen System on Chip (SoC) architecture, 70% of violating paths were resolved by data path optimization, with the remainder, 30%, monitored by push-pull and EOC patches. Their proposed work on the incremental EOC framework fixes 99.07% of violations in two rounds with only 30% manual effort, compared to the traditional flow's 96.43% fix rate and 100% manual effort.

Y. Kim and T. Kim [19], [20] have proposed SuperFlow, which is customized for adiabatic quantum-flux-parametron (AQFP) superconducting circuits. During the design phase, Super Flow respects the clocking and mixed cell-size limitations in AQFP circuits while concurrently optimizing wirelength and timing. Based on experimental results, SuperFlow performs better than other design tools for AQFP circuits in terms of time and wirelength, laying a solid foundation for future AQFP applications. Minnella *et al.* [21], the authors have discussed a universal RTL representation simple operator graph (SOG) that utilized and modified across multi-stage Machine Learning models for area, power, worst negative slack (WNS), and total negative slack (TNS). Furthermore, they presented two data augmentation techniques that produce RTL designs that closely resemble real-world data, thus resolving data scarcity problems. In these papers, the authors have highlighted on synthesis process for slack time analysis and floor planning for high network communications [22]-[25].

#### 2. METHOD

In our proposed research, we have utilized an I2C design at the RTL and conducted synthesis to generate a gate-level netlist. Subsequently, synopsys design constraints (SDC) were generated to define the design's timing requirements, which served as inputs for subsequent physical design activities. These activities were performed to analyze timing performance for low-power PVT settings and thus provides insight into the setup and hold constraints.

The proposed methodology in Figure 1, illustrates the physical design process in an I2C RTL design. The first step is to synthesize the RTL to obtain a gate-level netlist and SDC, that are utilized in physical design tasks to optimize the layout. Later the timing reports are generated to verify to meet the design timing constraints, where the feedback is provided to tune the physical design tasks to meet the timing contraints.



Figure 1. Proposed architecture

The Proposed architecture comprises of five blocks. Each are described:

- a) I2C RTL design: the RTL for I2C design using it as bench mark design is developed.
- b) Synthesis: the genus tool is utilized to carry out synthesis with SDC as input to get gate level netlist and SDC as output.
- c) Gate level netlist: the result of the synthesis process is a gate-level netlist, which is a textual depiction of the logic gates and connections in a circuit. It forms the basis for physical design and provides tools for comprehending the design and designing the chip layout.
- d) Physical design tasks: these tasks are illustrated with physical design flow.

The Figure 2 outlines the physical design flow, starting with the design configuration and power planning. Later, physical only cells are integrated into the design to enhance signal integrity that helps in power distribution. Later the standard cells are placed for optimal performance, while CTS ensures the timing consistency of clocks. Lastly, the routing stage completes the design by connecting all the cells while meeting timing and design constraints. The proposed workflow encompasses the following steps, initialize the design setup, conduct floor planning, insert physical-only cells, Implement power planning, perform placement, execute CTS, and carry out routing.



Figure 2. Physical design flow

#### 2.1. Initialization of design setup

In the initial stage of our research, the RTL design is established, defining the digital circuit's behavior through hardware description languages such as Verilog. The primary objective is to optimize the RTL design early in the process, recognizing that downstream synthesis stages are less effective in addressing suboptimal RTL quality. However, assessing RTL quality directly from HDL code is a challenging and time-consuming. Importing the design is a crucial step preceding the construction of the floorplan. This process necessitates furnishing all essential inputs to the physical design implementation tool to progress through each subsequent stage seamlessly.

Generating mmmc file. The multimode multi corner (MMMC) file is created by conducting static timing analysis concurrently across various operating modes, process voltage temperature (PVT) corners, and parasitic interconnects. This file is generated using the Innovus implementation tool, and correct inputs will be provided through the design import window. This file includes specific documents like the Netlist: Netlist\_module\_i2c\_gln.v, Library Exchange Format: gsclib045.fixed2.lef files, and View Definition VSS files, power nets: i2c.view.power net -VDD, ground nets: VSS and lastly a view file is created through a multidimensional decision-making process. This process involves finding a balance between the worst-case and best-case scenarios, and also entails changing the design to meet operational realities. That navigation has to be performed by looking at different parameters in the MMMC browser for the right view files.

The technical steps involved in the process of creating a view file through the MMMC browser are as follows:

- Choosing the correct library set: Max (max.lib) library set or min (min.lib) library set.
- Select the appropriate RC (Resistance-Capacitance) corners: max.rc or min.rc.
- Setting operating conditions: Either max.op or min.op must be chosen then select the delay corners by selecting max\_delay or min\_delay.
- Defining the constraint modes involves determining which functional mode the design must satisfy.
   The major considerations required for analysis views are:
- The slow or fast process has to be selected for overall views of analysis, as it will affect the overall speed of analysis processing.
- A choice between slow and fast process has to be made while setting the analysis views, as it directly determines the speed of the setup analysis; the same applies to hold analysis views. These choices will allow users to adapt their view file creation according to their design's unique needs and constraints, ensuring accurate and efficient analysis.

## 2.2. Floor planning phase

Our research involves a comprehensive floor planning process, determining the strategic placement of functional blocks within the chip area. This meticulous planning accounts for critical factors such as area, power, and signal routing, aiming to achieve an efficient layout that optimizes overall performance. The floor planning involves crutial steps that are very important in the process of chip design development.

The first step comprises, the Netlist which is associated with the physical library that allows easy incorporation of essential design components into the layout. A core initial is created to act as the basic framework which further development is built. The next step requires a strategic placement of input output (IO) pins, lastly a pad ring would be created to optimize the layout for effective connectivity and functionality.

These processes continues with placing macros or standard cells and placement blockages. The specifications of power and ground nets are addressed, as these components are fundamental to the chip's functioning. Hence, the power and macro rings are accurately designed to ensure efficient energy distribution. The routing of power and ground nets are well organized to maintain coherence in organization and functionality in the design. Finally, a comprehensive validation process is performed to detect and rectify any deviations from design constraints, ensuring the fidelity and reliability of the overall chip design. This systematic approach underscores the significance of meticulous floor planning in driving the successful progression of the chip design process.

## 2.3. Incorporating physical-only cells

During the layout phase, physical-only cells are proposed which is also referred to as filler cells. These cells serve the purpose of filling gaps, seamless connectivity, contributing to the enhancement of uniformity and manufacturability in the layout. In chip design, the absence of specific cells within the design netlist signals their inclusion entirely as physical cells. These physical cells, lacks representation in timing pathway reports, are typically integrated in the chip development in the subsequent process. The Tap cells are categorized as non-logic entities which are equipped with substrate ties, healthy ties, or both. They are systematically arranged within standard cell rows at predetermined intervals, adhering to guidelines outlined in the design rule handbook. Tap cells become invaluable when most or all normal cells within the library lack substrate or well taps, ensures the integrity and functionality of the chip design.

Another essential component in chip design is the end-cap cell, which is typically devoid of logic functions but serves various purposes, such as acting as a decoupling capacitor for the power rail. It is important to provide suitable end-cap cells, as the tool supports any standard cell intended for this purpose. The intentional use of proper end-cap cells helps designers control power distribution and maintain signal integrity in order to improve the overall resilience and reliability of a chip design.

## 2.4. Power planning phase

Our proposed methodology includes a robust power planning phase, mainly focuses on distributing power and ground networks. This step aims to minimize voltage drops and mitigate noise, eventually elevates the overall performance of the IC. The IC design involves a complex power distribution process that contains numerous critical elements to conduct power smoothly across the chip. The primary constituent of a system includes power rings that basically create a loop for distributing VDD power around the chip and ensures that sufficient power is supplied to every section of the circuit. The rings are the main power distribution channels and helps to stabilize voltage levels by smoothing out variations in the circuit.

In addition to providing power rings with functionality, power strips serve as intermediary connections that distribute the voltage source (VSS) and voltage drain supply (VDD) from the rings to other chip locations. Another purpose of these power strips is to keep the supply of power uniform for all functional blocks and standard cells, thus reduces the impedance and improves the overall performance of the chip. The final connections between the power voltages and the reference cells are made using the power rails, which allow direct powering of certain cells directly from the chip.

The electromigration (EM) and internal resistance (IR) are essential considerations in power distribution system design. EM refers to the phenomenon by which electron flow through metallic traces leads to the gradual degradation of the material and, hence, calls for careful planning and dimensioning of power conductors to avoid the risks of high current densities. On the other hand, IR refers to opposition to be encountered within the power distribution network of a chip (NoC), impacting both voltage drop and power dissipation.

The Core Power Ring with vertical and horizontal power straps helps efficiently to supply power to the core elements. Further, these components dissipate power throughout the core area and also each functional unit acquire enough energy for proper functioning. The foundation of this chip architecture is a set of specialized components, includes the Core Power Ring and vertical and horizontal power straps, which are essential for efficient power distribution to the core components. These enable the even distribution of power across the core area so that all the basic functional units adequate energy to perform optimally.

## 2.5. Placement phase

The placement phase performs the assignment of the logic cells to a specific location on the chip with the goal of strategic placement that minimizes wirelength and maximizes the timing. Placement in chip

design is a step of the place and route (PnR) methodology, where standard cells are placed optimally in the core area. Standard cells are essential building blocks defined logically in the Netlist. Subsequently, it is placed appropriately in the physical design of the chip with regards to many physical constraints defined in the layout exchange format (LEF) files.

The problem of cell placement is a very complex and challenging task since the quality of placements has a direct and crucial impact on the success of subsequent routing. Increased efficiency, lower power consumption, and reduced latency can be achieved only with an optimal placement. A design with optimal placement allows reliable signal routing paths, fewer interconnections, lower parasitic capacitance and resistance. Furthermore, effective placement contributes toward fulfilling the design constraints such as timing, area, and power needs. The placement quality has a significant effect on the success of the whole chip design; hence it is one of the most crucial steps in the PnR flow. Designers generally resort to complex algorithms and techniques to achieve optimum placement results. Some strategies are adopted to balance conflicting goals and optimize several metrics append analytical placement, incremental placement, global placement, timing closure, congestion, and wirelength. Besides, hierarchical placement techniques are employed to handle the complexities of modern designs effectively. Therefore, the skilled cell placement is critical in obtaining high performance, efficient area utilization, and manufacturability.

## 2.6. CTS phase

The proposed approach will generate the CTS, responsible for clock network with a specific hierarchical level and distribute the clock signal to all the sequence pieces with a minimal skew for a synchronous operation. CTS is one of the most fundamental steps in sequential circuit design flows. It aims to balance connections between clock pins by the insertion of inverters or buffers. The utmost purpose of CTS is to eliminate asymmetry and reduce the latency trigerred by these additions to ensure uniform delivery of the clock signal across the chip. This process is critical to synchronizing the operation of sequential elements, thus enhances the overall performance and reliability of the system. Typically, a single clock source drives all clock pins, and strategic planning is required to achieve clock balance and meet stringent design requirements.

For signals that fail to violate design rule check (DRCs), in particular the reset and scan enable signals with large fanouts, are accustomed to improve the signal integrity and reduce propagation delays. However, the clock signals, which are critical to meet timing synchronization, are not buffered because of their vital role in keeping the timing accurate. Understanding the importance of timing constraints is relevant to ensuring correct data capture in flip-flops. The setup time is when the signal must be stable before the clock edge sets the timing requirements for data arrival at the destination flip-flop. The setup slack could be determined only when the data arrival time is identified based on the delays caused by a launching flip-flop and a buffer cell. On the other hand, the required time is obtained by subtracting the setup time of the capture flip-flop from the total period of the clock.

On the other hand, the hold time represents the minimum duration a signal must remain stable after the clock edge for reliable data capture. Hold slack is calculated by comparing the data's arrival time with the capture flip-flop's required hold time. In (1)-(3) provide a framework for calculating setup slack where, FF represents Flipflop Clk, q is delay,  $T_c$  is buffer cell delay,  $C_p$  is Clock period and  $T_{su}$  is setup time of capture flipflop. While (4)-(6) outline the methodology for determining hold slack where,  $T_h$  represents hold time. Positive slack values indicate adherence to timing constraints, whereas negative slack values signify potential timing violations that entail optimization. Timing analysis tools utilize slack values to identify critical timing paths, enables designers to refine the design and meet timing requirements iteratively. Thus, a comprehensive understanding of the setup and hold timing constraints is essential for ensuring robust and reliable chip designs.

Setup slack:

Arrival time = 
$$FF$$
 ( $clk - q$  delay of launch  $FF$ ) +  $T_C$  (buffer cell delay) (1)

Required time = 
$$C_p$$
 (clock peroid) -  $T_{su}$  (setup time of capture FF) (2)

$$Setup Slack = Required time - Arrival time$$
 (3)

Hold slack:

$$Arrival\ time = FF(clk - q\ delay\ of\ launch\ ff) + T_C\ (buffer\ cell\ delay) \tag{4}$$

Required time = 
$$Th$$
 (hold time of capture  $FF$ ) (5)

 $Hold\ slack = Arrival\ time - Required\ time$ 

(6)

## 2.7. Routing phase

Routing designates careful interconnection through metal layers in between placed cells, considering critical issues like timing constraints, congestion, and manufacturability. This is a crucial stage for enhancing robustness and a well-connected chip design. According to a set of pre-defined design rules, the routing part of chip design involves creating metal wires to establish connections among pins carrying the same signal. The process utilizes a complex wiring network in the routing area to establish connections for all nets highlighted in the netlist. The routing process comprises several distinct steps, including special routing and global routing, congestion analysis, and timing optimization of the routing layout, which involves study, detailed routing, minimization and optimization, timing optimization and analysis, and parasitic extraction.

The routing implementation using the nano route engine in the Innovus tool is performed by executing commands according to a specified protocol. The selected commands control the routing operation for the nano route engine to be invoked to perform routing. The nano route engine performs the routing of wires using a Manhattan routing approach where the wires can be mainly routed horizontally or vertically to utilize the routing resources and shorten the signal delay efficiently. With this routing methodology and the nano route engine capabilities, designers can achieve best-in-class routing solutions tailored to their specific design requirements, improving integrated circuit (IC) design performance and reliability.

Figure 3 provides a routing process in physical design, starting at the Nano route tool initialization. It passes through both the global and detailed routing phases for preliminary connections and is followed by timing optimization to meet timing constraints. Finally, it is succeeded by post-route optimization before finalizing the design.



Figure 3. Detailed route in NanoRoute for wire and timing optimization

In Figure 4, the commands are used in setting alternate routes for physical design process for timing and signal integrity. The first set enables timing-driven and SI-driven (signal integrity) routing and then initiates the routing process.

set\_db route\_design\_with\_timing\_driven true set\_db route\_design\_with\_si\_driven true route\_design

Figure 4. Commands for timing-driven and signal integrity-driven routing setup

The Figure 5 commands turn off timing-driven routing temporarily, allow wire spreading and multicut through post-route optimization, and then turn timing-driven routing back on to optimize for setup and hold timing constraints after the route.

```
set_db route_design_with_timing_driven false
set_db route_design_detail_post_route_spread_wire true_set_db
route_design_detail_use_multi_cut_via_effort high_route_design
-wire_opt
set_db route_design_with_timing_driven true
opt_design -post_route -setup -hold
```

Figure 5. Commands for post-route optimization and timing constraint adjustments

## 2.8. Timing reports

In physical design, timing reports are crucial for evaluating and improving a circuit's timing performance. Clock skew, slack, setup and hold violations, and other crucial factors influence timing closure are included in these reports. The following are the main categories of physical design timing reports, setup timing report, slack analysis, path analysis, data arrival time, and data required time.

#### 3. RESULTS AND DISCUSSION

The configuration of an I2C design requires the proper setting of a few critical inputs. One of the most vital inputs involves the netlist module (\_i2C\_gln.v), provides the logical description of the elements present in the design. A second input is the LEF (Library Exchange Format) file and gsclib045.fixed2.lef, which describes the physical parameters of the standard cells in the design library. The third input needed is a view definition file, i2c.view, which defines how the design elements will be described and viewed. Identifying power and ground networks (designated as VDD and VSS, respectively) is imperative to ensure the correct functioning of the circuit.

At the floor planning stage, the goal is to arrange macros and obstructions efficiently so as to consolidate an area that makes sense for standard cell placement. This involves specifying various parameters such as aspect ratio (approximately 1), core utilization (targeting 40% of the available area), and core margins (including margins to the left, right, top, and bottom of the core area). Furthermore, in Figure 6, the pins are placed in metal layers 2 and 3, indicating the routing layers used to connect the circuit components. These detailed specifications ensure the floor planning process establishes a solid foundation for subsequent placement and routing stages in the chip design process.



Figure 6. Indicates floor-plan structure for pins placed in metal 2 and 3

The Figure 7 illustrates the physical cells exclusively, accompanied by end cap and tap cells, and power planning configuration. This visual representation provides a detailed view of the layout which

showcases the arrangement of physical cells, integration of essential end caps and tap cells. Additionally, the also depicts the strategic distribution of power resources throughout the layout, highlighting the critical aspect of power planning that ensures the proper functioning and performance of the IC design.



Figure 7. Power planning layout with physical cells, end cap cells, and tap cells

## 3.1. DRC and connectivity check analysis with routing adjustments

Two connectivity violations were identified, along with routing violations, as illustrated in Figure 8. Consequently, routing adjustments were implemented to address these issues, as depicted in Figure 9. The successful routing corrections resulted in the elimination of all DRC violations. Figure 9 illustrates the results of the DRC analysis conducted on the circuit layout, showcasing the identified violations and the subsequent routing adjustments implemented to address them.

## >> Verify Connectivity

```
Begin Summary
2 Problem(s) (IMPVFC-98): Net has no global routing and no special routing.
2 total info(s) created.
End Summary

End Time: Tue Sep 28 09:22:22 2021
Time Elapsed: 0:00:17.0

******** End: VERIFY CONNECTIVITY *******
Verification Complete : 2 Viols. 0 Wrngs.
(CPU Time: 0:00:01.0 MEM: 0.000M)
```

Figure 8. Window output for connectivity verification

The placement of standard cells is a crucial aspect of VLSI chip physical design, aims to minimize the chips area and the lengths of wires connecting its cells to ensure optimal performance which is depicted in Figure 10. The primary objective of the placement step is to position cells in a manner that minimizes the overall wire length, thereby enhances the efficiency of the chip design.

## >> Verify DRC

```
innovus 3> verify drc
   * Starting Verify DRC (MEM: 973.2) ***
  VERIFY DRC ..... Starting Verification
  VERIFY DRC ..... Initializing
  VERIFY DRC .....
                   Deleting Existing Violations
  VERIFY DRC ..... Creating Sub-Areas
  VERIFY DRC ..... Using new threading
  VERIFY DRC ..... Sub-Area: {0.000 0.000 55.680 53.760} 1 of 4
  VERIFY DRC .....
                   Sub-Area : 1 complete 0 Viols
  VERIFY DRC ..... Sub-Area: {55.680 0.000 109.600 53.760} 2 of 4
  VERIFY DRC
                   Sub-Area :
                              2 complete 0 Viols
  VERIFY DRC
                   Sub-Area: {0.000 53.760 55.680 107.350} 3 of 4
  VERIFY DRC ..... Sub-Area : 3 complete 0 Viols.
                   Sub-Area: {55.680 53.760 109.600 107.350} 4 of 4
  VERIFY DRC .....
  VERIFY DRC ..... Sub-Area : 4 complete 0 Viols.
  Verification Complete: 0 Viols.
 *** End Verify DRC (CPU: 0:00:00.1 ELAPSED TIME: 0.00 MEM: 5.0M) ***
```

Figure 9. DRC results and analysis



Figure 10. Standard cell placement analysis

## 3.2. Setup timing report 3.2.1. Views for PreCTS

The setup timing report provides insights into the performance characteristics of the design pre-CTS. Analysis views, representing different perspectives of the physical layout of the chip or IC, plays crucial role throughout the design phase, aiding in various analyses and verifications. These views include the geometric view, focusing on the detailed physical layout; the abstract view, offering a simplified representation for early design exploration; and the connectivity view, which highlights the interconnections between component. Each analytical view serves a distinct purpose in ensuring the physical design meets performance, power, and space criteria. The timing paths WNS and TNS are important metrics, which are analyzed in the setup views of slow and fast scenarios, as illustrated in Figure 11. Figure 12 provides a summary of the design information, showing the principal parameters and attributes of the design. A detailed summary is critical to understand the design specifications, allows improved communication and decision-making within the design process.



Figure 11. Setup views for slow and fast scenarios



Figure 12. Design information overview

## 3.3. Clocks in the design

The clock signal is physically propagated to each successive cell in the chip layout as part of the design. Two physical attributes of the clock signal, described as clock jitter and clock skew, are significant considerations in clock distribution. Figure 13 shows an example of a PCLK application for both fast and slow viewpoints empahsis the importance of clock choice in achieving maximum performance and timing accuracy.

| Clock Descriptions |        |            |                |      |       |           |            |  |  |  |  |
|--------------------|--------|------------|----------------|------|-------|-----------|------------|--|--|--|--|
|                    |        | Attributes |                |      |       |           |            |  |  |  |  |
| Clock<br>Name      | Source | View       | Period         | Lead | Trail | Generated | Propagated |  |  |  |  |
| PCLK<br>PCLK       |        |            | 2.000<br>2.000 |      |       |           | n<br>n     |  |  |  |  |

Figure 13. Clock distribution in RTL design

## 3.4. Placement optimization

Figure 14 is the configuration view after optimization, highlights the improvements in component placement within the design canvas. This optimization process is completed to improve performance while reduces the signal delays and significantly meets the timing constraints for better overall efficiency and reliability of the design.

| timeDesign Summary       |                |                |      |                         |        |                |  |  |  |  |  |
|--------------------------|----------------|----------------|------|-------------------------|--------|----------------|--|--|--|--|--|
| Setup views inclusions   | ded:           |                |      |                         |        |                |  |  |  |  |  |
| Setup mode               | all            | reg            | 2reg | default                 | +<br>: |                |  |  |  |  |  |
| TNS (                    |                | 0.             | -    | -0.479<br>-21.312<br>72 | 2      |                |  |  |  |  |  |
| +                        | tns: <br>+     | 208            | 1    |                         | 168    | +              |  |  |  |  |  |
| DRVs                     |                |                | Real |                         |        | Total          |  |  |  |  |  |
|                          | Nr             | nets(terr      | ms)  | Worst Vio               |        | Nr nets(terms) |  |  |  |  |  |
| max_cap<br>max tran      | <br> <br>      | 0 (0)<br>0 (0) |      | 0.000<br>0.000          |        | 0 (0)          |  |  |  |  |  |
| max_fanout<br>max_length | 0 (0)<br>0 (0) |                |      |                         | 0<br>0 | 0 (0)   0 (0)  |  |  |  |  |  |
|                          |                |                |      |                         |        |                |  |  |  |  |  |

Figure 14. Setup views after optimization

## 3.5. Clock tree synthesis

Figure 15 shows the non-default rule (NDR) configuration, used to define the initial placement of the clock source, increase insertion delay, and build the design less susceptible to crosstalk. Figure 16 further outlines the CTS configuration, showing the design's hierarchical allotment of the clock signals. Together, these figures provide essential views of the optimization techniques applied to enhance timing performance and reduce signal integrity issues in the design.

Figure 17 explains how to utilize the CCopt file, which contains the clock constraints used in the Innovus tool. These constraints play an important role in guiding the CTS and ensures the proper functionality of the design concerning timing and clock distribution. The figure also provides a visual representation of the configuration file that is significant for developing valid and efficient clocking schemes in chip design.

After completing CTS and optimization, the routing process entails physically connecting signal pins using metal layers. Figure 18 illustrates how this process defines the precise pathways for joining standard cells, macros, and I/O pins within the design.

```
add_ndr -width {Metal1 0.12 Metal2 0.14 Metal3 0.14 Metal4 0.14 Metal5 0.14 Metal6 0.14 Metal7 0.14 Metal8 0.14 Metal9 0.14 } - spacing {Metal1 0.12 Metal2 0.14 Metal3 0.14 Metal4 0.14 Metal5 0.14 Metal6 0.14 Metal7 0.14 Metal8 0.14 Metal9 0.14 } -name 2w2s create_route_type -name clkroute -non_default_rule 2w2s - bottom_preferred_layer Metal5 -top_preferred_layer Metal6 set_ccopt_property route_type clkroute -net_type trunk set_ccopt_property route_type clkroute -net_type leaf set_ccopt_property buffer_cells {CLKBUFX8 CLKBUFX12} set_ccopt_property inverter_cells {CLKINVX8 CLKINVX12} set_ccopt_property clock_gating_cells TLATNTSCA* create_ccopt_clock_tree_spec -file ccopt.spec
```

Figure 15. NDR setup for CTS



Figure 16. CTS configuration

```
# Clock tree setup script - dialect: Innovus
if { [get_ccopt_clock_trees] != {} } {
  error {Cannot run clock tree spec: clock trees are already defined.}
namespace eval ::ccopt {}
namespace eval ::ccopt::ilm {}
set ::ccopt::ilm::ccoptSpecRestoreData {}
# Start by checking for unflattened ILMs.
# Will flatten if so and then check the db sync.
if { [catch {ccopt_check_and_flatten_ilms_no_restore}] } {
 cache the value of the restore command output by the ILM flattening code
set ::ccopt::ilm::ccoptSpecRestoreData $::ccopt::ilm::ccoptRestoreILMState
# The following pins are clock sources
set_ccopt_property cts_is_sdc_clock_root -pin PCLK true
# Clocks present at pin PCLK
   PCLK (period 2.000ns) in timing config
func_mode([/StudentData/student162/pd_work/assignment/power.enc.dat/libs/mmmc/module_
create_ccopt_clock_tree -name PCLK -source PCLK -no_skew_group
set_ccopt_property source_output_max_trans -delay_corner max_delay -early -clock_tree
set_ccopt_property source_output_max_trans -delay_corner min_delay -early -clock_tree
PCLK 0.200
set_ccopt_property source_output_max_trans -delay_corner max_delay -late -clock_tree
PCLK 0.200
set_ccopt_property source_output_max_trans -delay_corner min_delay -late -clock_tree
# Clock period setting for source pin of PCLK
set_ccopt_property clock_period -pin PCLK 2
```

Figure 17. Clock constraints file

## 3.6. Comparison of Pre-CTS and Post-CTS optimization for setup views

The comparison between Pre-CTS and Post-CTS stages after optimization reveals significant enhancements in timing performance. TNS notably decreased for all paths, demonstrating improved timing closure. Moreover, the reduction in the number of violating paths indicates the effectiveness of the optimization process in resolving timing violations. These findings underscore the importance of CTS and optimization in achieving robust timing integrity and overall design reliability.

Upon analysis of the comparison in Table 1. there has been a significant reduction in the TNS for all paths. Specifically, the number of violating paths decreased from 88 to 0 for the reg2reg path and 119 to 72 for all pathsIn this section, it is explained the results of research and at the same time is given the comprehensive discussion. Results can be presented in figures, graphs, tables and others that make the reader understand easily [14], [15]. The discussion can be made in several sub-sections.



Figure 18. Routing of design

Table 1. Comparison of Pre-CTS and Post-CTS optimization for setup views

| Setup Mode      |         | PreCTS  |         | PostCTS after opt |         |         |  |  |
|-----------------|---------|---------|---------|-------------------|---------|---------|--|--|
|                 | All     | Reg2reg | default | all               | Reg2reg | default |  |  |
| WNS             | -1.292  | -0.720  | -1.292  | -0.479            | 0.055   | -0.479  |  |  |
| TNS             | -82.491 | -19.140 | -82.491 | -21.312           | 0.000   | -21.312 |  |  |
| Violating paths | 119     | 80      | 119     | 72                | 0       | 72      |  |  |

### 4. CONCLUSION

In conclusion, this research paper showcases the efficient execution of physical design tasks. As CTS successfully reduces the overall negative slack while addressing violations through integrating buffers. inverters and adherence to clock source specifications. The generation of global and multi-mode multi-corner files for the design has been accomplished successfully. Comprehensive reports on area, power, and timing at each physical design level were also developed and compared. Furthermore, measures were taken to address any identified design defects, including generating the ccopt\_spec file and a congestion report.

## **ACKNOWLEDGEMENTS**

The authors would like to extend their sincere gratitude to the VLSI Design Laboratory team at Dayananda Sagar College of Engineering for their technical guidance and access to design tools used in the study. Their support has been instrumental in refining several aspects of the proposed physical design methodology.

## FUNDING INFORMATION

Authors state no funding involved.

## **AUTHOR CONTRIBUTIONS STATEMENT**

This journal uses the Contributor Roles Taxonomy (CRediT) to recognize individual author contributions, reduce authorship disputes, and facilitate collaboration.

| Name of Author      | C            | M            | So | Va           | Fo | I            | R | D            | 0            | E            | Vi | Su           | P            | Fu |
|---------------------|--------------|--------------|----|--------------|----|--------------|---|--------------|--------------|--------------|----|--------------|--------------|----|
| Madhura Ramegowda   | ✓            | ✓            | ✓  | ✓            | ✓  | ✓            |   | ✓            | ✓            | ✓            |    |              | ✓            |    |
| Krutthika Hirebasur |              | $\checkmark$ |    | $\checkmark$ |    | $\checkmark$ |   | $\checkmark$ | $\checkmark$ | $\checkmark$ | ✓  | $\checkmark$ |              |    |
| Krishnappa          |              |              |    |              |    |              |   |              |              |              |    |              |              |    |
| Divyashree Yamadur  | $\checkmark$ |              | ✓  | $\checkmark$ |    | $\checkmark$ |   |              | $\checkmark$ |              | ✓  |              | $\checkmark$ |    |
| Venkatesh           |              |              |    |              |    |              |   |              |              |              |    |              |              |    |
| Kokila Sreenivasa   |              | ✓            | ✓  |              |    |              |   |              |              |              |    |              |              |    |

П

Fo: Formal analysis E: Writing - Review & Editing

#### CONFLICT OF INTEREST STATEMENT

Authors state no conflict of interest.

#### INFORMED CONSENT

Not applicable. No personal or individual data was used that would require informed consent.

#### ETHICAL APPROVAL

Not applicable. The research presented does not involve human or animal subjects.

#### DATA AVAILABILITY

The data that support the findings of this study are available from the corresponding author upon reasonable request. Timing reports, CCCopt specifications, and routing snapshots generated during physical design tasks are archived and will be shared to promote transparency and reproducibility upon inquiry.

#### REFERENCES

- [1] W. Shen, Y. Cai, X. Hong, and J. Hu, "An effective gated clock tree design based on activity and register aware placement," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, vol. 18, no. 12, pp. 1639–1648, 2010, doi: 10.1109/TVLSI.2009.2030156.
- [2] A. Rajaram and D. Z. Pan, "Robust chip-level clock tree synthesis," *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, vol. 30, no. 6, pp. 877–890, Jun. 2011, doi: 10.1109/TCAD.2011.2106852.
- [3] S. Roy, P. M. Mattheakis, L. Masse-Navette, and D. Z. Pan, "Clock tree resynthesis for multi-corner multi-mode timing closure," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 34, no. 4, pp. 589–602, Apr. 2015, doi: 10.1109/TCAD.2015.2394310.
- [4] J. Lu, W. K. Chow, and C. W. Sham, "Fast power- and slew-aware gated clock tree synthesis," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, vol. 20, no. 11, pp. 2094–2103, Nov. 2012, doi: 10.1109/TVLSI.2011.2168834.
- [5] W. Liu, C. Sitik, E. Salman, B. Taskin, S. Sundareswaran, and B. Huang, "SLECTS: slew-driven clock tree synthesis," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, vol. 27, no. 4, pp. 864–874, Apr. 2019, doi: 10.1109/TVLSI.2018.2888958.
- J. João, H. T. De Sousa, F. M. Gonçalves, and J. P. Teixeira, "Physical design of testable cmos digital integrated circuits," *IEEE Journal of Solid-State Circuits*, vol. 26, no. 7, pp. 1064–1072, Jul. 1991, doi: 10.1109/4.92027.
   V. G. Srivatsa, A. P. Chavan, and Di. Mourya, "Design of low power high performance multi source H-tree clock distribution
- [7] V. G. Srivatsa, A. P. Chavan, and Di. Mourya, "Design of low power high performance multi source H-tree clock distribution network," in *Proceedings of 2nd International Conference on VLSI Device, Circuit and System, VLSI DCS 2020*, Jul. 2020, pp. 468–473, doi: 10.1109/VLSIDCS47293.2020.9179954.
- [8] G. Wu, S. Jia, Y. Wang, and G. Zhang, "An efficient clock tree synthesis method in physical design," in 2009 IEEE International Conference on Electron Devices and Solid-State Circuits, EDSSC 2009, Dec. 2009, pp. 190–193, doi: 10.1109/EDSSC.2009.5394159.
- [9] G. S. Pranav and S. Hiremath, "Physical design flow for faster TAT in lower technology nodes," *International Journal of Engineering Research and Technology (IJERT)*, vol. 11, no. 6, 2022, doi: 10.17577/IJERTV11IS060299.
- [10] J. Lu and B. Taskin, "Post-CTS delay insertion," Midwest Symposium on Circuits and Systems, 2010.
- [11] M. R. Gowda and Jamuna, "Fault simulation for design for testability inserted designs," *Indonesian Journal of Electrical Engineering and Computer Science*, vol. 29, no. 2, pp. 658–668, Feb. 2023, doi: 10.11591/ijeecs.v29.i2.pp658-668.
- [12] R. Madhura and M. J. Shantiprasad, "Implementation of scan logic and pattern generation for RTL design," in *New Trends in Computational Vision and Bio-Inspired Computing Selected Works Presented at the ICCVBIC 2018*, Springer International Publishing, 2020, pp. 385–396.
- [13] N. Parthibhan, S. Ravi, and K. H. Mallikarjun, "Clock skew optimization in pre and post CTS," in *Proceedings 2012 International Conference on Advances in Computing and Communications, ICACC 2012*, Aug. 2012, pp. 146–149, doi: 10.1109/ICACC.2012.33.
- [14] M. Johann, "Recent advances and challenges in physical design automation," in 2013 IEEE Computer Society Annual Symposium on VLSI (ISVLSI), Aug. 2013, pp. 223–223, doi: 10.1109/isvlsi.2013.6654613.
- [15] R. Madhura and S. Jamuna, "Optimised DFT architecture through scan based design," 14th International Conference on Advances in Computing, Control, and Telecommunication Technologies (ACT 2023), pp. 1093–1101, 2023.
- [16] D. Stathis, P. Chaourani, S. M. A. H. Jafri, and A. Hemani, "Clock tree generation by abutment in synchoros VLSI design," in 2021 IEEE Nordic Circuits and Systems Conference, NORCAS 2021 - Proceedings, Oct. 2021, pp. 1–7, doi: 10.1109/NorCAS53631.2021.9599857.
- [17] N. Kwon and D. Park, "Shallow clock tree pre-estimation for designing clock tree synthesizable verilog RTLs," *Electronics (Switzerland)*, vol. 12, no. 20, p. 4340, Oct. 2023, doi: 10.3390/electronics12204340.
- [18] D. S. Lopera, R. N. Kunzelmann, E. Kaja, and W. Ecker, "Fake Timer: an engine for accurate timing estimation in register transfer level designs," in *Proceedings - International Symposium on Quality Electronic Design, ISQED*, Apr. 2024, pp. 1–8, doi: 10.1109/ISQED60706.2024.10528723.

[19] Y. Kim and T. Kim, "Synthesis and exploration of clock spines," IET Computers and Digital Techniques, vol. 12, no. 5, pp. 241–248, Jul. 2018, doi: 10.1049/iet-cdt.2017.0234.

- [20] Y. Kim and T. Kim, "Algorithm for synthesis and exploration of clock spines," in *Proceedings of the Asia and South Pacific Design Automation Conference*, ASP-DAC, Jan. 2017, pp. 263–268, doi: 10.1109/ASPDAC.2017.7858330.
- [21] F. Minnella, J. Cortadella, M. R. Casu, M. T. Lazarescu, and L. Lavagno, "Mix & Latch: an optimization flow for high-performance designs with single-clock mixed-polarity latches and flip-flops," *IEEE Access*, vol. 11, pp. 35830–35840, 2023, doi: 10.1109/ACCESS.2023.3265809.
- [22] R. Madhura, K. H. Krishnappa, R. Manasa, and K. P. Yashaswini, "Slack time analysis for APB timer using genus synthesis tool," in *Lecture Notes in Networks and Systems*, vol. 754 LNNS, Springer Nature Singapore, 2023, pp. 207–217.
- [23] S. Yu, S. Du, and C. Yang, "A deep reinforcement learning floorplanning algorithm based on sequence pairs †," Applied Sciences (Switzerland), vol. 14, no. 7, p. 2905, Mar. 2024, doi: 10.3390/app14072905.
- [24] A. B. Chong, "Hybrid multisource clock tree synthesis," in 2021 28th IEEE International Conference on Electronics, Circuits, and Systems, ICECS 2021 Proceedings, Nov. 2021, pp. 1–6, doi: 10.1109/ICECS53924.2021.9665516.
- [25] J. Vygen, "Slack in static timing analysis," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 25, no. 9, pp. 1876–1885, Sep. 2006, doi: 10.1109/TCAD.2005.858348.

## **BIOGRAPHIES OF AUTHORS**



**Dr. Madhura Ramegowda D S E** received Ph.D. degree in VLSI verification and testing from VTU in 2024. She has got the M.Tech. in VLSI design and Embedded System from JSSATE, VTU in 2013. She has 14 years of teaching experience. Her research interests are VLSI design and digital hardware design on FPGA. She has published 25 papers in reputed journals and conferences. She can be contacted at email: madhura-ece@dayanandasagar.edu.





Mrs. Divyashree Yamadur Venkatesh (D) (S) is currently working as an Assistant Professor in Department of Electronics and Communication Engineering at SJB Injstitute of technology, Bangalore. She obtained her B. E degree in Electronics and Communication Engineering from Visvesvaraya Technological University Belgaum in 2008, M. Tech degree in VLSI Design and Embedded system design from Visvesvaraya Technological University Belgaum in 2011. She has 13 years of teaching experience. She has published 13 papers in various journals and conference. She can be contacted at email: divyashreeyy@gmail.com.



Kokila Sreenivasa current working as assistant professor in the department of ECE at Dayananda Sagar university. She has completed Bachelor Degree in the year 2008 from VTU and MTech specialisation in Digital Electronics and communication in the year 2015 from VTU. She has teaching experience of 10 years. She can be contacted at email: kokilakavana@gmail.com.