Placement
Traditionally, placement is at the design stage after logic synthesis and before routing. In the VLSI design flow, Logic synthesis generates a netlist. Then, after placement, the locations of the circuit modules in the netlist are determined. After placement, routing is performed to layout the nets in the netlist.
Placement is a critical step in the VLSI design process for the following reasons:
First, placement is a key factor in determining performance a circuit. Placement largely determines the length and, hence, the delay of interconnecting wires. As the feature size in advanced VLSI technology reduces, interconnect delay has become the determining factor of circuit performance. Interconnect delay can consume 75% of clock cycles during design. Therefore, an effective placement solution can substantially improve circuit performance.
Second, placement determines design routability. A well-constructed placement solution will reduce routing demand (i.e., shorter total wirelength) and distribute routing demand more Evenly to prevent hot spots.
Third, placement determines heat distribution on a die surface. An uneven temperature profile can lead to reliability and timing problems.
Fourth, power consumption is also affected by placement. A well-designed placement solution can reduce capacitive load due to wires (by having shorter wires and larger separations between adjacent wires). Hence the power consumption of switching can be reduced. It has become essential for the logic synthesis stage to incorporate placement techniques to perform physical design-aware logic synthesis (i.e., physical synthesis). The reason is that without placement information, it is impossible to estimate the delay of interconnecting wires. Hence, given the significance of interconnect delay, logic synthesis will not have any meaningful timing information to guide the synthesis process. As a result, synthesized netlists will perform poorly after placement. For the same reason, consideration of placement information during architecture design is becoming more common.
One way to overcome complexity is to perform placements in several manageable steps. One common flow is as follows.
1. Global placement. Global placement aims at generating a rough placement solution that may violate some placement constraints (e.g., overlaps among modules) while maintaining a global view of the whole netlist.
2. Legalization. The legalization process renders the rough solution from global placement legal (i.e., there is no violation of placement constraints).
3. Detailed placement. Detailed placement further improves the legalized placement solution by rearranging a small group of modules in a local region. This is done while keeping all other modules fixed.
The global placement step is the most critical of the three. It has the most impact on placement solution quality and runtime, and has been the focus of most prior research works. After global placement, the placement solution is almost determined. Legalization and detailed placement will only make local module location changes. Therefore, this chapter focuses on the global placement step. The most commonly used global placement approaches are the partitioning-based approach, simulated annealing approach, and analytical approach. The analytical approach will be presented in the most detail, because it is currently the most effective approach in both quality and runtime.
Objective of Placement
∙ Place the standard cell in legalized location
∙ Minimize all the critical net delays in the design
∙ Make sure router can complete the routing with minimal DRC’s- By reducing the congestion
Placement Targets
∙ Timing, Power and Area optimization
∙ Minimum congestion
∙ Minimal cell density, pin density and congestion hot-spots
∙ Minimal Timing DRVs
∙ Total wire length
∙ Routability
∙ Performance
∙ Heat Dissipation
Inputs of Placement
∙ Gate Level Netlist
∙ Design Library files(.lib and .lef)
∙ Design Constraints (SDC)
∙ Technology file
∙ Floorplan and Power plan DEF file (Floorplanned Design)
Needs of Placement
Mainly four reason behind Placement
1. Placement is a key factor in determining the performance of a circuit. Placement largely determines the length and hence, the delay of interconnects wires. Interconnects delay can consumes as much as 75% of clock cycle in advance design. Therefore, a good placement solution can substantially improve the performance of a circuit.
2. Placement determines the routing ability of a design. A well-constructed placement solution will have less routing demand(i.e., shorter total wire
length) and will distributes the routing demand more evenly to avoid routing hotspots.
3. Placement decides the distribution of heat on a die surface. An uneven temperature profile can lead to reliability and timing problems.
4. Placement decides the distribution of heat on a die surface. An uneven temperature profile can lead to reliability and timing problems.
Placement Steps
▪ Pre Placement
▪ Initial Placement / Course Placement / Global Placement
▪ Legalization
▪ HFNS (Hign Fanout Net Synthesis)
▪ Iteration for Congestion, Timing, DRV, and Power Optimization ▪ Multibit conversion
▪ Timing optimization iterations
▪ Scan-Chain Reorder
▪ Tie Cell insertion
(1) Pre Placement
Before starting the actual placement of the standard cells present in the synthesized netlist, we need to place various physical only cells like end-cap cells, well-tap cells, IO buffers, antenna diodes, and spare cells. A typical view after preplacement has shown in figure.
(2) Initial Placement/ Global Placement/ Course Placement
Once the pre Placement stage has been completed, We can start the placement of standard cells but before that, we have to provide all the correct placement and optimization settings that we want to be applied while the tool does the placement and optimization. These settings could be like partial placement blockage or density screen setting, bound or region creation, cell/instance padding, path_groups and effort, enabling the early clock flow (ECF) in case of innovus, enabling the extreme flow, enabling the useful skew, global congestion effort, global timing effort, power effort, Multibit flop conversion and many more.
After providing all these placement settings we can call the placement command (place_opt_design in case of innovus). The tool first does the global placement in which the tool determines the approximate location of each cell according to the timing, congestion, and multi-voltage constraints (in the case of innovus Gigaplace engine is called in this step). Any pre-placed macros will work as a placement blockage. In this stage, the tool will not check any overlap of instances. A typical figure of global placement has shown in figure-2 where you can
see that the standard cells are placed in an approximate location but without legalization.
(3) Legalization
In the global placement stage, the instances are left with overlap. In this step, the tool will move the instances in nearby places to overcome the overlap. To match the proper power pins like the vdd pin of a standard cell should be on the vdd rail and vss on vss rail and for that if the flipping of instance is required tool also do the flipping. This process is called legalization. After this step, every instance should be placed in a legal location and there should be no overlaps. This step is also called refine placement.
(4)HFNS (High Fanout Net Synthesis)
Initially, there are some nets which have very high numbers of fanout. We have a constraint of maximum fanout, so we need to distribute the sinks on nets to different drivers. The process of adding buffers and splitting the fanout is called high fanout net synthesis (HFNS). So In this step, all high fanout nets get synthesized.
(5)Iteration for Congestion, Timing, DRVs and Power Optimization
In this step tool first, do an early global route and estimate the routing overflow/congestions in the design. The tool tries to initially minimize the congestion in this stage. Next, the tool starts the RC extraction to calculate the delay for setup analysis. The tool tries to minimize the setup WNS and TNS in this step. Similarly, the tool also tries to minimize the DRV and Power in this stage.
(6)Multibit Conversion
If the user enables the multi-bit flip flop conversion in the flow then the tool will first check the available multibit flops in the library. The tool considers the criticality of timing associated with a single bit of flop and the user constraint set for multi-bit conversion and based on the constraints the tool converts the single-bit flop into multibit flops.
(7)Timing optimization iteration
This is a long step in which the tool tries to minimize the WNS and TNS of each path group in various iterations. There are several iterations required to get a minimum WNS and TNS depending upon the effort set and initial WNS number. In case the result is not good after this stage, we can further run incremental optimization for timing. Similarly, for congestion, we can run congestion repair followed by incremental optimization to get a better result. But these additional steps will increase the run time.
(8)Scan chain reorder
Scan chain stitching has been done arbitrarily in synthesis. After placement and optimization, we have a location for each scan flops so it needs to be reordered for better routability. The tool performs a reordering of the scan chain in this step which is good for both timing and congestions.
(9)Tie cell insertion
There are some unused inputs of logic gates in the netlist which is tied to either vdd or vss. We cannot leave any inputs of the standard cell as floating, it must be tied either vdd or vss. Connecting an input of logic cell that is the gate of a transistor directly to vdd or vss is not recommended and for that, we have tie high and tie low cells in the library. So In this step tool places tie high and tie low cells which is basically a single output logic cell, and it connects the input of the logic gate which needs to connect vdd or vss respectively.
Attribute specification assigns an attribute declared earlier to a chosen named entity. The named entities that can be assigned attributes are: entity, architecture, configuration, procedure, function, package, type, subtype, constant, signal, variable, component, label, literal, units, group, or file. The named entities are enumerated in entity names list.
Make sure all the macro are having a fixed attribute Command to check:- get_attribute [all_macro_cells] physical_status
If it’s not fixed make sure to fix the macro attribute set_attribute [all_macro_cells] physical_status fixed or set_dont_touch_placement [all_macro_cells]
Example:
∙ Get_attribute[get_core_area] -bbox
∙ Get_attribute [get_die_area] -bbox
∙ Set_attribute [get_port * ] is_fixed true
∙ Get_attribute [get_port * ] is_fixed
∙ Set_attribute [all_macro_cells] is_fixed true
∙ Get_attribute [all_macro_cells] is_fixed
Placement and Optimization attributes
Typical attributes Coarse Detailed Optimization Placement Placement |
|||
Fixed |
Cannot move cells |
Cannot move cells |
Cannot move, rotate or resize cells |
Imposed on clock buffer |
Cannot move cells |
Cannot move cells |
Cannot move, rotate or resize cells |
Soft Fixed |
Cannot move cells |
No restrictions |
No restrictions |
Size only |
No restrictions |
No restrictions |
Can only resize cells |
In place size only |
Cannot move cells |
No restrictions |
Can resize cells only if there is room |
Imposed on clock sinks |
No restrictions |
No restrictions |
Can resize cells only if there is room |
Don’t touch |
No restrictions |
No restrictions |
Cannot move, rotate or resize cells |
Placement Optimization Techniques
Placement is performed in four optimization stages:
1) Pre Placement Optimization
2) In Placement Optimization
3) Post Placement Optimization(PPO) before Clock tree synthesis(CTS) 4) Post Placement Optimization(PPO) after Clock tree synthesis(CTS)
Pre-placement Optimization optimizes the netlist before placement, HFNs are collapsed. It can also downsize the cells.
✔ Delay models must be removed
✔ Zero-RC (0-RC) Optimization
▪ Optimize the netlist without any delay models, thus provides an optimal starting point for placement
▪ Timing during 0-RC Opt and that of during synthesis has to be matched ▪ Else indicate problems in the Technology file, Timing Library, Constraint file or overall design
▪ Logical restructuring and up/down size are optimization at the 0-RC stage
✔ Isolation Cell Insertion
✔ Multi Corner Multi Mode (MCMM) settings before std. cell placement
In-placement optimization re-optimizes the logic based on VR. This can perform cell sizing, cell moving, cell bypassing, net splitting, gate duplication, buffer insertion, area recovery. Optimization performs iteration of setup fixing, incremental timing and congestion driven placement.
✔ Scan Chain Reordering: DFT tool flow makes a list of all scan-able flops in the design and sorts them based on their hierarchy. In APR tool scan chain are reordered on the basis of placement of flops and Q-SI routing. Scan chain reordering helps to reduce congestion and total wire length etc…
✔ After placement, report congestion, Utilization and Timing
✔ Tie Off cell instances provide connectivity between the Tie high and tie low logical input pins of the netlist instances to Power and Ground
✔ Tie Off cells are placed after the placement of Std Cells
✔ After placement check the Cell Density
Post placement optimization before CTS performs netlist optimization with ideal clocks. It can fix setup, hold, max trans/cap violations. It can do placement optimization based on global routing. It re does HFN synthesis.
✔ Cell Sizing:
⮚ Sized Up/Down to meet optimizing for timing and area
⮚ Up sizing will give timing advantage and Down sizing will give area advantage.
✔Static power optimization Techniques
VT swapping
⮚ To optimize for leakage power(HVT, RVT/SVT, LVT)
✔ Dynamic power optimization Techniques
Cloning
⮚ To reduce fanout
✔ Buffering
⮚ Long nets are buffered or remove buffers to bring the timing advantage
✔ Re-buffering
⮚ To improve slews, reduce net capacitance and reduce fanout
✔ Logical Restructuring
⮚ To optimize timing and area without changing functionality of the design
⮚ Breaking complex cells into simpler cells or vice versa
✔ Pin swapping
Post placement optimization after CTS optimizes timing with propagated clock. It tries to preserve clock skew.
Don’t touch cell
set_dont_touch
Sets the dont_touch attribute on cells, nets, references, and designs in the current design, and on library cells, to prevent modification or replacement of these objects during optimization.
Setting the dont_touch attribute on a hierarchical cell implies “dont_touch” on all cells below it that do not have dont_touch set to false. Setting the dont_touch attribute on a library cell implies “dont_touch” on all instances of that cell.
Don’t use cell
-dont_use lib_cells
Specifies library cells that cannot be used for optimization even if they are inside the libraries specified by the library_list argument. You can specify the library cells by name or by collection in a space-separated list.
As name suggested about the “Don’t use cells” these are those cells which are present in the Library and you don’t want to use those cells in your design.
Whenever you are doing optimization or say using the tool like ICC for designing.. Then you have to provide a technology library. During mapping different logic of your design with proper cell or during optimization of your design tool uses different types of cells (standard cell/buffer/invertor/delay cell/filler cell) etc… Present in your specific library. Since these Library is usually design independent so they have a lot of cells which are not require for a particular design.
Now If you would not like to use any particular cells because of many reason (like – driving strength, fanout, size, or your design has some specific requirement or may be some type of cells are creating problem in your design during timing closer ..Etc. etc.) Then you can mark or say set an attribute over those cells in your design as don’t use cell. Now those cells will not be used in your design till the point those are like don’t use cell.
Sometime designer set few cells as don’t use cells for some part of the design and then remove those attribute later on or Vice versa.
So in short Don’t use cells are not the type of cells but these are the attributes sets on the cells and for EDA tools those become don’t use cells.
Tool commands and optimization
Checks before placement: : #
fc_shell>>check_design –checks physical_constraints
fc_shell>>check_design –checks pre_placement_stage
Detail placement:
fc_shell>>legalize_placement
Placement Optimisation: #
The place_opt command consists of the following stages:
1. Initial placement (initial_place) :During this stage, the tool merges the clock-gating logic and performs coarse placement, which considers only the datapath timing requirements. If a block contains scan chains that were annotated by reading a SCANDEF file, the tool also performs scan chain optimization.
2. Initial DRC violation fixing (initial_drc) :During this stage, the tool removes existing buffer trees and performs high-fanout-net synthesis and electrical DRC violation fixing.
3. Initial optimization (initial_opto) :During this stage, the tool performs timing, area, congestion, and leakage-power optimization.
4. Final placement (final_place): During this stage, the tool performs incremental placement to improve timing and routability.
5. Final optimization (final_opto): During this stage, the tool performs further optimization.
6. Legalization During this stage, the tool legalizes the placement.
>>*Refine_opt*: if congestion is found to be a problem after placement and optimization. It can improve incrementally with the refine_opt command
If scan chains are present in design
fc_shell>>read_def <name.scandef>
fc_shell>>check_scan_chain
fc_shell>>report_scan_chains
#ICG timing driven clock aware placement for designs with critical ICG enable timing (occurs during initial_opto of place_opt) (splits ICGs are places them closer to cluster registers to reduce setup) ->
Use if ICG cells are present in design
fc_shell>>place_opt.flow.clock_aware_placement
Fc_shell>>set_app_options place_opt.flow.optimize_icgs true
How to control the congestion steps: #
Reduce the local cell density using partial placement blockages
fc_shell>> Create_placement_blockage –boundary {10 20 100 200} –type partial – blocked_percentage 40 (it means 40 % area is blocked for placement of standard cells and rest of the 60% available for placement of standard cells )
If we have more pin density, which can be reduced by adding cell padding to the cells which is causing congestion. Cell padding can be applied by setting the keepout margin command.
fc_shell>>Create_keepout_margin –type soft –outer {left bottom right top} my_lib_macro
If the design is congested, we rerun the place_opt with the –congestion and –effort high options. During congestion driven placement, the cells which are sitting together and caused the congestion are spread apart.
fc_shell>>Place_opt –congestion_driven –effort high
Soft blockage: during optimization timing critical buffer/inv can be placed in that blockage area.
Create_placement_blockage -boundary {10 20 100 200} –name pb1 –type soft
Note: if you have both blockages are present at the same place then hard blockages take priority over the soft placement blockages.
Placement bounds:
It is a constraint that controls the placement of groups of leaf cells and hierarchical cells. It allows you to group cells to minimize wire length and place the cells at most appropriate locations. When our timing is critical during placement then we create bounds in that area where two communicating cells are sitting far from another. It is a fixed region in which we placed a set of cells. It comprises of one or more rectangular or rectilinear shapes which can be abutted or disjoint. In general we specify the cells and ports to be included in the bound. If a hierarchical cell is included, all cells in the sub-design belong to the bound.
types of bounds:
- Soft move bound
- Hard move bound
- Exclusive move bound
Soft move bound:
In this tool tries to place the cells in the move bound within a specified region, however, there is no guarantee that the cells are placed inside the bounds.
fc_shell>>create bound –name b0 –type soft –boundary {10 10 20 20} instance_1 #define softbound for instance_1 with its left corner at (10 10) and its upper-right corner at (20 20).
Hard move bound: In this tool must place the cells in the move bound within a specified region.
Create bound –name b1 –type soft –boundary {10 10 20 20} instance_2
Exclusive move bound:
In this tool tries to place the cells in the group bound within a floating region, however, there is no guarantee that the cells are placed inside the bounds
Create bound –name b2 –exclusive –boundary {10 10 20 20} instance_1
Density controls:
It means how the density of cells can be packed. We can control the overall placement density for the block or the cell density for specific regions. To control the cell density for specific regions we can also use partial placement blockages.
During Dynamic power-driven placement, the tool tries to improve both the timing and power of the critical nets and the power QOR without affecting the timing QOR.
Magnet placement #
To improve congestion for a complex floorplan or to improve timing for the design we can use magnet placement to specify fixed object as a magnet and have the tool place all the standard cells connected to the magnet object close to it. #
fc_shell>> magnet_placement
Placement commands: #
- check_legality –verbose
- report_congestion
- report_qor
- report_utilization
- timing reports ( report_timing – delay_type min/max)
- report_constraints –all_violators
- report_power
- report_timing –groups -transition –capacitance –physical –input_pins -nets –attributes
- report_timing –groups -max_paths
- Set_attribute[get_edit_groups<cell_name*>] is locked false
- Set_fixed_objects[get_edit_groups<cell_name*>] –unfix
- Set_attribute[get_edit_groups<cell_name*>] is locked true
- Add_buffer –lib_cell[get_lib_cells*buffer] -new_cell_names
- Change_selection [get_timing_paths –groups ]
- Check_lvs
Challenge 1 :
In previous project cells are placed in narrow channel between macro so faced lot more routing congestion as well cell density was increased to resolve that issue narrow channel placement blockage was helpful.
How Narrow channel placement blockage:-
● Created narrow channel placement blockage to avoid congestion and improve timing
Narrow channel placement blockage using set app options :
set_app_options -name plan.place.auto_create_blockage_channel_heights -value {5u 6u}
set_app_options -name plan.place.auto_create_blockage_channel_widths -value {2u 16u}
by using above two set app options tool will automatically place hard and soft placement blockage, and below part will describe how to convert soft placement blockage into partial placement blockage.
● Converted all soft placement blockage into partial.
Converting soft blockage into partial blockage and setting a blocked percentage :
set_attribute -objects [get_placement_blockages -filter “blockage_type == soft”] -name blockage_type -value partial
set_attribute -objects [get_placement_blockages -filter “blockage_type == partial”] -name blocked_percentage -value 30
Challenge 2:
Due to timing violation and cells are placed far from each other, by creating bound will help in reduction of congestion and improvement in timing.
Soft move bound:
In this tool tries to place the cells in the move bound within a specified region, however, there is no guarantee that the cells are placed inside the bounds.
Fc_shell>>Create bound –name b0 –type soft –boundary {10 10 20 20} instance_1
Hard move bound:
In this tool must place the cells in the move bound within a specified region.
Fc_shell>>Create bound –name b1 –type soft –boundary {10 10 20 20} instance_2
Exclusive move bound:
In this tool tries to place the cells in the group bound within a floating region, however, there is no guarantee that the cells are placed inside the bounds
Fc_shell>>Create bound –name b2 –exclusive –boundary {10 10 20 20} instance_1
Challenge 3:
In previously timing was violating after placement to resolve that timing violation path group was helpful.
fc_shell> group_path -name slct -weight 2.5 -critical_range 5 -from [get_cells slct_reg]
Challenge 4:
In some part of design congestion and cell density was high, to resolve that added partial placement blockage was helpful.
Reduce the local cell density using partial placement blockages Create_placement_blockage –boundary {10 20 100 200} –type partial –blocked_percentage 40 (it means 40 % area is blocked for placement of standard cells and the rest of the 60% available for placement of standard cells)
Placement
Traditionally, placement is at the design stage after logic synthesis and before routing. In the VLSI design flow, Logic synthesis generates a netlist. Then, after placement, the locations of the circuit modules in the netlist are determined. After placement, routing is performed to layout the nets in the netlist.
Placement is a critical step in the VLSI design process for the following reasons:
First, placement is a key factor in determining performance a circuit. Placement largely determines the length and, hence, the delay of interconnecting wires. As the feature size in advanced VLSI technology reduces, interconnect delay has become the determining factor of circuit performance. Interconnect delay can consume 75% of clock cycles during design. Therefore, an effective placement solution can substantially improve circuit performance.
Second, placement determines design routability. A well-constructed placement solution will reduce routing demand (i.e., shorter total wirelength) and distribute routing demand more Evenly to prevent hot spots.
Third, placement determines heat distribution on a die surface. An uneven temperature profile can lead to reliability and timing problems.
Fourth, power consumption is also affected by placement. A well-designed placement solution can reduce capacitive load due to wires (by having shorter wires and larger separations between adjacent wires). Hence the power consumption of switching can be reduced. It has become essential for the logic synthesis stage to incorporate placement techniques to perform physical design-aware logic synthesis (i.e., physical synthesis). The reason is that without placement information, it is impossible to estimate the delay of interconnecting wires. Hence, given the significance of interconnect delay, logic synthesis will not have any meaningful timing information to guide the synthesis process. As a result, synthesized netlists will perform poorly after placement. For the same reason, consideration of placement information during architecture design is becoming more common.
One way to overcome complexity is to perform placements in several manageable steps. One common flow is as follows.
1. Global placement. Global placement aims at generating a rough placement solution that may violate some placement constraints (e.g., overlaps among modules) while maintaining a global view of the whole netlist.
2. Legalization. The legalization process renders the rough solution from global placement legal (i.e., there is no violation of placement constraints).
3. Detailed placement. Detailed placement further improves the legalized placement solution by rearranging a small group of modules in a local region. This is done while keeping all other modules fixed.
The global placement step is the most critical of the three. It has the most impact on placement solution quality and runtime, and has been the focus of most prior research works. After global placement, the placement solution is almost determined. Legalization and detailed placement will only make local module location changes. Therefore, this chapter focuses on the global placement step. The most commonly used global placement approaches are the partitioning-based approach, simulated annealing approach, and analytical approach. The analytical approach will be presented in the most detail, because it is currently the most effective approach in both quality and runtime.
Objective of Placement
∙ Place the standard cell in legalized location
∙ Minimize all the critical net delays in the design
∙ Make sure router can complete the routing with minimal DRC’s- By reducing the congestion
Placement Targets
∙ Timing, Power and Area optimization
∙ Minimum congestion
∙ Minimal cell density, pin density and congestion hot-spots
∙ Minimal Timing DRVs
∙ Total wire length
∙ Routability
∙ Performance
∙ Heat Dissipation
Inputs of Placement
∙ Gate Level Netlist
∙ Design Library files(.lib and .lef)
∙ Design Constraints (SDC)
∙ Technology file
∙ Floorplan and Power plan DEF file (Floorplanned Design)
Needs of Placement
Mainly four reason behind Placement
1. Placement is a key factor in determining the performance of a circuit. Placement largely determines the length and hence, the delay of interconnects wires. Interconnects delay can consumes as much as 75% of clock cycle in advance design. Therefore, a good placement solution can substantially improve the performance of a circuit.
2. Placement determines the routing ability of a design. A well-constructed placement solution will have less routing demand(i.e., shorter total wire
length) and will distributes the routing demand more evenly to avoid routing hotspots.
3. Placement decides the distribution of heat on a die surface. An uneven temperature profile can lead to reliability and timing problems.
4. Placement decides the distribution of heat on a die surface. An uneven temperature profile can lead to reliability and timing problems.
Placement Steps
▪ Pre Placement
▪ Initial Placement / Course Placement / Global Placement
▪ Legalization
▪ HFNS (Hign Fanout Net Synthesis)
▪ Iteration for Congestion, Timing, DRV, and Power Optimization ▪ Multibit conversion
▪ Timing optimization iterations
▪ Scan-Chain Reorder
▪ Tie Cell insertion
(1) Pre Placement
Before starting the actual placement of the standard cells present in the synthesized netlist, we need to place various physical only cells like end-cap cells, well-tap cells, IO buffers, antenna diodes, and spare cells. A typical view after preplacement has shown in figure.
(2) Initial Placement/ Global Placement/ Course Placement
Once the pre Placement stage has been completed, We can start the placement of standard cells but before that, we have to provide all the correct placement and optimization settings that we want to be applied while the tool does the placement and optimization. These settings could be like partial placement blockage or density screen setting, bound or region creation, cell/instance padding, path_groups and effort, enabling the early clock flow (ECF) in case of innovus, enabling the extreme flow, enabling the useful skew, global congestion effort, global timing effort, power effort, Multibit flop conversion and many more.
After providing all these placement settings we can call the placement command (place_opt_design in case of innovus). The tool first does the global placement in which the tool determines the approximate location of each cell according to the timing, congestion, and multi-voltage constraints (in the case of innovus Gigaplace engine is called in this step). Any pre-placed macros will work as a placement blockage. In this stage, the tool will not check any overlap of instances. A typical figure of global placement has shown in figure-2 where you can
see that the standard cells are placed in an approximate location but without legalization.
(3) Legalization
In the global placement stage, the instances are left with overlap. In this step, the tool will move the instances in nearby places to overcome the overlap. To match the proper power pins like the vdd pin of a standard cell should be on the vdd rail and vss on vss rail and for that if the flipping of instance is required tool also do the flipping. This process is called legalization. After this step, every instance should be placed in a legal location and there should be no overlaps. This step is also called refine placement.
(4)HFNS (High Fanout Net Synthesis)
Initially, there are some nets which have very high numbers of fanout. We have a constraint of maximum fanout, so we need to distribute the sinks on nets to different drivers. The process of adding buffers and splitting the fanout is called high fanout net synthesis (HFNS). So In this step, all high fanout nets get synthesized.
(5)Iteration for Congestion, Timing, DRVs and Power Optimization
In this step tool first, do an early global route and estimate the routing overflow/congestions in the design. The tool tries to initially minimize the congestion in this stage. Next, the tool starts the RC extraction to calculate the delay for setup analysis. The tool tries to minimize the setup WNS and TNS in this step. Similarly, the tool also tries to minimize the DRV and Power in this stage.
(6)Multibit Conversion
If the user enables the multi-bit flip flop conversion in the flow then the tool will first check the available multibit flops in the library. The tool considers the criticality of timing associated with a single bit of flop and the user constraint set for multi-bit conversion and based on the constraints the tool converts the single-bit flop into multibit flops.
(7)Timing optimization iteration
This is a long step in which the tool tries to minimize the WNS and TNS of each path group in various iterations. There are several iterations required to get a minimum WNS and TNS depending upon the effort set and initial WNS number. In case the result is not good after this stage, we can further run incremental optimization for timing. Similarly, for congestion, we can run congestion repair followed by incremental optimization to get a better result. But these additional steps will increase the run time.
(8)Scan chain reorder
Scan chain stitching has been done arbitrarily in synthesis. After placement and optimization, we have a location for each scan flops so it needs to be reordered for better routability. The tool performs a reordering of the scan chain in this step which is good for both timing and congestions.
(9)Tie cell insertion
There are some unused inputs of logic gates in the netlist which is tied to either vdd or vss. We cannot leave any inputs of the standard cell as floating, it must be tied either vdd or vss. Connecting an input of logic cell that is the gate of a transistor directly to vdd or vss is not recommended and for that, we have tie high and tie low cells in the library. So In this step tool places tie high and tie low cells which is basically a single output logic cell, and it connects the input of the logic gate which needs to connect vdd or vss respectively.
Attribute specification assigns an attribute declared earlier to a chosen named entity. The named entities that can be assigned attributes are: entity, architecture, configuration, procedure, function, package, type, subtype, constant, signal, variable, component, label, literal, units, group, or file. The named entities are enumerated in entity names list.
Make sure all the macro are having a fixed attribute Command to check:- get_attribute [all_macro_cells] physical_status
If it’s not fixed make sure to fix the macro attribute set_attribute [all_macro_cells] physical_status fixed or set_dont_touch_placement [all_macro_cells]
Example:
∙ Get_attribute[get_core_area] -bbox
∙ Get_attribute [get_die_area] -bbox
∙ Set_attribute [get_port * ] is_fixed true
∙ Get_attribute [get_port * ] is_fixed
∙ Set_attribute [all_macro_cells] is_fixed true
∙ Get_attribute [all_macro_cells] is_fixed
Placement and Optimization attributes
Typical attributes Coarse Detailed Optimization Placement Placement |
|||
Fixed |
Cannot move cells |
Cannot move cells |
Cannot move, rotate or resize cells |
Imposed on clock buffer |
Cannot move cells |
Cannot move cells |
Cannot move, rotate or resize cells |
Soft Fixed |
Cannot move cells |
No restrictions |
No restrictions |
Size only |
No restrictions |
No restrictions |
Can only resize cells |
In place size only |
Cannot move cells |
No restrictions |
Can resize cells only if there is room |
Imposed on clock sinks |
No restrictions |
No restrictions |
Can resize cells only if there is room |
Don’t touch |
No restrictions |
No restrictions |
Cannot move, rotate or resize cells |
Placement Optimization Techniques
Placement is performed in four optimization stages:
1) Pre Placement Optimization
2) In Placement Optimization
3) Post Placement Optimization(PPO) before Clock tree synthesis(CTS) 4) Post Placement Optimization(PPO) after Clock tree synthesis(CTS)
Pre-placement Optimization optimizes the netlist before placement, HFNs are collapsed. It can also downsize the cells.
✔ Delay models must be removed
✔ Zero-RC (0-RC) Optimization
▪ Optimize the netlist without any delay models, thus provides an optimal starting point for placement
▪ Timing during 0-RC Opt and that of during synthesis has to be matched ▪ Else indicate problems in the Technology file, Timing Library, Constraint file or overall design
▪ Logical restructuring and up/down size are optimization at the 0-RC stage
✔ Isolation Cell Insertion
✔ Multi Corner Multi Mode (MCMM) settings before std. cell placement
In-placement optimization re-optimizes the logic based on VR. This can perform cell sizing, cell moving, cell bypassing, net splitting, gate duplication, buffer insertion, area recovery. Optimization performs iteration of setup fixing, incremental timing and congestion driven placement.
✔ Scan Chain Reordering: DFT tool flow makes a list of all scan-able flops in the design and sorts them based on their hierarchy. In APR tool scan chain are reordered on the basis of placement of flops and Q-SI routing. Scan chain reordering helps to reduce congestion and total wire length etc…
✔ After placement, report congestion, Utilization and Timing
✔ Tie Off cell instances provide connectivity between the Tie high and tie low logical input pins of the netlist instances to Power and Ground
✔ Tie Off cells are placed after the placement of Std Cells
✔ After placement check the Cell Density
Post placement optimization before CTS performs netlist optimization with ideal clocks. It can fix setup, hold, max trans/cap violations. It can do placement optimization based on global routing. It re does HFN synthesis.
✔ Cell Sizing:
⮚ Sized Up/Down to meet optimizing for timing and area
⮚ Up sizing will give timing advantage and Down sizing will give area advantage.
✔Static power optimization Techniques
VT swapping
⮚ To optimize for leakage power(HVT, RVT/SVT, LVT)
✔ Dynamic power optimization Techniques
Cloning
⮚ To reduce fanout
✔ Buffering
⮚ Long nets are buffered or remove buffers to bring the timing advantage
✔ Re-buffering
⮚ To improve slews, reduce net capacitance and reduce fanout
✔ Logical Restructuring
⮚ To optimize timing and area without changing functionality of the design
⮚ Breaking complex cells into simpler cells or vice versa
✔ Pin swapping
Post placement optimization after CTS optimizes timing with propagated clock. It tries to preserve clock skew.
Don’t touch cell
set_dont_touch
Sets the dont_touch attribute on cells, nets, references, and designs in the current design, and on library cells, to prevent modification or replacement of these objects during optimization.
Setting the dont_touch attribute on a hierarchical cell implies “dont_touch” on all cells below it that do not have dont_touch set to false. Setting the dont_touch attribute on a library cell implies “dont_touch” on all instances of that cell.
Don’t use cell
-dont_use lib_cells
Specifies library cells that cannot be used for optimization even if they are inside the libraries specified by the library_list argument. You can specify the library cells by name or by collection in a space-separated list.
As name suggested about the “Don’t use cells” these are those cells which are present in the Library and you don’t want to use those cells in your design.
Whenever you are doing optimization or say using the tool like ICC for designing.. Then you have to provide a technology library. During mapping different logic of your design with proper cell or during optimization of your design tool uses different types of cells (standard cell/buffer/invertor/delay cell/filler cell) etc… Present in your specific library. Since these Library is usually design independent so they have a lot of cells which are not require for a particular design.
Now If you would not like to use any particular cells because of many reason (like – driving strength, fanout, size, or your design has some specific requirement or may be some type of cells are creating problem in your design during timing closer ..Etc. etc.) Then you can mark or say set an attribute over those cells in your design as don’t use cell. Now those cells will not be used in your design till the point those are like don’t use cell.
Sometime designer set few cells as don’t use cells for some part of the design and then remove those attribute later on or Vice versa.
So in short Don’t use cells are not the type of cells but these are the attributes sets on the cells and for EDA tools those become don’t use cells.
Tool commands and optimization
Checks before placement: : #
fc_shell>>check_design –checks physical_constraints
fc_shell>>check_design –checks pre_placement_stage
Detail placement:
fc_shell>>legalize_placement
Placement Optimisation: #
The place_opt command consists of the following stages:
1. Initial placement (initial_place) :During this stage, the tool merges the clock-gating logic and performs coarse placement, which considers only the datapath timing requirements. If a block contains scan chains that were annotated by reading a SCANDEF file, the tool also performs scan chain optimization.
2. Initial DRC violation fixing (initial_drc) :During this stage, the tool removes existing buffer trees and performs high-fanout-net synthesis and electrical DRC violation fixing.
3. Initial optimization (initial_opto) :During this stage, the tool performs timing, area, congestion, and leakage-power optimization.
4. Final placement (final_place): During this stage, the tool performs incremental placement to improve timing and routability.
5. Final optimization (final_opto): During this stage, the tool performs further optimization.
6. Legalization During this stage, the tool legalizes the placement.
>>*Refine_opt*: if congestion is found to be a problem after placement and optimization. It can improve incrementally with the refine_opt command
If scan chains are present in design
fc_shell>>read_def <name.scandef>
fc_shell>>check_scan_chain
fc_shell>>report_scan_chains
#ICG timing driven clock aware placement for designs with critical ICG enable timing (occurs during initial_opto of place_opt) (splits ICGs are places them closer to cluster registers to reduce setup) ->
Use if ICG cells are present in design
fc_shell>>place_opt.flow.clock_aware_placement
Fc_shell>>set_app_options place_opt.flow.optimize_icgs true
How to control the congestion steps: #
Reduce the local cell density using partial placement blockages
fc_shell>> Create_placement_blockage –boundary {10 20 100 200} –type partial – blocked_percentage 40 (it means 40 % area is blocked for placement of standard cells and rest of the 60% available for placement of standard cells )
If we have more pin density, which can be reduced by adding cell padding to the cells which is causing congestion. Cell padding can be applied by setting the keepout margin command.
fc_shell>>Create_keepout_margin –type soft –outer {left bottom right top} my_lib_macro
If the design is congested, we rerun the place_opt with the –congestion and –effort high options. During congestion driven placement, the cells which are sitting together and caused the congestion are spread apart.
fc_shell>>Place_opt –congestion_driven –effort high
Soft blockage: during optimization timing critical buffer/inv can be placed in that blockage area.
Create_placement_blockage -boundary {10 20 100 200} –name pb1 –type soft
Note: if you have both blockages are present at the same place then hard blockages take priority over the soft placement blockages.
Placement bounds:
It is a constraint that controls the placement of groups of leaf cells and hierarchical cells. It allows you to group cells to minimize wire length and place the cells at most appropriate locations. When our timing is critical during placement then we create bounds in that area where two communicating cells are sitting far from another. It is a fixed region in which we placed a set of cells. It comprises of one or more rectangular or rectilinear shapes which can be abutted or disjoint. In general we specify the cells and ports to be included in the bound. If a hierarchical cell is included, all cells in the sub-design belong to the bound.
types of bounds:
- Soft move bound
- Hard move bound
- Exclusive move bound
Soft move bound:
In this tool tries to place the cells in the move bound within a specified region, however, there is no guarantee that the cells are placed inside the bounds.
fc_shell>>create bound –name b0 –type soft –boundary {10 10 20 20} instance_1 #define softbound for instance_1 with its left corner at (10 10) and its upper-right corner at (20 20).
Hard move bound: In this tool must place the cells in the move bound within a specified region.
Create bound –name b1 –type soft –boundary {10 10 20 20} instance_2
Exclusive move bound:
In this tool tries to place the cells in the group bound within a floating region, however, there is no guarantee that the cells are placed inside the bounds
Create bound –name b2 –exclusive –boundary {10 10 20 20} instance_1
Density controls:
It means how the density of cells can be packed. We can control the overall placement density for the block or the cell density for specific regions. To control the cell density for specific regions we can also use partial placement blockages.
During Dynamic power-driven placement, the tool tries to improve both the timing and power of the critical nets and the power QOR without affecting the timing QOR.
Magnet placement #
To improve congestion for a complex floorplan or to improve timing for the design we can use magnet placement to specify fixed object as a magnet and have the tool place all the standard cells connected to the magnet object close to it. #
fc_shell>> magnet_placement
Placement commands: #
- check_legality –verbose
- report_congestion
- report_qor
- report_utilization
- timing reports ( report_timing – delay_type min/max)
- report_constraints –all_violators
- report_power
- report_timing –groups -transition –capacitance –physical –input_pins -nets –attributes
- report_timing –groups -max_paths
- Set_attribute[get_edit_groups<cell_name*>] is locked false
- Set_fixed_objects[get_edit_groups<cell_name*>] –unfix
- Set_attribute[get_edit_groups<cell_name*>] is locked true
- Add_buffer –lib_cell[get_lib_cells*buffer] -new_cell_names
- Change_selection [get_timing_paths –groups ]
- Check_lvs
Challenge 1 :
In previous project cells are placed in narrow channel between macro so faced lot more routing congestion as well cell density was increased to resolve that issue narrow channel placement blockage was helpful.
How Narrow channel placement blockage:-
● Created narrow channel placement blockage to avoid congestion and improve timing
Narrow channel placement blockage using set app options :
set_app_options -name plan.place.auto_create_blockage_channel_heights -value {5u 6u}
set_app_options -name plan.place.auto_create_blockage_channel_widths -value {2u 16u}
by using above two set app options tool will automatically place hard and soft placement blockage, and below part will describe how to convert soft placement blockage into partial placement blockage.
● Converted all soft placement blockage into partial.
Converting soft blockage into partial blockage and setting a blocked percentage :
set_attribute -objects [get_placement_blockages -filter “blockage_type == soft”] -name blockage_type -value partial
set_attribute -objects [get_placement_blockages -filter “blockage_type == partial”] -name blocked_percentage -value 30
Challenge 2:
Due to timing violation and cells are placed far from each other, by creating bound will help in reduction of congestion and improvement in timing.
Soft move bound:
In this tool tries to place the cells in the move bound within a specified region, however, there is no guarantee that the cells are placed inside the bounds.
Fc_shell>>Create bound –name b0 –type soft –boundary {10 10 20 20} instance_1
Hard move bound:
In this tool must place the cells in the move bound within a specified region.
Fc_shell>>Create bound –name b1 –type soft –boundary {10 10 20 20} instance_2
Exclusive move bound:
In this tool tries to place the cells in the group bound within a floating region, however, there is no guarantee that the cells are placed inside the bounds
Fc_shell>>Create bound –name b2 –exclusive –boundary {10 10 20 20} instance_1
Challenge 3:
In previously timing was violating after placement to resolve that timing violation path group was helpful.
fc_shell> group_path -name slct -weight 2.5 -critical_range 5 -from [get_cells slct_reg]
Challenge 4:
In some part of design congestion and cell density was high, to resolve that added partial placement blockage was helpful.
Reduce the local cell density using partial placement blockages Create_placement_blockage –boundary {10 20 100 200} –type partial –blocked_percentage 40 (it means 40 % area is blocked for placement of standard cells and the rest of the 60% available for placement of standard cells)