Professional Documents
Culture Documents
Atmel
Rousset, France
www.atmel.com
ABSTRACT
The implementation and the usage of Synopsys OCC in a complex SoC using Synopsys
DFTMAX Adaptative Scan Compression technology are presented in this paper. A solution,
designed at ATMEL Rousset for SAMA5D3 product, is proposed including both the main DFT
and ATPG aspects, and explaining the specificities of the design in term of OCC Test
implementation, focusing on the scan at-speed test of synchronous subsystems that runs at
different frequencies.
Table of Contents
1.
2.
3.
4.
5.
6.
7.
SNUG 2013
AT-SPEED ATPG FLOW INCREMENTAL FLOW FOR 95% STUCK AT COVERAGE ........................ 39
ATPG RESULTS .......................................................................................................................... 40
EARLY SILICON RESULTS ............................................................................................................ 41
8.
9.
10.
Acknowledgements ............................................................................................................. 43
11.
Acronyms ............................................................................................................................ 43
12.
References ........................................................................................................................... 44
Table of Figures
Figure 1 - General Principe of at-speed Test................................................................................ 6
Figure 2 - Synopsys OCC switch ................................................................................................. 7
Figure 3 - Synopsys OCC functional view................................................................................... 8
Figure 4 - Clocks Domain in our SoC .......................................................................................... 12
Figure 5 - System Branch in our SoC ........................................................................................... 13
Figure 6 - Solution 1 ..................................................................................................................... 14
Figure 7 - Solution 2 ..................................................................................................................... 15
Figure 8 Mux in APMC for clock selection............................................................................... 16
Figure 9 - Multi-pass at speed Scan mode : Mode 0..................................................................... 17
Figure 10 - Multi-pass at speed Scan mode : Mode 1................................................................... 18
Figure 11 - Multi-pass at speed Scan mode : Mode 2................................................................... 19
Figure 12 - Multi-pass at speed Scan mode : Mode 3................................................................... 20
Figure 13 - OCC insertion point ................................................................................................... 22
Figure 14 - Blocking any data path from ATE clock.................................................................... 23
Figure 15 STA clocks at top level.............................................................................................. 27
Figure 16 STA clocks inside APMC ......................................................................................... 28
Figure 17 - I/O constraints ............................................................................................................ 29
SNUG 2013
Table of Tables
Table 1 - OCC truth table................................................................................................................ 9
Table 2 Synopsys OCC control.................................................................................................. 21
Table 3 Parts catched by Scan At-speed test modes .................................................................. 41
SNUG 2013
SNUG 2013
This fast clock is driven by a PLL at a rate which is the targeted frequency rate in functional
mode. The clock paths in capture mode should be identical to those in functional modes, so as to
exactly reflect the functional behavior. The reference clock at the PLL input is driven by the
tester. This reference clock is named 'OCC reference clock'.
This slow clock is driven by the tester. It is named here after 'ATE OCC slow clock'.
Then, in at-speed scan, loading/unloading scan patterns is done at slow rate (ATE rate). Launch
and capture is done at high speed rate (PLL rate) through at least 2 clock pulses.
Switching from the high speed clock to the ATE slow clock is automatically performed by the
OCC and is controlled by the scan enable signal (test_se).
SNUG 2013
SNUG 2013
The principle of Synopsys OCC logic is illustrated in Figure 3 and the behavior of this logic is
detailed here after :
In Functional mode (scan_test_mode = 0),
the OCC always selects the PLL clock (fast_clk) on its output (clk). (Caution : in test modes
other than scan mode, PLL must be stopped.)
A bypass pin is provided (pll_bypass).
when asserted high, the OCC logic is bypassed. The selected output clock is the ATE slow
clock (slow_clk). This signal is asserted when normal basic scan is to be performed. Note
that the bypass pin has no effect when the device is in functional mode.
When the device is in scan test mode and the bypass pin is inactive, the selected clock at the
OCC output is :
ATE slow clock (slow_clk) in shift mode (test_se = 1)
PLL clock (fast_clk & fast_clk_en) in capture mode (test_se = 0)
Test mode
Scan enable
(scan_test_mode) (test_se)
0
x
1
x
1
0
1
1
Bypass
(pll_bypass)
x
1
0
0
Output
(clk)
PLL clk (fast_clk)
ATE slow clk (slow_clk)
fast_clk_en & fast_clk
ATE slow clk (slow_clk)
SNUG 2013
b) clock chain. This clock chain is a dedicated scan chain segment. It allows for a perpattern clock selection mechanism and is controlled by the ATPG. The clock chain is
created during OCC insertion and may be integrated into other scan chains of the design.
It is loaded as part of the regular scan load process. When test_se is 0, the content of the
clock chain is unchanged.
The clock bits are used to validate or invalidate the PLL clock pulses on each cycle of the cycle
counter.
Next section will detail how the Synopsys OCC is inserted in SAMA5D3 design, which is divided into subsystems that runs at different frequencies.
SNUG 2013
10
SNUG 2013
11
12
specificity of our SoC is that it is divided into several subsystems that can run at different frequencies. The partitioning of the design into several subsystems allows us to have less constraint
in subsystems that dont require full performances. In this way, synthesis requires less area, less
power and less timing closure issues.
Basically, the presented SoC has 4 synchronous subsystems :
ARM processor, which runs at to 400 MHz,
LCD controller, which runs at 266 MHz,
system clock and 2/3 of the peripherals, which runs at 133 MHz,
1/3 of the peripherals, which runs at 66 MHz.
In functional mode, those clocks are generated by the APMC through dividers which are software programmable according to the user needs. Dividers are cascaded as shown in Figure 5.
The picture shows the dividers programmed in such a way, to provide the best performances of
the SoC, which are our scan at-speed target frequencies.
SNUG 2013
13
Figure 6 - Solution 1
In this case, all dividers have to keep their functionality in scan mode with the ratio shown in the
picture above. In this way 4 OCCs are needed. Drawbacks of this solution are:
All subsystems of OCC sys are synchronous even if clock frequencies are different,
and a lot of data path exists between them. With solution (1), all data-paths from one
sub-domain to the other cant be tested at-speed as there is no way to synchronize the
OCCs provided by Synopsys. Therefore, during at-speed scan test every sub-system will
be seen as asynchronous blocks, which is not the case in SAMA5D3 design. In our case
this would lead to a loss of 20% in term of at-speed fault coverage.
This issue may be solved by developing a full synchronous OCC, which is time consuming and a possible bug source (see solution (3) at the end of this chapter).
Dividers are not tested at-speed.
SNUG 2013
14
Figure 7 - Solution 2
Frequency partitioning is done in the second step during ATPG using multi-pass at-speed pattern
generation (see after chapter 7). ATPG will be run 4 times, one time per frequency subsystem in
different scan modes that applies OCC clock or ATE tester clock according to maximum frequency supported by the four clock domains. To do so, multiplexors are inserted into the APMC
on clock paths to protect clock branches against unsupported clock rates.
Scan at-speed OCC mode selection is generated by a customized test mode controller through
external pins (which is not described here).
SNUG 2013
15
Figure 8 shows the location of the multiplexors into APMC in order to select the clock source
according to selected scan mode.
SNUG 2013
16
SNUG 2013
17
SNUG 2013
18
SNUG 2013
19
20
Description
1: shift mode, 0: capture mode
PLL fast clock
Reset (active low)
OCC bypass (active high)
ATE slow clock
Scan test mode
Table 2 Synopsys OCC control
SNUG 2013
21
All signals listed in Table 2, except fast_clk, must be controlled from pads, through eventual
hook up pins.
The following subsections will gives some details on OCC control signals :
test_se
This is the shift enable signal also used for basic scan. test_se should be provided at the test
mode controller interface through a clocktree pre-buffer for proper bufferisation.
fast_clk
This signal is to be connected to the output of the PLL. We have inserted into the test mode controller muxes in order to select other fast_clk sources than PLL, for debug purposes, in case of
PLL lock issues. OCC will be inserted between test_mode_controller and APMC, as closest as
possible of the PLL in order not to impact functional clock path. Location of OCC insertion point
is shown in Figure 13 :
SNUG 2013
22
slow_clk
This signal is the ATE slow clock used in shift mode or in bypass mode. This signal must be
driven by a dedicated pad and is available at the test mode controller output as a hookup pin for
OCC insertion. It is declared as a scan_clock or oscillator (always running clock) depending on
exercised mode.
The pad used to drive the slow clock input of OCC must drive this input to avoid DRC issue :
This clock CANT drive any other asynchronous OCC. If the same clock was used to
drive the slow clock inputs of separate OCCs, these OCC outputs - in basic scan mode would be considered by ATPG as belonging to the same clock domain. Then, data paths
between asynchronous clock domains would be exercised by ATPG !
This clock CANT drive any other DFFs inputs. In that case, DRC results in unrecoverable error. In scan mode, paths from ATE Slow clock pads to functional logic must be cut
as illustrated on Figure 14. So as to not penalize coverage of the functional logic, any pad
may be used to drive this path.
23
pll_bypass
In the Synopsys flow, this signal must be directly driven by the tester from one pad and cant
result from the logical combination of several pads. The bypass signal in the STIL output file
appears as a constant set to 0 in the at_speed section and to 1, in the occ_bypass section. It must
be available at the test mode controller interface to provide a hookup pin for the OCC insertion.
In the Synopsys flow again, it is not possible to affect one separate bypass signal per OCC, all
OCCs must have the same bypass signal. It is however possible to bypass this constraint within
DC, once OCC insertion has been performed, by disconnecting the bypass signal from the OCC,
and reconnecting another signal at the OCC input. This is what we have done in our flow.
reset
The recommendation of Synopsys is that the OCC reset can be shared with other reset in scan
test mode.
In basic scan mode : when OCC is bypassed, there is no requirement to keep the OCC reset constant. So that the reset can be used to reset the functional logic.
In at_speed mode : the functional reset does not need to be tested at speed because it is
considered as static, it will then be test with stuck at model only .
test_mode
The test mode input of the OCC device is the scan test mode signal at the top level.
Unfortunately, in the Synopsys flow, this signal is expected to be directly driven by one pad of
the design. In order to avoid using an additional pad to drive that signal, a workaround to this
limitation is to declare one pad already used, to set up the design in scan test mode.
SNUG 2013
24
top_ ScanCompression.stil :
PatternExec ScanCompression_mode {
PatternBurst ScanCompression_mode;
}
PatternExec ScanCompression_mode _occ_bypass {
PatternBurst ScanCompression_mode _occ_bypass;
}
or With compression
set_drc $path_stil/top_ScanCompression_mode$i.stil
run_drc patternexec ScanCompression_mode
SNUG 2013
25
5. STA Constraints
Timing closure of scan modes is verified with STA constraints. Reference methodology used is
inspired from the approach explained in Solvnet ref[4].
Two STA scenarios are constrained: one scan shift and one scan capture. Timing paths from scan
enable test_se to scan_clocks must be analyzed. Then it is strongly advised to have at least one
scenario in which no case analysis is set on test_se.
scan shift scenario
In the scan_shift scenario, the test_se is forced to 1. All capture paths will then be removed
from analysis. Only ATE clocks (either ATE basic scan clocks or ATE OCC slow clocks) must
be declared. Declaring fast clocks in shift mode makes no sense.
All ATE clocks are declared to belong to the same clock domains, so that shift paths between
scan chains driven by separate scan clocks are analyzed.
In this scenario, there is nothing special regarding multi-frequency target modes.
scan capture scenario
In the scan_capture scenario, the test_se is not constrained so as to analyze the signal transition. All (fast and slow) clocks are declared :
ATE basic scan clocks,
ATE OCC slow clocks,
OCC fast clocks.
Note that it is safe to declare ATE OCC slow clocks in the scan capture scenario to time the design in OCC bypass modes. This may be needed as clock path may be different between bypass
and at-speed modes, then skew between clocks may also be different.
The ATE OCC slow clocks and ATE basic scan clocks will be individually assigned to separate
clock domains so that paths between ATE clocks wont be seen. This is equivalent to the
set_drc -nodisturb_clock_grouping command of ATPG, where the ATPG is asked to not activate paths between separate scan clocks. By the way, shift paths between scan chains driven by
distinct clocks are not analyzed and thats why a dedicated shift scenario is still required.
SNUG 2013
26
To manage the specificity of our design that have synchronous subsystems running at different
frequencies, clocks declarations are duplicated so to handle each target frequency of each modes.
For example, in mode 0 : PLL frequency has to be 400 MHz, in mode 1 : 266 MHz, in mode 2 :
133 MHz and in mode 3 : 66 MHz. Therefore 4 clocks are created at the output of the PLL as
showed in Figure 15. The same principle is done for all scan capture clocks defined at STA level.
All clocks are declared in the same scan capture scenario. False path are added between clocks
from one mode to the other, because they are exclusive.
The following picture shows the location of clocks defined in STA. Note that multiple clocks are
defined at the output of the PLL to manage all different frequency modes.
SNUG 2013
27
In addition to the clocks defined in Figure 15, generated clocks are defined inside APMC at the
output of muxes in order to propagate the correct clock through the scan multiplexors according
to each frequency modes, as showed in Figure 16. For the same reason than previously explained, all clocks are duplicated to manage all different frequency modes.
SNUG 2013
28
IO constraints
In order to constrain I/Os, virtual clocks are created according to the tester target timing set, as
shown in Figure 17.
at ATE rate. Consequently, these commands in ATPG are not equivalent to multi-cycle paths in
STA.
The recommended Synopsys flow is to read SDC file in Tetramax. These SDC files can specify
multi-cycle and false paths. Today, multicycles are interpreted as false paths by Tetramax, which
impacts directly the at-speed fault coverage, because all false path are not tested at-speed. Fortunately, our multi-pass ATPG flow, explained in chapter 7, allows us to test those multicycle path
in one lower frequency mode. For example, a multicycle path defined in mode 0 (400MHz) will
not be tested at-speed in this mode, but it will be tested in mode 3 (133 MHz) as not defined as
multi-cycle path in mode 3.
TCL procedures are available in Tetramax installation directory to generate SDC files for ATPG
(pt2tmax.tcl file defines write_delay_paths, write_exceptions_from_violations, write_exceptions
and write_exceptions_from_violations_from_to TCL procedures).
For more information on this section, refer to read_sdc flow as described in Tetramax User
Guide, see ref[2].
Clock gating checks
Clock gating check has to be performed on the OCC output. The aim is to check that the clock
gating of the fast_clk input and the clock gating of the slow_clk input of OCC are correctly performed.
Clock gating check is not necessarily inferred by PrimeTime, depending on how the OCC is synthesized. Then, a user constraint should force PT to perform the check. For instance :
set
OCC_cgc_list i_core/i_muxdebug/pll_controller_sys/U2/A2
lappend OCC_cgc_list i_core/i_muxdebug/pll_controller_sys/U2/A3
lappend OCC_cgc_list i_core/i_muxdebug/pll_controller_sys/U2/B2
foreach OCC_CGC_pin $OCC_cgc_list {
set_clock_gating_check -setup 0.1 -hold 0.1 -high $OCC_CGC_pin
}
Note : Since DFT Compiler version D-2010.03-SP2, the synthesis of latch-based clock gating
cells can be forced by setting the test_occ_insert_clock_gating_cells variable to true. This option
has not been tested yet.
SNUG 2013
30
6. CTS Constraints
The aims of CTS constraints regarding OCC design are :
balancing all flip flops in OCC driven by the fast_clk OCC input.
balancing all flip flops in OCC driven by the slow_clk input. For now, there is only one
DFF driven by this clock in the design (slow_clk_enable_l_reg), but one can imagine that
this number may increase in later implementations.
balancing all flip flops in the clock chain, namely in module top_DFT_clk_chain.
These DFFs are driven by the output of OCC. Note that they do not need to be balanced with all
DFFs driven by OCC output. The clock chain is chained with other chain segments of the
same domain, through lockup latches.
tagging the nets from pads (reference clock) to PLL input.
tagging the nets from PLL outputs to OCC fast_clk input.
tagging the net from ATE clock pad to OCC slow_clk input.
tagging the net OCC output (clk) to APMC scan multiplexor.
Note that all scan clocks should be also tagged.
CTS roots at top level
CTS roots must be declared on clock sources at top level.
For fast clocks, these sources are typically PLL outputs (ck), PLL divided outputs (out- div), or
the output of a custom divider on PLL outputs. A clock root must be declared on each potential
fast clock source. Declaring CTS roots on fast clock sources do not aim at balancing the OCC
inputs with other leaf pins of the same root, but at tagging the clock nets
For slow clock source, the root pin must be declared on the cin output pin of the pad that
drives the ATE slow clock. As shown in Figure 18, scan multiplexors were placed in the test
mode controller between the clock sources and the OCC inputs. They separate functional and
scan paths of clocks (fast and slow). In functional mode, OCC clock inputs are stucked to 1b0.
SNUG 2013
31
SNUG 2013
32
SNUG 2013
33
SNUG 2013
34
Primary Outputs (POs) must not be used in at-speed mode, because the rate of atspeed
scan is too fast to exercise paths from registers to pads. Only reg-to-reg paths must be exercised in at-speed mode and tested :
o add_po_mask all
For the same reason than previously, the patterns must not allow to change Primary Inputs (PIs) between the launch clock and the capture clock for clock launch transition fault
patterns or for any path delay fault patterns :
o set_delay nopi_changes
o add_pi_constraints X
The patterns must not use value through bidirectional ports, since these paths may not be
functional and may take more than one at-speed cycle :
o add_slow_bidi
Disable inter-clock domain testing and avoid generating patterns using one clock to
launch and another clock to capture
o set_delay common_launch_capture
Disable the parallel launch and capture of several clocks in the same tester cycle (even
when clocks belongs to separate groups). This parameter is a choice that that could have
been different :
o Set_delay noallow_multiple_common_clocks
Read timing exceptions generated during STA :
o read_sdc file.sdc
Set DRC file :
o set_drc ./STIL/top_ScanCompression_mode0.stil nodisturb_clock_grouping
DRC run :
o run_drc patternexec ScanCompression_mode
AT-speed ATPG Flow ATPG Step
The following parameters have been set for the ATPG generation :
Fault model used is transition fault
Width of the capture window must be large enough to enable proper resynchronization of
test_se, according to skew between test_se and clocks of the design at FF side (this parameter can be reduced afterward according to the timing margin with respect to the scan
enable change) :
o set_atpg min_ateclock_cycles 7
Select the depth of the fast sequential algorithm
o set_atpg capture_cycles 3
SNUG 2013
35
Fault dictionary creation has been limited to at-speed domains, therefore command add_fault
launch OCC_clock capture OCC_clocks has been used instead of add_fault all. This command have to be executed 4 times with different parameters in order to add all faults as showed
in Figures 20.
D1 adds faults from FF to FF both clocked by OCC_clock
D2 adds faults from FF clocked by OCC_clock to FF clocked by ATE clock
D3 adds faults from FF clocked by ATE clock to FF clocked by OCC_clock
D4 adds faults from FF clocked by OCC_clock to FF clocked by ATE clock, with logic
cone impacted by some FF clocked by ATE clock
SNUG 2013
36
Build the dictionary of the 400 MHz domain as explained in previous section,
run ATPG on mode 0,
write the mode0 dictionary resulting from atpg,
Build the dictionary of the 266 MHz domain. Remove from this dictionary the mode0
fault dictionary,
5. run ATPG on mode 1,
6. write the mode 1 dictionary resulting from atpg,
7. Build the dictionary of the 133 MHz domain. Remove from this dictionary the mode1
fault dictionary,
8. run ATPG on mode 2,
9. write the mode 2 dictionary resulting from atpg,
10. Build the dictionary of the 66MHz domain. Remove from this dictionary the mode2 fault
dictionary,
11. run ATPG on mode 3,
12. write the mode 3 dictionary resulting from atpg (pass 3).
The multi-pass at-speed flow is illustrated on Figure 21. Notice that the loop is starting at mode 1
because mode 0 as been previously run without the need of removing any dictionary.
SNUG 2013
37
and all faults that remain undetected after pass 1 because of these timing exceptions should remain in dictionary in order to get a chance to be detected at 133 MHz. This option should be selected if the number of timing exceptions is significantly reducing the fault coverage. This option
has been used in the design.
AT-speed ATPG Flow Incremental flow for 95% stuck at coverage
Once all at-speed atpg passes are completed, a last stuck-at incremental pass is performed in order to catch stuck-at defects with a coverage of 95%. Incremental means here that the already
detected transition faults should be used as starting point to the last ATPG pass. All dictionaries
resulting from all at-speed passes are removed from the stuck-at dictionary after conversion of
transition faults into stuck-at faults. For this, we use the command :
update_faults -direct_credit -external mode$i.dict
The incremental flow used to generate 95% stuck-at coverage is depicted on Figure 22.
39
ATPG results
The following are the results in term of coverage and pattern number of Multi-Pass ATPG flow
versus Basic scan flow without OCC :
We can observe a pattern increase of a factor 2.4 between transition fault and basic stuck-at fault
model, which is generally seen comparing stuck-at pattern count with respect to transition pattern
count. Transition fault model scan test time is still acceptable thanks to the compression factor of
20X used on chip SAMA5D3 : in scan compression mode, the scan internal chains are made of
100 Flip-Flops length, which gives us a scan test time of around 2ms. Please note that test time
with transition fault model would have not been acceptable without compression (test time has
been estimated to minimum 50 ms compare to 2ms).
These results were obtained with the following releases :
DFT compiler / Design Compiler : 2011.09-SP3
TetraMax : 2012.06-SP5
SNUG 2013
40
% Loss transition
versus stuck-at
0.4%
0%
0.8%
0%
Comments
High frequency : timing critical
Very small area of chip
4/5 of the area of the digital
Low frequency : result close to stuck-at
SNUG 2013
41
8. Pattern validation
We have used TetraMax MAXTestbench feature, to create Verilog testench in order to validate
the ATPG patterns with full back-annotated simulations. Please refer to ref[2] in case you need
more details about MAXTestbench flow. Alternate solution is to use STILDPV (usage of STIL
PLI), see ref[5].
SNUG 2013
42
10.Acknowledgements
We would like to thank Anne Lafage for her preliminary work on the OCC flow subject at
ATMEL which served as the basis for this paper, see ref[0], and also all the ATMEL design team
and management involved, especially Fabrice Vigneron (ATMEL Design Team Manager) for his
help and advises, and also our Synopsys local support team.
11.Acronyms
DFT
: Design for Test
SPF
: STIL Protocol file
FF
: Flip-Flop
BB
: Black Box
PnR
: Place and Route
ATPG
: Automatic Test Pattern Generation
ATE
: Automatic Test Equipment.
IP
: Intellectual Property (design block)
DC
: Design Compiler, Synopsys Synthesis Tool
DFT Compiler : DFT insertion tool embedded in DC
TetraMAX
: Synopsys ATPG tool
SoC
: System on a Chip
APMC
: Advanced Power Management Controller
SDD
: Small Delay Defect
SNUG 2013
43
12. References
[0] Scan At-Speed Guidelines V1.0, Anne Lafage, January 2012, ATMEL MCU Design
Group ATMEL Internal Confidential Document[1] Synopsys Tetramax 2012.06-SP5 User Guide
https://solvnet.synopsys.com/dow_retrieve/latest/tmax/tmax_olh/Default_CSH.htm
[2] Synopsys DFT Compiler 2012.06-SP5 User Guide
https://solvnet.synopsys.com/dow_retrieve/latest/dftxg1/dftxg1.html
[3] Synopsys Design Compiler 2012.06-SP5 User Guide
https://solvnet.synopsys.com/dow_retrieve/latest/dcug/dcug.html
[4] Solvnet ID 022490 Static Timing Analysis Constraints for On-Chip,
https://solvnet.synopsys.com/retrieve/022490.html?otSearchResultSrc=advSearch&otSearc
hResultNumber=4&otPageNum=1
[5] Test Pattern Validation User Guide 2012.06-SP5
[6] On-Chip-Clock controller (OCC) : An Alternative Approach, SNUG Israel 2012
[7] Using Custom OCC with TetraMax for At-speed Transition Fault testing and Small
Delay Defect, SNUG France 2011
[8] How to Handle OCC (On Chip Clocking) by using DFT Compiler and TetraMax on
Sophisticated SoC Design, SNUG Singapore 2008
[9] Implementation of At-Speed Test using an On-Chip Clock Controller on an Analog Device Audio Processor, SNUG Europe 2008
[10] Automatic Insertion Flow of on Chip Controller for At-Speed Testing, SNUG Europe
2007
[11] ATMEL SAMA5D3 Product The next Benchmark in Versatility,
http://www.atmel.com/Microsite/sama5d3/default.aspx
SNUG 2013
44