You are on page 1of 44

Using at-speed testing with OCC (On Chip Clock Control)

on a complex SoC, a user experience.

Bertrand Bruder : bertrand.bruder@atmel.com


Nicolat Graffet : nicolas.graffet@atmel.com
Vincent Chanel : vincent.chanel@atmel.com

Philippe Rossant : philippe.rossant@synopsys.com

Atmel
Rousset, France
www.atmel.com
ABSTRACT

The implementation and the usage of Synopsys OCC in a complex SoC using Synopsys
DFTMAX Adaptative Scan Compression technology are presented in this paper. A solution,
designed at ATMEL Rousset for SAMA5D3 product, is proposed including both the main DFT
and ATPG aspects, and explaining the specificities of the design in term of OCC Test
implementation, focusing on the scan at-speed test of synchronous subsystems that runs at
different frequencies.

Table of Contents
1.

Introduction and scope .......................................................................................................... 5

2.

Principles of at-speed Scan and On-Chip Clocking .............................................................. 6


BASICS ON AT-SPEED TESTING AND OCC...................................................................................... 6
SYNOPSYS OCC DESCRIPTION ...................................................................................................... 8

3.

Scan At-Speed Complex SoC Design Integration .............................................................. 11


OCC SYS CLOCK SYSTEM ANALYSIS........................................................................................ 12
SCAN AT-SPEED OCC INSERTION ................................................................................................ 13
a)
First solution : One OCC per clock domain ................................................................ 14
b)
Second solution : Only one single OCC for all domains ............................................. 15
c)
Third solution : use of custom synchronous OCC........................................................ 20

4.

Synthesis Flow and STIL generation .................................................................................. 21


OCC INSERTION ......................................................................................................................... 21
test_se ..................................................................................................................................... 22
fast_clk ................................................................................................................................... 22
slow_clk .................................................................................................................................. 23
pll_bypass ............................................................................................................................... 24
reset ........................................................................................................................................ 24
test_mode................................................................................................................................ 24
STIL FILES GENERATION ............................................................................................................ 25
Specificity of our multi-pass ATPG flow :.............................................................................. 25

5.

STA Constraints .................................................................................................................. 26


SCAN SHIFT SCENARIO................................................................................................................. 26
SCAN CAPTURE SCENARIO ........................................................................................................... 26
IO CONSTRAINTS ........................................................................................................................ 29
SDC GENERATION FOR ATPG .................................................................................................... 29
CLOCK GATING CHECKS .............................................................................................................. 30

6.

CTS Constraints .................................................................................................................. 31


CTS ROOTS AT TOP LEVEL .......................................................................................................... 31
CTS CONSTRAINTS IN OCC ........................................................................................................ 33

7.

Multi-Pass ATPG Flow....................................................................................................... 34


AT-SPEED ATPG FLOW - BUILD MODEL ................................................................................... 34
AT-SPEED ATPG FLOW DRC STEP ......................................................................................... 34
AT-SPEED ATPG FLOW ATPG STEP ....................................................................................... 35
AT-SPEED ATPG FLOW MULTI-PASS PATTERN GENERATION ................................................. 37

SNUG 2013

Using at-speed testing on a complex Soc, a user experience

AT-SPEED ATPG FLOW INCREMENTAL FLOW FOR 95% STUCK AT COVERAGE ........................ 39
ATPG RESULTS .......................................................................................................................... 40
EARLY SILICON RESULTS ............................................................................................................ 41
8.

Pattern validation ................................................................................................................ 42

9.

Conclusion and Future Work .............................................................................................. 42

10.

Acknowledgements ............................................................................................................. 43

11.

Acronyms ............................................................................................................................ 43

12.

References ........................................................................................................................... 44

Table of Figures
Figure 1 - General Principe of at-speed Test................................................................................ 6
Figure 2 - Synopsys OCC switch ................................................................................................. 7
Figure 3 - Synopsys OCC functional view................................................................................... 8
Figure 4 - Clocks Domain in our SoC .......................................................................................... 12
Figure 5 - System Branch in our SoC ........................................................................................... 13
Figure 6 - Solution 1 ..................................................................................................................... 14
Figure 7 - Solution 2 ..................................................................................................................... 15
Figure 8 Mux in APMC for clock selection............................................................................... 16
Figure 9 - Multi-pass at speed Scan mode : Mode 0..................................................................... 17
Figure 10 - Multi-pass at speed Scan mode : Mode 1................................................................... 18
Figure 11 - Multi-pass at speed Scan mode : Mode 2................................................................... 19
Figure 12 - Multi-pass at speed Scan mode : Mode 3................................................................... 20
Figure 13 - OCC insertion point ................................................................................................... 22
Figure 14 - Blocking any data path from ATE clock.................................................................... 23
Figure 15 STA clocks at top level.............................................................................................. 27
Figure 16 STA clocks inside APMC ......................................................................................... 28
Figure 17 - I/O constraints ............................................................................................................ 29
SNUG 2013

Using at-speed testing on a complex Soc, a user experience

Figure 18 - CTS constraints example............................................................................................ 32


Figure 19 - using the set_build instance_modify TIEX command ............................................. 34
Figure 20 - add_faults command select exclusive fault sets ......................................................... 36
Figure 21 - Multi-Passes flow on our SoC using all Modes. ........................................................ 38
Figure 22 - Top-off stuck-at Incremental ATPG .......................................................................... 39
Figure 23 Wafer maps of stuck-at versus transition fault models ............................................. 41

Table of Tables
Table 1 - OCC truth table................................................................................................................ 9
Table 2 Synopsys OCC control.................................................................................................. 21
Table 3 Parts catched by Scan At-speed test modes .................................................................. 41

SNUG 2013

Using at-speed testing on a complex Soc, a user experience

1. Introduction and scope


This document aims at providing a user experience for inserting & validating scan logic for atspeed testing. The presented flow here is based on the Synopsys OCC design IP. This flow has
been successfully implemented on a complex SoC with multiple frequency domains (both in functional and in test mode), incorporating DFTMAX Adaptative Scan Compression logic with CTL
based hierarchical approach. The OCC related aspects are detailed, focusing mainly on how the
management of subsystems that runs at different frequencies are handled.
Several SNUG papers (see ref[6],ref[7],ref[8],ref[9],ref[10] have been written in the past explaining
how to implement Synopsys OCC or user defined Custom OCC within different design contexts
and we encourage the reader to look at these very valuable references. We have tried here, to describe our own complex SoC context and the solution we have chosen, trying to share a user experience.
This document shows the concepts of scan at-speed and describes the Synopsys OCC. Then, it
details the implementation of scan at-speed in SAMA5D3 design, focusing on the multifrequency subsystems specificity, starting from synthesis and STIL generation, going through
STA and CTS constraints, and finally ending with ATPG generation.

SNUG 2013

Using at-speed testing on a complex Soc, a user experience

2. Principles of at-speed Scan and On-Chip Clocking


In this section we present the main concepts around scan at-speed and On-Chip Clocking, and
present the Synopsys OCC implemented in SAMA5D3 design.
Basics on at-speed testing and OCC
At-speed scan is generally used to catch physical defects that could lead to a speed decrease on
functional paths. Theses physical defects appears on small technologies (below 130nm). In this
case, stuck at fault model may not be totally appropriate because even though the functionality is
correct, the timing on data and clocks may be affected. Therefore, the transition fault model is
used to complement the test coverage.
Please note that, transitions must be captured at-speed, and that transition fault is generally combined with scan compression, here Synopsys DFTMAX Adaptative Scan Compression technology for both DFT and ATPG. For more information about Transition fault model and Synopsys
DFTMAX Adaptative Scan Compression Technology, please see ref[1] and ref[2].

Figure 1 - General Principe of at-speed Test


Synopsys OCC Flow acts as a switch between fast and slow clocks as depicted in Figure 2. During capture cycles, flip-flop clock source is the functional clock, generally provided by the functional PLL at maximum target clock speed.
In scan mode, Flip-Flops are driven by an On-Chip-Clocking (OCC) device which replaces the
functional clock (a PLL or its division). Clock tree from OCC to Flip-Flops is the same in functional and scan modes.
SNUG 2013

Using at-speed testing on a complex Soc, a user experience

Figure 2 - Synopsys OCC switch


The clock driven by the OCC is (please refer to Figure 2) :

A fast clock in CAPTURE mode (test_se = 0)

This fast clock is driven by a PLL at a rate which is the targeted frequency rate in functional
mode. The clock paths in capture mode should be identical to those in functional modes, so as to
exactly reflect the functional behavior. The reference clock at the PLL input is driven by the
tester. This reference clock is named 'OCC reference clock'.

A slow clock in SHIFT mode (test_se = 1)

This slow clock is driven by the tester. It is named here after 'ATE OCC slow clock'.
Then, in at-speed scan, loading/unloading scan patterns is done at slow rate (ATE rate). Launch
and capture is done at high speed rate (PLL rate) through at least 2 clock pulses.
Switching from the high speed clock to the ATE slow clock is automatically performed by the
OCC and is controlled by the scan enable signal (test_se).

SNUG 2013

Using at-speed testing on a complex Soc, a user experience

Synopsys OCC description


This section describes the OCC architecture automatically inserted by Synopsys DFT Compiler
and its functional behavior. This should be read as a complement of the On-Chip Clocking support chapter of Synopsys DFT compiler most up to date User Guide see ref[1].
We present here Synopsys OCC functional view :

Figure 3 - Synopsys OCC functional view

SNUG 2013

Using at-speed testing on a complex Soc, a user experience

The principle of Synopsys OCC logic is illustrated in Figure 3 and the behavior of this logic is
detailed here after :
In Functional mode (scan_test_mode = 0),
the OCC always selects the PLL clock (fast_clk) on its output (clk). (Caution : in test modes
other than scan mode, PLL must be stopped.)
A bypass pin is provided (pll_bypass).
when asserted high, the OCC logic is bypassed. The selected output clock is the ATE slow
clock (slow_clk). This signal is asserted when normal basic scan is to be performed. Note
that the bypass pin has no effect when the device is in functional mode.
When the device is in scan test mode and the bypass pin is inactive, the selected clock at the
OCC output is :
ATE slow clock (slow_clk) in shift mode (test_se = 1)
PLL clock (fast_clk & fast_clk_en) in capture mode (test_se = 0)
Test mode
Scan enable
(scan_test_mode) (test_se)
0
x
1
x
1
0
1
1

Bypass
(pll_bypass)
x
1
0
0

Output
(clk)
PLL clk (fast_clk)
ATE slow clk (slow_clk)
fast_clk_en & fast_clk
ATE slow clk (slow_clk)

Table 1 - OCC truth table


Table 1 shows how not all pulses of the PLL clk are passed to the output clock when test_se = 0.
The fast_clk_en signal filters the pulses of the PLL clk and enables the launch and capture pulses
of the OCC output clock (clk). Enabling the PLL clk is based on :
a) cycle counter clocked by the PLL clock (fast_clk). This counter is started after resynchronization of test_se = 0 on fast_clk. This counter is used to count the PLL cycles inside the timing window where (test_se == 0). The size of this counter is determined at
synthesis time.

SNUG 2013

Using at-speed testing on a complex Soc, a user experience

b) clock chain. This clock chain is a dedicated scan chain segment. It allows for a perpattern clock selection mechanism and is controlled by the ATPG. The clock chain is
created during OCC insertion and may be integrated into other scan chains of the design.
It is loaded as part of the regular scan load process. When test_se is 0, the content of the
clock chain is unchanged.
The clock bits are used to validate or invalidate the PLL clock pulses on each cycle of the cycle
counter.
Next section will detail how the Synopsys OCC is inserted in SAMA5D3 design, which is divided into subsystems that runs at different frequencies.

SNUG 2013

10

Using at-speed testing on a complex Soc, a user experience

3. Scan At-Speed Complex SoC Design Integration


In this section, a solution is proposed for the implementation of scan at-speed in a complex SoC
design.
As a general rule, separate clock domains in a SoC must be addressed by separate Synopsys
OCC devices. So designer should first identify the clock domains that must be tested at-speed.
These clock domains will determine the number of OCCs that have to be inserted and where.
Then, OCC should be placed so as to minimize their impact on clock distributions. Indeed, the
latency of at-speed clocks in scan mode must be as close as possible to their latency in functional
mode. To handle this, we have designed what we call the APMC (Advanced Power Management
Controller) IP which is able to deliver clocks in scan mode without modifying their latencies.
The clock distribution of all clocks delivered by APMC is the same in scan and functional
modes.
As an example, Figure 4 illustrates the clock domain distribution in our complex SoC and the
OCC insertion.

SNUG 2013

11

Using at-speed testing on a complex Soc, a user experience

Figure 4 - Clocks Domain in our SoC


OCC sys clock system analysis
As shown in Figure 4, our SoC has 2 main clock sources. One 800MHz PLL for the system and
one 480MHz PLL for USB and Soft modem peripherals. For the sake of simplicity, the main
clock source from the 800MHz PLL will be referenced as OCC sys clock system. We will only
focus on the OCC sys part of the design in order to point out the specificity of inserting scan
at-speed in a synchronous system that is divided into multiple sub-systems that runs at different
frequencies (all clocks of sub-systems are synchronous). The PLL named PLL800MHz is a
generic block, programmed to run at a maximum frequency of 400 MHz in SAMA5D3 design.
Note that the same scan at-speed OCC implementation has been used for the rest of design
clocked by the other PLL.
OCC sys clock system drives most of the SoCs clocks, as the ARM processor and all peripherals synchronous to the processor (ARM clock, APB & AHB peripheral clocks). One of the
SNUG 2013

12

Using at-speed testing on a complex Soc, a user experience

specificity of our SoC is that it is divided into several subsystems that can run at different frequencies. The partitioning of the design into several subsystems allows us to have less constraint
in subsystems that dont require full performances. In this way, synthesis requires less area, less
power and less timing closure issues.
Basically, the presented SoC has 4 synchronous subsystems :
ARM processor, which runs at to 400 MHz,
LCD controller, which runs at 266 MHz,
system clock and 2/3 of the peripherals, which runs at 133 MHz,
1/3 of the peripherals, which runs at 66 MHz.
In functional mode, those clocks are generated by the APMC through dividers which are software programmable according to the user needs. Dividers are cascaded as shown in Figure 5.
The picture shows the dividers programmed in such a way, to provide the best performances of
the SoC, which are our scan at-speed target frequencies.

Figure 5 - System Branch in our SoC


Scan at-speed OCC insertion
The insertion of scan at-speed with OCC can be done with three approaches :
use one OCC per target frequency, inserted at each output of APMC,
use only one OCC after the PLL and to generate ATPG in each frequency mode,
use of a custom OCC which is able to support synchronous clocks.

SNUG 2013

13

Using at-speed testing on a complex Soc, a user experience

a) First solution : One OCC per clock domain


Figure 6 shows the location of the OCCs in this solution. They are placed before the output of
the clocks that are going to each subsystem.

Figure 6 - Solution 1
In this case, all dividers have to keep their functionality in scan mode with the ratio shown in the
picture above. In this way 4 OCCs are needed. Drawbacks of this solution are:
All subsystems of OCC sys are synchronous even if clock frequencies are different,
and a lot of data path exists between them. With solution (1), all data-paths from one
sub-domain to the other cant be tested at-speed as there is no way to synchronize the
OCCs provided by Synopsys. Therefore, during at-speed scan test every sub-system will
be seen as asynchronous blocks, which is not the case in SAMA5D3 design. In our case
this would lead to a loss of 20% in term of at-speed fault coverage.
This issue may be solved by developing a full synchronous OCC, which is time consuming and a possible bug source (see solution (3) at the end of this chapter).
Dividers are not tested at-speed.

SNUG 2013

14

Using at-speed testing on a complex Soc, a user experience

b) Second solution : Only one single OCC for all domains


The solution that has been adopted into our SoC is the one that uses one single OCC placed at
the output of the PLL, and to implement several test modes to test different target rates. Unlike
solution (1) with 4 OCCs, here each divider is bypassed in order to connect all subsystems to the
output of the OCC. This gives us the possibility to scan the dividers.
Doing this, every subsystem is clocked by the same source, therefore inter-clock domains can
be tested at-speed, which was our goal.
Figure 7 shows the APMC in scan mode configured for solution (2) : One OCC connected to the
output of the PLL and dividers are bypassed.

Figure 7 - Solution 2
Frequency partitioning is done in the second step during ATPG using multi-pass at-speed pattern
generation (see after chapter 7). ATPG will be run 4 times, one time per frequency subsystem in
different scan modes that applies OCC clock or ATE tester clock according to maximum frequency supported by the four clock domains. To do so, multiplexors are inserted into the APMC
on clock paths to protect clock branches against unsupported clock rates.
Scan at-speed OCC mode selection is generated by a customized test mode controller through
external pins (which is not described here).

SNUG 2013

15

Using at-speed testing on a complex Soc, a user experience

Figure 8 shows the location of the multiplexors into APMC in order to select the clock source
according to selected scan mode.

Figure 8 Mux in APMC for clock selection


We will now detail each scan at-speed mode selection in the following sections Mode 0 to 3.

SNUG 2013

16

Using at-speed testing on a complex Soc, a user experience

Mode 0 : ARM scan mode


In this mode, armclock is connected to OCC and all other clocks on ATE tester clock, as shown
in Figure 9. PLL is then programmed to lock at 400 MHz, which allows 400MHz at speed scan
test on ARM processor. The rest of the system is clocked on ATE tester clock.

Figure 9 - Multi-pass at speed Scan mode : Mode 0

SNUG 2013

17

Using at-speed testing on a complex Soc, a user experience

Mode 1 : ARM + LCD scan mode


In this mode, armclock and LCD clock are connected to OCC and all other clocks on ATE tester
clock, as shown in Figure 10. PLL is then programmed to lock at 266 MHz, which allows
266MHz at speed scan test on ARM processor and LCD peripheral. The rest of the system is
clocked on ATE tester clock.
All data-path between ARM and LCD are covered at 266MHz at-speed scan test.

Figure 10 - Multi-pass at speed Scan mode : Mode 1

SNUG 2013

18

Using at-speed testing on a complex Soc, a user experience

Mode 2 : ARM + LCD + hclocks scan mode


In this mode, armclock, LCD clock and hclocks are connected to OCC and pclocks on ATE tester clock, as shown in Figure 11. PLL is then programmed to lock at 133 MHz, which allows
133MHz at speed scan test on ARM processor, LCD and hclocks peripherals. The rest of the
system is clocked on ATE tester clock.
All data-path between ARM, LCD and hclocks peripherals are covered at 133 MHz at-speed
scan test.

Figure 11 - Multi-pass at speed Scan mode : Mode 2

SNUG 2013

19

Using at-speed testing on a complex Soc, a user experience

Mode 3 : slow peripherals mode


In this mode, all peripherals are clocked by the OCC, as shown in Figure 12. PLL is then programmed to lock at 66 MHz, which allows 66MHz at speed scan test on all peripherals.

Figure 12 - Multi-pass at speed Scan mode : Mode 3

c) Third solution : use of custom synchronous OCC


We may plan to investigate in the future this solution which is based on the design of a custom
OCC able to support synchronous clocks. In this case, we could insert as many OCC as synchronous clock systems like in solution (1) and generate ATPG in one PASS. In this case, using several scan modes is no longer necessary.
This solution would be the best one, but developing a synchronous OCC is time consuming, not
bug free, and have to be designed such a way to be detected by Tetramax ATPG tool (OCC
structure will have to be described using STIL specific instructions and capture procedures, see
ref[2]). With the respect of our planning constraints and risk assessment study, the best trade off
is the solution (2).
Next step would be to study the feasibility of solution (3) for future developments.
SNUG 2013

20

Using at-speed testing on a complex Soc, a user experience

4. Synthesis Flow and STIL generation


Once the scan test modes scenarios are defined, designer should proceed to the design of the scan
test control. The test mode controller must provide the control of :
OCC devices,
scan mode selection in multi-pass flow,
PLL control in at-speed mode,
APMC scan multiplexors.
This chapter is an overview of the OCC insertion using DFT compiler with STIL generation for
later ATPG pattern creation. Scripts used are quite common, but it is always interesting to share
hints and tricks, to save time and debug effort.
OCC insertion
OCC insertion is controlled by the command set_dft_clock_controller, with the following parameters : chain_count for the number of clock chain and the cycles per clock (max number of
clock pulses the OCC can generate which corresponds to the max sequential depths used by the
fast sequential ATPG algorithm). This parameter can be found in the output STIL protocol file as
PLLCycles.
DFT Compiler commands are used to specify OCC control signals and connections using option
of the set_dft_signal command (please refer to DFT Compiler User Guide ref[1]). The following
table show all OCC control signals that have to be declared in OCC insertion script :
signal
test_se
fast_clk
reset
pll_bypass
slow_clk
test_mode

Description
1: shift mode, 0: capture mode
PLL fast clock
Reset (active low)
OCC bypass (active high)
ATE slow clock
Scan test mode
Table 2 Synopsys OCC control

SNUG 2013

21

Using at-speed testing on a complex Soc, a user experience

All signals listed in Table 2, except fast_clk, must be controlled from pads, through eventual
hook up pins.
The following subsections will gives some details on OCC control signals :
test_se
This is the shift enable signal also used for basic scan. test_se should be provided at the test
mode controller interface through a clocktree pre-buffer for proper bufferisation.
fast_clk
This signal is to be connected to the output of the PLL. We have inserted into the test mode controller muxes in order to select other fast_clk sources than PLL, for debug purposes, in case of
PLL lock issues. OCC will be inserted between test_mode_controller and APMC, as closest as
possible of the PLL in order not to impact functional clock path. Location of OCC insertion point
is shown in Figure 13 :

Figure 13 - OCC insertion point

SNUG 2013

22

Using at-speed testing on a complex Soc, a user experience

slow_clk
This signal is the ATE slow clock used in shift mode or in bypass mode. This signal must be
driven by a dedicated pad and is available at the test mode controller output as a hookup pin for
OCC insertion. It is declared as a scan_clock or oscillator (always running clock) depending on
exercised mode.
The pad used to drive the slow clock input of OCC must drive this input to avoid DRC issue :
This clock CANT drive any other asynchronous OCC. If the same clock was used to
drive the slow clock inputs of separate OCCs, these OCC outputs - in basic scan mode would be considered by ATPG as belonging to the same clock domain. Then, data paths
between asynchronous clock domains would be exercised by ATPG !
This clock CANT drive any other DFFs inputs. In that case, DRC results in unrecoverable error. In scan mode, paths from ATE Slow clock pads to functional logic must be cut
as illustrated on Figure 14. So as to not penalize coverage of the functional logic, any pad
may be used to drive this path.

Figure 14 - Blocking any data path from ATE clock


OCC reference Clock is declared using the ref clock keyword. If the period of the ref clock is
different from the test default period, it appears as a free running clock in the output STIL file.
Low cost tester used for final production test of chip SAMA5D3 do not allow asynchronous
clocks, therefore OCC reference clock frequency has to be calculated in order to match the target
frequencies, and all other ATE clocks will be synchronous to that clock.
OCC fast clock is declared as an Oscillator. In RTL the hookup pin of the fast clock must drive
the target clock before OCC insertion. The OCC hierarchical level depends on the location of the
fast_clk hookup pins.
SNUG 2013

23

Using at-speed testing on a complex Soc, a user experience

pll_bypass
In the Synopsys flow, this signal must be directly driven by the tester from one pad and cant
result from the logical combination of several pads. The bypass signal in the STIL output file
appears as a constant set to 0 in the at_speed section and to 1, in the occ_bypass section. It must
be available at the test mode controller interface to provide a hookup pin for the OCC insertion.
In the Synopsys flow again, it is not possible to affect one separate bypass signal per OCC, all
OCCs must have the same bypass signal. It is however possible to bypass this constraint within
DC, once OCC insertion has been performed, by disconnecting the bypass signal from the OCC,
and reconnecting another signal at the OCC input. This is what we have done in our flow.
reset
The recommendation of Synopsys is that the OCC reset can be shared with other reset in scan
test mode.
In basic scan mode : when OCC is bypassed, there is no requirement to keep the OCC reset constant. So that the reset can be used to reset the functional logic.
In at_speed mode : the functional reset does not need to be tested at speed because it is
considered as static, it will then be test with stuck at model only .
test_mode
The test mode input of the OCC device is the scan test mode signal at the top level.
Unfortunately, in the Synopsys flow, this signal is expected to be directly driven by one pad of
the design. In order to avoid using an additional pad to drive that signal, a workaround to this
limitation is to declare one pad already used, to set up the design in scan test mode.

SNUG 2013

24

Using at-speed testing on a complex Soc, a user experience

STIL files generation


Two STIL files are generated by DC, one with compression and one without. In each file, two
procedures are described, one at-speed and one in basic scan mode (occ_bypass) :
top_Internal_Scan.stil :
PatternExec Internal_scan {
PatternBurst Internam_scan;
}
PatternExec Internal_scan_occ_bypass {
PatternBurst Internal_scan_occ_bypass;
}

top_ ScanCompression.stil :
PatternExec ScanCompression_mode {
PatternBurst ScanCompression_mode;
}
PatternExec ScanCompression_mode _occ_bypass {
PatternBurst ScanCompression_mode _occ_bypass;
}

Specificity of our multi-pass ATPG flow :


We do not describe all frequency modes during OCC insertion in order to reduce the complexity
of the OCC insertion script, which is not needed for DFT insertion. Therefore, we need to post
process STIL files in order to derivate as many STIL files as scan frequency modes. Basically,
only the signals that control the frequency modes are impacted.
During ATPG, according to the current frequency modes used for generation, the corresponding
STIL file will be used to pass the DRC step :
Without compression
set_drc $path_stil/top_Internal_Scan_mode$i.stil
run_drc patternexec Internal_Scan_mode

or With compression
set_drc $path_stil/top_ScanCompression_mode$i.stil
run_drc patternexec ScanCompression_mode

SNUG 2013

25

Using at-speed testing on a complex Soc, a user experience

5. STA Constraints
Timing closure of scan modes is verified with STA constraints. Reference methodology used is
inspired from the approach explained in Solvnet ref[4].
Two STA scenarios are constrained: one scan shift and one scan capture. Timing paths from scan
enable test_se to scan_clocks must be analyzed. Then it is strongly advised to have at least one
scenario in which no case analysis is set on test_se.
scan shift scenario
In the scan_shift scenario, the test_se is forced to 1. All capture paths will then be removed
from analysis. Only ATE clocks (either ATE basic scan clocks or ATE OCC slow clocks) must
be declared. Declaring fast clocks in shift mode makes no sense.
All ATE clocks are declared to belong to the same clock domains, so that shift paths between
scan chains driven by separate scan clocks are analyzed.
In this scenario, there is nothing special regarding multi-frequency target modes.
scan capture scenario
In the scan_capture scenario, the test_se is not constrained so as to analyze the signal transition. All (fast and slow) clocks are declared :
ATE basic scan clocks,
ATE OCC slow clocks,
OCC fast clocks.
Note that it is safe to declare ATE OCC slow clocks in the scan capture scenario to time the design in OCC bypass modes. This may be needed as clock path may be different between bypass
and at-speed modes, then skew between clocks may also be different.
The ATE OCC slow clocks and ATE basic scan clocks will be individually assigned to separate
clock domains so that paths between ATE clocks wont be seen. This is equivalent to the
set_drc -nodisturb_clock_grouping command of ATPG, where the ATPG is asked to not activate paths between separate scan clocks. By the way, shift paths between scan chains driven by
distinct clocks are not analyzed and thats why a dedicated shift scenario is still required.

SNUG 2013

26

Using at-speed testing on a complex Soc, a user experience

To manage the specificity of our design that have synchronous subsystems running at different
frequencies, clocks declarations are duplicated so to handle each target frequency of each modes.
For example, in mode 0 : PLL frequency has to be 400 MHz, in mode 1 : 266 MHz, in mode 2 :
133 MHz and in mode 3 : 66 MHz. Therefore 4 clocks are created at the output of the PLL as
showed in Figure 15. The same principle is done for all scan capture clocks defined at STA level.
All clocks are declared in the same scan capture scenario. False path are added between clocks
from one mode to the other, because they are exclusive.
The following picture shows the location of clocks defined in STA. Note that multiple clocks are
defined at the output of the PLL to manage all different frequency modes.

Figure 15 STA clocks at top level

SNUG 2013

27

Using at-speed testing on a complex Soc, a user experience

In addition to the clocks defined in Figure 15, generated clocks are defined inside APMC at the
output of muxes in order to propagate the correct clock through the scan multiplexors according
to each frequency modes, as showed in Figure 16. For the same reason than previously explained, all clocks are duplicated to manage all different frequency modes.

Figure 16 STA clocks inside APMC


Remark : in Figures 15 and 16, only 3 clocks are defined at the output of the PLL, OCC and
muxes, in order to lighten the schematics. In our design, the number of duplicated clock is 4.

SNUG 2013

28

Using at-speed testing on a complex Soc, a user experience

IO constraints
In order to constrain I/Os, virtual clocks are created according to the tester target timing set, as
shown in Figure 17.

Figure 17 - I/O constraints

SDC generation for ATPG


In at-speed capture mode, all functional data paths are exercised at the nominal speed rate. Then
when multicycle paths are used in functional STA mode to relax some paths, they should also
apply in at-speed scan mode, so as to not over-constrain the design in PnR. STA scripts in scan
test mode should then report all functional multicycle paths.
To see how to deal with multi-cycle paths in STA scan mode, we must first analyze how Tetramax is able to understand and deal with timing exceptions. Indeed, STA constraints in Scan
mode must be aligned with Tetramax scripts.
Tetramax commands add_slow_path, add_slow_cell, add_slow_bidis, can be used to tell
Tetramax that specified paths or cells cannot be exercised at-speed. These paths wont be activated at speed, but to increase coverage, Tetramax tries to stimulate these paths at slow rate, i.e.
SNUG 2013
29
Using at-speed testing on a complex Soc, a user experience

at ATE rate. Consequently, these commands in ATPG are not equivalent to multi-cycle paths in
STA.
The recommended Synopsys flow is to read SDC file in Tetramax. These SDC files can specify
multi-cycle and false paths. Today, multicycles are interpreted as false paths by Tetramax, which
impacts directly the at-speed fault coverage, because all false path are not tested at-speed. Fortunately, our multi-pass ATPG flow, explained in chapter 7, allows us to test those multicycle path
in one lower frequency mode. For example, a multicycle path defined in mode 0 (400MHz) will
not be tested at-speed in this mode, but it will be tested in mode 3 (133 MHz) as not defined as
multi-cycle path in mode 3.
TCL procedures are available in Tetramax installation directory to generate SDC files for ATPG
(pt2tmax.tcl file defines write_delay_paths, write_exceptions_from_violations, write_exceptions
and write_exceptions_from_violations_from_to TCL procedures).
For more information on this section, refer to read_sdc flow as described in Tetramax User
Guide, see ref[2].
Clock gating checks
Clock gating check has to be performed on the OCC output. The aim is to check that the clock
gating of the fast_clk input and the clock gating of the slow_clk input of OCC are correctly performed.
Clock gating check is not necessarily inferred by PrimeTime, depending on how the OCC is synthesized. Then, a user constraint should force PT to perform the check. For instance :
set
OCC_cgc_list i_core/i_muxdebug/pll_controller_sys/U2/A2
lappend OCC_cgc_list i_core/i_muxdebug/pll_controller_sys/U2/A3
lappend OCC_cgc_list i_core/i_muxdebug/pll_controller_sys/U2/B2
foreach OCC_CGC_pin $OCC_cgc_list {
set_clock_gating_check -setup 0.1 -hold 0.1 -high $OCC_CGC_pin
}

Note : Since DFT Compiler version D-2010.03-SP2, the synthesis of latch-based clock gating
cells can be forced by setting the test_occ_insert_clock_gating_cells variable to true. This option
has not been tested yet.

SNUG 2013

30

Using at-speed testing on a complex Soc, a user experience

6. CTS Constraints
The aims of CTS constraints regarding OCC design are :
balancing all flip flops in OCC driven by the fast_clk OCC input.
balancing all flip flops in OCC driven by the slow_clk input. For now, there is only one
DFF driven by this clock in the design (slow_clk_enable_l_reg), but one can imagine that
this number may increase in later implementations.
balancing all flip flops in the clock chain, namely in module top_DFT_clk_chain.
These DFFs are driven by the output of OCC. Note that they do not need to be balanced with all
DFFs driven by OCC output. The clock chain is chained with other chain segments of the
same domain, through lockup latches.
tagging the nets from pads (reference clock) to PLL input.
tagging the nets from PLL outputs to OCC fast_clk input.
tagging the net from ATE clock pad to OCC slow_clk input.
tagging the net OCC output (clk) to APMC scan multiplexor.
Note that all scan clocks should be also tagged.
CTS roots at top level
CTS roots must be declared on clock sources at top level.
For fast clocks, these sources are typically PLL outputs (ck), PLL divided outputs (out- div), or
the output of a custom divider on PLL outputs. A clock root must be declared on each potential
fast clock source. Declaring CTS roots on fast clock sources do not aim at balancing the OCC
inputs with other leaf pins of the same root, but at tagging the clock nets
For slow clock source, the root pin must be declared on the cin output pin of the pad that
drives the ATE slow clock. As shown in Figure 18, scan multiplexors were placed in the test
mode controller between the clock sources and the OCC inputs. They separate functional and
scan paths of clocks (fast and slow). In functional mode, OCC clock inputs are stucked to 1b0.

SNUG 2013

31

Using at-speed testing on a complex Soc, a user experience

Figure 18 - CTS constraints example


Excluded pins are placed at the input of multiplexors. The goal is to avoid balancing the DFFs
inside OCC with other DFFs driven by the same clock sources in functional mode or other test
modes. By the mean, clock nets from clock sources to test mode controller are correctly tagged
as clock nets.
Then, root pins are declared on these multiplexors output to properly balance DFFs on fast_clk
and DFFs on slow_clk inputs.

SNUG 2013

32

Using at-speed testing on a complex Soc, a user experience

CTS constraints in OCC


The input pins of the multiplexor in OCC that drives the clk output must be declared as excluded
pins. There is no need to balance these pins with DFFs in OCC. Besides, this declaration will
avoid clock reconvergence between fast_clk and slow_clk.
The multiplexor that drives the clk output must be declared as a root pin. This net drives all DFFs
of the clock chain that must be balanced together.
In APMC, the scan multiplexor input must be declared as excluded pin. This will avoid reconvergence between OCC output and functional clock paths in APMC.

SNUG 2013

33

Using at-speed testing on a complex Soc, a user experience

7. Multi-Pass ATPG Flow


The specificity of our ATPG flow is related to the multi-frequency target modes of SAMA5D3
design. Therefore a Multi-Pass flow from mode 0 to mode 3 have been used in order to manage
all frequency target correctly. This chapter will explain the different steps of the Multi-Pass flow.
AT-speed ATPG Flow - Build Model
TetraMax is expecting the fast clock at the OCC input to be driven by a black box (PLL). If some
logic drives the fast_clk input (other than BB) DRC will fail. So we have used the set_build
instance_modify TIEX command on the output pin that drives fast_clk input of OCC as shown
in Figure 19.

Figure 19 - using the set_build instance_modify TIEX command


AT-speed ATPG Flow DRC Step
The following parameters have been used for DRC step, there is nothing special regarding Multipass flow, but the fact that STIL file used for first DRC is for mode 0.
Use the system_clock launch mode : set_delay launch_cycle system_clock
o Two system clock pulses are generated during capture : one to launch and one to
capture data on at-speed paths

SNUG 2013

34

Using at-speed testing on a complex Soc, a user experience

Primary Outputs (POs) must not be used in at-speed mode, because the rate of atspeed
scan is too fast to exercise paths from registers to pads. Only reg-to-reg paths must be exercised in at-speed mode and tested :
o add_po_mask all
For the same reason than previously, the patterns must not allow to change Primary Inputs (PIs) between the launch clock and the capture clock for clock launch transition fault
patterns or for any path delay fault patterns :
o set_delay nopi_changes
o add_pi_constraints X
The patterns must not use value through bidirectional ports, since these paths may not be
functional and may take more than one at-speed cycle :
o add_slow_bidi
Disable inter-clock domain testing and avoid generating patterns using one clock to
launch and another clock to capture
o set_delay common_launch_capture
Disable the parallel launch and capture of several clocks in the same tester cycle (even
when clocks belongs to separate groups). This parameter is a choice that that could have
been different :
o Set_delay noallow_multiple_common_clocks
Read timing exceptions generated during STA :
o read_sdc file.sdc
Set DRC file :
o set_drc ./STIL/top_ScanCompression_mode0.stil nodisturb_clock_grouping
DRC run :
o run_drc patternexec ScanCompression_mode
AT-speed ATPG Flow ATPG Step
The following parameters have been set for the ATPG generation :
Fault model used is transition fault
Width of the capture window must be large enough to enable proper resynchronization of
test_se, according to skew between test_se and clocks of the design at FF side (this parameter can be reduced afterward according to the timing margin with respect to the scan
enable change) :
o set_atpg min_ateclock_cycles 7
Select the depth of the fast sequential algorithm
o set_atpg capture_cycles 3

SNUG 2013

35

Using at-speed testing on a complex Soc, a user experience

Fault dictionary creation has been limited to at-speed domains, therefore command add_fault
launch OCC_clock capture OCC_clocks has been used instead of add_fault all. This command have to be executed 4 times with different parameters in order to add all faults as showed
in Figures 20.
D1 adds faults from FF to FF both clocked by OCC_clock
D2 adds faults from FF clocked by OCC_clock to FF clocked by ATE clock
D3 adds faults from FF clocked by ATE clock to FF clocked by OCC_clock
D4 adds faults from FF clocked by OCC_clock to FF clocked by ATE clock, with logic
cone impacted by some FF clocked by ATE clock

Figure 20 - add_faults command select exclusive fault sets

SNUG 2013

36

Using at-speed testing on a complex Soc, a user experience

AT-speed ATPG Flow Multi-Pass pattern generation


This section focuses on the multi-pass ATPG incremental flow, to test our system at different
clock rates. Four mode are available in the design :
Mode 0 to test the ARM processor at 400MHz
Mode 1 to test the LCD at 266 MHz
Mode 2 to test AHB and APB peripherals at 133 MHz
Mode 3 to test peripherals not running over 66 MHz.
These scan modes are executed in sequence, starting with the fastest one (mode 0). Faults that
are detected at 400 MHz do not need to be tested at 266, 133, or 66 MHz. So all faults detected
within a run can be removed from the dictionary of the next runs. Then the multi pass flow is :
1.
2.
3.
4.

Build the dictionary of the 400 MHz domain as explained in previous section,
run ATPG on mode 0,
write the mode0 dictionary resulting from atpg,
Build the dictionary of the 266 MHz domain. Remove from this dictionary the mode0
fault dictionary,
5. run ATPG on mode 1,
6. write the mode 1 dictionary resulting from atpg,
7. Build the dictionary of the 133 MHz domain. Remove from this dictionary the mode1
fault dictionary,
8. run ATPG on mode 2,
9. write the mode 2 dictionary resulting from atpg,
10. Build the dictionary of the 66MHz domain. Remove from this dictionary the mode2 fault
dictionary,
11. run ATPG on mode 3,
12. write the mode 3 dictionary resulting from atpg (pass 3).
The multi-pass at-speed flow is illustrated on Figure 21. Notice that the loop is starting at mode 1
because mode 0 as been previously run without the need of removing any dictionary.

SNUG 2013

37

Using at-speed testing on a complex Soc, a user experience

Figure 21 - Multi-Passes flow on our SoC using all Modes.


When removing faults from current dictionary the dictionary resulting from previous pass, the
read_faults -delete command is used. Two options may be envisaged :
read_faults delete :
o The whole dictionary from previous pass is removed, independantly from the status of faults. Only the fault location information is used, but the fault code is ignored. That is, even faults that havent been detected in the previous pass are removed. The underlying argument is that a fault that wasnt detected on previous
pass ($i-1) is likely to be no more detected with the same atpg settings, effort and
coverage on pass $i.
read_faults -delete -force_retain_code :
o Only faults that were detected on pass $(i-1) are removed from dictionary of pass
$i. Faults that were not detected remain present in current dictionary and remain
candidate for detection.
Second option is more appropriate if timing exceptions are taken into account. Consider for instance, a multicycle path at 266 MHz (mode1). This path is not processed by ATPG in mode 1
SNUG 2013
38
Using at-speed testing on a complex Soc, a user experience

and all faults that remain undetected after pass 1 because of these timing exceptions should remain in dictionary in order to get a chance to be detected at 133 MHz. This option should be selected if the number of timing exceptions is significantly reducing the fault coverage. This option
has been used in the design.
AT-speed ATPG Flow Incremental flow for 95% stuck at coverage
Once all at-speed atpg passes are completed, a last stuck-at incremental pass is performed in order to catch stuck-at defects with a coverage of 95%. Incremental means here that the already
detected transition faults should be used as starting point to the last ATPG pass. All dictionaries
resulting from all at-speed passes are removed from the stuck-at dictionary after conversion of
transition faults into stuck-at faults. For this, we use the command :
update_faults -direct_credit -external mode$i.dict
The incremental flow used to generate 95% stuck-at coverage is depicted on Figure 22.

Figure 22 - Top-off stuck-at Incremental ATPG


SNUG 2013

39

Using at-speed testing on a complex Soc, a user experience

ATPG results
The following are the results in term of coverage and pattern number of Multi-Pass ATPG flow
versus Basic scan flow without OCC :

We can observe a pattern increase of a factor 2.4 between transition fault and basic stuck-at fault
model, which is generally seen comparing stuck-at pattern count with respect to transition pattern
count. Transition fault model scan test time is still acceptable thanks to the compression factor of
20X used on chip SAMA5D3 : in scan compression mode, the scan internal chains are made of
100 Flip-Flops length, which gives us a scan test time of around 2ms. Please note that test time
with transition fault model would have not been acceptable without compression (test time has
been estimated to minimum 50 ms compare to 2ms).
These results were obtained with the following releases :
DFT compiler / Design Compiler : 2011.09-SP3
TetraMax : 2012.06-SP5
SNUG 2013

40

Using at-speed testing on a complex Soc, a user experience

Early Silicon results


Statistics have been started on SAMA5D3 product in order to evaluate the usefulness of transition fault model versus stuck-at fault model. The first figures show that transition fault model is
necessary: around 1% of the parts are catched by At-speed scan test. The following table shows
the percentage of parts catched by the different scan modes. Thoses figures have been calculated
on 2 wafers only, they will have to be consolidated on more parts.
Mode
Mode 0 : ARM 400MHz
Mode 1 : LCD 266 MHz
Mode 2 : hclocks 133 MHz
Mode 3 : pclocks 66 MHz

% Loss transition
versus stuck-at
0.4%
0%
0.8%
0%

Comments
High frequency : timing critical
Very small area of chip
4/5 of the area of the digital
Low frequency : result close to stuck-at

Table 3 Parts catched by Scan At-speed test modes


We noticed that mode 0 catches a sinificant number of parts. The explanation is the following :
as the subsystem tested by this mode is running at the highest frequency of the chip, a lot of timing critical path are covered during scan at-speed test mode 0.
Then, we observed that mode 2 catches more parts than the other modes. This can be explained
by the fact that this mode tests a big part of the digital of the chip (around 4/5 of the digital area).
Mode 1 represent a very small amount of the digital area, which explains that no part were
catched by this mode.
Mode 3 represent 1/5 of the digital area of the chip and its frequency is low, therefore, we can
think that results are close to stuck-at fault model.
The following figure, shows an example of the comparision of a stuck-at wafer map with an atspeed wafer map made on SAMA5D3 product. On the At-speed wafer map, we can see that 2
additional parts have been catched by scan at-speed test on the edge of the wafer.

Stuck-at wafer map

At-speed wafer map

Figure 23 Wafer maps of stuck-at versus transition fault models

SNUG 2013

41

Using at-speed testing on a complex Soc, a user experience

8. Pattern validation
We have used TetraMax MAXTestbench feature, to create Verilog testench in order to validate
the ATPG patterns with full back-annotated simulations. Please refer to ref[2] in case you need
more details about MAXTestbench flow. Alternate solution is to use STILDPV (usage of STIL
PLI), see ref[5].

9. Conclusion and Future Work


Implementing OCC in a complex SoC can be a challenging task involving multiple implementation aspects to consider :
Front-End related ones : during Synthesis, design DFT architecture, static timing analysis, timing exceptions handling, multi-pass ATPG, Test coverage.
Back-End related ones : OCC structure is quite a sensitive part of the design and special
care must be taken also during place and route.
Patterns validation is also required to check proper functionality of the OCC inserted and
especially no glitch occurring.
We have successfully deployed Synopsys OCC IP in our own DFT implementation flows at
ATMEL Rousset leading to the ATMEL SAMA5D3 product. For more information please see :
http://www.atmel.com/Microsite/sama5d3/default.aspx.
It is a little early to share feedback from silicon returns (number of patterns vs. coverage vs. rejects) as product has been introduced on the market beginning of 2013.
For the future, we plan to implement a Custom OCC to handle inter-domain at-speed coverage,
also improve the functional timing exception flow handling, and consider looking at complementary fault models such as Small Delay Defect (SDD).

SNUG 2013

42

Using at-speed testing on a complex Soc, a user experience

10.Acknowledgements
We would like to thank Anne Lafage for her preliminary work on the OCC flow subject at
ATMEL which served as the basis for this paper, see ref[0], and also all the ATMEL design team
and management involved, especially Fabrice Vigneron (ATMEL Design Team Manager) for his
help and advises, and also our Synopsys local support team.

11.Acronyms
DFT
: Design for Test
SPF
: STIL Protocol file
FF
: Flip-Flop
BB
: Black Box
PnR
: Place and Route
ATPG
: Automatic Test Pattern Generation
ATE
: Automatic Test Equipment.
IP
: Intellectual Property (design block)
DC
: Design Compiler, Synopsys Synthesis Tool
DFT Compiler : DFT insertion tool embedded in DC
TetraMAX
: Synopsys ATPG tool
SoC
: System on a Chip
APMC
: Advanced Power Management Controller
SDD
: Small Delay Defect

SNUG 2013

43

Using at-speed testing on a complex Soc, a user experience

12. References
[0] Scan At-Speed Guidelines V1.0, Anne Lafage, January 2012, ATMEL MCU Design
Group ATMEL Internal Confidential Document[1] Synopsys Tetramax 2012.06-SP5 User Guide
https://solvnet.synopsys.com/dow_retrieve/latest/tmax/tmax_olh/Default_CSH.htm
[2] Synopsys DFT Compiler 2012.06-SP5 User Guide
https://solvnet.synopsys.com/dow_retrieve/latest/dftxg1/dftxg1.html
[3] Synopsys Design Compiler 2012.06-SP5 User Guide
https://solvnet.synopsys.com/dow_retrieve/latest/dcug/dcug.html
[4] Solvnet ID 022490 Static Timing Analysis Constraints for On-Chip,
https://solvnet.synopsys.com/retrieve/022490.html?otSearchResultSrc=advSearch&otSearc
hResultNumber=4&otPageNum=1
[5] Test Pattern Validation User Guide 2012.06-SP5
[6] On-Chip-Clock controller (OCC) : An Alternative Approach, SNUG Israel 2012
[7] Using Custom OCC with TetraMax for At-speed Transition Fault testing and Small
Delay Defect, SNUG France 2011
[8] How to Handle OCC (On Chip Clocking) by using DFT Compiler and TetraMax on
Sophisticated SoC Design, SNUG Singapore 2008
[9] Implementation of At-Speed Test using an On-Chip Clock Controller on an Analog Device Audio Processor, SNUG Europe 2008
[10] Automatic Insertion Flow of on Chip Controller for At-Speed Testing, SNUG Europe
2007
[11] ATMEL SAMA5D3 Product The next Benchmark in Versatility,
http://www.atmel.com/Microsite/sama5d3/default.aspx

SNUG 2013

44

Using at-speed testing on a complex Soc, a user experience

You might also like