You are on page 1of 33

Interview Question Bank on ATPG & SCAN

Questions
1- What is ATPG?

2- What is Scan Insertion and Scan Chain?

3-What is Full and Partial Scan?

4-What is Combbinational ATPG and


Sequential ATPG?
Which has less patterns? Why?

5- Why we use Combo. ATPG for full scan ?

6- What is Fault Coverage and Test Coverage?

7- Explain different Fault Models?


Stuck at fault, Transition fault, IDDQ fault,
Path delay Fault. Bridgung fault And explain
the difference between stuck at and transition
faults.

8- How you started analysing coverage?

9-What is hierachical report?

10- Why some flops left non-scanned?

11-How did you get coverage on non-scannedpaths?

12-What is 'no-faults' ?
How you can say that you can 'no-fault'
something? What about the coverage of that
area?

13- How you covered complex 'combo-logic'


blocks? OR how you got full controlabilty and
Observanility on complex combo. Logic block?

14- How you handled issues regarding to


pin constraints to improve coverage?

15-Is there any non-transparent latches in you


desing? How you handled issues with it?

16- How you improve coverage? Summary.

17- What is Lock-up-Latch? Why we use them?

18- What do you do while performing flopstiching there are some negative edge and
some posetive edge based flops? How you will
handle this situation?
19. Explain what is fault collapsing ?
Explain in terms of fault dominance and
Fault equivalence.

20. If scan was failing and you slow down the


clock and it starts to pass what was the cause
of the failure in the beginning? Setup or Hold
time?
21.Give three important Clock drc rules and
how to fix them?

22.What is STIL procedure file? What does it

contains?

23. What is scannability checking or Scan integrity? How you check it?

24. How important is scan chain balancing in


DFT? How it effects in designs if the chains are
not balanced?

25. Let's say there is a chain with 8 flops and


one of them has a hold viol. Assuming you've
enough data patterns to fill in other chains
to find the captuting failure. How will you do?
OR
How you figure out which is bad flop?

26. Which flops to avoid for scan?

27.How do we make sure that each flop is


getting clock and reset? Is a separate test cloc
used or it is the functional clock?

28. How to decide the number of chains?

29. Difference between normal flop and


scan flop?

30. What is Top of ATPG?

31. Why don't we add buffer instead of


Lock up Latches?

32.What exactly is the At Speed Test ?

33.What is the difference between defect,


fault and failure?

34.How we apply i/p's during simulation time


and during ATE time?
OR
What is serial loading and parallel loading?

35.How reset affected in coverage?


OR
Why did we applied reset fro top level?

36. What is block level ATPG?

37. Explain about Fault Classes?


Mention the Fault Class Hierarchy.

38. What is Controlability and Observability?

39.What is LOC and LOS?

40.What are the pros and cons of LOC & LOS?

41.Why in compare to LOC , LOS has more


coverage,less patterns and better controlability?

What are the differences between atpg library and


verilog library??

How block level testing will useful when we are doing top
level testing? when we are doing block level we are creating
patterns, but in top level again we are creating patterns.
then what is the use of doing block level testing. plz make
me clear in this

1. In your design you have dual port memories each


working at a different frequency. What is the clock
frequency you use for testing (MBIST)?

Why we go for MBIST?

what are wrapper chains?

what is the difference between pre drc check and post


drc check in DFT compiler ?

Why there is difference in the pattern count and test


coverage between the two methods: LOS and LOC??

How to cover the faults on inter clock domain


crossing?

what is basic patterns , simulated patterns?

How technology impacts DFT?

How to get coverage on reset pin?

Answers
ATPG is a automatic test pattern generation process which takes a gatelevel netlist along with some i/o constraints, clock definitions, scan definitions and generates a test patterns which can be used to find manufacturing defects in a real silocon.
It also produces a fault coverage report that tells you how good your test
is and which are covered and non covered nets by test patterns.
The process of replacing ordinary sequential element into a scan sequential element for the sake of better controlability and observability by
adding scan signals (SE,SDI,SDO) and mux and making it into scannable
element is called Scan Insertion. And the series of scannable sequential
elements stitched together is called Scan Chain.
Full Scan- If all the sequential elements are converted into scannable
elements then the test architecture is know as full-scan
Partial Scan- If some non-scanned sequential elements are left in design
due to some reasons then the test architecture is known as partial-scan.
Combinational ATPG- The idea is to control and observe the values in all
the sequential elements in the full scan design , so we can see the combo
. Logic between the sequential elements, So the ATPG tool take this
combo logic into consideration and generates combo patterns .
This is also a reason why no. of patterns are less in Combo. ATPG.
Sequential ATPG- We use this for partial scan design where between
the two scan ff's there are non-scan ff's along with the combo logic.
So only combo patterns are not enough for them we required sequential
patterns for them. This is the reason ATPG tool has to generate patterns
with multiple clock pulses.
So pattern count and runtime is much more than combo ATPG.
If all the sequential elements are converted into scannable sequential
elements into the design then the design is effectively reduced to a
combo. Only sets of circuits surrounded by primary i/o's.
This simplification allows the combo. ATPG tool to be used in more
effective way.
Fault Coverage- A test pattern should target every possible faults in the
design but at times it might not to be possible to target every possible
fault in the design.
The ratio of faults targeted to the possible no. of faults is called as

fault coverage.
Fault Coverage= Faults detected/ Total no. of faults.
Test Coverage= Faults detected/ Detectable Faults
Stuck At Fault- It is a static check as the name suggests a particular value
either high1 or low0 is stucked on node.
So we want to detect that a particular node can toggle from 0 to 1 and
1 to 0.
Transition Fault- Here the node is same, toggle is same . we have to
ensure that 0 to 1 and 1 to 0 happens but this time we have a time
constraint to see toggeling is happening in that given time constraint.
If it is not happening in certain amount of time then there is a transitionfault.
Stuck at fault is what which is either connected to ground or vdd while in
transition fault if a node is not toggeling in certain amount of time then
we can say that node is slow to rise or slow to fall.
Path Delay Fault- Is useful for testing and characterising critical timing
paths in our design
It exercise the critical paths at-speed (the full operating speed of chip)
to detect weather the path is too slow because of manufacturing defects
or variations.
Incorrect field oxcide thickness could lead to slower singal propagation,
which could cause transition along a critical path to arrive too late.
Bridging Fault - Bridging(or short) is common defect in semiconductor
which causes two normally unconnected singnal nets in a device to
become electrically connected due to incorrect etching .
Such defects can be detected if one of the nets causes the other net to
take on a faulty value.
IDDQ Fault- It is a type of fault which occurs in CMOS circuits. To detect
this fault we need to measure the amount of current drawn by a CMOS
device in a quiescent state.
CMOS circuits almost draws no current in quiescent state. Quiescent
means the i/o's are stable and the circuit is inactive.
If circuit has designed correctly the amount of curent is extreme small in
quiescent state and if significant amount of current is there then it
indicates the presence of one or more defects.

I started working on given fault list. I picked up the list of those classes,
for example AU(atpg untestable) so, AU is the thing which bring down the
coverage thus I picked up AU list and started to improvisation of coverage
and I observed one particular block so tried to reporting the hierachicalreport for that block.
So from the hierachical report I picked up the low coverage modules.
I knew that if we improve coverage on block level it will directly afftect
at top level.
So while I was observing one particular block which were having somany
non-scanned flops.
So, basically the tool was loosing controlability and observability here.
And that's the reason we had low coverage for that block.
It will report the hierachicaly each and every modules coverage. So that
from that report we can analyse which modules having low coverages.
There was certain issues with the critical path. If we do scan insertion it
will add extra mux delay in the data path which will bring down the
functional frequency.
And it was so critical if we add muxes and all they might not be able to
close the timing at whatever required highspeed frequency. So we
decided to remove these flops from the scan chain.
I started analysing and then I generated sequential patterns for them.
I increased sequential depth up to 4 and I was getting maximum path coverage for that particaular module.
So its confirmed that because of non-scanned sequential elements we
were loosing coverage.
Then I moved to other block and while anyalysing I found that most of
the part is covered by memory instances in a particular block.
So iI did 'no-faults' for it.
Tools will remove those faults from the fault-list when we apply
no-faults' to it.
So basically it will reduce the number of faults which is going to be
targeted.
If we do 'no-faults' so those faults are not going to be considered so
patterns wont be generated for that part. Conclusion is that our coverage
has increased but that area is untested.
But we cant leave such area untested so we know that for memories
we test them using MBIST patterns likewise If we apply 'no-faults' for
JTAG , we know we have separate patterns 'JTAG-patterns' which will test
it.

While I was analysing I found that there was huge combo logic and it was
not controllable and observable. Because in that block the combo logic
was much more than the sequential logic.
So there was very few flops, we cant put scain chain there in a effective
way so that the whole logic gets controlability and observability.
So there was a need to put test points. Basically two types of test ponts
is there 1)- controlled 2) observed.
Then I broke the huge combo logic and put flops in between and that
flops can be in scan chain. And I added some test patterns to achive
contolability and observability. Thus coverage got improved.
Yes , there was some issues with pin constraints. We had a active_low
reset pin and we constrained that reset to one.
Reset goes to each and every block and reset tied to 1.
So the whole logic got uncovered in stuck at 1 fault, because always tied
to 1.
So, at the top of ATPG we run separate ATPG where we defined reset
as a clock. We didnt added all the faults only undetected faults and then
we started generated patterns. Thus we covered those points due to
tied logic.
Yes, non-transparent latches was the reason why some blocks were
getting low coverage.
Basically for the non-transparent latches clock is blocked them to be
transparent. As they were not getting clock and we know that latches
should be leven sensitive.So to avoid this issue we controlled clock for
top level.
Summary on Coverage improvement
By having controlability and observability on all the nets we can improve
our coverage. The issues with stops us to achive high coverage is listed
below.
1) Non-scanned flops.
2) Non- transparent latches.
3) Commplex combo. Logic
4) Pin constraint.
We discussed all these points in previous answers.
If launch and capture of data is happening on the same clock pulse then
we are adding a lock up latch in between the two flops.

Lets assume a scenario, there are two flops ff1 and ff2 and both asserted
by different clock clk1 and clk2. And there isnt much delay in between
ff1 and ff2 so there might be possibility that lauch and capture might
happen on the same pulse.
But if we add a lock up latch in between ff1 and ff2. Lock up latch asserted
by inverted clock from ff1. So, by adding LL there we are getting some
additional time like half clock cycle so that now data will capture on
second cycle.
In such scenarios where in flop stiching there are some negative edge and
positive edge based flops we will tie-up all negative edge first and then
the positive edge second. By doing this we can avoid issues related with
data jumping.
Fault Collapsing- It is typically reduces the total number of faults.
We generally classifies it in two types 1)Uncollapsed Faults &
2)Collapsed Faults.
Uncollapsed Faults- It is the total number of possible faults in the circuit.
For example an AND gate can have 6 faults.
Collapsed Faults-It is the total number of collapsed faults which is equals
to the sum of equivalance fault and dominance fault for the design.
Suppose for an AND gate (Stuck at 0 at any of the input is equivalent to
stuck at 0 at the output) so by this we can say there are 3 equivalance faults
for AND gate. And we have 1 dominance fault( Stuck at 1 at output is
dominated by stuck at 1 at input).
Thus,Collapsed Fault= Equivalance fault + Dominance fault. = 3+1 = 4.
The cause of the violation was setup, absolutely. If you slow down the
frequency you're giving more time to the signals to go through all the
logic, so those signals which were failling to reach the setup arrival time
requirement now have enough time to reach.
1. When all the clocks are in off state , the latches should be transparent
or ( add logic to make them transparent).
2. A clock must not capture data into a level sensitive (LS) port (latch or
RAM), if it does then that data may be affected by new captured data.
3. Clock not controllable from the top. ( Use mux to controll the same)
STIL procedure file provides informations about clock ports, scain chains

and other controls.


The STIL procedure file can be generated from DFT Compiler and we use
it in for DRC checks.
The test procedure file contains all the scan information of your test
ready netlist. Some other informations mentioned below,
The
The
The
The
The
The
The

number of scan cells in each scan chain.


number of the scan chains.
shift clocks.
capture clocks.
timing of the different clocks.
time for forcing the Primary input , bidi inputs , scan inputs etc .
time to measure the primary outputs, scan outputs etc.

The first pattern that is pattern 0 in most of the ATPG tool is called the
the chain test pattern. This pattern is used to check the integrity of the
scan chains, to see if the scan chains are shifting and loading properly,
if the scan chains itself have a fault, there is no use checking the full chip
using this chain.
Generally 99.99 precent of test time (on the tester) is spent loading the
scan chains and this is directly proportional to the length of the longest
scan chain in your design. So the way to minimize test time is to minimize
test time is to minimize the length of your longest parallel chain.
Balancing the scan chain is critical because if you have scan 10 scan chains
and 9 chains has 10 flops but the 10th chain has 100 flops each shift has to
be 100 clock pulses and unnecessarily the tool has to insert X for 90 clock
cycles. so your overall test time for one pattern will be 100 clock cycles to
scan in, 100 to scan out and one capture cycle. It could have been 10 to
shift in, 10 to shift out and 1 capture cycle. so you overall test time for
one pattern is 201 instead of 21. Now multiple this by the number of
pattern's. on average in this scenario your test time is 10 times higher.
1. Shift in all 1 to initialize the chain.
2. You shift in 00001111(right bit first, left bit last) and if the capture fails.
The cause of failure should be in the 1st 4 flops because the data in the
2nd half is not supposed to change and should have no effect on the
capture failure.
3. shift in all 1 to initialize the chain.
4. You shift in 00111111 and the capture passes. The cause of failure
should be in the 3rd or 4th flops because the 3rd and 4th flops are only
flops that change the value and can contribute to the capture failure.
5. shift in all 1 to initialize the chain,
6. You shift in 00011111 and the capture fails. The the cause of failure

should be in the 3rd flop.


Shift registers are typically the only flops left out of the scan chain.
Some reset and metastability flops may make sense to leave off the scan
chains.
NOTE- Any flops that left out from scain chain will need a separate
test suits for it.
In order to be testable, every clock pin and every reset pin should be
controllable by a primary input during test mode.
During design you should run dft-rule-checks as part of your synthesis
flow. For high fault coverage, you should fix every dft warning/violation.
Typically the more scan chains you have the shorter your tester test time
is. it takes less time to load 10000 flops 30 (30 scan chains) each shift clock
than if you could only load 2 (2 scan chains) each shift clock.
if you only have 2 seconds of test time to do all of your testing you may
find that you run out of test time...
A scan flip flop is ordinary flip flop modified for sake of using it during dft.
It has additional scan input and scan output for sending test inputs
and receiving test outputs.
Normal Flip-Flop have D, Clk & Q.
Scan flop have D, SI (scan in), SE (scan enable), Clk, Q and/or SO (scan out)
During scan shift operation (SE=1), data will shift in through the SI pin.
Durig scan capture state (SE=0), data will capture into scan flop via D pin.
We generate combo. ATPG patterns for full scan. But for partial scan,
Still there are many combo. Only paths so for them we will generate
combo. ATPG patters and then for the remaining logic we will generate
sequential ATPG patters. This is called as Top of ATPG.
If your skew is big, then you will need a lot of buffers or delay cells, which
is undesirable for power/area etc.
Frequency of operation is not as important during scan shifting. Therfore,
we can always slow down the freq and/or modify the duty cycle to
remove a hold time problem with data lockup latches.
Normally test clocks have low frequencies. At these frequencies basically
you can check the connectivity of the nets (e.g. shorts/opens).
However you cannot see the real parasitic effects of a functional clock
which has higher frequency than test clocks.

At this point using at-speed-clock, which means running the circuit at its
functional frequency, you can gather more information (e.g. path delay)
and improve the coverage.
At speed test are used in analyzing path delay/transition delay of circuit.
Normal tests are good to test 'stuck at' faults but they fail in testing
timing behavior of circuits.
Defects : Imperfection or flaw that occurs within silicon.
Faults : Representation of a defect.
Failures : Non-performance of the intended functions of the system.
Examples ,A physical short is considered a Defect.
A physical short resulting in stuck-at behavior might be modeled as a
stuck-at- 1/0 Fault.
Non-performance of the system due to error is Failure.
Serial and Parallel patterns both are (and must be) same for a give scan
mode(Internal or Adaptive). The only difference is in the way they are
are applied on the design.
In serial patterns,all the patterns are applied through ScanIn and ScanOut.
The operations are similar to the Tester environment.
But the parallel patterns are applied directly to the internal registers,
therefore no Shift-in, shift-out. So reduces the test time.
The direct access to registers is possible only in simulation environment
& hence Parallel patterns are used only in Simulations and NOT in tester.
The serial pattern is describing timing in reality, and parallel patterns is
just to verify the correctness of logic, not including timing.
In our design we had active_low reset pin and we cconstrained that reset
to 1.
Reset goes to each and every block and it is tied to 1. So, the whole logic
was not covered in stuck at 1 fault, because we had tied it to1.
So, at the top of ATPG we run a separate ATPG, where we defined ATPG
as a clock.
We didnt added all the faults , only the undetected faults and when we
started generating patterns we covered those uncovered points due to
reset's tied logic.
We can describe block level as it is one of the core block of the full chip
design. Suppose the full chip contains 4 core blocks and we approached to

improve coverage for a particulat block is called block level ATPG.


If we improve coverage at block level , automatically at chip level we are
going to have good coverage. So that when we move from block level to
top level we dont have to spend much time to debug at top level.
Faults are assigned to classes indicating their current fault detection or
detectability status.A two character code is used to specify a fault class.
Fault Class HierarchyDT - Detected
PT - Possibly Detected
UD - Undetectable
AU - ATPG Untestable
ND - Not Detected
Basically ATPG Untestable is the main reason behind low coverage.
Controllability: We can control the internal part through stimulus.
Observability: We can monitor the internal part variation through output
interface to writing testbench.
The two most popular transition tests are LOS and LOC.
They are categorized by how they launch transition by launching on shift
or by launching on capture.
LOS-Launch on Shift- When the last shift on the scain chain load is used to
launch the transition then we call it a launch on shift operation.
Where launch is happening on shifting path and capturing on functional
path.
LOC-Launch on Capture- The launch of transition is done in capture mode
when scan enable is 0.
Where both launch and shift happening on functional path.
LOSAdvantagesMore coverage.
Less patterns.
Basic Combo ATPG used.
DisadvantageScan Enable SE signal has to be very fast. This might not possible in every
scenario. It requires fast clock domain signals that could be used to launch
and capture transition quickly.

LOCAdvantageNo need of Scan Enable SE signal to be very fast.


DisadvantagesLess coverage.
More patterns.
Sequential algorithm used.
LOS is combo based algorithm, because there is only one c.p. during the
capture mode. No second needs to be stored by ATPG to determine how
the circuit will react after the second clock pulse.
That is the reason why it has less patterns and more coverage with
controlability.
LOC is sequential based algorithm, Because it is essentially a double
capture and the ATPG tool needs to be able to store the state of the circuit
after the last shift and first clock pulse of capture in order to know what
is expected after the second capture .
That is the reason why it has more patterns but less coverage compared to
LOS.
Though controlability differs but Observability is same in both LOS & LOC.
Linkedin Question
It is depend on tool , like tetra max use only verilog lib .Some ATPG tool use their own
lib. There are same , no difference. And ATPG lib are vendor specific.

For some reasons we have to do block level patteren generation .


(1) To check that what is the test coverage we have achived at block level. If the
coverage is low then we can add the test points and increase the coverage. Same we
can do it at top level directly it might be very difficult.

We will consider highest frequecy amoung that to test memory.


And if dual port memory
working with asynchonous frequency then we can gurentee that functionality wise
both will work fine.

1) if we do memory checking with ATPG way the we require lots of shift cycle to shift
the address and data to the respective pins , shift cycle increase , tester time increase
and chip cost increase.
(2) ATPG pattersn cant detect memory related faluts.

It is made up of wrapper cells. It is like envelope to isolate the some block or IPS.
To
bypass tha logic , wrapper chain is used , it increase the controlibility and obserbility.
the top level or not( S1 and S2 rule violation in DFTADVIOSER ) .
POST DRC:: It will checks
that the there is no any blockage in scan chain.

In LOS , means launching event happen in the shift path , here we have full
controllabilty to load our desire vaules in the scan chains , and the after directly
capture the response so here coverage is more.
LOC::: Launch on capture , means the
capture event happend in to the capture path , so here the patterns are comming fro
the data path so here controlibilty is less so , coverage is less and to the cover
perticular fault it has to try more to detect that perticluar fault so patterns are more.

We make path as faulse path which are inter clock doamin crossing. So we dont test
faults on those paths.

IDDqtestbecomesmoredifficultduetotheleakagegetshigher.
Sizeofchipwilldecrese.

step(1) Define reset as clock. (2) Load patterns.(3) make scan enable low .(4) now
pulse the reset. (5) unload patterns.

Additional details on answers or corrections

You might also like