Professional Documents
Culture Documents
Abstract
Scan compression technology combines the expected Scan in Scan in
÷3
responses from multiple scan chains to be observed at
fewer scan outputs. As a result unknowns (Xs) in the test
response interfere with the good values that could be
observed. Prior to this paper, Xs in the test response were
treated as bad for compression and solutions either
removed, bypassed, or blocked the Xs from interfering with
the other responses. In this paper we show that some Xs
can be added to improve test compression quality of Scan out X masking logic Scan out
results. The trade-off between improved observability due Test Data Volume = Patterns x Scan interface x chain length
to simultaneous clocking of interacting clock domains is Test Application Time = Patterns x chain length
played against the reduced observability caused by the Xs
in the response due to race conditions. In this paper we Figure 1: Decoupling of the scan-interface from the internal
show that when the inter clock domain Xs are added but scan chains to allow for reduction in test data volume and
limited, the gains achieved by adding the Xs far exceeds test application time.
the losses in bringing together the Xs with other observes Scan compression technology came about as a solution for
in scan compression.
the increased test data volume and test application time
1. Introduction seen in scan testing. When no unknowns (Xs) exist in the
test response scan and compression technology behave
The need for test data compression has been driven not quite similarly. However, when an X is observed in a scan
only by the increasing transistor density on the chips but design the X has no negative impact on the scan solution.
also by the increasing use of test sets for multiple fault In scan compression the values from multiple scan cells
models [1]. In a span of few years, a number of are combined to create fewer values where the X in the
technologies that went under the name of scan response masks off any values it is combined with.
compression were developed to add logic between the scan Depending upon the amount of compression being targeted
inputs/outputs and internal scan chains to reduce test data the Xs could mask a large number of good responses that
volume. These efforts leveraged the fact that a large the test patterns relied upon for fault detection. Thus Xs in
portion of ATPG-generated tests had logic don’t cares. the response have a significant negative impact on the QoR
These technologies relied on decoupling the scan- of compression. All solutions in scan compression today
inputs/outputs from the internal chains, such that a larger treat these Xs as negative things and proactively manage
number of internal chains can be driven from a smaller the Xs such that they do not appear in the test response. In
interface. the next section we provide an overview of the existing
Figure 1 shows the impact of decoupling the scan solution to Xs in the response. In the following section
terminals from the internal scan chain. This figure also (Section 3) we present a new kind of X that has never been
shows the relationship of chain length to test data volume considered in the past researchin compression. These Xs
and test application time. A reduction in chain length have a good and bad aspect to them. On the good side they
linearly reduces the test data volume and test application appear as a result of increased observability caused due to
time. A ratio of 3x more internal chains than the scan aggressive clocking which translates to fewer patterns. On
interface translates to 3x shorter chains and the the bad side in compression they interfere with the
corresponding reduction in the quality of results (data and observability of other values – increasing the number of
time). This is the fundamental mechanism behind the test patterns. In Section 4 we show how managing the
numerous research papers and the commercially available number of Xs caused by aggressive clocking can result in
scan compression technologies of today. improved test data volume and test application time in scan
compression. We finally present our conclusions.
The X’s generated during response capture can be 2. Many configurations that block a sub-set of
proactively blocked from reaching the scan cell by chains from entering the compressor.
identifying the X-sources and then removing them or by The mask control signals could either come from primary
inserting additional DFT logic to fix the X-sources by inputs, the decompressor of the architecture or an
adding additional test points [2]. Another way to block the independent scan chain that is not connected to the
Xs from reaching the scan cells is by careful test pattern decompressor [9][10][11][12][13]. Furthermore, masking
generation where the don’t-care bits in the scan-in vector implementations in scan compression schemes can vary
can be set to control values to block the Xs from reaching between per-pattern masking to per-shift masking with
the scan cell [3]. intermediate possibilities [13]. Compression schemes that
Efficient compactors have been proposed that provide use MISRs need additional blocking to guarantee 100%
good compression without loss in coverage due to error unknowns from reaching the MISR.
masking and X-masking. Error masking occurs when As one can see there is a lot of effort and DFT put in place
multiple errors cancel each other due to the compactor in scan compression to avoid logic-X’s. Logic-X’s in the
architecture and X-masking happens when the Xs in the response can impact the fault coverage of the scan
response prevent the error from propagating to the compression scheme which is typically not an issue that
compactor output. The space compactors are can be compromised. When fault coverage is lost due to a
combinational circuits typically built of XOR gates to logic-X interfering with the good values in the compressor,
compact test response coming out from N scan chains in the coverage would need to be recovered in scan mode
one shift-out cycle {0,1,X}N into a compacted test response which takes away from the efficiency achieved by the scan
observable at Q outputs {0,1,X}Q, where N > Q and X compression scheme. For example take the case where
denotes the unknown value. The space compactors are 10X compression was targeted and the final
defined by parity check matrices of linear block codes as implementation requires 10% of the patterns to be applied
proposed in [3]. The ability of space compactors to tolerate through the traditional scan mode as a result of logic-X’s.
unknown values in the test responses was addressed by X- The final test application time is T=0.9x(t/10) + 0.1t =
compact technique [4]. The analysis presented in [5] 0.19t. Which represents only 5X compression. Masking
provides an estimation of the compaction ratio of the space ensures that logic-X’s do not cause this situation to occur
compactors to be able to tolerate a certain number of because of unknowns in the test response.
unknown values.
However, masking itself does not come without any
The convolutional compaction schemes presented in issues. Use of masking logic reduces the observability in
[6][7][8] are a class of finite state compactors that the design making the test pattern count increase for the
combine time and space compaction techniques. The same fault coverage. Now consider the case where twice
convolutional compactors convert test responses shifting the patterns are created because masking reduced the
out of N scan chains in a finite number of s shift-out cycles observability of the design to avoid unknowns. For a 10X
into compacted test response observable at Q outputs, scan compression scheme the final reduction in test
where N > Q. Designed specifically to meet certain application time would be 10/2=5X over that of scan.
requirements of manufacturing test [6], the convolutional Hence bad QoR for the scan compression scheme.
compactors are able achieve much higher compaction
ratios N/Q with Q scan out channels. The ability of the For designs that are not X-clean, the number of Xs
convolutional compactors to tolerate unknown values was generated could be large or the distribution of these Xs
studied in [7]. could be non optimal w.r.t. the masking selection logic.
When some scan cells always observe an X on capture
By far the most popular technique to handle X’s captured values, instead of invoking masking it is better to architect
by the scan cells is to insert a masking circuit between the the scan chains to separate the static-X scan cells such that
scan chain output and the compactor input (see Figure 1). the Xs do not enter the compressor at all [14].
8
3. Introducing Xs for Better Compression 2. X (plain text): Represents a dynamic-X that does
not come from the simultaneous clocking issue
The traditional ATPG operates in a zero delay being discussed in this paper. A dynamic-X
environment where no-timing information is available. In represents a scan cell that sometimes captures an
such cases strict DRC rules are followed to ensure that the unknown value and sometimes a good response.
captured values predicted by the ATPG tool matches the
values seen when timing is taken into account. One of the 3. X (italics): Represents a scan cell that captures a
rules ATPG follows is that one clock can be turned on at dynamic X due to inter-clock-domain issues
any given time. When the flip-flops of two clock domains when clocks are simultaneously pulsed.
do not interact with each other their clocks can be pulsed
simultaneously without causing problems to the zero delay
environment assumed in ATPG. Figure 2 (a) shows a
situation where the FF's from one clock domain are not Clk 1 Clk 2
connected to FFs from another clock domain. Since not
pulsing a clock during capture makes a set of FFs
unobservable for a test, ATPG is designed to (a) Independent clock domains where no
simultaneously pulse all the clocks it is allowed to for FFs from one domain are connected to
maximum observability. the other domain.
9
with mask value Mall = 111, chain set {ch1, ch2, ch3} and Shift position 111 Mask
001
{ch1, ch3, ch4} are observed at so1 and so2 respectively. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 010
values
However, with mask values {M1 = 001, M2 = 010}, chains ch1
D D1
{{ch1, ch3}, {ch2, ch4}} are observed at so1 and so2, ch1 X D X D D
ch2
so1
ch2
respectively. For the shift position 1 through 4, since all D X X X X X X X D
Clk1 ch3
the Xs are static Xs, the Ds have to be observed using the
masks M1 and M2. Similarly, for shift position 11 and 13, Combinational logic
due to scan chains ch2 and ch3 being observed at multiple ch3
so2
outputs (compressor redundancy) for mask Mall, all Ds can ch1
ch3
be observed without compromising chain observability. D X X X X X 0 1 0 1 1 0 0 0
ch4
Clk3
ch4
1
ch1 X D X D D D D 2
Clk2 D D D D D X X
so1
ch2
D X X X X X X X D
Clk1
so2 Mask Inter clock Compressor
required domain Xs redundancy
Combinational logic
Figure 4: Scan cells capturing Xs in scan compression
architecture with multiple clocks being clocked
ch3 D X X X X X X X X X D separately.
so3
ch4
Clk3
so4
4. Experimental Results
For our experiments, we used 17 large industrial designs,
2
Clk2 D D D D D X X
10
observability can be obtained. For example, if the number obtained with the proposed approach translates into lower
Xs could be reduced to three Xs per shift, even in the test application time and lower test cost.
worst case distribution, 360 chains can be observed. If
those 360 chains happen to be where Ds are being 6. References
observed, all chains can be observed without needing to [1] E. J. McCluskey, D. Burek, B. Koenemann, S. Mitra, J.
invoke X masking. This implies that ATPG can leverage Patel, J. Rajski, J. Waicukauski, "Test data compression,”
the compaction advantages by clocking fewer clocks IEEE Design & Test of Computers, vol. 20, pp. 76–87,
together and by introducing fewer Xs so as not to March-April 2003.
[2] H. Tang et al., "On efficient X-handling using a selective
adversely affect the compressor observability. compaction scheme to achieve high test response
Figure 5 shows the results of the compression obtained for compaction ratios," in Proc. Int. Conf. on VLSI Design, pp.
the default settings. Since we did not change the ATPG to 59–64, 2005.
[3] C. Wang, S.M. Reddy, I. Pomeranz, J. Rajski and J.Tyszer,
dynamically monitor the inter-clock-domain Xs, we
"On compacting test response data containing unknown
statistically determined that if we set the maximum limit values," in Proc. Int. Conf. on Computer Aided Design, pp.
for the number of cells to be Xed out, SCX = 75, we 855–862, 2003.
obtained significantly better results. If we look at Figure 5, [4] S. Mitra, K.S. Kim, “X-Compact: An efficient response
we observe that there are two kinds of circuits: the ones on compaction technique for test cost reduction”, in Proc. Int.
the left hand side that have fewer inter-clock-domain Xs Test Conf., pp.311–320, 2002.
and the ones on the right with large number of inter-clock- [5] P. Wohl and L. Huisman, "Analysis and design of optimal
domain Xs. Clearly, all the circuits on the right hand side combinational compactors," in Proc. VLSI Test Symp., pp.
show significant gains when the inter-clock-domain Xs are 101–112, 2003.
[6] J. Rajski, J. Tyszer, C. Wang and S. M. Reddy ,
managed to stay within a limit that trades off the increased
“Convolutional compaction of test responses,” in Proc. Int.
observability due to multiple clocks with the negative Test Conf., 2003, pp. 745–754.
impact of the Xs. In fact, for circuit 13 and 17, 80% and [7] J. Rajski and J. Tyszer, "Synthesis of X-tolerant
84% improvement was obtained, respectively. convolutional compactors," in Proc. VLSI Test Symp., pp.
114-119, 2005.
We also generated patterns with SCX = 0 and compared it
[8] Y. Han, Y. Xu, H. Li, X. Li and A. Chandra, “Test resource
with SCX = 75 and the results are shown in Figure 6. The partitioning based on efficient response compaction for test
data in Figure 6 shows that, generating patterns by time and tester channel reduction,” in Proc. Asian Test
allowing a small number of clocks to pulse together is Symp., pp. 440-445, 2003.
more efficient then pulsing each clock separately. [9] I. Pomeranz, S. Kundu and S. M. Reddy, "On output
response compression in the presence of unknown output
4.2 Impact of mask usage values," in Proc. Design Auto. Conf., pp. 255–258, 2002.
To study the compressor performance as the number of [10] V.Chickermane, B. Foutz and B.Keller, “Channel masking
synthesis for efficient on-chip test compression”, in Proc.
dynamic Xs are reduced, we studied the mask usage for Int. Test Conf., pp. 452–461, 2004.
circuit 13 with tool default settings and with SCX = 75. As [11] M. Naruse, I. Porneranz, S. M. Reddy and S. Kundu, "On-
shown in Figure 7, the maximum number of times X chip compression of output responses with unknown values
masking can be used is equal to the scan chain length = using LFSR reseeding," Proc. Int. Test Conf., pp. 1060–
320. Figure 7 also shows the significant reduction in the 1068, 2003.
use X masking from default to the SCX = 75 setting. In [12] P. Wohl, J. A. Waicukauski, S. Patel and M. B. Amin, "X-
fact, mask usage is almost 2X less for SCX = 75 and that tolerant compression and application of scan-ATPG patterns
translates in to the 80% better compression achieved. in a BIST architecture," in Proc. Int. Test Conf., pp. 727–
736, 2003.
5. Conclusions [13] A. Chandra and R. Kapur, "Interval based X-masking for
scan compression architectures," Proc. IEEE Int. Symp. on
We have studied the Xs generated during ATPG due to Quality Electronic Design, pp. 821–826, 2008.
inter-clock-domain issues and have shown that not all Xs [14] A. Chandra, Y. Kanzawa, R. Kapur and T. W. Williams
are bad for compression. We have shown that by allowing “Adapting scan compression to designs,” Proc. IEEE VLSI
some inter-clock-domain Xs during ATPG, scan Design and Test, pp. 309–318, 2008.
compression QoR can be significantly improved. Our [15] P. Wohl, J. Waicukauski, and S. Ramnath, “Fully X-tolerant
results show that the gains obtained in compression by combinational scan compression,” Proc. Int. Test Conf., pp.
clocking multiple clocks together with fewer cells allowed 1–10, 2007.
[16] X. Lin, S. M. Reddy, and I. Pomeranz “Test pattern
to be disturbed are significantly higher than either with
reduction by simultaneously pulsing interacting clocks,”
very large or zero cells to be disturbed. It is observed that Proc. IEEE VLSI Design and Test, pp. 301–308, 2008.
the increased observability obtained due to the clocking of [17] DFT Compiler, Synopsys DFT synthesis solution,
multiple clocks simultaneously outweighs the http://www.synopsys.com/products/test/dft_compiler_ds.pdf
observability lost due to the dynamic Xs generated during [18] TetraMAX®, Synopsys ATPG solution,
ATPG with SCX = 75. Finally, the higher compression http://www.synopsys.com/products/test/tetramax_ds.pdf
11
60
50
45%
Compression (X)
40 6% 36%
34% 84%
30 21%
SCx<1000
11% 80%
20
SCx=75
10
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
Circuit
X
Figure 5: Compression obtained using default tool setting and with number of disturb scan cells SC = 75
11
9
Improvement (%)
7
5
3
1
-1
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
-3
-5
Circuit
X X 75 0 75
Figure 6: Compression improvements of SC = 75 over SC = 0 ( (SC – SC )*100 / SC ).
300
Number of shifts mask used
250
200
150 Max
SCx<1000
100
SCx =75
50
0
1
12
23
34
45
56
67
78
89
100
111
122
133
144
155
166
177
188
199
210
221
232
243
254
265
Patterns
X X
Figure 7: Number of shift cycles where mask was used with SC <1000 vs SC = 75 for circuit 13.
12