You are on page 1of 7

Improved Fault Emulation for Synchronous Sequential Circuits

Jaan Raik, Peeter Ellervee, Valentin Tihhomirov, Raimund Ubar


Department of Computer Engineering
Tallinn University of Technology, Raja 15, 12618 Tallinn, Estonia,
e-mail: jaan@pld.ttu.ee, lrv@cc.ttu.ee, valentin@abelectron.com, raiub@pld.ttu.ee

Abstract by the availability of multi-million-gate FPGAs. For


academic purposes, cheaper devices with rather large
Current paper presents new alternatives for accelerating capacity, e.g., new Spartan devices, can be used.
the task of fault simulation for sequential circuits by The availability of large devices doesn’t allow merely
hardware emulation on FPGA. Fault simulation is an implementation of the circuit under test with fault models
important subtask in test pattern generation and it is but, additionally, to include test vector generation and
frequently used throughout the test generation process. output response analysis circuits on a single
The problems associated to fault emulation for sequential reconfigurable device. It is important to stress that current
circuits are explained and alternative implementations approach is not aiming at testing the FPGA itself neither
are discussed. An environment for hardware emulation of simulating any defects in it. (FPGA is assumed to be a
fault simulation is presented. It incorporates hardware priori 100 % tested by the manufacturer). The goal is to
support for fault dropping. The proposed approach speed up fault grading mainly for ASIC projects using
allows simulation speed-up of 40 to 500 times as FPGA simply as a faster model. Furthermore, in addition
compared to the state-of-the-art in fault simulation. to increasing the speed of fault simulation, the idea
Average speedup provided by the method is 250 that is proposed in this paper can be used for selecting optimal
about an order of magnitude higher than previously cited Built-In Self-Test (BIST) structures.
in the literature. Based on the experiments, we can Alternatives that can be considered as trade-offs in
conclude that it is beneficial to use emulation when large terms of required FPGA resources and fault grading
numbers of test vectors is required. accuracy are discussed in the paper. In addition, an
experimental environment for reconfigurable hardware
emulation of fault simulation is presented, which supports
1. Introduction the different trade-offs.
As a novel feature, the emulation environment
The increasing complexity of modern VLSI circuits incorporates hardware support for fault dropping, i.e., if a
has made test generation one of the most complicated and fault has been detected by a test vector then the remainder
time-consuming problems in the domain of digital design. of the test sequence will be canceled and the process will
As the sizes of circuits grow, so do the test costs [1]. Test continue with the next fault. Comparison with the golden
costs include not only the time and resources spent for device is used for implementing the fault dropping. The
testing a circuit but also time and resources spent to proposed approach allows simulation speed-up of 40 to
generate appropriate test vectors. The most important nearly 500 times as compared to the state-of-the-art in
sub-task of any test generation approach is the suitability fault simulation. Average speedup provided by the
analysis, or fault grading, of a given set of test vectors. method is 250 - an order of magnitude higher than
Fault simulation is the most often used way for fault previously cited in the literature.
grading. Efficient fault simulation algorithms for The paper is organized as follows. Section 2 provides
combinational circuits are known already for decades an overview of previous related works. The initial
(see, e.g. [2]). However, it is the large sequential designs emulation environment is introduced in Section 3.
whose fault grading run times could be measured in years Section 4 presents alternative approaches to sequential
that drive the need faster implementation, e.g. by fault emulation and the improved emulation environment
hardware emulation. At the same time, reconfigurable is described. In Section 5, the results of experiments are
hardware devices have been found useful as system presented and Section 6 is for conclusions.
modeling environments [3]. This has been made possible

Proceedings of the 2005 8th Euromicro conference on Digital System Design (DSD’05)
0-7695-2433-8/05 $20.00 © 2005 IEEE
2. Related Works of accelerator FPGA. In addition, the improved emulation
environment allows implementation of various fault-
A number of works on fault emulation for modeling scenarios. The environment is explained in the
combinational circuits has been published in the past. following Sections.
They rely either on fault injection (see, e.g., [4, 5]) or on
implementing specific fault simulation algorithms in
3. Emulation environment
hardware [6]. Recently, acceleration of combinational
circuit fault diagnosis using FPGAs has been proposed in
The emulation environment was designed to work
[7]. However, the need for hardware fault emulation has
together with Turbo Tester (TT) is a diagnostic software
been driven mainly by large sequential designs whose
package developed at the Department of Computer
fault grading run-times could extend to several years.
Engineering of Tallinn University of Technology [15,
In many of the papers for sequential circuits, faults
16]. The TT software (see Figure 1) consists of several
are injected either by modifying the configuration
test-related tools - test generation by different algorithms
bitstream while the latter is being loaded into the device
(deterministic, random and genetic), test program
[8] or by using partial reconfiguration [9, 10, 11]. This
optimization, fault simulation for combinational and
kind of approach is slow due to the run-time overhead
sequential circuits, testability analysis, and fault diagnosis
required by multiple reconfigurations. Other options for
- to name few of them. The main advantage of TT is that
fault injection are shift-registers and/or decoders (used in
different methods and algorithms for various test
this paper). A paper relying on the shift-register based
problems have been implemented and can be investigated
method was presented in [12]. Shift-registers are known
separately of each other or working together in different
to require slightly less hardware overhead than the
combinations.
decoders do. However, in [12] injection codes and test
Test Pattern Analysis includes single-fault simulation,
patterns are read remotely from a host PC.
parallel fault simulation, and critical path tracing fault
In addition to merely increasing the speed of fault
analysis methods implemented in the system. These
simulation, the idea proposed in current paper can be
competing approaches can be investigated and compared
used for selecting optimal Built-In Self-Test (BIST)
for circuits of different complexities and structures. For
structures. In an earlier paper [13] a fault emulation
this part the emulation approach was proposed. So far we
method to be used for evaluating the Circular Self-Test
have experimented only with "Fault Simulation" part
Path (CSTP) type BIST architectures has been presented.
(black in Figure 1). In future, also the other simulation-
Different from the current approach, no fault collapsing
related parts might be considered (gray in Figure 1).
was carried out and fault-injecting hardware was inserted
The initial emulation environment was created
to each logic gate of the circuit to be emulated.
keeping in mind that the main purpose was to evaluate
In [14], we proposed an efficient FPGA-based fault
the feasibility of replacing fault simulation with
emulation environment for sequential circuits. The main
emulation. Based on that, the main focus was put on how
feature of the emulation approach lied in implementing
to implement circuits to be tested on FPGAs. Less
multiplexers and a decoder for fault injection, which,
attention was paid how to organize data exchange
unlike the shift register based injection, allowed to insert
between hardware and TT. For the first series of
faults in arbitrary order. This feature is highly useful
experiments, we looked at combinational circuits only.
when applying the presented environment in emulating
Results of experiments with these circuits were presented
the test generation process. In addition, we used an on-
in [5].
chip input pattern generator as opposed to loading the
For sequential circuits, most of the solutions used for
simulation stimuli from the host computer. It is also
combinational circuits could be exploited. The main
important to note that the time spent for emulator
modification was an extra loop in the controller because
synthesis is included to the experimental results of, both,
sequential circuits require not a single input combination
[14] and current paper. However, average acceleration
but a sequence consisting of tens or even hundreds of
provided by the environment was still slightly below 20 -
input combinations. Also, instead of hard-coded test
similar to the best known results from the literature. In
sequence generators and output analyzers, loadable
addition, many of the important issues concerning fault
modules were introduced. The results of first experiments
emulation for sequential circuits remained without an
with sequential circuits were presented in [14]. Before
efficient solution.
building the experimental environment, we had first to
The fault emulation environment, presented in the
solve how to insert faults, how to generate test vectors,
paper, introduces hardware support to fault dropping that
how to analyze output data, and how to automate design
pushes the speed-up nearly an order of magnitude further.
flow. The solutions and discussions are presented below.
This is achieved without a significant penalty to the area

Proceedings of the 2005 8th Euromicro conference on Digital System Design (DSD’05)
0-7695-2433-8/05 $20.00 © 2005 IEEE
Algorithms:
Levels Deterministic Circuits:
Formats: Gate Random Combinational
Hazard
EDIF Macro Genetic Sequential Multivalued
AGM Simulation
Analysis
RTL
Data
Test Logic
Generation Simulation
Specifi- Desig Test
cation n BIST Set Fault Defect
Emulation Simulation Library

Methods: Fault models:


Faulty Design Error BILBO Test Set Fault Stuck-at faults
Area Diagnosis CSTP Optimization Table Physical defects
Hybrid

Figure 1. Turbo Tester environment

Fault insertion: The main problem here was how to Compared against shift-register based fault injection
represent non-logic features - faults - in such a way that approaches (see, e.g., [12] and Figure 3), the use of
they can be synthesized using standard logic synthesis multiplexers has both advantages and disadvantages. The
tools. Since most of the analysis is done using stuck-at- main disadvantage is small increase in both area and
one and stuck-at-zero fault models, the solution was delay of the circuit. Although the delay increase is only
obvious - use multiplexers at fault points to introduce few percents, execution time may increase significantly
logic one or zero, or pass through intact logic value. Also, for long test cycles. The main advantage is that any fault
since a single fault is analyzed at a time typically, can be selected in a single clock cycle, i.e., there is no
decoders were introduced to activate faults (see Figure 2). need to shift the code of a fault into the proper register.
Combining both approaches may be the best solution and
one direction of future work will go in that direction.

1
fault
point 0
select
fault
fault stuck-at-1 /
point
stuck-at-0
from previous to next fault
select fault fault points ff ff
points
or to next
level stuck-at-1 stuck-at-0
fault decoders
codes
Figure 3. Fault point insertion with shift-register

Figure 2. Fault point insertion with multiplexer


Test vector generation and output data analysis:
Here we relied on a well-known solution for BIST -
The extra multiplexers will increase the gate count Linear Feedback Shift Register (LFSR) is used both for
and will make the circuit slower (typically 5 to 10 times). input vector generation and output correctness analysis
It is not a problem for smaller circuits but may be too (see, e.g., [17]). LFSRs structures are thoroughly studied
prohibitive for larger designs - the circuit may not fit into and their implementation in hardware is very simple. This
target FPGA. A solution is to insert faults selectively. simplifies data exchange with the software part - only
Selection algorithm, essentially fault set partitioning, is a seed and feedback polynomial vectors are needed to get a
subject of future research. desired behavior. Output correctness analysis hardware

Proceedings of the 2005 8th Euromicro conference on Digital System Design (DSD’05)
0-7695-2433-8/05 $20.00 © 2005 IEEE
needs first to store the expected output signature and then x Three counters - one to count test vectors, one to
to report to the software part whether the modeled fault count test sequences, and one to count modeled faults
was detected or not. Figure 4 illustrates a stage of used (generic VHDL module);
LFSRs. The input 'coefficient' is used for feedback x Test bench with controller (FSM) to connect all sub-
polynomial. The input 'result' is used only for result modules, to initialize LFSRs and counters, and to
analysis and is connected to zero for input vector organize data exchange with the external interface; a
generation. generic VHDL module; algorithms implemented by
the FSM are depicted in Figure 6, and
x Interface to organize data exchange between the test
result seed coefficient bench (FPGA) and the software part (PC).
flip-flop to next The interface is currently implemented only in part as
stage further studies are needed to define data exchange
D protocols between hardware and software parts. In future,
E any suitable FPGA board can be used assuming that
C
supporting interfaces have been developed.
from
previous reset
stage enable 4. Hardware Support for Fault Dropping
clock

Figure 4. Single stage of LFSR Fault emulation approaches for sequential circuits
encounter problems that are not present in combinational
fault emulation. In the sequential fault emulation process,
Design flow automation was rather easy because of first, fault-free emulation of the circuit is performed.
the modular structure of the hardware part. All modules Then, faults are injected one-by-one and for each of them
are written in VHDL that allows to parameterize design the test stimuli set is emulated. This repetitive emulation
units (see also Figure 5): requires the circuit to be in an initial state at the
x CUT - circuit under test, generated by the fault beginning of every subsequent emulation of the test set.
insertion program; Furthermore, the environment presented in this paper
x CUT-pkg and CUT-top - parameters of CUT and can be configured to use signature analysis (e.g. a MISR)
wrapper for CUT to interface with the generic test for minimizing the FPGA resources required. In that case,
environment, generated by wrapper program; all the issues that are related to signature analysis in
x Two LFSRs - one for test vector generator and one sequential BIST apply to fault emulation too. In other
for output signature calculation (generic VHDL words, we have to solve the problem of compressing
module); don’t care responses to a meaningful signature. In the

FPGA input
vectors
Host PC interface
LFSR parameters
CUT-pkg
seed and
feedback FSM CUT-top
vectors generated
CUT
list of
counters modified
detected /
netlist
undetected
faults LFSR
Test bench output
analysis

Figure 5. Emulation environment structure

Proceedings of the 2005 8th Euromicro conference on Digital System Design (DSD’05)
0-7695-2433-8/05 $20.00 © 2005 IEEE
A similar approach has been used in many
(Sequential circuits) applications, including in the parallel fault simulation
reset_LFSRs; method PROOFS [18]. An obvious shortcoming is the
for every test_sequence {
reset_CUT; duplication of hardware necessary for emulation. The
for every test_vector emulate_CUT; main advantage lies in the fact that the fault emulation
} results directly match the software simulation.
store_signature;
for every fault {
reset_LFSRs; Table 1. Four-valued gate evaluation
for every test_sequence {
reset_CUT; Y0 Y1
for every test_vector emulate_CUT; AND A0 | B0 A1 & B1
}
compare_signature; send_report; OR A0 & B0 A1 | B1
} inverter A1 A0
XOR (A0&B0)|(A1&B1) (A0&B1)|(A1&B0)

Figure 6. Algorithm implemented by the FSM Hardware Support for Fault Dropping: Only
minimal modifications were needed to introduce
hardware support for fault dropping. First, the reference
following we present two alternative solutions to the unit - the golden device - had to be inserted. For that we
above problems that can be considered as trade-offs in used the original netlist of the CUT - the one without
terms of required FPGA resources and fault grading fault models. Few alternative solutions, e.g. context-
accuracy. switching like approach with one combinational part and
Fault emulation with all registers resettable: In the two sets of registers, were also considered but the used
general case, not all the registers in sequential designs solution appeared to be the cheapest (in the terms of
could be reset. There are two possibilities to solve this resources and performance). Second, the circuitry for on-
issue. First, (and this is the approach used in current fly comparison was needed to get the full benefit of the
paper), it is possible to alter the initial design to contain fault dropping approach. This was implemented using a
resettable registers only. This can be considered as a kind simple parallel comparator and modified emulation
of design-for-testability modification. However, this algorithm. Moreover, the emulation environment was
might not be desireable, as the circuit’s silicon area also simplified because the output signature analyzer is
would increase. not needed anymore.
Another option is to modify the circuit only when The modified structure (see Figure 7) makes use of
emulated on the FPGA without modifying the design to the new CUT-top module to include also the golden
be manufactured itself. The drawback here is that using device (GOLD). The LFSR for output analysis has been
this approach we cannot compare the results directly to replaced with comparator (CMP). In a similar manner,
the ones of software fault simulation. However, it is a the emulation algorithm (see Figure 8) has been modified
reasonable approach if fast fault grading is needed with - there is no need build the fault free signature but a
limited FPGA resources available. Furthermore, as our comparison is performed at the end of every emulation
preliminary experiments indicate the difference from the step.
‘real’ fault coverage is negligible when using the all-
resettable accelerator for designs containing non- 5. Experimental Results
resettable registers.
Encoded don’t-care value approach: An alternative For experiments, a powerful RC1000-PPE board with
solution for the problem is to rely on the encoded don’t- VirtexE chip (19200 slices) was used. Test circuits were
care value approach. It is based on coding the three- selected from ISCAS'89 and HLSynt'92 benchmark sets
valued logic: 0, 1, X. To code these values we need two to evaluate the speedup when replacing fault simulation
bits and thus, two instances of the emulating circuit. 0 is with emulation on FPGA. Results of some benchmarks
coded as (1,0), 1 as (0,1) and X as (0,0). (Additionally, it are presented in the paper to illustrate gains and losses of
is possible to code the third-state high impedance value Z our approach (see Table 2). Column "# of faults"
as (1,1)). Table 1 shows the logic used for evaluating the illustrates the complexity of the test circuits. The first set
basic gates AND, OR, inverter, and XOR. The gates have of columns (non-dropping) illustrates parameters of the
inputs A, B, and an output Y. The lower indices are for initial emulation environment while the second set of
indexing the coding bits. columns (fault dropping) illustrates the improved
environment. The columns "# of vectors" illustrate the

Proceedings of the 2005 8th Euromicro conference on Digital System Design (DSD’05)
0-7695-2433-8/05 $20.00 © 2005 IEEE
FPGA input
vectors
Host PC interface
LFSR parameters
CUT-pkg
seed and counters
feedback CUT-top
vectors generated
CUT
FSM
list of
modified
detected /
GOLD netlist
undetected
CMP
faults
unmodified
Test bench netlist

Figure 7. Modified emulation environment structure

complexity of tests. The column "SW" gives the fault wire to represent four different values. The logic was not
simulation time based on a parallel algorithm and "emul" always doubled because inputs and resettable/settable
emulation time for the same set of test vectors. Synthesis registers have two effective values and therefore the logic
times have been added for comparison ("synt"). Columns could be simplified.
"slices" give the size of the emulation environment in
FPGA.
For different benchmarks, the initial hardware (Sequential circuits & fault dropping)
emulation was in average 33 (ranging from 13 to 85) for every fault {
times faster than the software fault simulation. The reset_LFSR;
for every test_sequence {
improved environment was in average 250 times faster reset_CUT; reset_GOLD;
(from 44 to 517). It should be noted that when for every test_vector {
considering also synthesis times, it might not be useful to emulate_CUT; emulate_GOLD;
replace simulation with emulation, especially for smaller if (outputs_differ)
break test_sequence;
designs. Nevertheless, taking into account that sequential }
circuits, as opposed to combinational ones, have much }
longer test sequences, the use of emulation will pay off. send_report;
The second approach - encode don't-care values - }
does not improve the speed-up, in principle, but increases
50% the size of the emulation hardware in average. This
is caused by the need to use two wires for every original Figure 8. Algorithm for fault dropping

Table 2. Results of experiments

circuit # of non-dropping fault dropping


faults # of vectors SW HW # of HW
# of seq. simul slices MHz synt emul vectors slices MHz synt emul
seq. len.
s5378 5150 80 100 26.8" 3311 35 11.8' 1.18" 2896 3573 30 14.8' 0.50"
s15850 12314 200 200 15.6' 9939 15 79' 32.8" 25521 10131 15 85' 21.0"
GCD (16) 1634 80 50 5.28" 1094 40 2.8' 0.16" 510 1152 40 2.9' 0.02"
GCD (32) 3734 10 400 22.6" 2227 25 7.9' 0.60" 558 2456 25 9.0' 0.08"
prefetch (16) 1042 40 100 1.34" 776 75 1.2' 0.06" 181 701 75 1.3' 3 ms
prefetch (32) 2252 40 400 9.46" 1529 50 3.4' 0.72" 1904 1448 50 3.6' 0.09"
diff-eq (16) 10008 20 200 87.9" 7469 10 80' 4.00" 175 7710 10 82' 0.17"
TLC 468 40 100 2.69" 391 60 41" 0.03" 925 409 60 41" 0.01"

Proceedings of the 2005 8th Euromicro conference on Digital System Design (DSD’05)
0-7695-2433-8/05 $20.00 © 2005 IEEE
6. Conclusions [6] M. Abramovici, P. Menon, "Fault Simulation on
Reconfigurable Hardware." In Proc. of FPGAs for
The paper presents an improved version of fault Custom Computing Machines, pp.182-190, April
emulation of synchronous sequential circuits - a fully 1997.
resettable design approach has been modified to support [7] S.-K. Lu, J.-L. Chen, C.-W. Wu, W.-F. Chang, S.-
fault dropping. An FPGA-based emulation environment Y. Huang, "Combinational Circuit Fault Diagnosis
implementing the former has been implemented. Using Logic Emulation.” In Proc. of ISCAS'03,
Experiments with ISCAS'89 and HLSynth'92 benchmarks Vol.5, pp.549-552, 2003.
have been carried out. [8] M. Alderighi, S. D'Angelo, M. Mancini, G.R. Sechi,
The experiments showed that for circuits that require "A Fault Injection Tool for SRAM-based FPGAs."
large numbers of test vectors, e.g., sequential circuits, it is In Proc. of 9th IEEE International On-Line Testing
beneficial to replace simulation with emulation. Although Symposium (IOLTS'03), pp.129-133, July 2003.
even for combinational circuits the simulation speedup is
[9] R. Wieler, Z. Zhang, R. D. McLeod, "Simulating
significant, there exist rather large penalty caused by
Static and Dynamic Faults in BIST Structures with a
synthesis time. Based on that, it can be concluded that the
FPGA Based Emulator." In Proc. of FPL’94,
most useful application would be to explore test
Springer-Verlag, pp.240-250, Sept. 1994.
generation and analysis architectures based on easily
reprogrammed structures, e.g., LFSRs. This makes fault [10] K.-T. Cheng, S.-Y. Huang, W.-J. Dai, "Fault
emulation very useful to select the best emulation: a new approach to fault grading." In
generator/analyzer structures for BIST. The need to Proc. of ICCAD’95, pp.681-686, Nov. 1995.
simulate the same circuit many times with different seed- [11] L. Burgun, F. Reblewski, G. Fenelon, J. Barbier, O.
and feedback vectors reduces also the impact of the main Lepape, "Serial fault emulation." In Proc. DAC’96,
drawback of the approach – rather large synthesis times. pp.801-806, Las Vegas, USA, June 1996.
Another useful application of fault emulation would be [12] S.-A. Hwang, J.-H. Hong, C.-W. Wu, "Sequential
genetic algorithms of test pattern generation where also Circuit Fault Simulation Using Logic Emulation." In
large numbers of test vectors are analyzed. Future work IEEE Trans. on CAD of Int. Circuits and Systems,
includes development of more advanced on chip test Vol.17, No.8, pp.724-736, Aug. 1998.
vector generators and analyzers.
[13] R. Wieler, Z. Zhang, R. D. McLeod, "Emulating
Static Faults Using a Xilinx Based Emulator." In
Acknowledgments:
Proc. of IEEE Symposium on FPGAs for Custom
This work was supported partly by ESF grants No 5601
Computing Machines, pp.110-115, April 1995.
and 5637, Enterprise Estonia funded ELIKO Competence
Center and by European projects IST-2001-37592 "E- [14] Authors, "Evaluating Fault Emulation on FPGA."
VIKINGS II" and IST-2000-30193 "REASON". In Proc. of FPL'04, Antwerp, Belgium, pp.354-363,
Aug. 2004.
References [15] G. Jervan, A. Markus, P. Paomets, J. Raik, R. Ubar,
"Turbo Tester: A CAD System for Teaching Digital
[1] ITRS 2004 roadmap - URL: public.itrs.net Test." In Microelectronics Education, Kluwer
[2] P. McGeer, K. McMillan, A. Saldanha, A. Academic Publishers, pp.287-290, 1998.
Sangiovanni-Vincetelli, P. Scaglia, "Fast Discrete [16] "Turbo Tester" home page - URL:
Function Evaluation Using Decision Diagrams." In http://www.pld.ttu.ee/tt
Proc. of ICCAD''95, pp.402-407, Nov. 1995. [17] D.K. Pradhan, C. Liu, K. Chakrabarty, "EBIST: A
[3] "Axis Systems Uses World's Largest FPGAs from Novel Test Generator with Built in Fault Detection
Xilinx to Deliver Most Efficient Verification Capability." In Proc. of DATE'03, pp. 224-229,
System in the Industry." Xilinx Press Release #0273 Munich, Germany, March 2003.
- http://www.xilinx.com/ [18] T. M. Niermann, W.-T. Cheng, J. H. Patel,
[4] R. Sedaghat-Maman, E. Barke, "A New Approach "PROOFS: a Fast, Memory Efficient Sequential
to Fault Emulation." In Proc. of Rapid System Circuit Fault Simulator". Proc. DAC, pp. 535-540,
Prototyping, pp.173-179, June 1997. 1990.
[5] Authors, "Fault Emulation on FPGA: A Feasibility
Study." In Proc. of Norchip''03, Nov. 11-12, 2003.

Proceedings of the 2005 8th Euromicro conference on Digital System Design (DSD’05)
0-7695-2433-8/05 $20.00 © 2005 IEEE

You might also like