# Design for Reliability and Manufacturing Yield (Study and Modeling of Defects in Integrated Circuits for their Reliability Analysis) G. Ait Abdelmalek, and R. Ziani Abstract—In this document, we have proposed a robust conceptual strategy, in order to improve the robustness against the manufacturing defects and thus the reliability of logic CMOS circuits. However, in order to enable the use of future CMOS technology nodes this strategy combines various types of design: DFR (Design for Reliability), techniques of tolerance: hardware redundancy TMR (Triple Modular Redundancy) for hard error tolerance, the DFT (Design for Testability. The Results on largest ISCAS and ITC benchmark circuits show that our approach improves considerably the reliability, by reducing the key factors, the area costs and fault tolerance probability. **Keywords**—Design for reliability, design for testability, fault tolerance, manufacturing yield. ## I. Introduction **T**ODERN electronic systems are becoming increasingly ■vulnerable to errors and failures. Effectively identify these failures by testing techniques is important. The test of integrated circuits that enables us to provide a share of operating systems and secondly, through diagnosis, questioning procedures or workmanship. Therefore, the test allows us to increase the efficiency of production, the subject of numerous studies. Work has concerned the types of defects, others have proposed testing techniques. However, these failures require constant questioning methods of design of electronic systems. That must ensure control of behavior in critical environments. In particular, the control reliability is a key element in the success of technological change. This is why for many years; the difficulties of designing integrated circuits (ICs) retain all the attentions of industrial designers and electronics brought by the constraints of future use in CMOS technology nodes and low voltages power [1]. To reduce the effect of failures, the solutions proposed by technologists since the year 2000 were initially to replace the aluminum tracks by copper tracks which allow speeding up interconnections between transistors, reducing losses and increasing resistance to electromigration. Along with proposed solutions, traditional methods of fault tolerance "redundancy", and hardening of components are used. Since, the fault tolerance inherent in the modular design provides a new testing strategy by integrating it in complex circuits. However, the test of these structures requires methods and tools for predictive evaluation of the tolerance fault level induced by the manufacturing process or those relative to their use. We were interested by tools able of performing faults injection in the cycle of design. These tools should allow for example, when one has only the high level model of a circuit, to obtain an upstream assessment of the reliability of this circuit. This article is designed to meet the emerging demands of industry for the development of methods that can increase and ensure optimal reliability of integrated circuits early in the design flow even in presence of defects; we propose in this paper a design methodology capable of increase the reliability of such circuits via increased manufacturing yield. The latter is shown in Fig. 1 consists of two points: - Methodology for testing integrated circuits in order to detect failures of integrated circuits, fault simulation, the ATPG "Automatic Test Pattern Generation" and DFT (Design for Testability) with performance to ensure the reliability of circuits, reduce the cost of test circuits while keeping the performance of the latter. - Methodology for fault tolerance and improved reliability: a triple module redundancy "TMR" enhanced TMR double and triple TMR through partitioning redundant elements that made up the circuit to ensure its operation while minimizing its surface and therefore the cost of the overall circuit. This method is then applied to the circuits of "benchmarks ISCAS85 and ITC99", which demonstrates its feasibility and effectiveness. Fig. 1 Design methodology used G. Ait Abdelmalek is with the Department of Electronics, Mouloud Mammeri University, 15 000 Tizi-ouzou, Algeria (phone: (+213) 026218642; fax: (+213) 026218642; e-mail: ghania\_79@yahoo.fr). R. Ziani was with the Department of Electronics, Mouloud Mammeri University, 15 000 Tizi-ouzou, Algeria (phone: (+213) 026218642; fax: (+213) 026218642; e-mail: ziani\_r@yahoo.fr). ## II. TEST OF INTEGRATED CIRCUITS The advent of integrated circuit technology has introduced electronics in many aspect of present-day life. As the use of electronic components increases, the expectation of lower cost, better accuracy, and higher reliability increases. Lower cost and better accuracy is achieved by putting more transistors per unit of silicon, using design automation, increasing device operation speed, and reducing its power consumption. However, these design steps cannot guarantee reliability. In fact, as the circuit density increases, the probability of a manufacturing defect increases. The higher expectation of reliability can only be met by more thorough and comprehensive testing. Classical fault models (stuck-at, stuck-open, stuckon ...) have been proved to be efficient for the analysis of many of these faults and the majority of the techniques of test are based on these models of faults [2]-[3]. However, it is well-known that these fault models cover only partially the spectrum of real failures in today's integrated circuits and the functional test tends to being replaced by the structural test [2]-[4]. The effectiveness of the test depends on the Y (Yield), the FC (Fault Coverage) and the DL (Defect Level) respectively corresponding to the ratio of the number of circuits that pass the test on the total number of circuits, the ratio of the number of faults detected on the total number of errors and report the number of circuits on the number of faulty circuits that pass the test. As in (1), studies [4] showed that: $$DL = [1 - Y^{1-FC}] \times 100\%$$ (1) ## III. MANUFACTURING YIELD AND TMR STRUCTURES ## A. General Principles The tolerant numerical structures with the faults were mainly conceived to tolerate faults appearing during the operation of the circuits. Their principle is to use resources of redundancies to detect and correct the faults. From the point of view of technology, there are several faults tolerance structures which are classified [5]–[6] according to the resources of redundancy which they use: material, information, time, software or hybrid (composition of various resources). These structures have been designed to tolerate transient or temporary faults but they can also tolerate manufacturing defects and thus increase the yield. Fig. 2 shows the simplest design and most widely used are the TMR (Triple Modular Redundancy). However, the technical realization of such structures is very expensive. Therefore the motivations of designers are not making so many designs for reliability with the manufacturing yield the highest possible, but to determine the manufacturing process that aims to develop reliable circuits with a production cost of the lowest possible. The TMR consists of M three modules identical and a Voter (Fig. 3). The outputs of these three identical circuits are voted by a majority voter for to give an output single. Equation (2) gives the reliability of TMR structure [6]. Fig. 3 Voter bit by bit ## B. Calculus of Manufacturing Yield of TMR Structure Considering that $A_C$ the original surface of the circuit without redundancy and $A_V$ the surface of the voter. If we neglect the size of the interconnections, refer to (3): the TMR structure surface $A_{TMR}$ is given by: $$A_{TMR} = 3 A_C + A_V \tag{3}$$ The area cost $A_0$ (Area Overhead) of the implementation of a circuit structure transformed into TMR is then [7]–[8]: $$A_O = \frac{3 A_C + AV}{A_C} = 3 + \frac{A_V}{A_C} \tag{4}$$ By calling $Y_{TMR}$ the manufacturing yield of the TMR structure and $Y_{C}$ the manufacturing yield of the structure without redundancy, the condition so that TMR structure increases the yield and therefore the reliability is as follows: $$Y_{TMR} > A_O \times Y_C \tag{5}$$ Like $Y_{TMR} \leq I$ , this condition becomes: $$Y_C \le \frac{1}{A_O} \tag{6}$$ # IV. APPLICATION In our application we have used combinational circuits of benchmarks ISCAS85 and ITC99 [9]. Next, we transformed their architecture into TMR structure. Fig. 4 represents a summary of the principal phases for the test of the circuits "benchmarks ISCAS85 and ITC99" simulated under multiple simultaneous faults of stuck-at fault. Fig. 4 Summary of the principal phases of the ATPG For an effective test of TMR, a new fault models should be introduced into the ATPG, allowing detection of several manufacturing faults. Furthermore, for to see the behavioral of circuit in the presence multiple simultaneous faults, we have injected of pairs of stuck- at faults (s@0, s@1) for verifying testability at the verilog level. The ATPG gives us the FC (Fault Coverage) rate, and T tolerance probability, that a pair of faults either tolerated by the structure, defined by: T = number of pairs of faults tolerated / total number of pairs of faults. To calculate the probability T, we used the simulation software Tetramax [9]. Table I summarizes the characteristics of the circuits tested. TABLE I CHARACTERISTICS OF THE CIRCUITS TESTED | CHARACTERISTICS OF THE CIRCUITS TESTED | | | | | | | | | | | |----------------------------------------|------|---------|------------|-------------------------|-------------------------|------------------------------------------|--------------------|----------|----------|--| | Simulated circuit | | In/Out | Gates<br># | Stuck-at<br>faults<br># | pairs of<br>Faults<br># | Pairs of<br>faults<br>reduced<br>by Atpg | A <sub>0</sub> (%) | T<br>(%) | R<br>(%) | | | Benchmarks ISCAS85 | c432 | 36/7 | 160 | 392 | 689725 | 75.10 <sup>3</sup> | 3.10 | 40,00 | 35,2 | | | | c499 | 41/32 | 202 | 486 | 1062153 | 95.10 <sup>3</sup> | 3.39 | 52,73 | 54,09 | | | | c190 | 33/25 | 880 | 1826 | 15001503 | $1.3.10^6$ | 3.08 | 56,45 | 59,62 | | | | c267 | 233/140 | 1193 | 2852 | 36598290 | $1.87.10^6$ | 3.33 | 75,96 | 85,44 | | | | c354 | 50/22 | 1669 | 3438 | 53184141 | 4.91. 10 <sup>6</sup> | 3.04 | 54,09 | 56,12 | | | | c531 | 178/123 | 2307 | 4970 | 11114659 | 3.4. 10 <sup>6</sup> | 3.16 | 93,20 | 98,67 | | | | c628 | 32/32 | 2416 | 6250 | 17577187 | 18.2.10 <sup>6</sup> | 3.03 | 38,03 | 32,38 | | | | c755 | 207/108 | 3512 | 7438 | 24894614 | $7.40.\ 10^6$ | 3.09 | 84,93 | 93,87 | | | Benchmarks ITC99 | b02 | 1/1 | 25 | 64 | 18336 | $1.5.\ 10^3$ | 3.47 | 86,37 | 94,93 | | | | b03 | 4/4 | 150 | 382 | 656085 | 290. 10 <sup>3</sup> | 3.57 | 87,93 | 95,98 | | | | b04 | 11/8 | 480 | 1477 | 9814665 | 535. 10 <sup>3</sup> | 3.32 | 84,30 | 93,37 | | | | b05 | 1/36 | 608 | 2553 | 29326311 | $1.17.10^6$ | 3.16 | 88,66 | 96,43 | | | | b06 | 2/6 | 66 | 155 | 107880 | $7.73.10^3$ | 3.68 | 87,50 | 95,70 | | | | b07 | 1/8 | 382 | 1120 | 5643120 | 399. 10 <sup>3</sup> | 3.38 | 81,90 | 91,35 | | | | b09 | 1/1 | 131 | 417 | 781875 | 57.3. 10 <sup>3</sup> | 3.45 | 83,07 | 92,37 | | | | b10 | 11/6 | 172 | 468 | 984906 | $63.5.\ 10^3$ | 3.31 | 89,40 | 96,86 | | | | b11 | 7/6 | 366 | 1308 | 7696926 | $703.\ 10^3$ | 3.19 | 74,50 | 83,80 | | | | b12 | 5/6 | 1000 | 2777 | 34698615 | 857. 10 <sup>3</sup> | 3.30 | 95,46 | 99,40 | | | | b13 | 10/10 | 309 | 835 | 3136260 | 59.6. 10 <sup>3</sup> | 3.49 | 96,96 | 99,72 | | # A. Impact of Cost in Silicon Area on T Tolerance Probability Fig. 5 shows that the transformation of circuits in TMR structures does not increase their manufacturing yield, and thus their reliability. The cost in silicon area due to their achievement is very high and tolerance is not enough to offset. In other words, for that realization of TMR structures will lead to increased performance of a circuit, the circuits must be above the curve $(T>T_{min})$ which is not the case in Fig. 5. Fig. 5 Faults tolerance of circuits depending on the cost (Ao) ## B. Optimization of TMR Tolerance Probability For that TMR structures can increase reliability through increased manufacturing yield, other solutions must be found. Two possible ideas are needed: either to reduce the cost in silicon area, in other words, to use another type of structure, or to use one of the design methods of DFT (Design for Testability): the partitioning shown in Fig. 6 to reinforce the redundancy of TMR structures [10]. It is this second solution which we chose to implement. Fig. 6 Principle of partitioning ## C. Partitioning Results In this section, we examine the results of TMR structures Simple, Double TMR and Triple TMR to see if their implementation will increase the manufacturing yield and reliability of circuits through the results given by Sh-METIS and a program developed TMR structures to transform simple structures in Double TMR and Triple TMR ## 1. Characteristics of partitioned TMR structures Table II represents the characteristics of partitioned TMR structures. For example, the circuit c 2670 is characterized by: - 140 voters for simple TMR structure - 160 voters for double TMR structure with a area cost the surface of 1.40%; partitioning in two equivalent parts requires - 20 cuts and thus 20 voters to add (140+20= 160). - 179 voters for the structure Triples representing a cost on the surface of 2.72% compared to simple TMR structure. Partitioning in three equivalent parts requires 39 cuts and thus 39 voting to add (179-140 = 39). Finally, it deals with aging phenomenon and thus, increases the expected lifetime of logic circuits. Table II summarizes the characteristics of the partitioned TMR structures. We find that the area overhead of partitioning of TMR structure is lower for larger circuits. Indeed, less than 2% for the largest ISCAS85 circuits and less than 5% for the largest ITC99 circuits. This low area overhead is a positive point for the $T_{\min}$ probability which can thus satisfy the condition $T > T_{\min}$ required. Indeed several structures, in particular TMR triple satisfied the conditions for increasing the reliability and manufacturing yield. TABLE II PARTITIONING RESULTS | Simulated circuit | Simple<br>TMR | Double<br>TMR | | Triple TMR | | | | | | | |-------------------|------------------|---------------|--------------------|-------------|--------------------|----------|----------------------|----------|-----------------------|----------| | | ** | Voters<br># | A <sub>O</sub> (%) | voters<br># | A <sub>0</sub> (%) | T<br>(%) | Robusts<br>voters | | Not robusts<br>voters | | | | Voters<br>number | | | | | | T <sub>min</sub> (%) | R<br>(%) | T <sub>min</sub> (%) | R<br>(%) | | c432 | 7 | 29 | 10.55 | 38 | 14.86 | 95,53 | 93,59 | 99,41 | 99,61 | 88,48 | | c499 | 32 | 49 | 6.14 | 59 | 9.75 | 95,11 | 93,79 | 99,30 | NA | 88,38 | | c1908 | 25 | 53 | 3.03 | 68 | 4.65 | 96,50 | 93,05 | 99,64 | 96,29 | 88,68 | | c2670 | 140 | 160 | 1.40 | 179 | 2.72 | 93,90 | 93,37 | 98,93 | 98,35 | 88,04 | | c3540 | 22 | 56 | 1.90 | 95 | 4.09 | 93,79 | 92,91 | 98,89 | 95,36 | 88,01 | | c5315 | 123 | 149 | 1.05 | 161 | 1.53 | 96,38 | 92,99 | 99,61 | 95,90 | 88,65 | | c6288 | 32 | 49 | 0.58 | 64 | 1.10 | 84,14 | 92,73 | 93,25 | 93,93 | 82,99 | | c7552 | 108 | 134 | 0.69 | 158 | 1.32 | 96,22 | 92,86 | 99,58 | 94,85 | 88,62 | | b02 | 5 | 8 | 8.15 | 9 | 10.87 | 93,28 | 93,95 | 98,70 | NA | 87,84 | | b03 | 34 | 43 | 4.25 | 47 | 6.14 | 96,75 | 93,88 | 99,68 | NA | 88,72 | | b04 | 74 | 99 | 3.23 | 109 | 4.52 | 95,53 | 92,44 | 99,41 | 98,79 | 88,48 | | b05 | 60 | 73 | 1.08 | 91 | 2.57 | 93,30 | 93,06 | 98,71 | 96,41 | 87,85 | | b06 | 15 | 22 | 8.64 | 26 | 13.58 | 96,26 | 94,30 | 99,59 | NA | 88,63 | | b07 | 57 | 80 | 4.58 | 84 | 5.38 | 94,11 | 92,59 | 99,00 | 99,66 | 88,11 | | b09 | 29 | 40 | 4.96 | 44 | 6.77 | 97,13 | 93,75 | 99,75 | NA | 88,78 | | b10 | 23 | 36 | 5.28 | 50 | 10.97 | 97,34 | 94,73 | 99,79 | NA | 88,81 | | b11 | 37 | 76 | 6.14 | 94 | 8.97 | 93,91 | 93,40 | 98,93 | 98,77 | 88,04 | | b12 | 127 | 139 | 0.85 | 167 | 2.84 | 97,79 | 93,37 | 99,85 | 98,05 | 88,87 | | b13 | 63 | 66 | 0.67 | 70 | 1.56 | 97,83 | 93,54 | 99,86 | 99,40 | 88,88 | 2. Impact of cost in silicon area on T tolerance probability From Fig. 7, it is clear that with the help of partitioning techniques, the tolerance of TMR structures can be improved and the majority of the tested circuits have their characteristics ranging between the curves corresponding to the two cases: not robust voter and robust voter. However, if the design effort is made to make the strongest possible voters, the real curve Tmin corresponding to these efforts would be between the two extreme cases presented in Fig. 7. Fig.7 Reliability and yield improved through the partitioning ## V. CONCLUSION This paper analyzes the behavior of integrated circuits designed for test (DFT) and reliability in presence of manufacturing defects and their ability to tolerate manufacturing defects. The modified circuits into TMR structures provide a good compromise between manufacturing yield, reliability and area overhead. It is well known, that the impact of the realization of fault-tolerant structure of the manufacturing yield and reliability can be positive when two conditions are respected (i) the manufacturing yield must be less than 1/A<sub>0</sub>, and (ii) the T probability must be greater than a value T<sub>min</sub> which depends on technological parameters of manufacture. But also, the optimization of fault tolerance through a judicious choice of the partitioning, and key locations within the circuit would be more appropriate to vote. However, the redundancy consumes more energy and increases the propagation delay through the circuits. We must therefore analyze the interesting compromise including assessing system-level redundancy to what percentage it is possible to go not to lose the expected benefits for new technologies. ## REFERENCES - [1] International Technology Roadmap for Semiconductors (ITRS), *Edition* 2007 - [2] M. Cimino, "Design of circuits radio frequencies under constraints of extended reliability", thesis of doctorate, University of Bordeaux I, 2007 - [3] A.Machouat, "Development and application of a method of analysis of functional failures and contribution to the improvement of the use of the static and dynamic optical techniques", thesis of doctorate, University of Bordeaux I, 2008. - [4] A. Bounceur, "Platform CAD for the test of mixed circuits", thesis of doctorate, Institut National Polytechnique of Grenoble - [5] J. Han and P. Jonker, "Toward hardware redundant, fault tolerant logic for Nanoeletronics", *IEEE Design & Test off computer, Vol.22, No.4*, 2005 - [6] M. Hafezparast, "tolerant Fault hardware designs and to their reliability analysis", thesis of doctorate, Brunel university of west London, 1990. - [7] C.H. Stapper "Yield model for fault clusters within integrated circuits", IBM Newspaper off Research and Development, vol. 28, N°5, the USA 1984. - [8] D.P. Siewiorek, R.S.Swarz" Applicable Systems Computer, Design and Evaluation " ED. DIGITAL Close 1992 - [9] Web site,www.vtvt.ece.vt.edu/vlsidesign/cadtools.php - [10] C. Edmond Bichot, "Development of a metheuristic news for the airspace division", thesis of doctorate, Institut National Polytechnique of Toulouse, 2007.