Defect Cause Modeling with Decision Tree and Regression Analysis
The main aim of this study is to identify the most
influential variables that cause defects on the items produced by a
casting company located in Turkey. To this end, one of the items
produced by the company with high defective percentage rates is
selected. Two approaches-the regression analysis and decision treesare
used to model the relationship between process parameters and
defect types. Although logistic regression models failed, decision tree
model gives meaningful results. Based on these results, it can be
claimed that the decision tree approach is a promising technique for
determining the most important process variables.
[1] J. Han, M. Kamber, Data Mining: Concepts and Techniques, Morgan
Kaufmann Publishers, 2001.
[2] M. H. Dunham, Data Mining: Introductory and Advanced Topics.
Prentice Hall, 2003.
[3] B. S. Kang, S. C. Park, "Integrated machine learning approaches for
complementing statistical process control procedures", Decision Support
System, vol. 29, pp. 59-72, 2000.
[4] M. Li, S. Feng, I. K. Sethi, J. Luciow, K. Wagner, "Mining Production
Data with Neural Network & CART" in Conf. Rec. 2003 IEEE Int. Conf.
Data Mining.
[5] J. Lian, X. M. Lai, Z. Q. Lin, F. S. Yao, "Application of data mining and
process knowledge discovery in sheet metal assembly dimensional
variation diagnosis", Journal of Materials Processing Technology, vol.
129, pp. 315-320, 2002.
[6] D. Braha, A. Shmilovici, "Data Mining for Improving a Cleaning
Process in the Semiconductor Industry", IEEE Trans. Semiconductor
Manufacturing, vol. 15, no. 1 pp. 91-101, Feb. 2002.
[7] D. W. Hosmer, S. Lemeshow, Applied Logistic Regression. Wiley-
Interscience Publication, 2000.
[8] D. C. Montgomery, E. A. Peck, Introduction to Linear Regression
Analysis. Wiley, 1982, pp. 444-453
[9] P. McCullagh, "Regression models for ordinal data (with discussion)",
Journal of the Royal Statistical Society. Series B, vol. 42, pp. 109-127,
1980.
[10] A. Albert, J. A. Anderson, "On the existence of maximum likelihood
estimates in logistic models", Biometrika, vol. 71, pp. 1-10, 1984.
[11] M. C. Bryson, M. E. Johnson, "The incidence of monotone likelihood in
the Cox model", Techometrics, vol.23, pp. 381-384, 1981.
[12] Data Mining Tools C5.0
http://www.rulequest.com/see5-info.html
[13] K. R. Skinner, D. C. Montgomery, G. C. Runger, J. W. Fowler, D. R.
McCarville, T. R. Rhoads, "Multivariate Statistical Methods for
Modeling and Analysis of Wafer Probe Test Data", IEEE Trans.
Semiconductor Manufacturing, vol. 15, no. 4 pp. 523-530, Nov. 2002.
[1] J. Han, M. Kamber, Data Mining: Concepts and Techniques, Morgan
Kaufmann Publishers, 2001.
[2] M. H. Dunham, Data Mining: Introductory and Advanced Topics.
Prentice Hall, 2003.
[3] B. S. Kang, S. C. Park, "Integrated machine learning approaches for
complementing statistical process control procedures", Decision Support
System, vol. 29, pp. 59-72, 2000.
[4] M. Li, S. Feng, I. K. Sethi, J. Luciow, K. Wagner, "Mining Production
Data with Neural Network & CART" in Conf. Rec. 2003 IEEE Int. Conf.
Data Mining.
[5] J. Lian, X. M. Lai, Z. Q. Lin, F. S. Yao, "Application of data mining and
process knowledge discovery in sheet metal assembly dimensional
variation diagnosis", Journal of Materials Processing Technology, vol.
129, pp. 315-320, 2002.
[6] D. Braha, A. Shmilovici, "Data Mining for Improving a Cleaning
Process in the Semiconductor Industry", IEEE Trans. Semiconductor
Manufacturing, vol. 15, no. 1 pp. 91-101, Feb. 2002.
[7] D. W. Hosmer, S. Lemeshow, Applied Logistic Regression. Wiley-
Interscience Publication, 2000.
[8] D. C. Montgomery, E. A. Peck, Introduction to Linear Regression
Analysis. Wiley, 1982, pp. 444-453
[9] P. McCullagh, "Regression models for ordinal data (with discussion)",
Journal of the Royal Statistical Society. Series B, vol. 42, pp. 109-127,
1980.
[10] A. Albert, J. A. Anderson, "On the existence of maximum likelihood
estimates in logistic models", Biometrika, vol. 71, pp. 1-10, 1984.
[11] M. C. Bryson, M. E. Johnson, "The incidence of monotone likelihood in
the Cox model", Techometrics, vol.23, pp. 381-384, 1981.
[12] Data Mining Tools C5.0
http://www.rulequest.com/see5-info.html
[13] K. R. Skinner, D. C. Montgomery, G. C. Runger, J. W. Fowler, D. R.
McCarville, T. R. Rhoads, "Multivariate Statistical Methods for
Modeling and Analysis of Wafer Probe Test Data", IEEE Trans.
Semiconductor Manufacturing, vol. 15, no. 4 pp. 523-530, Nov. 2002.
@article{"International Journal of Mechanical, Industrial and Aerospace Sciences:58290", author = "B. Bakır and İ. Batmaz and F. A. Güntürkün and İ. A. İpekçi and G. Köksal and N. E. Özdemirel", title = "Defect Cause Modeling with Decision Tree and Regression Analysis", abstract = "The main aim of this study is to identify the most
influential variables that cause defects on the items produced by a
casting company located in Turkey. To this end, one of the items
produced by the company with high defective percentage rates is
selected. Two approaches-the regression analysis and decision treesare
used to model the relationship between process parameters and
defect types. Although logistic regression models failed, decision tree
model gives meaningful results. Based on these results, it can be
claimed that the decision tree approach is a promising technique for
determining the most important process variables.", keywords = "Casting industry, decision tree algorithm C5.0,logistic regression, quality improvement.", volume = "2", number = "12", pages = "1327-4", }