Comparison of Multivariate Adaptive Regression Splines and Random Forest Regression in Predicting Forced Expiratory Volume in One Second

Pulmonary Function Tests are important non-invasive diagnostic tests to assess respiratory impairments and provides quantifiable measures of lung function. Spirometry is the most frequently used measure of lung function and plays an essential role in the diagnosis and management of pulmonary diseases. However, the test requires considerable patient effort and cooperation, markedly related to the age of patients resulting in incomplete data sets. This paper presents, a nonlinear model built using Multivariate adaptive regression splines and Random forest regression model to predict the missing spirometric features. Random forest based feature selection is used to enhance both the generalization capability and the model interpretability. In the present study, flow-volume data are recorded for N= 198 subjects. The ranked order of feature importance index calculated by the random forests model shows that the spirometric features FVC, FEF25, PEF, FEF25-75, FEF50 and the demographic parameter height are the important descriptors. A comparison of performance assessment of both models prove that, the prediction ability of MARS with the `top two ranked features namely the FVC and FEF25 is higher, yielding a model fit of R2= 0.96 and R2= 0.99 for normal and abnormal subjects. The Root Mean Square Error analysis of the RF model and the MARS model also shows that the latter is capable of predicting the missing values of FEV1 with a notably lower error value of 0.0191 (normal subjects) and 0.0106 (abnormal subjects) with the aforementioned input features. It is concluded that combining feature selection with a prediction model provides a minimum subset of predominant features to train the model, as well as yielding better prediction performance. This analysis can assist clinicians with a intelligence support system in the medical diagnosis and improvement of clinical care.

Estimation of Missing or Incomplete Data in Road Performance Measurement Systems

Modern management in most fields is performance based; both planning and implementation of maintenance and operational activities are driven by appropriately defined performance indicators. Continuous real-time data collection for management is becoming feasible due to technological advancements. Outdated and insufficient input data may result in incorrect decisions. When using deterministic models the uncertainty of the object state is not visible thus applying the deterministic models are more likely to give false diagnosis. Constructing structured probabilistic models of the performance indicators taking into consideration the surrounding indicator environment enables to estimate the trustworthiness of the indicator values. It also assists to fill gaps in data to improve the quality of the performance analysis and management decisions. In this paper authors discuss the application of probabilistic graphical models in the road performance measurement and propose a high-level conceptual model that enables analyzing and predicting more precisely future pavement deterioration based on road utilization.

EZW Coding System with Artificial Neural Networks

Image compression plays a vital role in today-s communication. The limitation in allocated bandwidth leads to slower communication. To exchange the rate of transmission in the limited bandwidth the Image data must be compressed before transmission. Basically there are two types of compressions, 1) LOSSY compression and 2) LOSSLESS compression. Lossy compression though gives more compression compared to lossless compression; the accuracy in retrievation is less in case of lossy compression as compared to lossless compression. JPEG, JPEG2000 image compression system follows huffman coding for image compression. JPEG 2000 coding system use wavelet transform, which decompose the image into different levels, where the coefficient in each sub band are uncorrelated from coefficient of other sub bands. Embedded Zero tree wavelet (EZW) coding exploits the multi-resolution properties of the wavelet transform to give a computationally simple algorithm with better performance compared to existing wavelet transforms. For further improvement of compression applications other coding methods were recently been suggested. An ANN base approach is one such method. Artificial Neural Network has been applied to many problems in image processing and has demonstrated their superiority over classical methods when dealing with noisy or incomplete data for image compression applications. The performance analysis of different images is proposed with an analysis of EZW coding system with Error Backpropagation algorithm. The implementation and analysis shows approximately 30% more accuracy in retrieved image compare to the existing EZW coding system.

Internal Force State Recognition of Jiujiang Bridge Based on Cable Force-displacement Relationship

The nearly 21-year-old Jiujiang Bridge, which is suffering from uneven line shape, constant great downwarping of the main beam and cracking of the box girder, needs reinforcement and cable adjustment. It has undergone cable adjustment for twice with incomplete data. Therefore, the initial internal force state of the Jiujiang Bridge is identified as the key for the cable adjustment project. Based on parameter identification by means of static force test data, this paper suggests determining the initial internal force state of the cable-stayed bridge according to the cable force-displacement relationship parameter identification method. That is, upon measuring the displacement and the change in cable forces for twice, one can identify the parameters concerned by means of optimization. This method is applied to the cable adjustment, replacement and reinforcement project for the Jiujiang Bridge as a guidance for the cable adjustment and reinforcement project of the bridge.

Heterogeneous Attribute Reduction in Noisy System based on a Generalized Neighborhood Rough Sets Model

Neighborhood Rough Sets (NRS) has been proven to be an efficient tool for heterogeneous attribute reduction. However, most of researches are focused on dealing with complete and noiseless data. Factually, most of the information systems are noisy, namely, filled with incomplete data and inconsistent data. In this paper, we introduce a generalized neighborhood rough sets model, called VPTNRS, to deal with the problem of heterogeneous attribute reduction in noisy system. We generalize classical NRS model with tolerance neighborhood relation and the probabilistic theory. Furthermore, we use the neighborhood dependency to evaluate the significance of a subset of heterogeneous attributes and construct a forward greedy algorithm for attribute reduction based on it. Experimental results show that the model is efficient to deal with noisy data.

Reliability Analysis of Press Unit using Vague Set

In conventional reliability assessment, the reliability data of system components are treated as crisp values. The collected data have some uncertainties due to errors by human beings/machines or any other sources. These uncertainty factors will limit the understanding of system component failure due to the reason of incomplete data. In these situations, we need to generalize classical methods to fuzzy environment for studying and analyzing the systems of interest. Fuzzy set theory has been proposed to handle such vagueness by generalizing the notion of membership in a set. Essentially, in a Fuzzy Set (FS) each element is associated with a point-value selected from the unit interval [0, 1], which is termed as the grade of membership in the set. A Vague Set (VS), as well as an Intuitionistic Fuzzy Set (IFS), is a further generalization of an FS. Instead of using point-based membership as in FS, interval-based membership is used in VS. The interval-based membership in VS is more expressive in capturing vagueness of data. In the present paper, vague set theory coupled with conventional Lambda-Tau method is presented for reliability analysis of repairable systems. The methodology uses Petri nets (PN) to model the system instead of fault tree because it allows efficient simultaneous generation of minimal cuts and path sets. The presented method is illustrated with the press unit of the paper mill.