Words Reordering based on Statistical Language Model

There are multiple reasons to expect that detecting the word order errors in a text will be a difficult problem, and detection rates reported in the literature are in fact low. Although grammatical rules constructed by computer linguists improve the performance of grammar checker in word order diagnosis, the repairing task is still very difficult. This paper presents an approach for repairing word order errors in English text by reordering words in a sentence and choosing the version that maximizes the number of trigram hits according to a language model. The novelty of this method concerns the use of an efficient confusion matrix technique for reordering the words. The comparative advantage of this method is that works with a large set of words, and avoids the laborious and costly process of collecting word order errors for creating error patterns.

Modern Vibration Signal Processing Techniques for Vehicle Gearbox Fault Diagnosis

This paper presents modern vibration signalprocessing techniques for vehicle gearbox fault diagnosis, via the wavelet analysis and the Squared Envelope (SE) technique. The wavelet analysis is regarded as a powerful tool for the detection of sudden changes in non-stationary signals. The Squared Envelope (SE) technique has been extensively used for rolling bearing diagnostics. In the present work a scheme of using the Squared Envelope technique for early detection of gear tooth pit. The pitting defect is manufactured on the tooth side of a fifth speed gear on the intermediate shaft of a vehicle gearbox. The objective is to supplement the current techniques of gearbox fault diagnosis based on using the raw vibration and ordered signals. The test stand is equipped with three dynamometers; the input dynamometer serves as the internal combustion engine, the output dynamometers introduce the load on the flanges of output joint shafts. The gearbox used for experimental measurements is the type most commonly used in modern small to mid-sized passenger cars with transversely mounted powertrain and front wheel drive; a five-speed gearbox with final drive gear and front wheel differential. The results show that the approaches methods are effective for detecting and diagnosing localized gear faults in early stage under different operation conditions, and are more sensitive and robust than current gear diagnostic techniques.

An Intelligent Combined Method Based on Power Spectral Density, Decision Trees and Fuzzy Logic for Hydraulic Pumps Fault Diagnosis

Recently, the issue of machine condition monitoring and fault diagnosis as a part of maintenance system became global due to the potential advantages to be gained from reduced maintenance costs, improved productivity and increased machine availability. The aim of this work is to investigate the effectiveness of a new fault diagnosis method based on power spectral density (PSD) of vibration signals in combination with decision trees and fuzzy inference system (FIS). To this end, a series of studies was conducted on an external gear hydraulic pump. After a test under normal condition, a number of different machine defect conditions were introduced for three working levels of pump speed (1000, 1500, and 2000 rpm), corresponding to (i) Journal-bearing with inner face wear (BIFW), (ii) Gear with tooth face wear (GTFW), and (iii) Journal-bearing with inner face wear plus Gear with tooth face wear (B&GW). The features of PSD values of vibration signal were extracted using descriptive statistical parameters. J48 algorithm is used as a feature selection procedure to select pertinent features from data set. The output of J48 algorithm was employed to produce the crisp if-then rule and membership function sets. The structure of FIS classifier was then defined based on the crisp sets. In order to evaluate the proposed PSD-J48-FIS model, the data sets obtained from vibration signals of the pump were used. Results showed that the total classification accuracy for 1000, 1500, and 2000 rpm conditions were 96.42%, 100%, and 96.42% respectively. The results indicate that the combined PSD-J48-FIS model has the potential for fault diagnosis of hydraulic pumps.

Statistical Models of Network Traffic

Model-based approaches have been applied successfully to a wide range of tasks such as specification, simulation, testing, and diagnosis. But one bottleneck often prevents the introduction of these ideas: Manual modeling is a non-trivial, time-consuming task. Automatically deriving models by observing and analyzing running systems is one possible way to amend this bottleneck. To derive a model automatically, some a-priori knowledge about the model structure–i.e. about the system–must exist. Such a model formalism would be used as follows: (i) By observing the network traffic, a model of the long-term system behavior could be generated automatically, (ii) Test vectors can be generated from the model, (iii) While the system is running, the model could be used to diagnose non-normal system behavior. The main contribution of this paper is the introduction of a model formalism called 'probabilistic regression automaton' suitable for the tasks mentioned above.

Eclectic Rule-Extraction from Support Vector Machines

Support vector machines (SVMs) have shown superior performance compared to other machine learning techniques, especially in classification problems. Yet one limitation of SVMs is the lack of an explanation capability which is crucial in some applications, e.g. in the medical and security domains. In this paper, a novel approach for eclectic rule-extraction from support vector machines is presented. This approach utilizes the knowledge acquired by the SVM and represented in its support vectors as well as the parameters associated with them. The approach includes three stages; training, propositional rule-extraction and rule quality evaluation. Results from four different experiments have demonstrated the value of the approach for extracting comprehensible rules of high accuracy and fidelity.

Finding Approximate Tandem Repeats with the Burrows-Wheeler Transform

Approximate tandem repeats in a genomic sequence are two or more contiguous, similar copies of a pattern of nucleotides. They are used in DNA mapping, studying molecular evolution mechanisms, forensic analysis and research in diagnosis of inherited diseases. All their functions are still investigated and not well defined, but increasing biological databases together with tools for identification of these repeats may lead to discovery of their specific role or correlation with particular features. This paper presents a new approach for finding approximate tandem repeats in a given sequence, where the similarity between consecutive repeats is measured using the Hamming distance. It is an enhancement of a method for finding exact tandem repeats in DNA sequences based on the Burrows- Wheeler transform.

The First Prevalence Report of Direct Identification and Differentiation of B. abortus and B. melitensis using Real Time PCR in House Mouse of Iran

Brucellosis is a zoonotic disease; its symptoms and appearances are not exclusive in human and its traditional diagnosis is based on culture, serological methods and conventional PCR. For more sensitive, specific detection and differentiation of Brucella spp., the real time PCR method is recommended. This research has performed to determine the presence and prevalence of Brucella spp. and differentiation of Brucella abortus and Brucella melitensis in house mouse (Mus musculus) in west of Iran. A TaqMan analysis and single-step PCR was carried out in total 326 DNA of Mouse's spleen samples. From the total number of 326 samples, 128 (39.27%) gave positive results for Brucella spp. by conventional PCR, also 65 and 32 out of the 128 specimens were positive for B. melitensis, B. abortus, respectively. These results indicate a high presence of this pathogen in this area and that real time PCR is considerably faster than current standard methods for identification and differentiation of Brucella species. To our knowledge, this study is the first prevalence report of direct identification and differentiation of B. abortus and B. melitensis by real time PCR in mouse tissue samples in Iran.

Genetic Polymorphism of the Acute Lymphoblastic Leukaemia and Hyperhomocysteinemia its Relation with the for a Group of Children in the East of Algeria

A lot of recent research have spoken on the relation between the increase of the homocysteinemia and some kinds of cancer . For that, our study was based on the research of a possible relation between the increase of the concentration of this amino-acid in the plasma and the appearance of the disease of the Acute Lymphoblastic Leukaemia in a part of Algerian children with Berber origin in the East of Algeria . The study has done on 47 ill persons with an average age of (09±06 ) years , with whom the disease has diagnosed by blood and marrow examination in the hospital of blood diseases in the CHU of Batna, and on 194 healthy witnesses of the same age. The two groups were benefited by a dosage of the concentration of the homocysteine vitamin B9 ,vitamin B12 , and also of the study of special polymorphisms of indispensable enzymes in the metabolism of this acid , and that by the use of the method ( Light cycler ) Real time PCR , on the following enzymes : MS ( C2756G ), MSR ( A66G ) ,MTHFR1 ( C677T ) and MTHFR2 (A1298C). The obtained results have revealed that the rate of the homozygote muted genotype is the less frequent in the two groups , and that exist at list one genotype of each enzyme in the ill group and in which the percentage exceed with remarkable way the same genotype in the healthy group and we notice specially the muted genotype GG of -the methionine synthetase-and the form TT of the enzyme – methyline tetra hydrofolate reductase – We notice the existence of considerable number of genotypes in the ill group lied with characteristic increase of this Amino-acid ,and that for the reduction of the biologic activity of these enzymes which become inefficient in the transfer of the homocysteine into the methionine and cause the diminution of the biologic activity of these enzymes and with consequence the reduction of the percentage of methylic radicals in the DNA of studied genes and that lead to the increase of the activity and the capacity of transcription , and it-s so probably that this last one is one of the factors of this disease especially if we know that the specific check-up of vitamins is normal and similar in the two groups , which ovoid the hypothesis of the reduction of vitamins . We notice also that the heterozygote genotype is the less in the sick category except the MTHFR2. Wild genotype is more frequent in the witness group except MSR. Even these results are partials; they open a new way in the genetic diagnosis of this malicious disease which allow a precocious diagnosis and the use of an effective and appropriated treatment in the same time.

Processing the Medical Sensors Signals Using Fuzzy Inference System

Sensors possess several properties of physical measures. Whether devices that convert a sensed signal into an electrical signal, chemical sensors and biosensors, thus all these sensors can be considered as an interface between the physical and electrical equipment. The problem is the analysis of the multitudes of saved settings as input variables. However, they do not all have the same level of influence on the outputs. In order to identify the most sensitive parameters, those that can guide users in gathering information on the ground and in the process of model calibration and sensitivity analysis for the effect of each change made. Mathematical models used for processing become very complex. In this paper a fuzzy rule-based system is proposed as a solution for this problem. The system collects the available signals information from sensors. Moreover, the system allows the study of the influence of the various factors that take part in the decision system. Since its inception fuzzy set theory has been regarded as a formalism suitable to deal with the imprecision intrinsic to many problems. At the same time, fuzzy sets allow to use symbolic models. In this study an example was applied for resolving variety of physiological parameters that define human health state. The application system was done for medical diagnosis help. The inputs are the signals expressed the cardiovascular system parameters, blood pressure, Respiratory system paramsystem was done, it will be able to predict the state of patient according any input values.

A Trainable Neural Network Ensemble for ECG Beat Classification

This paper illustrates the use of a combined neural network model for classification of electrocardiogram (ECG) beats. We present a trainable neural network ensemble approach to develop customized electrocardiogram beat classifier in an effort to further improve the performance of ECG processing and to offer individualized health care. We process a three stage technique for detection of premature ventricular contraction (PVC) from normal beats and other heart diseases. This method includes a denoising, a feature extraction and a classification. At first we investigate the application of stationary wavelet transform (SWT) for noise reduction of the electrocardiogram (ECG) signals. Then feature extraction module extracts 10 ECG morphological features and one timing interval feature. Then a number of multilayer perceptrons (MLPs) neural networks with different topologies are designed. The performance of the different combination methods as well as the efficiency of the whole system is presented. Among them, Stacked Generalization as a proposed trainable combined neural network model possesses the highest recognition rate of around 95%. Therefore, this network proves to be a suitable candidate in ECG signal diagnosis systems. ECG samples attributing to the different ECG beat types were extracted from the MIT-BIH arrhythmia database for the study.

Emotion Classification by Incremental Association Language Features

The Major Depressive Disorder has been a burden of medical expense in Taiwan as well as the situation around the world. Major Depressive Disorder can be defined into different categories by previous human activities. According to machine learning, we can classify emotion in correct textual language in advance. It can help medical diagnosis to recognize the variance in Major Depressive Disorder automatically. Association language incremental is the characteristic and relationship that can discovery words in sentence. There is an overlapping-category problem for classification. In this paper, we would like to improve the performance in classification in principle of no overlapping-category problems. We present an approach that to discovery words in sentence and it can find in high frequency in the same time and can-t overlap in each category, called Association Language Features by its Category (ALFC). Experimental results show that ALFC distinguish well in Major Depressive Disorder and have better performance. We also compare the approach with baseline and mutual information that use single words alone or correlation measure.

Brain MRI Segmentation and Lesions Detection by EM Algorithm

In Multiple Sclerosis, pathological changes in the brain results in deviations in signal intensity on Magnetic Resonance Images (MRI). Quantitative analysis of these changes and their correlation with clinical finding provides important information for diagnosis. This constitutes the objective of our work. A new approach is developed. After the enhancement of images contrast and the brain extraction by mathematical morphology algorithm, we proceed to the brain segmentation. Our approach is based on building statistical model from data itself, for normal brain MRI and including clustering tissue type. Then we detect signal abnormalities (MS lesions) as a rejection class containing voxels that are not explained by the built model. We validate the method on MR images of Multiple Sclerosis patients by comparing its results with those of human expert segmentation.

On The Analysis of a Compound Neural Network for Detecting Atrio Ventricular Heart Block (AVB) in an ECG Signal

Heart failure is the most common reason of death nowadays, but if the medical help is given directly, the patient-s life may be saved in many cases. Numerous heart diseases can be detected by means of analyzing electrocardiograms (ECG). Artificial Neural Networks (ANN) are computer-based expert systems that have proved to be useful in pattern recognition tasks. ANN can be used in different phases of the decision-making process, from classification to diagnostic procedures. This work concentrates on a review followed by a novel method. The purpose of the review is to assess the evidence of healthcare benefits involving the application of artificial neural networks to the clinical functions of diagnosis, prognosis and survival analysis, in ECG signals. The developed method is based on a compound neural network (CNN), to classify ECGs as normal or carrying an AtrioVentricular heart Block (AVB). This method uses three different feed forward multilayer neural networks. A single output unit encodes the probability of AVB occurrences. A value between 0 and 0.1 is the desired output for a normal ECG; a value between 0.1 and 1 would infer an occurrence of an AVB. The results show that this compound network has a good performance in detecting AVBs, with a sensitivity of 90.7% and a specificity of 86.05%. The accuracy value is 87.9%.

Rigorous Electromagnetic Model of Fourier Transform Infrared (FT-IR) Spectroscopic Imaging Applied to Automated Histology of Prostate Tissue Specimens

Fourier transform infrared (FT-IR) spectroscopic imaging is an emerging technique that provides both chemically and spatially resolved information. The rich chemical content of data may be utilized for computer-aided determinations of structure and pathologic state (cancer diagnosis) in histological tissue sections for prostate cancer. FT-IR spectroscopic imaging of prostate tissue has shown that tissue type (histological) classification can be performed to a high degree of accuracy [1] and cancer diagnosis can be performed with an accuracy of about 80% [2] on a microscopic (≈ 6μm) length scale. In performing these analyses, it has been observed that there is large variability (more than 60%) between spectra from different points on tissue that is expected to consist of the same essential chemical constituents. Spectra at the edges of tissues are characteristically and consistently different from chemically similar tissue in the middle of the same sample. Here, we explain these differences using a rigorous electromagnetic model for light-sample interaction. Spectra from FT-IR spectroscopic imaging of chemically heterogeneous samples are different from bulk spectra of individual chemical constituents of the sample. This is because spectra not only depend on chemistry, but also on the shape of the sample. Using coupled wave analysis, we characterize and quantify the nature of spectral distortions at the edges of tissues. Furthermore, we present a method of performing histological classification of tissue samples. Since the mid-infrared spectrum is typically assumed to be a quantitative measure of chemical composition, classification results can vary widely due to spectral distortions. However, we demonstrate that the selection of localized metrics based on chemical information can make our data robust to the spectral distortions caused by scattering at the tissue boundary.

An Evaluation of Sputum Smear Conversion and Haematological Parameter Alteration in Early Detection Period of New Pulmonary Tuberculosis (PTB) Patients

Sputum smear conversion after one month of antituberculosis therapy in new smear positive pulmonary tuberculosis patients (PTB+) is a vital indicator towards treatment success. The objective of this study is to determine the rate of sputum smear conversion in new PTB+ patients after one month under treatment of National Institute of Diseases of the Chest and Hospital (NIDCH). Analysis of sputum smear conversion was done by re-clinical examination with sputum smear microscopic test after one month. Socio-demographic and hematological parameters were evaluated to perceive the correlation with the disease status. Among all enrolled patients only 33.33% were available for follow up diagnosis and of them only 42.86% patients turned to smear negative. Probably this consequence is due to non-coherence to the proper disease management. 66.67% and 78.78% patients reported low haemoglobin and packed cell volume level respectively whereas 80% and 93.33% patients accounted accelerated platelet count and erythrocyte sedimentation rate correspondingly.

Analysis of a Mathematical Model for Dengue Disease in Pregnant Cases

Dengue fever is an important human arboviral disease. Outbreaks are now reported quite often from many parts of the world. The number of cases involving pregnant women and infant cases are increasing every year. The illness is often severe and complications may occur. Deaths often occur because of the difficulties in early diagnosis and in the improper management of the diseases. Dengue antibodies from pregnant women are passed on to infants and this protects the infants from dengue infections. Antibodies from the mother are transferred to the fetus when it is still in the womb. In this study, we formulate a mathematical model to describe the transmission of this disease in pregnant women. The model is formulated by dividing the human population into pregnant women and non-pregnant human (men and non-pregnant women). Each class is subdivided into susceptible (S), infectious (I) and recovered (R) subclasses. We apply standard dynamical analysis to our model. Conditions for the local stability of the equilibrium points are given. The numerical simulations are shown. The bifurcation diagrams of our model are discussed. The control of this disease in pregnant women is discussed in terms of the threshold conditions.

A Method for Quality Inspection of Motors by Detecting Abnormal Sound

Recently, a quality of motors is inspected by human ears. In this paper, I propose two systems using a method of speech recognition for automation of the inspection. The first system is based on a method of linear processing which uses K-means and Nearest Neighbor method, and the second is based on a method of non-linear processing which uses neural networks. I used motor sounds in these systems, and I successfully recognize 86.67% of motor sounds in the linear processing system and 97.78% in the non-linear processing system.

An Intelligent Fuzzy-Neural Diagnostic System for Osteoporosis Risk Assessment

In this article, we propose an Intelligent Medical Diagnostic System (IMDS) accessible through common web-based interface, to on-line perform initial screening for osteoporosis. The fundamental approaches which construct the proposed system are mainly based on the fuzzy-neural theory, which can exhibit superiority over other conventional technologies in many fields. In diagnosis process, users simply answer a series of directed questions to the system, and then they will immediately receive a list of results which represents the risk degrees of osteoporosis. According to clinical testing results, it is shown that the proposed system can provide the general public or even health care providers with a convenient, reliable, inexpensive approach to osteoporosis risk assessment.

The Role of Velocity Map Quality in Estimation of Intravascular Pressure Distribution

Phase-Contrast MR imaging methods are widely used for measurement of blood flow velocity components. Also there are some other tools such as CT and Ultrasound for velocity map detection in intravascular studies. These data are used in deriving flow characteristics. Some clinical applications are investigated which use pressure distribution in diagnosis of intravascular disorders such as vascular stenosis. In this paper an approach to the problem of measurement of intravascular pressure field by using velocity field obtained from flow images is proposed. The method presented in this paper uses an algorithm to calculate nonlinear equations of Navier- Stokes, assuming blood as an incompressible and Newtonian fluid. Flow images usually suffer the lack of spatial resolution. Our attempt is to consider the effect of spatial resolution on the pressure distribution estimated from this method. In order to achieve this aim, velocity map of a numerical phantom is derived at six different spatial resolutions. To determine the effects of vascular stenoses on pressure distribution, a stenotic phantom geometry is considered. A comparison between the pressure distribution obtained from the phantom and the pressure resulted from the algorithm is presented. In this regard we also compared the effects of collocated and staggered computational grids on the pressure distribution resulted from this algorithm.

Evaluation of the Impact of Dataset Characteristics for Classification Problems in Biological Applications

Availability of high dimensional biological datasets such as from gene expression, proteomic, and metabolic experiments can be leveraged for the diagnosis and prognosis of diseases. Many classification methods in this area have been studied to predict disease states and separate between predefined classes such as patients with a special disease versus healthy controls. However, most of the existing research only focuses on a specific dataset. There is a lack of generic comparison between classifiers, which might provide a guideline for biologists or bioinformaticians to select the proper algorithm for new datasets. In this study, we compare the performance of popular classifiers, which are Support Vector Machine (SVM), Logistic Regression, k-Nearest Neighbor (k-NN), Naive Bayes, Decision Tree, and Random Forest based on mock datasets. We mimic common biological scenarios simulating various proportions of real discriminating biomarkers and different effect sizes thereof. The result shows that SVM performs quite stable and reaches a higher AUC compared to other methods. This may be explained due to the ability of SVM to minimize the probability of error. Moreover, Decision Tree with its good applicability for diagnosis and prognosis shows good performance in our experimental setup. Logistic Regression and Random Forest, however, strongly depend on the ratio of discriminators and perform better when having a higher number of discriminators.