Data Mining Classification Methods Applied in Drug Design

Data mining incorporates a group of statistical methods used to analyze a set of information, or a data set. It operates with models and algorithms, which are powerful tools with the great potential. They can help people to understand the patterns in certain chunk of information so it is obvious that the data mining tools have a wide area of applications. For example in the theoretical chemistry data mining tools can be used to predict moleculeproperties or improve computer-assisted drug design. Classification analysis is one of the major data mining methodologies. The aim of thecontribution is to create a classification model, which would be able to deal with a huge data set with high accuracy. For this purpose logistic regression, Bayesian logistic regression and random forest models were built using R software. TheBayesian logistic regression in Latent GOLD software was created as well. These classification methods belong to supervised learning methods. It was necessary to reduce data matrix dimension before construct models and thus the factor analysis (FA) was used. Those models were applied to predict the biological activity of molecules, potential new drug candidates.

Multi-Font Farsi/Arabic Isolated Character Recognition Using Chain Codes

Nowadays, OCR systems have got several applications and are increasingly employed in daily life. Much research has been done regarding the identification of Latin, Japanese, and Chinese characters. However, very little investigation has been performed regarding Farsi/Arabic characters recognition. Probably the reason is difficulty and complexity of those characters identification compared to the others and limitation of IT activities in Farsi and Arabic speaking countries. In this paper, a technique has been employed to identify isolated Farsi/Arabic characters. A chain code based algorithm along with other significant peculiarities such as number and location of dots and auxiliary parts, and the number of holes existing in the isolated character has been used in this study to identify Farsi/Arabic characters. Experimental results show the relatively high accuracy of the method developed when it is tested on several standard Farsi fonts.

Generalized Differential Quadrature Nonlinear Consolidation Analysis of Clay Layer with Time-Varied Drainage Conditions

In this article, the phenomenon of nonlinear consolidation in saturated and homogeneous clay layer is studied. Considering time-varied drainage model, the excess pore water pressure in the layer depth is calculated. The Generalized Differential Quadrature (GDQ) method is used for the modeling and numerical analysis. For the purpose of analysis, first the domain of independent variables (i.e., time and clay layer depth) is discretized by the Chebyshev-Gauss-Lobatto series and then the nonlinear system of equations obtained from the GDQ method is solved by means of the Newton-Raphson approach. The obtained results indicate that the Generalized Differential Quadrature method, in addition to being simple to apply, enjoys a very high accuracy in the calculation of excess pore water pressure.

Complex Condition Monitoring System of Aircraft Gas Turbine Engine

Researches show that probability-statistical methods application, especially at the early stage of the aviation Gas Turbine Engine (GTE) technical condition diagnosing, when the flight information has property of the fuzzy, limitation and uncertainty is unfounded. Hence the efficiency of application of new technology Soft Computing at these diagnosing stages with the using of the Fuzzy Logic and Neural Networks methods is considered. According to the purpose of this problem training with high accuracy of fuzzy multiple linear and non-linear models (fuzzy regression equations) which received on the statistical fuzzy data basis is made. For GTE technical condition more adequate model making dynamics of skewness and kurtosis coefficients- changes are analysed. Researches of skewness and kurtosis coefficients values- changes show that, distributions of GTE workand output parameters of the multiple linear and non-linear generalised models at presence of noise measured (the new recursive Least Squares Method (LSM)). The developed GTE condition monitoring system provides stage-by-stage estimation of engine technical conditions. As application of the given technique the estimation of the new operating aviation engine technical condition was made.