Abstract: The use of advanced methods has been increasing day by day in the maritime sector, which is one of the sectors least affected by the COVID-19 pandemic. It is aimed to minimize accidents, especially by using advanced methods in the investigation of marine accidents. This research aimed to conduct an exploratory statistical analysis of particular ship accidents in the Transport Safety Investigation Center of Turkey database. 46 ship accidents, which occurred between 2010-2018, have been selected from the database. In addition to the availability of a reliable and comprehensive database, taking advantage of the robust statistical models for investigation is critical to improving the safety of ships. Thus, descriptive analysis has been used in the research to identify causes and conditional factors related to different types of ship accidents. The research outcomes underline the fact that environmental factors and day and night ratio have great influence on ship safety.
Abstract: The role and relative importance of intrinsic and extrinsic factors in the development of complex diseases such as cancer still remains a controversial issue. Determining the amount of variation explained by these factors needs experimental data and statistical models. These models are nevertheless based on the occurrence and accumulation of random mutational events during stem cell division, thus rendering cancer development a stochastic outcome. We demonstrate that not only individual genome sequencing is uninformative in determining cancer risk, but also assigning a unique genome sequence to any given individual (healthy or affected) is not meaningful. Current whole-genome sequencing approaches are therefore unlikely to realize the promise of personalized medicine. In conclusion, since genome sequence differs from cell to cell and changes over time, it seems that determining the risk factor of complex diseases based on genome sequence is somewhat unrealistic, and therefore, the resulting data are likely to be inherently uninformative.
Abstract: Monitoring of tool wear in milling operations is essential for achieving the desired dimensional accuracy and surface finish of a machined workpiece. Although there are numerous statistical models and artificial intelligence techniques available for monitoring the wear of cutting tools, these techniques cannot pin point which cutting edge of the tool, or which insert in the case of indexable tooling, is worn or broken. Currently, the task of monitoring the wear on the tool cutting edges is carried out by the operator who performs a manual inspection, causing undesirable stoppages of machine tools and consequently resulting in costs incurred from lost productivity. The present study is concerned with the development of a flute tracking system to segment signals related to each physical flute of a cutter with three flutes used in an end milling operation. The purpose of the system is to monitor the cutting condition for individual flutes separately in order to determine their progressive wear rates and to predict imminent tool failure. The results of this study clearly show that signals associated with each flute can be effectively segmented using the proposed flute tracking system. Furthermore, the results illustrate that by segmenting the sensor signal by flutes it is possible to investigate the wear in each physical cutting edge of the cutting tool. These findings are significant in that they facilitate the online condition monitoring of a cutting tool for each specific flute without the need for operators/engineers to perform manual inspections of the tool.
Abstract: This research study aims to present a retrospective
study about speech recognition systems and artificial intelligence.
Speech recognition has become one of the widely used technologies,
as it offers great opportunity to interact and communicate with
automated machines. Precisely, it can be affirmed that speech
recognition facilitates its users and helps them to perform their daily
routine tasks, in a more convenient and effective manner. This
research intends to present the illustration of recent technological
advancements, which are associated with artificial intelligence.
Recent researches have revealed the fact that speech recognition is
found to be the utmost issue, which affects the decoding of speech. In
order to overcome these issues, different statistical models were
developed by the researchers. Some of the most prominent statistical
models include acoustic model (AM), language model (LM), lexicon
model, and hidden Markov models (HMM). The research will help in
understanding all of these statistical models of speech recognition.
Researchers have also formulated different decoding methods, which
are being utilized for realistic decoding tasks and constrained
artificial languages. These decoding methods include pattern
recognition, acoustic phonetic, and artificial intelligence. It has been
recognized that artificial intelligence is the most efficient and reliable
methods, which are being used in speech recognition.
Abstract: This paper analyzes the conceptual framework of three
statistical methods, multiple regression, path analysis, and structural
equation models. When establishing research model of the statistical
modeling of complex social phenomenon, it is important to know the
strengths and limitations of three statistical models. This study
explored the character, strength, and limitation of each modeling and
suggested some strategies for accurate explaining or predicting the
causal relationships among variables. Especially, on the studying of
depression or mental health, the common mistakes of research
modeling were discussed.
Abstract: In this paper, a study of slope failures along the Alishan Highway is carried out. An innovative empirical model is developed based on 15-year records of rainfall-induced slope failures. The statistical models are intended for assessing the volume of landslide for slope failure along the Alishan Highway in the future. The rainfall data considered in the proposed models include the effective cumulative rainfall and the critical rainfall intensity. The effective cumulative rainfall is defined at the point when the curve of cumulative rainfall goes from steep to flat. Then, the rainfall thresholds of landslide are established for assessing the volume of landslide and issuing warning and/or closure for the Alishan Highway during a future extreme rainfall. Slope failures during Typhoon Saola in 2012 demonstrate that the new empirical model is effective and applicable to other cases with similar rainfall conditions.
Abstract: High Resolution NMR Spectroscopy offers unique screening capabilities for food quality and safety by combining non-targeted and targeted screening in one analysis.
The objective is to demonstrate, that due to its extreme reproducibility NMR can detect smallest changes in concentrations of many components in a mixture, which is best monitored by statistical evaluation however also delivers reliable quantification results.
The methodology typically uses a 400 MHz high resolution instrument under full automation after minimized sample preparation.
For example one fruit juice analysis in a push button operation takes at maximum 15 minutes and delivers a multitude of results, which are automatically summarized in a PDF report.
The method has been proven on fruit juices, where so far unknown frauds could be detected. In addition conventional targeted parameters are obtained in the same analysis. This technology has the advantage that NMR is completely quantitative and concentration calibration only has to be done once for all compounds. Since NMR is so reproducible, it is also transferable between different instruments (with same field strength) and laboratories. Based on strict SOP`s, statistical models developed once can be used on multiple instruments and strategies for compound identification and quantification are applicable as well across labs.
Abstract: Using a set of confidence intervals, we develop a
common approach, to construct a fuzzy set as an estimator for
unknown parameters in statistical models. We investigate a method
to derive the explicit and unique membership function of such fuzzy
estimators. The proposed method has been used to derive the fuzzy
estimators of the parameters of a Normal distribution and some
functions of parameters of two Normal distributions, as well as the
parameters of the Exponential and Poisson distributions.
Abstract: This paper proposes a novel approach that combines statistical models and support vector machines. A hybrid scheme which appropriately incorporates the advantages of both the generative and discriminant model paradigms is described and evaluated. Support vector machines (SVMs) are trained to divide the whole speakers' space into small subsets of speakers within a hierarchical tree structure. During testing a speech token is assigned to its corresponding group and evaluation using gaussian mixture models (GMMs) is then processed. Experimental results show that the proposed method can significantly improve the performance of text independent speaker identification task. We report improvements of up to 50% reduction in identification error rate compared to the baseline statistical model.
Abstract: Exclusive breastfeeding is the feeding of a baby on no other milk apart from breast milk. Exclusive breastfeeding during the first 6 months of life is very important as it supports optimal growth and development during infancy and reduces the risk of obliterating diseases and problems. Moreover, it helps to reduce the incidence and/or severity of diarrhea, lower respiratory infection and urinary tract infection. In this paper, we make a survey of the factors that influence exclusive breastfeeding and use two dispersed statistical models to analyze data. The models are the Generalized Poisson regression model and the Com-Poisson regression models.
Abstract: Wheat prediction was carried out using different meteorological variables together with agro meteorological indices in Ardebil district for the years 2004-2005 & 2005–2006. On the basis of correlation coefficients, standard error of estimate as well as relative deviation of predicted yield from actual yield using different statistical models, the best subset of agro meteorological indices were selected including daily minimum temperature (Tmin), accumulated difference of maximum & minimum temperatures (TD), growing degree days (GDD), accumulated water vapor pressure deficit (VPD), sunshine hours (SH) & potential evapotranspiration (PET). Yield prediction was done two months in advance before harvesting time which was coincide with commencement of reproductive stage of wheat (5th of June). It revealed that in the final statistical models, 83% of wheat yield variability was accounted for variation in above agro meteorological indices.
Abstract: The present work is concerned with the effect of turning process parameters (cutting speed, feed rate, and depth of cut) and distance from the center of work piece as input variables on the chip micro-hardness as response or output. Three experiments were conducted; they were used to investigate the chip micro-hardness behavior at diameter of work piece for 30[mm], 40[mm], and 50[mm]. Response surface methodology (R.S.M) is used to determine and present the cause and effect of the relationship between true mean response and input control variables influencing the response as a two or three dimensional hyper surface. R.S.M has been used for designing a three factor with five level central composite rotatable factors design in order to construct statistical models capable of accurate prediction of responses. The results obtained showed that the application of R.S.M can predict the effect of machining parameters on chip micro-hardness. The five level factorial designs can be employed easily for developing statistical models to predict chip micro-hardness by controllable machining parameters. Results obtained showed that the combined effect of cutting speed at it?s lower level, feed rate and depth of cut at their higher values, and larger work piece diameter can result increasing chi micro-hardness.
Abstract: The analysis of electromagnetic environment using
deterministic mathematical models is characterized by the
impossibility of analyzing a large number of interacting network
stations with a priori unknown parameters, and this is characteristic,
for example, of mobile wireless communication networks. One of the
tasks of the tools used in designing, planning and optimization of
mobile wireless network is to carry out simulation of electromagnetic
environment based on mathematical modelling methods, including
computer experiment, and to estimate its effect on radio
communication devices. This paper proposes the development of a
statistical model of electromagnetic environment of a mobile
wireless communication network by describing the parameters and
factors affecting it including the propagation channel and their
statistical models.
Abstract: Breastfeeding is an important concept in the maternal life of a woman. In this paper, we focus on exclusive breastfeeding. Exclusive breastfeeding is the feeding of a baby on no other milk apart from breast milk. This type of breastfeeding is very important during the first six months because it supports optimal growth and development during infancy and reduces the risk of obliterating diseases and problems. Moreover, in Mauritius, exclusive breastfeeding has decreased the incidence and/or severity of diarrhea, lower respiratory infection and urinary tract infection. In this paper, we give an overview of exclusive breastfeeding in Mauritius and the factors influencing it. We further analyze the local practices of exclusive breastfeeding using the Generalized Poisson regression model and the negative-binomial model since the data are over-dispersed.
Abstract: The purpose of this study is mainly to predict collision
frequency on the horizontal tangents combined with vertical curves
using artificial neural network methods. The proposed ANN models
are compared with existing regression models. First, the variables
that affect collision frequency were investigated. It was found that
only the annual average daily traffic, section length, access density,
the rate of vertical curvature, smaller curve radius before and after
the tangent were statistically significant according to related
combinations. Second, three statistical models (negative binomial,
zero inflated Poisson and zero inflated negative binomial) were
developed using the significant variables for three alignment
combinations. Third, ANN models are developed by applying the
same variables for each combination. The results clearly show that
the ANN models have the lowest mean square error value than those
of the statistical models. Similarly, the AIC values of the ANN
models are smaller to those of the regression models for all the
combinations. Consequently, the ANN models have better statistical
performances than statistical models for estimating collision
frequency. The ANN models presented in this paper are
recommended for evaluating the safety impacts 3D alignment
elements on horizontal tangents.
Abstract: The main goal in this paper is to quantify the quality of
different techniques for radiation treatment plans, a back-propagation
artificial neural network (ANN) combined with biomedicine theory
was used to model thirteen dosimetric parameters and to calculate
two dosimetric indices. The correlations between dosimetric indices
and quality of life were extracted as the features and used in the ANN
model to make decisions in the clinic. The simulation results show
that a trained multilayer back-propagation neural network model can
help a doctor accept or reject a plan efficiently. In addition, the
models are flexible and whenever a new treatment technique enters
the market, the feature variables simply need to be imported and the
model re-trained for it to be ready for use.
Abstract: Longitudinal data typically have the characteristics of
changes over time, nonlinear growth patterns, between-subjects
variability, and the within errors exhibiting heteroscedasticity and
dependence. The data exploration is more complicated than that of
cross-sectional data. The purpose of this paper is to organize/integrate
of various visual-graphical techniques to explore longitudinal data.
From the application of the proposed methods, investigators can
answer the research questions include characterizing or describing the
growth patterns at both group and individual level, identifying the time
points where important changes occur and unusual subjects, selecting
suitable statistical models, and suggesting possible within-error
variance.
Abstract: Model-based approaches have been applied successfully
to a wide range of tasks such as specification, simulation, testing, and
diagnosis. But one bottleneck often prevents the introduction of these
ideas: Manual modeling is a non-trivial, time-consuming task.
Automatically deriving models by observing and analyzing running
systems is one possible way to amend this bottleneck. To
derive a model automatically, some a-priori knowledge about the
model structure–i.e. about the system–must exist. Such a model
formalism would be used as follows: (i) By observing the network
traffic, a model of the long-term system behavior could be generated
automatically, (ii) Test vectors can be generated from the model,
(iii) While the system is running, the model could be used to diagnose
non-normal system behavior.
The main contribution of this paper is the introduction of a model
formalism called 'probabilistic regression automaton' suitable for the
tasks mentioned above.
Abstract: Parsing is important in Linguistics and Natural
Language Processing to understand the syntax and semantics of a
natural language grammar. Parsing natural language text is
challenging because of the problems like ambiguity and inefficiency.
Also the interpretation of natural language text depends on context
based techniques. A probabilistic component is essential to resolve
ambiguity in both syntax and semantics thereby increasing accuracy
and efficiency of the parser. Tamil language has some inherent
features which are more challenging. In order to obtain the solutions,
lexicalized and statistical approach is to be applied in the parsing
with the aid of a language model. Statistical models mainly focus on
semantics of the language which are suitable for large vocabulary
tasks where as structural methods focus on syntax which models
small vocabulary tasks. A statistical language model based on Trigram
for Tamil language with medium vocabulary of 5000 words has
been built. Though statistical parsing gives better performance
through tri-gram probabilities and large vocabulary size, it has some
disadvantages like focus on semantics rather than syntax, lack of
support in free ordering of words and long term relationship. To
overcome the disadvantages a structural component is to be
incorporated in statistical language models which leads to the
implementation of hybrid language models. This paper has attempted
to build phrase structured hybrid language model which resolves
above mentioned disadvantages. In the development of hybrid
language model, new part of speech tag set for Tamil language has
been developed with more than 500 tags which have the wider
coverage. A phrase structured Treebank has been developed with 326
Tamil sentences which covers more than 5000 words. A hybrid
language model has been trained with the phrase structured Treebank
using immediate head parsing technique. Lexicalized and statistical
parser which employs this hybrid language model and immediate
head parsing technique gives better results than pure grammar and
trigram based model.
Abstract: The production of a plant can be measured in terms of
seeds. The generation of seeds plays a critical role in our social and
daily life. The fruit production which generates seeds, depends on the
various parameters of the plant, such as shoot length, leaf number,
root length, root number, etc When the plant is growing, some leaves
may be lost and some new leaves may appear. It is very difficult to
use the number of leaves of the tree to calculate the growth of the
plant.. It is also cumbersome to measure the number of roots and
length of growth of root in several time instances continuously after
certain initial period of time, because roots grow deeper and deeper
under ground in course of time. On the contrary, the shoot length of
the tree grows in course of time which can be measured in different
time instances. So the growth of the plant can be measured using the
data of shoot length which are measured at different time instances
after plantation. The environmental parameters like temperature, rain
fall, humidity and pollution are also play some role in production of
yield. The soil, crop and distance management are taken care to
produce maximum amount of yields of plant. The data of the growth
of shoot length of some mustard plant at the initial stage (7,14,21 &
28 days after plantation) is available from the statistical survey by a
group of scientists under the supervision of Prof. Dilip De. In this
paper, initial shoot length of Ken( one type of mustard plant) has
been used as an initial data. The statistical models, the methods of
fuzzy logic and neural network have been tested on this mustard
plant and based on error analysis (calculation of average error) that
model with minimum error has been selected and can be used for the
assessment of shoot length at maturity. Finally, all these methods
have been tested with other type of mustard plants and the particular
soft computing model with the minimum error of all types has been
selected for calculating the predicted data of growth of shoot length.
The shoot length at the stage of maturity of all types of mustard
plants has been calculated using the statistical method on the
predicted data of shoot length.