Abstract: This paper explores the effectiveness of machine
learning techniques in detecting firms that issue fraudulent financial
statements (FFS) and deals with the identification of factors
associated to FFS. To this end, a number of experiments have been
conducted using representative learning algorithms, which were
trained using a data set of 164 fraud and non-fraud Greek firms in the
recent period 2001-2002. The decision of which particular method to
choose is a complicated problem. A good alternative to choosing
only one method is to create a hybrid forecasting system
incorporating a number of possible solution methods as components
(an ensemble of classifiers). For this purpose, we have implemented
a hybrid decision support system that combines the representative
algorithms using a stacking variant methodology and achieves better
performance than any examined simple and ensemble method. To
sum up, this study indicates that the investigation of financial
information can be used in the identification of FFS and underline the
importance of financial ratios.
Abstract: Recent years have seen a growing trend towards the
integration of multiple information sources to support large-scale
prediction of protein-protein interaction (PPI) networks in model
organisms. Despite advances in computational approaches, the
combination of multiple “omic" datasets representing the same type
of data, e.g. different gene expression datasets, has not been
rigorously studied. Furthermore, there is a need to further investigate
the inference capability of powerful approaches, such as fullyconnected
Bayesian networks, in the context of the prediction of PPI
networks. This paper addresses these limitations by proposing a
Bayesian approach to integrate multiple datasets, some of which
encode the same type of “omic" data to support the identification of
PPI networks. The case study reported involved the combination of
three gene expression datasets relevant to human heart failure (HF).
In comparison with two traditional methods, Naive Bayesian and
maximum likelihood ratio approaches, the proposed technique can
accurately identify known PPI and can be applied to infer potentially
novel interactions.
Abstract: Most Decision Support Systems (DSS) for waste
management (WM) constructed are not widely marketed and lack
practical applications. This is due to the number of variables and
complexity of the mathematical models which include the
assumptions and constraints required in decision making. The
approach made by many researchers in DSS modelling is to isolate a
few key factors that have a significant influence to the DSS. This
segmented approach does not provide a thorough understanding of
the complex relationships of the many elements involved. The
various elements in constructing the DSS must be integrated and
optimized in order to produce a viable model that is marketable and
has practical application. The DSS model used in assisting decision
makers should be integrated with GIS, able to give robust prediction
despite the inherent uncertainties of waste generation and the plethora
of waste characteristics, and gives optimal allocation of waste stream
for recycling, incineration, landfill and composting.
Abstract: The main aim of the current study was to examine the
effect of emotional intelligence on retention. The study also aimed at
analyzing the role of job involvement, as a moderator, in the effect of
emotional intelligence on retention. Using data gathered from 241
employees working with hotels and tourism corporations listed in
Amman Stock Exchange in Jordan, emotional intelligence, job
involvement and retention were measured. Hierarchical regression
analyses were used to test the three main hypotheses. Results
indicated that retention was related to emotional intelligence.
Moreover, the study yielded support for the claim that job
involvement had a moderating effect on the relationship between
emotional intelligence and retention.
Abstract: The one-class support vector machine “support vector
data description” (SVDD) is an ideal approach for anomaly or outlier
detection. However, for the applicability of SVDD in real-world
applications, the ease of use is crucial. The results of SVDD are
massively determined by the choice of the regularisation parameter C
and the kernel parameter of the widely used RBF kernel. While for
two-class SVMs the parameters can be tuned using cross-validation
based on the confusion matrix, for a one-class SVM this is not
possible, because only true positives and false negatives can occur
during training. This paper proposes an approach to find the optimal
set of parameters for SVDD solely based on a training set from
one class and without any user parameterisation. Results on artificial
and real data sets are presented, underpinning the usefulness of the
approach.
Abstract: Recently, X. Ge and J. Qian investigated some relations between higher mathematics scores and calculus scores (resp. linear algebra scores, probability statistics scores) for Chinese university students. Based on rough-set theory, they established an information system S = (U,CuD,V, f). In this information system, higher mathematics score was taken as a decision attribute and calculus score, linear algebra score, probability statistics score were taken as condition attributes. They investigated importance of each condition attribute with respective to decision attribute and strength of each condition attribute supporting decision attribute. In this paper, we give further investigations for this issue. Based on the above information system S = (U, CU D, V, f), we analyze the decision rules between condition and decision granules. For each x E U, we obtain support (resp. strength, certainty factor, coverage factor) of the decision rule C —>x D, where C —>x D is the decision rule induced by x in S = (U, CU D, V, f). Results of this paper gives new analysis of on higher mathematics scores for Chinese university students, which can further lead Chinese university students to raise higher mathematics scores in Chinese graduate student entrance examination.
Abstract: In networks, mainly small and medium-sized businesses benefit from the knowledge, experiences and solutions offered by experts from industry and science or from the exchange with practitioners. Associations which focus, among other things, on networking, information and knowledge transfer and which are interested in supporting such cooperations are especially well suited to provide such networks and the appropriate web platforms. Using METORA as an example – a project developed and run by the Federal Association for Information Economy, Telecommunications and New Media e.V. (BITKOM) for the Federal Ministry of Economics and Technology (BMWi) – This paper will discuss how associations and other network organizations can achieve this task and what conditions they have to consider.
Abstract: The new framework the Higher Education is
immersed in involves a complete change in the way lecturers must
teach and students must learn. Whereas the lecturer was the main
character in traditional education, the essential goal now is to
increase the students' participation in the process. Thus, one of the
main tasks of lecturers in this new context is to design activities of
different nature in order to encourage such participation. Seminars
are one of the activities included in this environment. They are active
sessions that enable going in depth into specific topics as support of
other activities. They are characterized by some features such as
favoring interaction between students and lecturers or improving
their communication skills. Hence, planning and organizing strategic
seminars is indeed a great challenge for lecturers with the aim of
acquiring knowledge and abilities. This paper proposes a method
using Artificial Intelligence techniques to obtain student profiles
from their marks and preferences. The goal of building such profiles
is twofold. First, it facilitates the task of splitting the students into
different groups, each group with similar preferences and learning
difficulties. Second, it makes it easy to select adequate topics to be a
candidate for the seminars. The results obtained can be either a
guarantee of what the lecturers could observe during the development
of the course or a clue to reconsider new methodological strategies in
certain topics.
Abstract: In the automotive industry test drives are being conducted
during the development of new vehicle models or as a part of
quality assurance of series-production vehicles. The communication
on the in-vehicle network, data from external sensors, or internal
data from the electronic control units is recorded by automotive
data loggers during the test drives. The recordings are used for fault
analysis. Since the resulting data volume is tremendous, manually
analysing each recording in great detail is not feasible.
This paper proposes to use machine learning to support domainexperts
by preventing them from contemplating irrelevant data and
rather pointing them to the relevant parts in the recordings. The
underlying idea is to learn the normal behaviour from available
recordings, i.e. a training set, and then to autonomously detect
unexpected deviations and report them as anomalies.
The one-class support vector machine “support vector data description”
is utilised to calculate distances of feature vectors. SVDDSUBSEQ
is proposed as a novel approach, allowing to classify subsequences
in multivariate time series data. The approach allows to
detect unexpected faults without modelling effort as is shown with
experimental results on recordings from test drives.
Abstract: This study reports the implementation of Good
Manufacturing Practice (GMP) in a polycarbonate film processing
plant. The implementation of GMP took place with the creation of a
multidisciplinary team. It was carried out in four steps: conduct gap
assessment, create gap closure plan, close gaps, and follow up the
GMP implementation. The basis for the gap assessment is the
guideline for GMP for plastic materials and articles intended for Food
Contact Material (FCM), which was edited by Plastic Europe. The
effective results of the GMP implementation in this study showed
100% completion of gap assessment. The key success factors for
implementing GMP in production process are the commitment,
intention and support of top management.
Abstract: A key to success of high quality software development
is to define valid and feasible requirements specification. We have
proposed a method of model-driven requirements analysis using
Unified Modeling Language (UML). The main feature of our method
is to automatically generate a Web user interface mock-up from UML
requirements analysis model so that we can confirm validity of
input/output data for each page and page transition on the system by
directly operating the mock-up. This paper proposes a support method
to check the validity of a data life cycle by using a model checking tool
“UPPAAL" focusing on CRUD (Create, Read, Update and Delete).
Exhaustive checking improves the quality of requirements analysis
model which are validated by the customers through automatically
generated mock-up. The effectiveness of our method is discussed by a
case study of requirements modeling of two small projects which are a
library management system and a supportive sales system for text
books in a university.
Abstract: Nevertheless the widespread application of finite
mixture models in segmentation, finite mixture model selection is
still an important issue. In fact, the selection of an adequate number
of segments is a key issue in deriving latent segments structures and
it is desirable that the selection criteria used for this end are effective.
In order to select among several information criteria, which may
support the selection of the correct number of segments we conduct a
simulation study. In particular, this study is intended to determine
which information criteria are more appropriate for mixture model
selection when considering data sets with only categorical
segmentation base variables. The generation of mixtures of
multinomial data supports the proposed analysis. As a result, we
establish a relationship between the level of measurement of
segmentation variables and some (eleven) information criteria-s
performance. The criterion AIC3 shows better performance (it
indicates the correct number of the simulated segments- structure
more often) when referring to mixtures of multinomial segmentation
base variables.
Abstract: It well recognized that one feature that makes a
successful company is its ability to successfully align its business goals with its information communication technologies platform.
Enterprise Resource Planning (ERP) systems contribute to achieve better performance by integrating various business functions and
providing support for information flows. However, the technological
systems complexity is known to prevent the business users to exploit in an efficient way the Enterprise Resource Planning Systems (ERP).
This paper aims to investigate the role of training in improving the
usage of ERP systems. To this end, we have designed an instrument
survey to employees of a Norwegian multinational global provider of
technology solutions. Based on the analysis of collected data, we have delineated a training model that could be high relevance for
both researchers and practitioners as a step towards a better
understanding of ERP system implementation.
Abstract: Participation in sporting activities can lead to injury.
Sport injuries have been widely studied in many sports including the
more extreme categories of aquatic board sports. Kitesurfing is a
relatively new water surface action sport, and has not yet been
widely studied in terms of injuries and stress on the body. The aim of
this study was to get information about which injuries that are most
common among kitesurfing participants, where they occur, and their
causes. Injuries were studied using an international open web
questionnaire (n=206).
The results showed that many respondents reported injuries, in
total 251 injuries to knee (24%), ankle (17%), trunk (16%) and
shoulders (10%), often sustained while doing jumps and tricks
(40%). Among the reported injuries were joint injuries (n=101),
muscle/tendon damages (n=47), wounds and cuts (n=36) and bone
fractures (n=28). Also environmental factors and equipment can
influence the risk of injury, or the extent of injury in a hazardous
situation. Conclusively, the information from this retrospective study
supports earlier studies in terms of prevalence and site of injuries.
Suggestively, this information should be used for to build a
foundation of knowledge about the sport for development of
applications for physical training and product development.
Abstract: The research objective of the project and article “The impact of Structural Funds on the growth of competitiveness of Polish agriculture" is to assess competitiveness of regions in Poland from the perspective of Polish agriculture by analysing the efficiency of the use of Structural Funds, the economic procedure of their distribution and the regulatory and organisational framework under the Rural Development Programme (RDP). It must be stressed that defining the scope of research in the above manner limits the analysis only to the part of Structural Funds directed to support Polish agriculture.
Abstract: Sickness absence represents a major economic and
social issue. Analysis of sick leave data is a recurrent challenge to analysts because of the complexity of the data structure which is
often time dependent, highly skewed and clumped at zero. Ignoring these features to make statistical inference is likely to be inefficient
and misguided. Traditional approaches do not address these problems. In this study, we discuss model methodologies in terms of statistical techniques for addressing the difficulties with sick leave data. We also introduce and demonstrate a new method by performing a longitudinal assessment of long-term absenteeism using
a large registration dataset as a working example available from the Helsinki Health Study for municipal employees from Finland during the period of 1990-1999. We present a comparative study on model
selection and a critical analysis of the temporal trends, the occurrence
and degree of long-term sickness absences among municipal employees. The strengths of this working example include the large
sample size over a long follow-up period providing strong evidence in supporting of the new model. Our main goal is to propose a way to
select an appropriate model and to introduce a new methodology for analysing sickness absence data as well as to demonstrate model
applicability to complicated longitudinal data.
Abstract: According to the statistics, the prevalence of congenital hearing loss in Taiwan is approximately six thousandths; furthermore, one thousandths of infants have severe hearing impairment. Hearing ability during infancy has significant impact in the development of children-s oral expressions, language maturity, cognitive performance, education ability and social behaviors in the future. Although most children born with hearing impairment have sensorineural hearing loss, almost every child more or less still retains some residual hearing. If provided with a hearing aid or cochlear implant (a bionic ear) timely in addition to hearing speech training, even severely hearing-impaired children can still learn to talk. On the other hand, those who failed to be diagnosed and thus unable to begin hearing and speech rehabilitations on a timely manner might lose an important opportunity to live a complete and healthy life. Eventually, the lack of hearing and speaking ability will affect the development of both mental and physical functions, intelligence, and social adaptability. Not only will this problem result in an irreparable regret to the hearing-impaired child for the life time, but also create a heavy burden for the family and society. Therefore, it is necessary to establish a set of computer-assisted predictive model that can accurately detect and help diagnose newborn hearing loss so that early interventions can be provided timely to eliminate waste of medical resources. This study uses information from the neonatal database of the case hospital as the subjects, adopting two different analysis methods of using support vector machine (SVM) for model predictions and using logistic regression to conduct factor screening prior to model predictions in SVM to examine the results. The results indicate that prediction accuracy is as high as 96.43% when the factors are screened and selected through logistic regression. Hence, the model constructed in this study will have real help in clinical diagnosis for the physicians and actually beneficial to the early interventions of newborn hearing impairment.
Abstract: Data mining is the process of sifting through large
volumes of data, analyzing data from different perspectives and
summarizing it into useful information. One of the widely used
desktop applications for data mining is the Weka tool which is
nothing but a collection of machine learning algorithms implemented
in Java and open sourced under the General Public License (GPL). A
web service is a software system designed to support interoperable
machine to machine interaction over a network using SOAP
messages. Unlike a desktop application, a web service is easy to
upgrade, deliver and access and does not occupy any memory on the
system. Keeping in mind the advantages of a web service over a
desktop application, in this paper we are demonstrating how this Java
based desktop data mining application can be implemented as a web
service to support data mining across the internet.
Abstract: Project managers are the ultimate responsible for the
overall characteristics of a project, i.e. they should deliver the project
on time with minimum cost and with maximum quality. It is vital for
any manager to decide a trade-off between these conflicting
objectives and they will be benefited of any scientific decision
support tool. Our work will try to determine optimal solutions (rather
than a single optimal solution) from which the project manager will
select his desirable choice to run the project. In this paper, the
problem in project scheduling notated as
(1,T|cpm,disc,mu|curve:quality,time,cost) will be studied. The
problem is multi-objective and the purpose is finding the Pareto
optimal front of time, cost and quality of a project
(curve:quality,time,cost), whose activities belong to a start to finish
activity relationship network (cpm) and they can be done in different
possible modes (mu) which are non-continuous or discrete (disc), and
each mode has a different cost, time and quality . The project is
constrained to a non-renewable resource i.e. money (1,T). Because
the problem is NP-Hard, to solve the problem, a meta-heuristic is
developed based on a version of genetic algorithm specially adapted
to solve multi-objective problems namely FastPGA. A sample project
with 30 activities is generated and then solved by the proposed
method.
Abstract: One important objective in Precision Agriculture is to minimize the volume of herbicides that are applied to the fields through the use of site-specific weed management systems. In order to reach this goal, two major factors need to be considered: 1) the similar spectral signature, shape and texture between weeds and crops; 2) the irregular distribution of the weeds within the crop's field. This paper outlines an automatic computer vision system for the detection and differential spraying of Avena sterilis, a noxious weed growing in cereal crops. The proposed system involves two processes: image segmentation and decision making. Image segmentation combines basic suitable image processing techniques in order to extract cells from the image as the low level units. Each cell is described by two area-based attributes measuring the relations among the crops and the weeds. From these attributes, a hybrid decision making approach determines if a cell must be or not sprayed. The hybrid approach uses the Support Vector Machines and the Fuzzy k-Means methods, combined through the fuzzy aggregation theory. This makes the main finding of this paper. The method performance is compared against other available strategies.