Abstract: Lung CT image segmentation is a prerequisite in lung
CT image analysis. Most of the conventional methods need a
post-processing to deal with the abnormal lung CT scans such as
lung nodules or other lesions. The simplest similarity measure in
the standard Graph Cuts Algorithm consists of directly comparing
the pixel values of the two neighboring regions, which is not
accurate because this kind of metrics is extremely sensitive to minor
transformations such as noise or other artifacts problems. In this work,
we propose an improved version of the standard graph cuts algorithm
based on the Patch-Based similarity metric. The boundary penalty
term in the graph cut algorithm is defined Based on Patch-Based
similarity measurement instead of the simple intensity measurement
in the standard method. The weights between each pixel and its
neighboring pixels are Based on the obtained new term. The graph
is then created using theses weights between its nodes. Finally,
the segmentation is completed with the minimum cut/Max-Flow
algorithm. Experimental results show that the proposed method is
very accurate and efficient, and can directly provide explicit lung
regions without any post-processing operations compared to the
standard method.
Abstract: This paper presents a method for improving object search accuracy using a deep learning model. A major limitation to provide accurate similarity with deep learning is the requirement of huge amount of data for training pairwise similarity scores (metrics), which is impractical to collect. Thus, similarity scores are usually trained with a relatively small dataset, which comes from a different domain, causing limited accuracy on measuring similarity. For this reason, this paper proposes a deep learning model that can be trained with a significantly small amount of data, a clustered data which of each cluster contains a set of visually similar images. In order to measure similarity distance with the proposed method, visual features of two images are extracted from intermediate layers of a convolutional neural network with various pooling methods, and the network is trained with pairwise similarity scores which is defined zero for images in identical cluster. The proposed method outperforms the state-of-the-art object similarity scoring techniques on evaluation for finding exact items. The proposed method achieves 86.5% of accuracy compared to the accuracy of the state-of-the-art technique, which is 59.9%. That is, an exact item can be found among four retrieved images with an accuracy of 86.5%, and the rest can possibly be similar products more than the accuracy. Therefore, the proposed method can greatly reduce the amount of training data with an order of magnitude as well as providing a reliable similarity metric.
Abstract: With the rapid development of urbanization process in China, its environmental protection pressure is severely tested. So, analyzing and optimizing the landscape pattern is an important measure to ease the pressure on the ecological environment. This paper takes Wuhan Urban Development Zone as the research object, and studies its landscape pattern evolution and quantitative optimization strategy. First, remote sensing image data from 1990 to 2015 were interpreted by using Erdas software. Next, the landscape pattern index of landscape level, class level, and patch level was studied based on Fragstats. Then five indicators of ecological environment based on National Environmental Protection Standard of China were selected to evaluate the impact of landscape pattern evolution on the ecological environment. Besides, the cost distance analysis of ArcGIS was applied to simulate wildlife migration thus indirectly measuring the improvement of ecological environment quality. The result shows that the area of land for construction increased 491%. But the bare land, sparse grassland, forest, farmland, water decreased 82%, 47%, 36%, 25% and 11% respectively. They were mainly converted into construction land. On landscape level, the change of landscape index all showed a downward trend. Number of patches (NP), Landscape shape index (LSI), Connection index (CONNECT), Shannon's diversity index (SHDI), Aggregation index (AI) separately decreased by 2778, 25.7, 0.042, 0.6, 29.2%, all of which indicated that the NP, the degree of aggregation and the landscape connectivity declined. On class level, the construction land and forest, CPLAND, TCA, AI and LSI ascended, but the Distribution Statistics Core Area (CORE_AM) decreased. As for farmland, water, sparse grassland, bare land, CPLAND, TCA and DIVISION, the Patch Density (PD) and LSI descended, yet the patch fragmentation and CORE_AM increased. On patch level, patch area, Patch perimeter, Shape index of water, farmland and bare land continued to decline. The three indexes of forest patches increased overall, sparse grassland decreased as a whole, and construction land increased. It is obvious that the urbanization greatly influenced the landscape evolution. Ecological diversity and landscape heterogeneity of ecological patches clearly dropped. The Habitat Quality Index continuously declined by 14%. Therefore, optimization strategy based on greenway network planning is raised for discussion. This paper contributes to the study of landscape pattern evolution in planning and design and to the research on spatial layout of urbanization.
Abstract: An online performance management system was evaluated, and recommendations were made to improve the system. The study shows the effects of not adhering to the established web design principles and conventions. Furthermore, the study indicates that if the online performance management system is not well designed, it may have negative effects on the overall usability of the system and these negative effects will have consequences for both the employer and employees. The evaluation was done in terms of the usability metrics of effectiveness, efficiency and user satisfaction. Effectiveness was measured in terms of the success rate with which users could execute prescribed tasks in a sandbox system. Efficiency was expressed in terms of the time it took participants to understand what is expected of them and to execute the tasks. Post-test questionnaires were used in order to determine the satisfaction of the participants. Recommendations were made to improve the usability of the online performance management system.
Abstract: Where human beings can easily learn and adopt pronunciation variations, machines need training before put into use. Also humans keep minimum vocabulary and their pronunciation variations are stored in front-end of their memory for ready reference, while machines keep the entire pronunciation dictionary for ready reference. Supervised methods are used for preparation of pronunciation dictionaries which take large amounts of manual effort, cost, time and are not suitable for real time use. This paper presents an unsupervised adaptation model for building agile and dynamic pronunciation dictionaries online. These methods mimic human approach in learning the new pronunciations in real time. A new algorithm for measuring sound distances called Dynamic Phone Warping is presented and tested. Performance of the system is measured using an adaptation model and the precision metrics is found to be better than 86 percent.
Abstract: The production and publication of scientific works have increased significantly in the last years, being the Internet the main factor of access and distribution of these works. Faced with this, there is a growing interest in understanding how scientific research has evolved, in order to explore this knowledge to encourage research groups to become more productive. Therefore, the objective of this work is to explore repositories containing data from scientific publications and to characterize keyword networks of these publications, in order to identify the most relevant keywords, and to highlight those that have the greatest impact on the network. To do this, each article in the study repository has its keywords extracted and in this way the network is characterized, after which several metrics for social network analysis are applied for the identification of the highlighted keywords.
Abstract: In order to survive on the market, companies must
constantly develop improved and new products. These products are
designed to serve the needs of their customers in the best possible
way. The creation of new products is also called innovation and is
primarily driven by a company’s internal research and development
department. However, a new approach has been taking place for some
years now, involving external knowledge in the innovation process.
This approach is called open innovation and identifies customer
knowledge as the most important source in the innovation process. This paper presents a concept of using social media posts as an external source to support the open innovation approach in its
initial phase, the Ideation phase. For this purpose, the social media
posts are semantically structured with the help of an ontology and
the authors are evaluated using graph-theoretical metrics such as
density. For the structuring and evaluation of relevant social media
posts, we also use the findings of Natural Language Processing, e.
g. Named Entity Recognition, specific dictionaries, Triple Tagger
and Part-of-Speech-Tagger. The selection and evaluation of the tools
used are discussed in this paper. Using our ontology and metrics
to structure social media posts enables users to semantically search
these posts for new product ideas and thus gain an improved insight
into the external sources such as customer needs.
Abstract: Crop yield prediction is a paramount issue in
agriculture. The main idea of this paper is to find out efficient
way to predict the yield of corn based meteorological records.
The prediction models used in this paper can be classified into
model-driven approaches and data-driven approaches, according to
the different modeling methodologies. The model-driven approaches are based on crop mechanistic
modeling. They describe crop growth in interaction with their
environment as dynamical systems. But the calibration process of
the dynamic system comes up with much difficulty, because it
turns out to be a multidimensional non-convex optimization problem.
An original contribution of this paper is to propose a statistical
methodology, Multi-Scenarios Parameters Estimation (MSPE), for the
parametrization of potentially complex mechanistic models from a
new type of datasets (climatic data, final yield in many situations).
It is tested with CORNFLO, a crop model for maize growth. On the other hand, the data-driven approach for yield prediction
is free of the complex biophysical process. But it has some strict
requirements about the dataset.
A second contribution of the paper is the comparison of these
model-driven methods with classical data-driven methods. For this
purpose, we consider two classes of regression methods, methods
derived from linear regression (Ridge and Lasso Regression, Principal
Components Regression or Partial Least Squares Regression) and
machine learning methods (Random Forest, k-Nearest Neighbor,
Artificial Neural Network and SVM regression).
The dataset consists of 720 records of corn yield at county scale
provided by the United States Department of Agriculture (USDA) and
the associated climatic data. A 5-folds cross-validation process and
two accuracy metrics: root mean square error of prediction(RMSEP),
mean absolute error of prediction(MAEP) were used to evaluate the
crop prediction capacity.
The results show that among the data-driven approaches, Random
Forest is the most robust and generally achieves the best prediction
error (MAEP 4.27%). It also outperforms our model-driven approach
(MAEP 6.11%). However, the method to calibrate the mechanistic
model from dataset easy to access offers several side-perspectives.
The mechanistic model can potentially help to underline the stresses
suffered by the crop or to identify the biological parameters of interest
for breeding purposes. For this reason, an interesting perspective is
to combine these two types of approaches.
Abstract: With recent trends in Big Data and advancements
in Information and Communication Technologies, the healthcare
industry is at the stage of its transition from clinician oriented to
technology oriented. Many people around the world die of cancer
because the diagnosis of disease was not done at an early stage.
Nowadays, the computational methods in the form of Machine
Learning (ML) are used to develop automated decision support
systems that can diagnose cancer with high confidence in a timely
manner. This paper aims to carry out the comparative evaluation
of a selected set of ML classifiers on two existing datasets: breast
cancer and cervical cancer. The ML classifiers compared in this study
are Decision Tree (DT), Support Vector Machine (SVM), k-Nearest
Neighbor (k-NN), Logistic Regression, Ensemble (Bagged Tree) and
Artificial Neural Networks (ANN). The evaluation is carried out based
on standard evaluation metrics Precision (P), Recall (R), F1-score and
Accuracy. The experimental results based on the evaluation metrics
show that ANN showed the highest-level accuracy (99.4%) when
tested with breast cancer dataset. On the other hand, when these
ML classifiers are tested with the cervical cancer dataset, Ensemble
(Bagged Tree) technique gave better accuracy (93.1%) in comparison
to other classifiers.
Abstract: Even though past, current and future trends suggest that multicore and cloud computing systems are increasingly prevalent/ubiquitous, this class of parallel systems is nonetheless underutilized, in general, and barely used for research on employing parallel Delaunay triangulation for parallel surface modeling and generation, in particular. The performances, of actual/physical and virtual/cloud multicore systems/machines, at executing various algorithms, which implement various parallelization strategies of the incremental insertion technique of the Delaunay triangulation algorithm, were evaluated. T-tests were run on the data collected, in order to determine whether various performance metrics differences (including execution time, speedup and efficiency) were statistically significant. Results show that the actual machine is approximately twice faster than the virtual machine at executing the same programs for the various parallelization strategies. Results, which furnish the scalability behaviors of the various parallelization strategies, also show that some of the differences between the performances of these systems, during different runs of the algorithms on the systems, were statistically significant. A few pseudo superlinear speedup results, which were computed from the raw data collected, are not true superlinear speedup values. These pseudo superlinear speedup values, which arise as a result of one way of computing speedups, disappear and give way to asymmetric speedups, which are the accurate kind of speedups that occur in the experiments performed.
Abstract: H-index has been widely used as a performance indicator of researchers around the world especially in Indonesia. The Government uses Scopus and Google scholar as indexing references in providing recognition and appreciation. However, those two indexing services yield to different H-index values. For that purpose, this paper evaluates the difference of the H-index from those services. Researchers indexed by Webometrics, are used as reference’s data in this paper. Currently, Webometrics only uses H-index from Google Scholar. This paper observed and compared corresponding researchers’ data from Scopus to get their H-index score. Subsequently, some researchers with huge differences in score are observed in more detail on their paper’s publisher. This paper shows that the H-index of researchers in Google Scholar is approximately 2.45 times of their Scopus H-Index. Most difference exists due to the existence of uncertified publishers, which is considered in Google Scholar but not in Scopus.
Abstract: Motor vehicle related pedestrian road traffic collisions are a major road safety challenge, since they are a leading cause of death and serious injury worldwide, contributing to a third of the global disease burden. The auto rickshaw, which is a common form of urban transport in many developing countries, plays a major transport role, both as a vehicle for hire and for private use. The most common auto rickshaws are quite unlike ‘typical’ four-wheel motor vehicle, being typically characterised by three wheels, a non-tilting sheet-metal body or open frame construction, a canvas roof and side curtains, a small drivers’ cabin, handlebar controls and a passenger space at the rear. Given the propensity, in developing countries, for auto rickshaws to be used in mixed cityscapes, where pedestrians and vehicles share the roadway, the potential for auto rickshaw impacts with pedestrians is relatively high. Whilst auto rickshaws are used in some Western countries, their limited number and spatial separation from pedestrian walkways, as a result of city planning, has not resulted in significant accident statistics. Thus, auto rickshaws have not been subject to the vehicle impact related pedestrian crash kinematic analyses and/or injury mechanics assessment, typically associated with motor vehicle development in Western Europe, North America and Japan. This study presents a parametric analysis of auto rickshaw related pedestrian impacts by computational simulation, using a Finite Element model of an auto rickshaw and an LS-DYNA 50th percentile male Hybrid III Anthropometric Test Device (dummy). Parametric variables include auto rickshaw impact velocity, auto rickshaw impact region (front, centre or offset) and relative pedestrian impact position (front, side and rear). The output data of each impact simulation was correlated against reported injury metrics, Head Injury Criterion (front, side and rear), Neck injury Criterion (front, side and rear), Abbreviated Injury Scale and reported risk level and adds greater understanding to the issue of auto rickshaw related pedestrian injury risk. The parametric analyses suggest that pedestrians are subject to a relatively high risk of injury during impacts with an auto rickshaw at velocities of 20 km/h or greater, which during some of the impact simulations may even risk fatalities. The present study provides valuable evidence for informing a series of recommendations and guidelines for making the auto rickshaw safer during collisions with pedestrians. Whilst it is acknowledged that the present research findings are based in the field of safety engineering and may over represent injury risk, compared to “Real World” accidents, many of the simulated interactions produced injury response values significantly greater than current threshold curves and thus, justify their inclusion in the study. To reduce the injury risk level and increase the safety of the auto rickshaw, there should be a reduction in the velocity of the auto rickshaw and, or, consideration of engineering solutions, such as retro fitting injury mitigation technologies to those auto rickshaw contact regions which are the subject of the greatest risk of producing pedestrian injury.
Abstract: This paper examines the forecasting performance of Autoregressive Integrated Moving Average (ARIMA) and Artificial Neural Networks (ANN) models with the published exchange rate obtained from South African Reserve Bank (SARB). ARIMA is one of the popular linear models in time series forecasting for the past decades. ARIMA and ANN models are often compared and literature revealed mixed results in terms of forecasting performance. The study used the MSE and MAE to measure the forecasting performance of the models. The empirical results obtained reveal the superiority of ARIMA model over ANN model. The findings further resolve and clarify the contradiction reported in literature over the superiority of ARIMA and ANN models.
Abstract: This paper presents the processing and analysis of ECG signals. The study is based on wavelet transform and uses exclusively the MATLAB environment. This study includes removing Baseline wander and further de-noising through wavelet transform and metrics such as signal-to noise ratio (SNR), Peak signal-to-noise ratio (PSNR) and the mean squared error (MSE) are used to assess the efficiency of the de-noising techniques. Feature extraction is subsequently performed whereby signal features such as heart rate, rise and fall levels are extracted and the QRS complex was detected which helped in classifying the ECG signal. The classification is the last step in the analysis of the ECG signals and it is shown that these are successfully classified as Normal rhythm or Abnormal rhythm. The final result proved the adequacy of using wavelet transform for the analysis of ECG signals.
Abstract: The production and publication of scientific works
have increased significantly in the last years, being the Internet
the main factor of access and diffusion of these. In view of this,
researchers from several areas of knowledge have carried out several
studies on scientific production data in order to analyze phenomena
and trends about science. The understanding of how research has
evolved can, for example, serve as a basis for building scientific
policies for further advances in science and stimulating research
groups to become more productive. In this context, the objective
of this work is to analyze the main research topics investigated
along the trajectory of the Brazilian science of researchers working
in the areas of engineering, in order to map scientific knowledge
and identify topics in highlights. To this end, studies are carried
out on the frequency and relationship of the keywords of the set of
scientific articles registered in the existing curricula in the Lattes
Platform of each one of the selected researchers, counting with the
aid of bibliometric analysis features.
Abstract: DNA data have been used in forensics for decades. However, current research looks at using the DNA as a biometric identity verification modality. The goal is to improve the speed of identification. We aim at using gene data that was initially used for autism detection to find if and how accurate is this data for identification applications. Mainly our goal is to find if our data preprocessing technique yields data useful as a biometric identification tool. We experiment with using the nearest neighbor classifier to identify subjects. Results show that optimal classification rate is achieved when the test set is corrupted by normally distributed noise with zero mean and standard deviation of 1. The classification rate is close to optimal at higher noise standard deviation reaching 3. This shows that the data can be used for identity verification with high accuracy using a simple classifier such as the k-nearest neighbor (k-NN).
Abstract: Air Transport links markets and individuals, making regions more competitive and promoting social and economic development. The assessment of social contribution is the key objective of this paper, focusing on the definition of the components of social dimension and welfare metrics in the national scale. According to a top-down approach, the key dimensions that affect the social welfare are presented. Conventional wisdom is to provide estimations on added value to social issues caused by the air transport development and present the methodology framework for measuring the contribution of transport development in social value chain. Greece is the case study of this paper, providing results from the contribution of air transport infrastructures in national welfare. The application key findings are essential for managers and decision makers to support actions and plans towards economic recovery of an economy presenting strong seasonal characteristics (because of tourism) and suffering from recession.
Abstract: This paper presents the development of an event based Discrete Event Simulation (DES) for a recovery algorithm known Backward Recovery Global Preemptive Utility Accrual Scheduling (BR_GPUAS). This algorithm implements the Backward Recovery (BR) mechanism as a fault recovery solution under the existing Time/Utility Function/ Utility Accrual (TUF/UA) scheduling domain for multiprocessor environment. The BR mechanism attempts to take the faulty tasks back to its initial safe state and then proceeds to re-execute the affected section of the faulty tasks to enable recovery. Considering that faults may occur in the components of any system; a fault tolerance system that can nullify the erroneous effect is necessary to be developed. Current TUF/UA scheduling algorithm uses the abortion recovery mechanism and it simply aborts the erroneous task as their fault recovery solution. None of the existing algorithm in TUF/UA scheduling domain in multiprocessor scheduling environment have considered the transient fault and implement the BR mechanism as a fault recovery mechanism to nullify the erroneous effect and solve the recovery problem in this domain. The developed BR_GPUAS simulator has derived the set of parameter, events and performance metrics according to a detailed analysis of the base model. Simulation results revealed that BR_GPUAS algorithm can saved almost 20-30% of the accumulated utilities making it reliable and efficient for the real-time application in the multiprocessor scheduling environment.
Abstract: Energy production optimization has been traditionally very important for utilities in order to improve resource consumption. However, load forecasting is a challenging task, as there are a large number of relevant variables that must be considered, and several strategies have been used to deal with this complex problem. This is especially true also in microgrids where many elements have to adjust their performance depending on the future generation and consumption conditions. The goal of this paper is to present a solution for short-term load forecasting in microgrids, based on three machine learning experiments developed in R and web services built and deployed with different components of Cortana Intelligence Suite: Azure Machine Learning, a fully managed cloud service that enables to easily build, deploy, and share predictive analytics solutions; SQL database, a Microsoft database service for app developers; and PowerBI, a suite of business analytics tools to analyze data and share insights. Our results show that Boosted Decision Tree and Fast Forest Quantile regression methods can be very useful to predict hourly short-term consumption in microgrids; moreover, we found that for these types of forecasting models, weather data (temperature, wind, humidity and dew point) can play a crucial role in improving the accuracy of the forecasting solution. Data cleaning and feature engineering methods performed in R and different types of machine learning algorithms (Boosted Decision Tree, Fast Forest Quantile and ARIMA) will be presented, and results and performance metrics discussed.
Abstract: The aim of this paper is to present the QoE (Quality of Experience) IPTV SDN-based media streaming server enhanced architecture for configuring, controlling, management and provisioning the improved delivery of IPTV service application with low cost, low bandwidth, and high security. Furthermore, it is given a virtual QoE IPTV SDN-based topology to provide an improved IPTV service based on QoE Control and Management of multimedia services functionalities. Inside OpenFlow SDN Controller there are enabled in high flexibility and efficiency Service Load-Balancing Systems; based on the Loading-Balance module and based on GeoIP Service. This two Load-balancing system improve IPTV end-users Quality of Experience (QoE) with optimal management of resources greatly. Through the key functionalities of OpenFlow SDN controller, this approach produced several important features, opportunities for overcoming the critical QoE metrics for IPTV Service like achieving incredible Fast Zapping time (Channel Switching time) < 0.1 seconds. This approach enabled Easy and Powerful Transcoding system via FFMPEG encoder. It has the ability to customize streaming dimensions bitrates, latency management and maximum transfer rates ensuring delivering of IPTV streaming services (Audio and Video) in high flexibility, low bandwidth and required performance. This QoE IPTV SDN-based media streaming architecture unlike other architectures provides the possibility of Channel Exchanging between several IPTV service providers all over the word. This new functionality brings many benefits as increasing the number of TV channels received by end –users with low cost, decreasing stream failure time (Channel Failure time < 0.1 seconds) and improving the quality of streaming services.