Knowledge Discovery and Data Mining Techniques in Textile Industry

This paper addresses the issues and technique for textile industry using data mining techniques. Data mining has been applied to the stitching of garments products that were obtained from a textile company. Data mining techniques were applied to the data obtained from the CHAID algorithm, CART algorithm, Regression Analysis and, Artificial Neural Networks. Classification technique based analyses were used while data mining and decision model about the production per person and variables affecting about production were found by this method. In the study, the results show that as the daily working time increases, the production per person also decreases. In addition, the relationship between total daily working and production per person shows a negative result and the production per person show the highest and negative relationship.

CompPSA: A Component-Based Pairwise RNA Secondary Structure Alignment Algorithm

The biological function of an RNA molecule depends on its structure. The objective of the alignment is finding the homology between two or more RNA secondary structures. Knowing the common functionalities between two RNA structures allows a better understanding and a discovery of other relationships between them. Besides, identifying non-coding RNAs -that is not translated into a protein- is a popular application in which RNA structural alignment is the first step A few methods for RNA structure-to-structure alignment have been developed. Most of these methods are partial structure-to-structure, sequence-to-structure, or structure-to-sequence alignment. Less attention is given in the literature to the use of efficient RNA structure representation and the structure-to-structure alignment methods are lacking. In this paper, we introduce an O(N2) Component-based Pairwise RNA Structure Alignment (CompPSA) algorithm, where structures are given as a component-based representation and where N is the maximum number of components in the two structures. The proposed algorithm compares the two RNA secondary structures based on their weighted component features rather than on their base-pair details. Extensive experiments are conducted illustrating the efficiency of the CompPSA algorithm when compared to other approaches and on different real and simulated datasets. The CompPSA algorithm shows an accurate similarity measure between components. The algorithm gives the flexibility for the user to align the two RNA structures based on their weighted features (position, full length, and/or stem length). Moreover, the algorithm proves scalability and efficiency in time and memory performance.

Clustering Categorical Data Using the K-Means Algorithm and the Attribute’s Relative Frequency

Clustering is a well known data mining technique used in pattern recognition and information retrieval. The initial dataset to be clustered can either contain categorical or numeric data. Each type of data has its own specific clustering algorithm. In this context, two algorithms are proposed: the k-means for clustering numeric datasets and the k-modes for categorical datasets. The main encountered problem in data mining applications is clustering categorical dataset so relevant in the datasets. One main issue to achieve the clustering process on categorical values is to transform the categorical attributes into numeric measures and directly apply the k-means algorithm instead the k-modes. In this paper, it is proposed to experiment an approach based on the previous issue by transforming the categorical values into numeric ones using the relative frequency of each modality in the attributes. The proposed approach is compared with a previously method based on transforming the categorical datasets into binary values. The scalability and accuracy of the two methods are experimented. The obtained results show that our proposed method outperforms the binary method in all cases.

Causal Relation Identification Using Convolutional Neural Networks and Knowledge Based Features

Causal relation identification is a crucial task in information extraction and knowledge discovery. In this work, we present two approaches to causal relation identification. The first is a classification model trained on a set of knowledge-based features. The second is a deep learning based approach training a model using convolutional neural networks to classify causal relations. We experiment with several different convolutional neural networks (CNN) models based on previous work on relation extraction as well as our own research. Our models are able to identify both explicit and implicit causal relations as well as the direction of the causal relation. The results of our experiments show a higher accuracy than previously achieved for causal relation identification tasks.

Summarizing Data Sets for Data Mining by Using Statistical Methods in Coastal Engineering

Coastal regions are the one of the most commonly used places by the natural balance and the growing population. In coastal engineering, the most valuable data is wave behaviors. The amount of this data becomes very big because of observations that take place for periods of hours, days and months. In this study, some statistical methods such as the wave spectrum analysis methods and the standard statistical methods have been used. The goal of this study is the discovery profiles of the different coast areas by using these statistical methods, and thus, obtaining an instance based data set from the big data to analysis by using data mining algorithms. In the experimental studies, the six sample data sets about the wave behaviors obtained by 20 minutes of observations from Mersin Bay in Turkey and converted to an instance based form, while different clustering techniques in data mining algorithms were used to discover similar coastal places. Moreover, this study discusses that this summarization approach can be used in other branches collecting big data such as medicine.

C-LNRD: A Cross-Layered Neighbor Route Discovery for Effective Packet Communication in Wireless Sensor Network

One of the problems to be addressed in wireless sensor networks is the issues related to cross layer communication. Cross layer architecture shares the information across the layer, ensuring Quality of Services (QoS). With this shared information, MAC protocol adapts effective functionality maintenance such as route selection on changeable sensor network environment. However, time slot assignment and neighbour route selection time duration for cross layer have not been carried out. The time varying physical layer communication over cross layer causes high traffic load in the sensor network. Though, the traffic load was reduced using cross layer optimization procedure, the computational cost is high. To improve communication efficacy in the sensor network, a self-determined time slot based Cross-Layered Neighbour Route Discovery (C-LNRD) method is presented in this paper. In the presented work, the initial process is to discover the route in the sensor network using Dynamic Source Routing based Medium Access Control (MAC) sub layers. This process considers MAC layer operation with dynamic route neighbour table discovery. Then, the discovered route path for packet communication employs Broad Route Distributed Time Slot Assignment method on Cross-Layered Sensor Network system. Broad Route means time slotting on varying length of the route paths. During packet communication in this sensor network, transmission of packets is adjusted over the different time with varying ranges for controlling the traffic rate. Finally, Rayleigh fading model is developed in C-LNRD to identify the performance of the sensor network communication structure. The main task of Rayleigh Fading is to measure the power level of each communication under MAC sub layer. The minimized power level helps to easily reduce the computational cost of packet communication in the sensor network. Experiments are conducted on factors such as power factor, on packet communication, neighbour route discovery time, and information (i.e., packet) propagation speed.

Application of Data Mining Techniques for Tourism Knowledge Discovery

Application of five implementations of three data mining classification techniques was experimented for extracting important insights from tourism data. The aim was to find out the best performing algorithm among the compared ones for tourism knowledge discovery. Knowledge discovery process from data was used as a process model. 10-fold cross validation method is used for testing purpose. Various data preprocessing activities were performed to get the final dataset for model building. Classification models of the selected algorithms were built with different scenarios on the preprocessed dataset. The outperformed algorithm tourism dataset was Random Forest (76%) before applying information gain based attribute selection and J48 (C4.5) (75%) after selection of top relevant attributes to the class (target) attribute. In terms of time for model building, attribute selection improves the efficiency of all algorithms. Artificial Neural Network (multilayer perceptron) showed the highest improvement (90%). The rules extracted from the decision tree model are presented, which showed intricate, non-trivial knowledge/insight that would otherwise not be discovered by simple statistical analysis with mediocre accuracy of the machine using classification algorithms.

Teaching Light Polarization by Putting Art and Physics Together

Light Polarization has many technological applications, and its discovery was crucial to reveal the transverse nature of the electromagnetic waves. However, despite its fundamental and practical importance, in high school, this property of light is often neglected. This is a pity not only for its conceptual relevance, but also because polarization gives the possibility to perform many brilliant experiments with low cost materials. Moreover, the treatment of this matter lends very well to an interdisciplinary approach between art, biology and technology, which usually makes things more interesting to students. For these reasons, we have developed, and in this work, we introduce a laboratory on light polarization for high school and undergraduate students. They can see beautiful pictures when birefringent materials are set between two crossed polarizing filters. Pupils are very fascinated and drawn into by what they observe. The colourful images remind them of those ones of abstract painting or alien landscapes. With this multidisciplinary teaching method, students are more engaged and participative, and also, the learning process of the respective physics concepts is more effective.

Rheological Characteristics of Ice Slurries Based on Propylene- and Ethylene-Glycol at High Ice Fractions

Ice slurries are considered as a promising phase-changing secondary fluids for air-conditioning, packaging or cooling industrial processes. An experimental study has been here carried out to measure the rheological characteristics of ice slurries. Ice slurries consist in a solid phase (flake ice crystals) and a liquid phase. The later is composed of a mixture of liquid water and an additive being here either (1) Propylene-Glycol (PG) or (2) Ethylene-Glycol (EG) used to lower the freezing point of water. Concentrations of 5%, 14% and 24% of both additives are investigated with ice mass fractions ranging from 5% to 85%. The rheological measurements are carried out using a Discovery HR-2 vane-concentric cylinder with four full-length blades. The experimental results show that the behavior of ice slurries is generally non-Newtonian with shear-thinning or shear-thickening behaviors depending on the experimental conditions. In order to determine the consistency and the flow index, the Herschel-Bulkley model is used to describe the behavior of ice slurries. The present results are finally validated against an experimental database found in the literature and the predictions of an Artificial Neural Network model.

Analyzing the Plausible Alternatives in Contracting the Societal Fissure Caused by Digital Divide in Sri Lanka

'Digital Divide' is a concept that has existed in this paradigm ever since the discovery of the first-generation technologies. Before the turn of the century, it was basically used to describe the gap between those with telephone communication access and those without it. At present, it is plainly descriptive in itself to illustrate the cavity among those with Internet access and those without. Though the concept of digital divide has been merely lying in sight for as long as time itself, the friction it caused has not yet been fully realized to solve major crisis situations. Unlike well-developed countries, Sri Lanka is still in the verge of moving farther away from a developing country in the race towards reaching a developed state. Access to technological resources varies from region to region, even within the island itself, with one region having a considerable percentage of its community exposed to the Internet and its related technologies, and the other unaware of such. Thus, this paper intends to analyze the roots for the still-extant gap instigated based on the concept of ‘Digital Divide’ and explores the plausible potentials that could be brought about by narrowing this prevailing percentage among the population, specifically entrenching the advantages reaped towards an economic augmentation and culture or lifestyle revolution on the path towards development.

The Use of Mobile Phone as Enhancement to Mark Multiple Choice Objectives English Grammar and Literature Examination: An Exploratory Case Study of Preliminary National Diploma Students, Abdu Gusau Polytechnic, Talata Mafara, Zamfara State, Nigeria

Most often, marking and assessment of multiple choice kinds of examinations have been opined by many as a cumbersome and herculean task to accomplished manually in Nigeria. Usually this may be in obvious nexus to the fact that mass numbers of candidates were known to take the same examination simultaneously. Eventually, marking such a mammoth number of booklets dared and dread even the fastest paid examiners who often undertake the job with the resulting consequences of stress and boredom. This paper explores the evolution, as well as the set aim to envision and transcend marking the Multiple Choice Objectives- type examination into a thing of creative recreation, or perhaps a more relaxing activity via the use of the mobile phone. A more “pragmatic” dimension method was employed to achieve this work, rather than the formal “in-depth research” based approach due to the “novelty” of the mobile-smartphone e-Marking Scheme discovery. Moreover, being an evolutionary scheme, no recent academic work shares a direct same topic concept with the ‘use of cell phone as an e-marking technique’ was found online; thus, the dearth of even miscellaneous citations in this work. Additional future advancements are what steered the anticipatory motive of this paper which laid the fundamental proposition. However, the paper introduces for the first time the concept of mobile-smart phone e-marking, the steps to achieve it, as well as the merits and demerits of the technique all spelt out in the subsequent pages.

Design of a Service-Enabled Dependable Integration Environment

The aim of information systems integration is to make all the data sources, applications and business flows integrated into the new environment so that unwanted redundancies are reduced and bottlenecks and mismatches are eliminated. Two issues have to be dealt with to meet such requirements: the software architecture that supports resource integration, and the adaptor development tool that help integration and migration of legacy applications. In this paper, a service-enabled dependable integration environment (SDIE), is presented, which has two key components, i.e., a dependable service integration platform and a legacy application integration tool. For the dependable platform for service integration, the service integration bus, the service management framework, the dependable engine for service composition, and the service registry and discovery components are described. For the legacy application integration tool, its basic organization, functionalities and dependable measures taken are presented. Due to its service-oriented integration model, the light-weight extensible container, the service component combination-oriented p-lattice structure, and other features, SDIE has advantages in openness, flexibility, performance-price ratio and feature support over commercial products, is better than most of the open source integration software in functionality, performance and dependability support.

The Nexus between Migration and Human Security: The Case of Ethiopian Female Migration to Sudan

International labor migration is an integral part of the modern globalized world. However, the phenomenon has its roots in some earlier periods in human history. This paper discusses the relatively new phenomenon of female migration in Africa. In the past, African women migrants were only spouses or dependent family members. But as modernity swept most African societies, with rising unemployment rates, there is evidence everywhere in Africa that women labor migration is a growing phenomenon that deserves to be understood in the context of human security research. This work explores these issues further, focusing on the experience of Ethiopian women labor migrants to Sudan. The migration of Ethiopian people to Sudan is historical; nevertheless, labor migration mainly started since the discovery and subsequent exploration of oil in the Sudan. While the paper is concerned with the human security aspect of the migrant workers, we need to be certain that the migration process will provide with a decent wage, good working conditions, the necessary social security coverage, and labor protection as a whole. However, migration to Sudan is not always safe and female migrants become subject to violence at the hands of brokers, employers and migration officials. For this matter, the paper argued that identifying the vulnerable stages and major problem facing female migrant workers at various stages of migration is a prerequisite to combat the problem and secure the lives of the migrant workers. The major problems female migrants face include extra degrees of gender-based violence, underpayment, various forms of abuse like verbal, physical and sexual and other forms of torture which include beating and slaps. This peculiar situation could be attributed to the fact that most of these women are irregular migrants and fall under the category of unskilled and/or illiterate migrants.

Metabolomics Profile Recognition for Cancer Diagnostics

Metabolomics has become a rising field of research for various diseases, particularly cancer. Increases or decreases in metabolite concentrations in the human body are indicative of various cancers. Further elucidation of metabolic pathways and their significance in cancer research may greatly spur medicinal discovery. We analyzed the metabolomics profiles of lung cancer. Thirty-three metabolites were selected as significant. These metabolites are involved in 37 metabolic pathways delivered by MetaboAnalyst software. The top pathways are glyoxylate and dicarboxylate pathway (its hubs are formic acid and glyoxylic acid) along with Citrate cycle pathway followed by Taurine and hypotaurine pathway (the hubs in the latter are taurine and sulfoacetaldehyde) and Glycine, serine, and threonine pathway (the hubs are glycine and L-serine). We studied interactions of the metabolites with the proteins involved in cancer-related signaling networks, and developed an approach to metabolomics biomarker use in cancer diagnostics. Our analysis showed that a significant part of lung-cancer-related metabolites interacts with main cancer-related signaling pathways present in this network: PI3K–mTOR–AKT pathway, RAS–RAF–ERK1/2 pathway, and NFKB pathway. These results can be employed for use of metabolomics profiles in elucidation of the related cancer proteins signaling networks.

Improving 99mTc-tetrofosmin Myocardial Perfusion Images by Time Subtraction Technique

Quantitative measurement of myocardium perfusion is possible with single photon emission computed tomography (SPECT) using a semiconductor detector. However, accumulation of 99mTc-tetrofosmin in the liver may make it difficult to assess that accurately in the inferior myocardium. Our idea is to reduce the high accumulation in the liver by using dynamic SPECT imaging and a technique called time subtraction. We evaluated the performance of a new SPECT system with a cadmium-zinc-telluride solid-state semi- conductor detector (Discovery NM 530c; GE Healthcare). Our system acquired list-mode raw data over 10 minutes for a typical patient. From the data, ten SPECT images were reconstructed, one for every minute of acquired data. Reconstruction with the semiconductor detector was based on an implementation of a 3-D iterative Bayesian reconstruction algorithm. We studied 20 patients with coronary artery disease (mean age 75.4 ± 12.1 years; range 42-86; 16 males and 4 females). In each subject, 259 MBq of 99mTc-tetrofosmin was injected intravenously. We performed both a phantom and a clinical study using dynamic SPECT. An approximation to a liver-only image is obtained by reconstructing an image from the early projections during which time the liver accumulation dominates (0.5~2.5 minutes SPECT image-5~10 minutes SPECT image). The extracted liver-only image is then subtracted from a later SPECT image that shows both the liver and the myocardial uptake (5~10 minutes SPECT image-liver-only image). The time subtraction of liver was possible in both a phantom and the clinical study. The visualization of the inferior myocardium was improved. In past reports, higher accumulation in the myocardium due to the overlap of the liver is un-diagnosable. Using our time subtraction method, the image quality of the 99mTc-tetorofosmin myocardial SPECT image is considerably improved.

Automatic Threshold Search for Heat Map Based Feature Selection: A Cancer Dataset Analysis

Public health is one of the most critical issues today; therefore, there is great interest to improve technologies in the area of diseases detection. With machine learning and feature selection, it has been possible to aid the diagnosis of several diseases such as cancer. In this work, we present an extension to the Heat Map Based Feature Selection algorithm, this modification allows automatic threshold parameter selection that helps to improve the generalization performance of high dimensional data such as mass spectrometry. We have performed a comparison analysis using multiple cancer datasets and compare against the well known Recursive Feature Elimination algorithm and our original proposal, the results show improved classification performance that is very competitive against current techniques.

Identifying a Drug Addict Person Using Artificial Neural Networks

Use and abuse of drugs by teens is very common and can have dangerous consequences. The drugs contribute to physical and sexual aggression such as assault or rape. Some teenagers regularly use drugs to compensate for depression, anxiety or a lack of positive social skills. Teen resort to smoking should not be minimized because it can be "gateway drugs" for other drugs (marijuana, cocaine, hallucinogens, inhalants, and heroin). The combination of teenagers' curiosity, risk taking behavior, and social pressure make it very difficult to say no. This leads most teenagers to the questions: "Will it hurt to try once?" Nowadays, technological advances are changing our lives very rapidly and adding a lot of technologies that help us to track the risk of drug abuse such as smart phones, Wireless Sensor Networks (WSNs), Internet of Things (IoT), etc. This technique may help us to early discovery of drug abuse in order to prevent an aggravation of the influence of drugs on the abuser. In this paper, we have developed a Decision Support System (DSS) for detecting the drug abuse using Artificial Neural Network (ANN); we used a Multilayer Perceptron (MLP) feed-forward neural network in developing the system. The input layer includes 50 variables while the output layer contains one neuron which indicates whether the person is a drug addict. An iterative process is used to determine the number of hidden layers and the number of neurons in each one. We used multiple experiment models that have been completed with Log-Sigmoid transfer function. Particularly, 10-fold cross validation schemes are used to access the generalization of the proposed system. The experiment results have obtained 98.42% classification accuracy for correct diagnosis in our system. The data had been taken from 184 cases in Jordan according to a set of questions compiled from Specialists, and data have been obtained through the families of drug abusers.

Improving Cryptographically Generated Address Algorithm in IPv6 Secure Neighbor Discovery Protocol through Trust Management

As transition to widespread use of IPv6 addresses has gained momentum, it has been shown to be vulnerable to certain security attacks such as those targeting Neighbor Discovery Protocol (NDP) which provides the address resolution functionality in IPv6. To protect this protocol, Secure Neighbor Discovery (SEND) is introduced. This protocol uses Cryptographically Generated Address (CGA) and asymmetric cryptography as a defense against threats on integrity and identity of NDP. Although SEND protects NDP against attacks, it is computationally intensive due to Hash2 condition in CGA. To improve the CGA computation speed, we parallelized CGA generation process and used the available resources in a trusted network. Furthermore, we focused on the influence of the existence of malicious nodes on the overall load of un-malicious ones in the network. According to the evaluation results, malicious nodes have adverse impacts on the average CGA generation time and on the average number of tries. We utilized a Trust Management that is capable of detecting and isolating the malicious node to remove possible incentives for malicious behavior. We have demonstrated the effectiveness of the Trust Management System in detecting the malicious nodes and hence improving the overall system performance.

Induction Melting as a Fabrication Route for Aluminum-Carbon Nanotubes Nanocomposite

Increasing demands of contemporary applications for high strength and lightweight materials prompted the development of metal-matrix composites (MMCs). After the discovery of carbon nanotubes (CNTs) in 1991 (revealing an excellent set of mechanical properties) became one of the most promising strengthening materials for MMC applications. Additionally, the relatively low density of the nanotubes imparted high specific strengths, making them perfect strengthening material to reinforce MMCs. In the present study, aluminum-multiwalled carbon nanotubes (Al-MWCNTs) composite was prepared in an air induction furnace. The dispersion of the nanotubes in molten aluminum was assisted by inherent string action of induction heating at 790°C. During the fabrication process, multifunctional fluxes were used to avoid oxidation of the nanotubes and molten aluminum. Subsequently, the melt was cast in to a copper mold and cold rolled to 0.5 mm thickness. During metallographic examination using a scanning electron microscope, it was observed that the nanotubes were effectively dispersed in the matrix. The mechanical properties of the composite were significantly increased as compared to pure aluminum specimen i.e. the yield strength from 65 to 115 MPa, the tensile strength from 82 to 125 MPa and hardness from 27 to 30 HV for pure aluminum and Al-CNTs composite, respectively. To recognize the associated strengthening mechanisms in the nanocomposites, three foremost strengthening models i.e. shear lag model, Orowan looping and Hall-Petch have been critically analyzed; experimental data were found to be closely satisfying the shear lag model.

A Distributed Cryptographically Generated Address Computing Algorithm for Secure Neighbor Discovery Protocol in IPv6

Due to shortage in IPv4 addresses, transition to IPv6 has gained significant momentum in recent years. Like Address Resolution Protocol (ARP) in IPv4, Neighbor Discovery Protocol (NDP) provides some functions like address resolution in IPv6. Besides functionality of NDP, it is vulnerable to some attacks. To mitigate these attacks, Internet Protocol Security (IPsec) was introduced, but it was not efficient due to its limitation. Therefore, SEND protocol is proposed to automatic protection of auto-configuration process. It is secure neighbor discovery and address resolution process. To defend against threats on NDP’s integrity and identity, Cryptographically Generated Address (CGA) and asymmetric cryptography are used by SEND. Besides advantages of SEND, its disadvantages like the computation process of CGA algorithm and sequentially of CGA generation algorithm are considerable. In this paper, we parallel this process between network resources in order to improve it. In addition, we compare the CGA generation time in self-computing and distributed-computing process. We focus on the impact of the malicious nodes on the CGA generation time in the network. According to the result, although malicious nodes participate in the generation process, CGA generation time is less than when it is computed in a one-way. By Trust Management System, detecting and insulating malicious nodes is easier.