Moving Data Mining Tools toward a Business Intelligence System

Data mining (DM) is the process of finding and extracting frequent patterns that can describe the data, or predict unknown or future values. These goals are achieved by using various learning algorithms. Each algorithm may produce a mining result completely different from the others. Some algorithms may find millions of patterns. It is thus the difficult job for data analysts to select appropriate models and interpret the discovered knowledge. In this paper, we describe a framework of an intelligent and complete data mining system called SUT-Miner. Our system is comprised of a full complement of major DM algorithms, pre-DM and post-DM functionalities. It is the post-DM packages that ease the DM deployment for business intelligence applications.

Quality Properties of Fermented Mugworts and Rapid Pattern Analysis of Their Volatile Flavor Components by Electric Nose Based On SAW (Surface Acoustic Wave) Sensor in GC System

The changes in quality properties and nutritional components in two fermented mugworts (Artemisia capillaries Thumberg, Artemisiaeasiaticae Nakai) were characterized followed by the rapid pattern analysis of volatile flavor compounds by Electric Nose based on SAW(Surface Acoustic Wave) sensor in GC system. There were remarkable decreases in the pH and small changes in the total soluble solids after fermentation. The L (lightness) and b (yellowness) values in Hunter's color system were shown to be decreased, whilst the a (redness) value was increased by fermentation. The HPLC analysis demonstrated that total amino acids were increased in quantity and the essential amino acids were contained higher in A. asiaticaeNakai than in A. capillaries Thumberg. While the total polyphenol contents were not affected by fermentation, the total sugar contents were dramatically decreased. Scopoletinwere highly abundant in A. capillarisThumberg, however, it was not detected in A. asiaticaeNakai. Volatile flavor compounds by Electric Nose showed that the intensity of several peaks were increased much and seven additional flavor peaks were newly produced after fermentation. The flavor differences of two mugworts were clearly distinguished from the image patterns of VaporPrintTM which indicate that the fermentation enables the two mugworts to have subtle flavor differences.

Cooling Turbine Blades using Exciting Boundary Layer

The present study is concerned with the effect of exciting boundary layer on cooling process in a gas-turbine blades. The cooling process is numerically investigated. Observations show cooling the first row of moving or stable blades leads to increase their life-time. Results show that minimum temperature in cooling line with exciting boundary layer is lower than without exciting. Using block in cooling line of turbines' blade causes flow pattern and stability in boundary layer changed that causes increase in heat transfer coefficient. Results show at the location of block, temperature of turbines' blade is significantly decreased. The k-ε turbulence model is used.

Bandwidth Estimation Algorithms for the Dynamic Adaptation of Voice Codec

In the recent years multimedia traffic and in particular VoIP services are growing dramatically. We present a new algorithm to control the resource utilization and to optimize the voice codec selection during SIP call setup on behalf of the traffic condition estimated on the network path. The most suitable methodologies and the tools that perform realtime evaluation of the available bandwidth on a network path have been integrated with our proposed algorithm: this selects the best codec for a VoIP call in function of the instantaneous available bandwidth on the path. The algorithm does not require any explicit feedback from the network, and this makes it easily deployable over the Internet. We have also performed intensive tests on real network scenarios with a software prototype, verifying the algorithm efficiency with different network topologies and traffic patterns between two SIP PBXs. The promising results obtained during the experimental validation of the algorithm are now the basis for the extension towards a larger set of multimedia services and the integration of our methodology with existing PBX appliances.

Measuring of Urban Sustainability in Town Planners Practice

Physical urban form is recognized to be the media for human transactions. It directly influences the travel demand of people in a specific urban area and the amount of energy used for transportation. Distorted, sprawling form often creates sustainability problems in urban areas. It is declared in EU strategic planning documents that compact urban form and mixed land use pattern must be given the main focus to achieve better sustainability in urban areas, but the methods to measure and compare these characteristics are still not clear. This paper presents the simple methods to measure the spatial characteristics of urban form by analyzing the location and distribution of objects in an urban environment. The extended CA (cellular automata) model is used to simulate urban development scenarios.

A Systematic Approach for Finding Hamiltonian Cycles with a Prescribed Edge in Crossed Cubes

The crossed cube is one of the most notable variations of hypercube, but some properties of the former are superior to those of the latter. For example, the diameter of the crossed cube is almost the half of that of the hypercube. In this paper, we focus on the problem embedding a Hamiltonian cycle through an arbitrary given edge in the crossed cube. We give necessary and sufficient condition for determining whether a given permutation with n elements over Zn generates a Hamiltonian cycle pattern of the crossed cube. Moreover, we obtain a lower bound for the number of different Hamiltonian cycles passing through a given edge in an n-dimensional crossed cube. Our work extends some recently obtained results.

Classifying Bio-Chip Data using an Ant Colony System Algorithm

Bio-chips are used for experiments on genes and contain various information such as genes, samples and so on. The two-dimensional bio-chips, in which one axis represent genes and the other represent samples, are widely being used these days. Instead of experimenting with real genes which cost lots of money and much time to get the results, bio-chips are being used for biological experiments. And extracting data from the bio-chips with high accuracy and finding out the patterns or useful information from such data is very important. Bio-chip analysis systems extract data from various kinds of bio-chips and mine the data in order to get useful information. One of the commonly used methods to mine the data is classification. The algorithm that is used to classify the data can be various depending on the data types or number characteristics and so on. Considering that bio-chip data is extremely large, an algorithm that imitates the ecosystem such as the ant algorithm is suitable to use as an algorithm for classification. This paper focuses on finding the classification rules from the bio-chip data using the Ant Colony algorithm which imitates the ecosystem. The developed system takes in consideration the accuracy of the discovered rules when it applies it to the bio-chip data in order to predict the classes.

Development of Subjective Measures of Interestingness: From Unexpectedness to Shocking

Knowledge Discovery of Databases (KDD) is the process of extracting previously unknown but useful and significant information from large massive volume of databases. Data Mining is a stage in the entire process of KDD which applies an algorithm to extract interesting patterns. Usually, such algorithms generate huge volume of patterns. These patterns have to be evaluated by using interestingness measures to reflect the user requirements. Interestingness is defined in different ways, (i) Objective measures (ii) Subjective measures. Objective measures such as support and confidence extract meaningful patterns based on the structure of the patterns, while subjective measures such as unexpectedness and novelty reflect the user perspective. In this report, we try to brief the more widely spread and successful subjective measures and propose a new subjective measure of interestingness, i.e. shocking.

Optimal Sizing of a Hybrid Wind/PV Plant Considering Reliability Indices

The utilization of renewable energy sources in electric power systems is increasing quickly because of public apprehensions for unpleasant environmental impacts and increase in the energy costs involved with the use of conventional energy sources. Despite the application of these energy sources can considerably diminish the system fuel costs, they can also have significant influence on the system reliability. Therefore an appropriate combination of the system reliability indices level and capital investment costs of system is vital. This paper presents a hybrid wind/photovoltaic plant, with the aim of supplying IEEE reliability test system load pattern while the plant capital investment costs is minimized by applying a hybrid particle swarm optimization (PSO) / harmony search (HS) approach, and the system fulfills the appropriate level of reliability.

Analysis of Influenza Cases and Seasonal Index in Thailand

This study investigated the pattern and seasonal index of influenza cases in Thailand. Our results showed that southern Thailand had the highest influenza incidence among the four regions of Thailand (i.e. north, northeast, central and southern Thailand). The influenza pattern in southern Thailand was similar to that of northeastern Thailand. Seasonal index values of influenza cases in Thailand were higher in the hot season than in the wet season. Influenza cases started to increase at the beginning of the hot season (April), reached a maximum in August, rapidly declined in the middle of the wet season and reached the lowest value in December. Seasonal index values for northern Thailand differed from other regions of Thailand.

Actionable Rules: Issues and New Directions

Knowledge Discovery in Databases (KDD) is the process of extracting previously unknown, hidden and interesting patterns from a huge amount of data stored in databases. Data mining is a stage of the KDD process that aims at selecting and applying a particular data mining algorithm to extract an interesting and useful knowledge. It is highly expected that data mining methods will find interesting patterns according to some measures, from databases. It is of vital importance to define good measures of interestingness that would allow the system to discover only the useful patterns. Measures of interestingness are divided into objective and subjective measures. Objective measures are those that depend only on the structure of a pattern and which can be quantified by using statistical methods. While, subjective measures depend only on the subjectivity and understandability of the user who examine the patterns. These subjective measures are further divided into actionable, unexpected and novel. The key issues that faces data mining community is how to make actions on the basis of discovered knowledge. For a pattern to be actionable, the user subjectivity is captured by providing his/her background knowledge about domain. Here, we consider the actionability of the discovered knowledge as a measure of interestingness and raise important issues which need to be addressed to discover actionable knowledge.

Technological Innovation Persistence Organizational Innovation Matters

Organizational innovation favors technological innovation, but does it also influence technological innovation persistence? This article investigates empirically the pattern of technological innovation persistence and tests the potential impact of organizational innovation using firm-level data from three waves of the French Community Innovation Surveys. Evidence shows a positive effect of organizational innovation on technological innovation persistence, according to various measures of organizational innovation. Moreover, this impact is more significant for complex innovators (i.e., those who innovate in both products and processes). These results highlight the complexity of managing organizational practices with regard to the firm-s technological innovation. They also add to comprehension of the drivers of innovation persistence, through a focus on an often forgotten dimension of innovation in a broader sense.

AGHAZ : An Expert System Based approach for the Translation of English to Urdu

Machine Translation (MT 3) of English text to its Urdu equivalent is a difficult challenge. Lot of attempts has been made, but a few limited solutions are provided till now. We present a direct approach, using an expert system to translate English text into its equivalent Urdu, using The Unicode Standard, Version 4.0 (ISBN 0-321-18578-1) Range: 0600–06FF. The expert system works with a knowledge base that contains grammatical patterns of English and Urdu, as well as a tense and gender-aware dictionary of Urdu words (with their English equivalents).

Mining Sequential Patterns Using Hybrid Evolutionary Algorithm

Mining Sequential Patterns in large databases has become an important data mining task with broad applications. It is an important task in data mining field, which describes potential sequenced relationships among items in a database. There are many different algorithms introduced for this task. Conventional algorithms can find the exact optimal Sequential Pattern rule but it takes a long time, particularly when they are applied on large databases. Nowadays, some evolutionary algorithms, such as Particle Swarm Optimization and Genetic Algorithm, were proposed and have been applied to solve this problem. This paper will introduce a new kind of hybrid evolutionary algorithm that combines Genetic Algorithm (GA) with Particle Swarm Optimization (PSO) to mine Sequential Pattern, in order to improve the speed of evolutionary algorithms convergence. This algorithm is referred to as SP-GAPSO.

Design and Simulation of a Concentrated Luneberg Antenna

Luneberg lens is a new generation of antennas that is developed in the last few years and inserts itself strongly in Microwaves, Communications and Telescopes area. The idea of this research is to improve the radiation pattern by decreasing the side lobes and increasing the main lobe. The new design is proposed to work in the X-band. The simulated result and analysis are presented.

Analysis of Metallothionein Gene MT1A (rs11076161) and MT2A (rs10636) Polymorphisms as a Molecular Marker in Type 2 Diabetes Mellitus among Malay Population

Type 2 diabetes mellitus (T2DM) is a complex metabolic disorder that characterized by the presence of high glucose in blood that cause from insulin resistance and insufficiency due to deterioration β-cell Langerhans functions. T2DM is commonly caused by the combination of inherited genetic variations as well as our own lifestyle. Metallothionein (MT) is a known cysteine-rich protein responsible in helping zinc homeostasis which is important in insulin signaling and secretion as well as protection our body from reactive oxygen species (ROS). MT scavenged ROS and free radicals in our body happen to be one of the reasons of T2DM and its complications. The objective of this study was to investigate the association of MT1A and MT2A polymorphisms between T2DM and control subjects among Malay populations. This study involved 150 T2DM and 120 Healthy individuals of Malay ethnic with mixed genders. The genomic DNA was extracted from buccal cells and amplified for MT1A and MT2A loci; the 347bp and 238bp banding patterns were respectively produced by mean of the Polymerase Chain Reaction (PCR). The PCR products were digested with Mlucl and Tsp451 restriction enzymes respectively and producing fragments lengths of (158/189/347bp) and (103/135/238bp) respectively. The ANOVA test was conducted and it shown that there was a significant difference between diabetic and control subjects for age, BMI, WHR, SBP, FPG, HBA1C, LDL, TG, TC and family history with (P0.05). The genotype frequency for AA, AG and GG of MT1A polymorphisms was 72.7%, 22.7% and 4.7% in cases and 15%, 55% and 30% in control respectively. As for MT2A, genotype frequency of GG, GC and CC was 42.7%, 27.3% and 30% in case and 5%, 40% and 55% for control respectively. Both polymorphisms show significant difference between two investigated groups with (P=0.000). The Post hoc test was conducted and shows a significant difference between the genotypes within each polymorphism (P=0. 000). The MT1A and MT2A polymorphisms were believed to be the reliable molecular markers to distinguish the T2DM subjects from healthy individuals in Malay populations.

Quality of Concrete of Recent Development Projects in Libya

Numerous concrete structures projects are currently running in Libya as part of a US$50 billion government funding. The quality of concrete used in 20 different construction projects were assessed based mainly on the concrete compressive strength achieved. The projects are scattered all over the country and are at various levels of completeness. For most of these projects, the concrete compressive strength was obtained from test results of a 150mm standard cube mold. Statistical analysis of collected concrete compressive strengths reveals that the data in general followed a normal distribution pattern. The study covers comparison and assessment of concrete quality aspects such as: quality control, strength range, data standard deviation, data scatter, and ratio of minimum strength to design strength. Site quality control for these projects ranged from very good to poor according to ACI214 criteria [1]. The ranges (Rg) of the strength (max. strength – min. strength) divided by average strength are from (34% to 160%). Data scatter is measured as the range (Rg) divided by standard deviation () and is found to be (1.82 to 11.04), indicating that the range is ±3σ. International construction companies working in Libya follow different assessment criteria for concrete compressive strength in lieu of national unified procedure. The study reveals that assessments of concrete quality conducted by these construction companies usually meet their adopted (internal) standards, but sometimes fail to meet internationally known standard requirements. The assessment of concrete presented in this paper is based on ACI, British standards and proposed Libyan concrete strength assessment criteria.

Study of Electro-Optical Properties of ZnS Nanoparticles Prepared by Colloidal Particles Method

ZnS nanoparticles of different size have been synthesized using a colloidal particles method. Zns nanoparticles prepared with capping agent (mercaptoethanol) then were characterized using X-ray diffraction (XRD) and UV-Vis spectroscopy. The particle size of the nanoparticles calculated from the XRD patterns has been found in the range 1.85-2.44nm. Absorption spectra have been obtained using UV-Vis spectrophotometer to find the optical band gap and the obtained values have been founded to being range 3.83-4.59eV. It was also found that energy band gap increase with the increase in molar capping agent solution.

Modelling of Soil Erosion by Non Conventional Methods

Soil erosion is the most serious problem faced at global and local level. So planning of soil conservation measures has become prominent agenda in the view of water basin managers. To plan for the soil conservation measures, the information on soil erosion is essential. Universal Soil Loss Equation (USLE), Revised Universal Soil Loss Equation 1 (RUSLE1or RUSLE) and Modified Universal Soil Loss Equation (MUSLE), RUSLE 1.06, RUSLE1.06c, RUSLE2 are most widely used conventional erosion estimation methods. The essential drawbacks of USLE, RUSLE1 equations are that they are based on average annual values of its parameters and so their applicability to small temporal scale is questionable. Also these equations do not estimate runoff generated soil erosion. So applicability of these equations to estimate runoff generated soil erosion is questionable. Data used in formation of USLE, RUSLE1 equations was plot data so its applicability at greater spatial scale needs some scale correction factors to be induced. On the other hand MUSLE is unsuitable for predicting sediment yield of small and large events. Although the new revised forms of USLE like RUSLE 1.06, RUSLE1.06c and RUSLE2 were land use independent and they have almost cleared all the drawbacks in earlier versions like USLE and RUSLE1, they are based on the regional data of specific area and their applicability to other areas having different climate, soil, land use is questionable. These conventional equations are applicable for sheet and rill erosion and unable to predict gully erosion and spatial pattern of rills. So the research was focused on development of nonconventional (other than conventional) methods of soil erosion estimation. When these non-conventional methods are combined with GIS and RS, gives spatial distribution of soil erosion. In the present paper the review of literature on non- conventional methods of soil erosion estimation supported by GIS and RS is presented.