Novelty as a Measure of Interestingness in Knowledge Discovery

Rule Discovery is an important technique for mining knowledge from large databases. Use of objective measures for discovering interesting rules leads to another data mining problem, although of reduced complexity. Data mining researchers have studied subjective measures of interestingness to reduce the volume of discovered rules to ultimately improve the overall efficiency of KDD process. In this paper we study novelty of the discovered rules as a subjective measure of interestingness. We propose a hybrid approach based on both objective and subjective measures to quantify novelty of the discovered rules in terms of their deviations from the known rules (knowledge). We analyze the types of deviation that can arise between two rules and categorize the discovered rules according to the user specified threshold. We implement the proposed framework and experiment with some public datasets. The experimental results are promising.

Effects of Different Plant Densities on the Yield and Quality of Second Crop Sesame

Sesame is one of the oldest and most important oil crops as main crop and second crop agriculture. This study was carried out to determine the effects of different inter- and intra-row spacings on the yield and yield components on second crop sesame; was set up in Antalya West Mediterranean Agricultural Research Institue in 2009. Muganlı 57 sesame cultivar was used as plant material. The field experiment was set up in a split plot design and row spacings (30, 40, 50, 60 and 70 cm) were assigned to the main plots and and intra-row spacings (5, 10, 20 and 30 cm) were assigned to the subplots. Seed yield, oil ratio, oil yield, protein ratio and protein yield were investigated. In general, wided inter row spacings and intra-row spacings, resulted in decreased seed yield, oil yield and protein yield. The highest seed yield, oil yield and protein yield (respectively, 1115.0 kg ha-1, 551.3 kg ha-1, 224.7 kg ha-1) were obtained from 30x5 cm plant density while the lowest seed yield, oil yield and protein yield (respectively, 677.0 kg ha-1, 327.0 kg ha-1, 130.0 kg ha-1) were recorded from 70x30 cm plant density. As a result, in terms of oil yield for second crop sesame agriculture, 30 cm row spacing, and 5 cm intra row spacing are the most suitable plant densities.

Protein Secondary Structure Prediction

Protein structure determination and prediction has been a focal research subject in the field of bioinformatics due to the importance of protein structure in understanding the biological and chemical activities of organisms. The experimental methods used by biotechnologists to determine the structures of proteins demand sophisticated equipment and time. A host of computational methods are developed to predict the location of secondary structure elements in proteins for complementing or creating insights into experimental results. However, prediction accuracies of these methods rarely exceed 70%.

Oscillation Effect of the Multi-stage Learning for the Layered Neural Networks and Its Analysis

This paper proposes an efficient learning method for the layered neural networks based on the selection of training data and input characteristics of an output layer unit. Comparing to recent neural networks; pulse neural networks, quantum neuro computation, etc, the multilayer network is widely used due to its simple structure. When learning objects are complicated, the problems, such as unsuccessful learning or a significant time required in learning, remain unsolved. Focusing on the input data during the learning stage, we undertook an experiment to identify the data that makes large errors and interferes with the learning process. Our method devides the learning process into several stages. In general, input characteristics to an output layer unit show oscillation during learning process for complicated problems. The multi-stage learning method proposes by the authors for the function approximation problems of classifying learning data in a phased manner, focusing on their learnabilities prior to learning in the multi layered neural network, and demonstrates validity of the multi-stage learning method. Specifically, this paper verifies by computer experiments that both of learning accuracy and learning time are improved of the BP method as a learning rule of the multi-stage learning method. In learning, oscillatory phenomena of a learning curve serve an important role in learning performance. The authors also discuss the occurrence mechanisms of oscillatory phenomena in learning. Furthermore, the authors discuss the reasons that errors of some data remain large value even after learning, observing behaviors during learning.

Variance Based Component Analysis for Texture Segmentation

This paper presents a comparative analysis of a new unsupervised PCA-based technique for steel plates texture segmentation towards defect detection. The proposed scheme called Variance Based Component Analysis or VBCA employs PCA for feature extraction, applies a feature reduction algorithm based on variance of eigenpictures and classifies the pixels as defective and normal. While the classic PCA uses a clusterer like Kmeans for pixel clustering, VBCA employs thresholding and some post processing operations to label pixels as defective and normal. The experimental results show that proposed algorithm called VBCA is 12.46% more accurate and 78.85% faster than the classic PCA.

Oil Refineries Emissions: Source and Impact: A Study using AERMOD

The main objectives of this paper are to measure pollutants concentrations in the oil refinery area in Kuwait over three periods during one year, obtain recent emission inventory for the three refineries of Kuwait, use AERMOD and the emission inventory to predict pollutants concentrations and distribution, compare model predictions against measured data, and perform numerical experiments to determine conditions at which emission rates and the resulting pollutant dispersion is below maximum allowable limits.

Mechanical and Hydric Properties of High- Performance Concrete Containing Natural Zeolites

Mechanical and water transport properties of high performance concrete (HPC) containing natural zeolite as partial replacement of Portland cement are studied. Experimental results show that in the investigated mixes the use of natural zeolite leads to an increase of porosity, decrease of compressive strength and increase of moisture diffusivity and water vapor diffusion coefficient, as compared with the reference HPC. However, for the replacement level up to 20% of the mass of Portland cement the concretes still maintain their high performance character and exhibit acceptable water transport properties. Therefore, natural zeolite can be considered an environmental friendly binder with a potential to replace a part of Portland cement in concrete in building industry.

Granulation using Clustering and Rough Set Theory and its Tree Representation

Granular computing deals with representation of information in the form of some aggregates and related methods for transformation and analysis for problem solving. A granulation scheme based on clustering and Rough Set Theory is presented with focus on structured conceptualization of information has been presented in this paper. Experiments for the proposed method on four labeled data exhibit good result with reference to classification problem. The proposed granulation technique is semi-supervised imbibing global as well as local information granulation. To represent the results of the attribute oriented granulation a tree structure is proposed in this paper.

Dynamic Action Induced By Walking Pedestrian

The main focus of this paper is on the human induced forces. Almost all existing force models for this type of load (defined either in the time or frequency domain) are developed from the assumption of perfect periodicity of the force and are based on force measurements conducted on rigid (i.e. high frequency) surfaces. To verify the different authors conclusions the vertical pressure measurements invoked during the walking was performed, using pressure gauges in various configurations. The obtained forces are analyzed using Fourier transformation. This load is often decisive in the design of footbridges. Design criteria and load models proposed by widely used standards and other researchers were introduced and a comparison was made.

Performance Evaluation of the Post-Installed Anchor for Sign Structure

Numerous experimental tests for post-installed anchor systems drilled in hardened concrete were conducted in order to estimate pull-out and shear strength accounting for uncertainties such as torque ratios, embedment depths and different diameters in demands. In this study, the strength of the systems was significantly changed by the effect of those three uncertainties during pull-out experimental tests, whereas the shear strength of the systems was not affected by torque ratios. It was also shown that concrete cone failure or damage mechanism was generally investigated during and after pull-out tests and in shear strength tests, mostly the anchor systems were failed prior to failure of primary structural system. Furthermore, 3D finite element model for the anchor systems was created by ABAQUS for the numerical analysis. The verification of finite element model was identical till the failure points to the load-displacement relationship specified by the experimental tests.

Investigating the Treatability of a Compost Leachate in a Hybrid Anaerobic Reactor: An Experimental Study

Compost manufacturing plants are one of units where wastewater is produced in significantly large amounts. Wastewater produced in these plants contains high amounts of substrate (organic loads) and is classified as stringent waste which creates significant pollution when discharged into the environment without treatment. A compost production plant in the one of the Iran-s province treating 200 tons/day of waste is one of the most important environmental pollutant operations in this zone. The main objectives of this paper are to investigate the compost wastewater treatability in hybrid anaerobic reactors with an upflow-downflow arrangement, to determine the kinetic constants, and eventually to obtain an appropriate mathematical model. After starting the hybrid anaerobic reactor of the compost production plant, the average COD removal rate efficiency was 95%.

Influence of Static Pressure on Viability of Entomopathogenic Nematodes – Steinernema feltiae

The entomopathogenic nematodes Steinernema feltiaeare are components of many biological pesticides. The biological pesticides are applicated by means a spraying machines. The influence of high pressure operating time on viability of nematodes has been experimentally investigated in order to explain if static pressure inside of the sprayers installation was able to destroy nematodes. The value of pressure was 55 MPa and its maximum operating time was 3 hours. Changes were found in viability of pressurized samples of nematodes, mixed with water.

Segmentation of Breast Lesions in Ultrasound Images Using Spatial Fuzzy Clustering and Structure Tensors

Segmentation in ultrasound images is challenging due to the interference from speckle noise and fuzziness of boundaries. In this paper, a segmentation scheme using fuzzy c-means (FCM) clustering incorporating both intensity and texture information of images is proposed to extract breast lesions in ultrasound images. Firstly, the nonlinear structure tensor, which can facilitate to refine the edges detected by intensity, is used to extract speckle texture. And then, a spatial FCM clustering is applied on the image feature space for segmentation. In the experiments with simulated and clinical ultrasound images, the spatial FCM clustering with both intensity and texture information gets more accurate results than the conventional FCM or spatial FCM without texture information.

Optimization of SAD Algorithm on VLIW DSP

SAD (Sum of Absolute Difference) algorithm is heavily used in motion estimation which is computationally highly demanding process in motion picture encoding. To enhance the performance of motion picture encoding on a VLIW processor, an efficient implementation of SAD algorithm on the VLIW processor is essential. SAD algorithm is programmed as a nested loop with a conditional branch. In VLIW processors, loop is usually optimized by software pipelining, but researches on optimal scheduling of software pipelining for nested loops, especially nested loops with conditional branches are rare. In this paper, we propose an optimal scheduling and implementation of SAD algorithm with conditional branch on a VLIW DSP processor. The proposed optimal scheduling first transforms the nested loop with conditional branch into a single loop with conditional branch with consideration of full utilization of ILP capability of the VLIW processor and realization of earlier escape from the loop. Next, the proposed optimal scheduling applies a modulo scheduling technique developed for single loop. Based on this optimal scheduling strategy, optimal implementation of SAD algorithm on TMS320C67x, a VLIW DSP is presented. Through experiments on TMS320C6713 DSK, it is shown that H.263 encoder with the proposed SAD implementation performs better than other H.263 encoder with other SAD implementations, and that the code size of the optimal SAD implementation is small enough to be appropriate for embedded environments.

Bio-Inspired Generalized Global Shape Approach for Writer Identification

Writer identification is one of the areas in pattern recognition that attract many researchers to work in, particularly in forensic and biometric application, where the writing style can be used as biometric features for authenticating an identity. The challenging task in writer identification is the extraction of unique features, in which the individualistic of such handwriting styles can be adopted into bio-inspired generalized global shape for writer identification. In this paper, the feasibility of generalized global shape concept of complimentary binding in Artificial Immune System (AIS) for writer identification is explored. An experiment based on the proposed framework has been conducted to proof the validity and feasibility of the proposed approach for off-line writer identification.

Automatic Camera Calibration for Images of Soccer Match

Camera calibration plays an important role in the domain of the analysis of sports video. Considering soccer video, in most cases, the cross-points can be used for calibration at the center of the soccer field are not sufficient, so this paper introduces a new automatic camera calibration algorithm focus on solving this problem by using the properties of images of the center circle, halfway line and a touch line. After the theoretical analysis, a practicable automatic algorithm is proposed. Very little information used though, results of experiments with both synthetic data and real data show that the algorithm is applicable.

Gluten-Free Cookies Enriched with Blueberry Pomace: Optimization of Baking Process

With the aim of improving nutritional profile and antioxidant capacity of gluten-free cookies, blueberry pomace, by-product of juice production, was processed into a new food ingredient by drying and grinding and used for a gluten-free cookie formulation. Since the quality of a baked product is highly influenced by the baking conditions, the objective of this work was to optimize the baking time and thickness of dough pieces, by applying Response Surface Methodology (RSM) in order to obtain the best technological quality of the cookies. The experiments were carried out according to a Central Composite Design (CCD) by selecting the dough thickness and baking time as independent variables, while hardness, color parameters (L*, a* and b* values), water activity, diameter and short/long ratio were response variables. According to the results of RSM analysis, the baking time of 13.74min and dough thickness of 4.08mm was found to be the optimal for the baking temperature of 170°C. As similar optimal parameters were obtained by previously conducted experiment based on sensory analysis, response surface methodology (RSM) can be considered as a suitable approach to optimize the baking process.

Experimental Study of Upsetting and Die Forging with Controlled Impact

The results from experimental research of deformation by upsetting and die forging of lead specimens wit controlled impact are presented. Laboratory setup for conducting the investigations, which uses cold rocket engine operated with compressed air, is described. The results show that when using controlled impact is achieving greater plastic deformation and consumes less impact energy than at ordinary impact deformation process.

The Contraction Point for Phan-Thien/Tanner Model of Tube-Tooling Wire-Coating Flow

The simulation of extrusion process is studied widely in order to both increase products and improve quality, with broad application in wire coating. The annular tube-tooling extrusion was set up by a model that is termed as Navier-Stokes equation in addition to a rheological model of differential form based on singlemode exponential Phan-Thien/Tanner constitutive equation in a twodimensional cylindrical coordinate system for predicting the contraction point of the polymer melt beyond the die. Numerical solutions are sought through semi-implicit Taylor-Galerkin pressurecorrection finite element scheme. The investigation was focused on incompressible creeping flow with long relaxation time in terms of Weissenberg numbers up to 200. The isothermal case was considered with surface tension effect on free surface in extrudate flow and no slip at die wall. The Stream Line Upwind Petrov-Galerkin has been proposed to stabilize solution. The structure of mesh after die exit was adjusted following prediction of both top and bottom free surfaces so as to keep the location of contraction point around one unit length which is close to experimental results. The simulation of extrusion process is studied widely in order to both increase products and improve quality, with broad application in wire coating. The annular tube-tooling extrusion was set up by a model that is termed as Navier-Stokes equation in addition to a rheological model of differential form based on single-mode exponential Phan- Thien/Tanner constitutive equation in a two-dimensional cylindrical coordinate system for predicting the contraction point of the polymer melt beyond the die. Numerical solutions are sought through semiimplicit Taylor-Galerkin pressure-correction finite element scheme. The investigation was focused on incompressible creeping flow with long relaxation time in terms of Weissenberg numbers up to 200. The isothermal case was considered with surface tension effect on free surface in extrudate flow and no slip at die wall. The Stream Line Upwind Petrov-Galerkin has been proposed to stabilize solution. The structure of mesh after die exit was adjusted following prediction of both top and bottom free surfaces so as to keep the location of contraction point around one unit length which is close to experimental results.

Mining Correlated Bicluster from Web Usage Data Using Discrete Firefly Algorithm Based Biclustering Approach

For the past one decade, biclustering has become popular data mining technique not only in the field of biological data analysis but also in other applications like text mining, market data analysis with high-dimensional two-way datasets. Biclustering clusters both rows and columns of a dataset simultaneously, as opposed to traditional clustering which clusters either rows or columns of a dataset. It retrieves subgroups of objects that are similar in one subgroup of variables and different in the remaining variables. Firefly Algorithm (FA) is a recently-proposed metaheuristic inspired by the collective behavior of fireflies. This paper provides a preliminary assessment of discrete version of FA (DFA) while coping with the task of mining coherent and large volume bicluster from web usage dataset. The experiments were conducted on two web usage datasets from public dataset repository whereby the performance of FA was compared with that exhibited by other population-based metaheuristic called binary Particle Swarm Optimization (PSO). The results achieved demonstrate the usefulness of DFA while tackling the biclustering problem.