Meta Random Forests

Leo Breimans Random Forests (RF) is a recent development in tree based classifiers and quickly proven to be one of the most important algorithms in the machine learning literature. It has shown robust and improved results of classifications on standard data sets. Ensemble learning algorithms such as AdaBoost and Bagging have been in active research and shown improvements in classification results for several benchmarking data sets with mainly decision trees as their base classifiers. In this paper we experiment to apply these Meta learning techniques to the random forests. We experiment the working of the ensembles of random forests on the standard data sets available in UCI data sets. We compare the original random forest algorithm with their ensemble counterparts and discuss the results.

Time Domain and Frequency Domain Analyses of Measured Metocean Data for Malaysian Waters

Data of wave height and wind speed were collected from three existing oil fields in South China Sea – offshore Peninsular Malaysia, Sarawak and Sabah regions. Extreme values and other significant data were employed for analysis. The data were recorded from 1999 until 2008. The results show that offshore structures are susceptible to unacceptable motions initiated by wind and waves with worst structural impacts caused by extreme wave heights. To protect offshore structures from damage, there is a need to quantify descriptive statistics and determine spectra envelope of wind speed and wave height, and to ascertain the frequency content of each spectrum for offshore structures in the South China Sea shallow waters using measured time series. The results indicate that the process is nonstationary; it is converted to stationary process by first differencing the time series. For descriptive statistical analysis, both wind speed and wave height have significant influence on the offshore structure during the northeast monsoon with high mean wind speed of 13.5195 knots ( = 6.3566 knots) and the high mean wave height of 2.3597 m ( = 0.8690 m). Through observation of the spectra, there is no clear dominant peak and the peaks fluctuate randomly. Each wind speed spectrum and wave height spectrum has its individual identifiable pattern. The wind speed spectrum tends to grow gradually at the lower frequency range and increasing till it doubles at the higher frequency range with the mean peak frequency range of 0.4104 Hz to 0.4721 Hz, while the wave height tends to grow drastically at the low frequency range, which then fluctuates and decreases slightly at the high frequency range with the mean peak frequency range of 0.2911 Hz to 0.3425 Hz.

Monitoring and Fault-Recovery Capacity with Waveguide Grating-based Optical Switch over WDM/OCDMA-PON

In order to implement flexibility as well as survivable capacities over passive optical network (PON), a new automatic random fault-recovery mechanism with array-waveguide-grating based (AWG-based) optical switch (OSW) is presented. Firstly, wavelength-division-multiplexing and optical code-division multiple-access (WDM/OCDMA) scheme are configured to meet the various geographical locations requirement between optical network unit (ONU) and optical line terminal (OLT). The AWG-base optical switch is designed and viewed as central star-mesh topology to prohibit/decrease the duplicated redundant elements such as fiber and transceiver as well. Hence, by simple monitoring and routing switch algorithm, random fault-recovery capacity is achieved over bi-directional (up/downstream) WDM/OCDMA scheme. When error of distribution fiber (DF) takes place or bit-error-rate (BER) is higher than 10-9 requirement, the primary/slave AWG-based OSW are adjusted and controlled dynamically to restore the affected ONU groups via the other working DFs immediately.

Improving Protein-Protein Interaction Prediction by Using Encoding Strategies and Random Indices

A New features are extracted and compared to improve the prediction of protein-protein interactions. The basic idea is to select and use the best set of features from the Tensor matrices that are produced by the frequency vectors of the protein sequences. Three set of features are compared, the first set is based on the indices that are the most common in the interacting proteins, the second set is based on the indices that tend to be common in the interacting and non-interacting proteins, and the third set is constructed by using random indices. Moreover, three encoding strategies are compared; that are based on the amino asides polarity, structure, and chemical properties. The experimental results indicate that the highest accuracy can be obtained by using random indices with chemical properties encoding strategy and support vector machine.

Automation of Heat Exchanger using Neural Network

In this paper the development of a heat exchanger as a pilot plant for educational purpose is discussed and the use of neural network for controlling the process is being presented. The aim of the study is to highlight the need of a specific Pseudo Random Binary Sequence (PRBS) to excite a process under control. As the neural network is a data driven technique, the method for data generation plays an important role. In light of this a careful experimentation procedure for data generation was crucial task. Heat exchange is a complex process, which has a capacity and a time lag as process elements. The proposed system is a typical pipe-in- pipe type heat exchanger. The complexity of the system demands careful selection, proper installation and commissioning. The temperature, flow, and pressure sensors play a vital role in the control performance. The final control element used is a pneumatically operated control valve. While carrying out the experimentation on heat exchanger a welldrafted procedure is followed giving utmost attention towards safety of the system. The results obtained are encouraging and revealing the fact that if the process details are known completely as far as process parameters are concerned and utilities are well stabilized then feedback systems are suitable, whereas neural network control paradigm is useful for the processes with nonlinearity and less knowledge about process. The implementation of NN control reinforces the concepts of process control and NN control paradigm. The result also underlined the importance of excitation signal typically for that process. Data acquisition, processing, and presentation in a typical format are the most important parameters while validating the results.

Optimization of Communication Protocols by stochastic Delay Mechanisms

The paper is concerned with developing stochastic delay mechanisms for efficient multicast protocols and for smooth mobile handover processes which are capable of preserving a given Quality of Service (QoS). In both applications the participating entities (receiver nodes or subscribers) sample a stochastic timer and generate load after a random delay. In this way, the load on the networking resources is evenly distributed which helps to maintain QoS communication. The optimal timer distributions have been sought in different p.d.f. families (e.g. exponential, power law and radial basis function) and the optimal parameter have been found in a recursive manner. Detailed simulations have demonstrated the improvement in performance both in the case of multicast and mobile handover applications.

Analysis of Socio-Cultural Obstacles for Dissemination of Nanotechnology from Iran's Agricultural Experts Perspective

The main purpose of this research was to analyze Socio-Cultural obstacles of disseminating of nanotechnology in Iran's agricultural section. One hundred twenty eight out of a total of 190 researchers with different levels of expertise in and familiarity with nanotechnology were randomly selected and questionnaires completed by them. Face validity have been done by expert's suggestion and correction, reliability by using Cronbakh-Alpha formula. The results of a factor analysis showed variation for different factors. For cultural factors 19/475 percent, for management 13/139 percent, information factor 11/277 percent, production factor 9/703 percent, social factor 9/267 percent, and for attitude factor it became 8/947 percent. Also results indicated that socio-cultural factors were the most important obstacle for nanotechnology dissemination in agricultural section in Iran.

Wind Load Characteristics in Libya

Recent trends in building constructions in Libya are more toward tall (high-rise) building projects. As a consequence, a better estimation of the lateral loading in the design process is becoming the focal of a safe and cost effective building industry. Byin- large, Libya is not considered a potential earthquake prone zone, making wind is the dominant design lateral loads. Current design practice in the country estimates wind speeds on a mere random bases by considering certain factor of safety to the chosen wind speed. Therefore, a need for a more accurate estimation of wind speeds in Libya was the motivation behind this study. Records of wind speed data were collected from 22 metrological stations in Libya, and were statistically analysed. The analysis of more than four decades of wind speed records suggests that the country can be divided into four zones of distinct wind speeds. A computer “survey" program was manipulated to draw design wind speeds contour map for the state of Libya. The paper presents the statistical analysis of Libya-s recorded wind speed data and proposes design wind speed values for a 50-year return period that covers the entire country.

Simulation of Sample Paths of Non Gaussian Stationary Random Fields

Mathematical justifications are given for a simulation technique of multivariate nonGaussian random processes and fields based on Rosenblatt-s transformation of Gaussian processes. Different types of convergences are given for the approaching sequence. Moreover an original numerical method is proposed in order to solve the functional equation yielding the underlying Gaussian process autocorrelation function.

Directional Drilling Optimization by Non-Rotating Stabilizer

The Non-Rotating Adjustable Stabilizer / Directional Solution (NAS/DS) is the imitation of a mechanical process or an object by a directional drilling operation that causes a respond mathematically and graphically to data and decision to choose the best conditions compared to the previous mode. The NAS/DS Auto Guide rotary steerable tool is undergoing final field trials. The point-the-bit tool can use any bit, work at any rotating speed, work with any MWD/LWD system, and there is no pressure drop through the tool. It is a fully closed-loop system that automatically maintains a specified curvature rate. The Non–Rotating Adjustable stabilizer (NAS) can be controls curvature rate by exactly positioning and run with the optimum bit, use the most effective weight (WOB) and rotary speed (RPM) and apply all of the available hydraulic energy to the bit. The directional simulator allowed to specify the size of the curvature rate performance errors of the NAS tool and the magnitude of the random errors in the survey measurements called the Directional Solution (DS). The combination of these technologies (NAS/DS) will provide smoother bore holes, reduced drilling time, reduced drilling cost and incredible targeting precision. This simulator controls curvature rate by precisely adjusting the radial extension of stabilizer blades on a near bit Non-Rotating Stabilizer and control process corrects for the secondary effects caused by formation characteristics, bit and tool wear, and manufacturing tolerances.

The Appropriate Time Required for Newborn Calf Camel to Get Optimal Amount of Colostrums Immunoglobulin (IgG) with Relation to Levels of Cortisol and Thyroxin

A major challenge in camel productivity is the high mortality rate of camel calves in the early stage due to the lack of colostrums. This study investigates the time required for the calves to obtain the optimum amount of the immunoglobulin (IgG). Eleven pregnant female camels (Camelus Dromedarus) were selected randomly and variant in age and gestation. After delivery, 7 calves were obtained and used for this investigation. Colostrum samples were collected from mothers immediately after parturition. Blood samples were obtained from the calves as follow: 0 day (before suckling), 24, 48, 72, 96, 120 and 144 hours, 2nd, 3rd, and 4th weeks post suckling. Blood serum and colostrums whey were separated and used to determine IgG concentration, total protein and concentration of Cortisol and Thyroxin. The results showed high levels of IgG in camel colostrums (328.8 ± 4.5 mg / ml). The IgG concentration in serum of calves was the highest within 1st 24 h after suckling (140.75 mg /ml), and then declined gradually reached lower level at 144 h (41.97 mg / ml). The average turnover rate (t 1/2) of serum IgG in the all cases was 3.22 days. The turnover of ranged from 2.56 days for calves have values of IgG more than average and 7.7 days for those with values below average. In spite of very high levels of thyroxin in sera of new born the results showed no correlation between cortisol and thyroxin with IgG levels.

Experimental and Theoretical Investigation of Rough Rice Drying in Infrared-assisted Hot Air Dryer Using Artificial Neural Network

Drying characteristics of rough rice (variety of lenjan) with an initial moisture content of 25% dry basis (db) was studied in a hot air dryer assisted by infrared heating. Three arrival air temperatures (30, 40 and 500C) and four infrared radiation intensities (0, 0.2 , 0.4 and 0.6 W/cm2) and three arrival air speeds (0.1, 0.15 and 0.2 m.s-1) were studied. Bending strength of brown rice kernel, percentage of cracked kernels and time of drying were measured and evaluated. The results showed that increasing the drying arrival air temperature and radiation intensity of infrared resulted decrease in drying time. High bending strength and low percentage of cracked kernel was obtained when paddy was dried by hot air assisted infrared dryer. Between this factors and their interactive effect were a significant difference (p

Evaluating some Feature Selection Methods for an Improved SVM Classifier

Text categorization is the problem of classifying text documents into a set of predefined classes. After a preprocessing step the documents are typically represented as large sparse vectors. When training classifiers on large collections of documents, both the time and memory restrictions can be quite prohibitive. This justifies the application of features selection methods to reduce the dimensionality of the document-representation vector. Four feature selection methods are evaluated: Random Selection, Information Gain (IG), Support Vector Machine (called SVM_FS) and Genetic Algorithm with SVM (GA_FS). We showed that the best results were obtained with SVM_FS and GA_FS methods for a relatively small dimension of the features vector comparative with the IG method that involves longer vectors, for quite similar classification accuracies. Also we present a novel method to better correlate SVM kernel-s parameters (Polynomial or Gaussian kernel).

Objectivity, Reliability and Validity of the 90º Push-Ups Test Protocol Among Male and Female Students of Sports Science Program

This study was conducted to determine the objectivity, reliability and validity of the 90º push-ups test protocol among male and female students of Sports Science Program, Faculty of Sports Science and Coaching Sultan Idris University of Education. Samples (n = 300), consisted of males (n = 168) and females (n = 132) students were randomly selected for this study. Researchers tested the 90º push-ups on the sample twice in a single trial, test and re-test protocol in the bench press test. Pearson-Product Moment Correlation method's was used to determine the value of objectivity, reliability and validity testing. The findings showed that the 900 pushups test protocol showed high consistency between the two testers with a value of r = .99. Likewise, The reliability value between test and re-test for the 90º push-ups test for the male (r=.93) and female (r=.93) students was also high. The results showed a correlation between 90º push-ups test and bench press test for boys was r = .64 and girls was r = .28. This finding indicates that the use of the 90º push-ups to test muscular strength and endurance in the upper body of males has a higher validity values than female students.

Structural Characteristics of Three-Dimensional Random Packing of Aggregates with Wide Size Distribution

The mechanical properties of granular solids are dependent on the flow of stresses from one particle to another through inter-particle contact. Although some experimental methods have been used to study the inter-particle contacts in the past, preliminary work with these techniques indicated that they do not have the necessary resolution to distinguish between those contacts that transmit the load and those that do not, especially for systems with a wide distribution of particle sizes. In this research, computer simulations are used to study the nature and distribution of contacts in a compact with wide particle size distribution, representative of aggregate size distribution used in asphalt pavement construction. The packing fraction, the mean number of contacts and the distribution of contacts were studied for different scenarios. A methodology to distinguish and compute the fraction of load-bearing particles and the fraction of space-filling particles (particles that do not transmit any force) is needed for further investigation.

Utilization Juice Wastes as Corn Replacement in the Broiler Diet

An experiment was conducted with 80 unsexed broilers of the Arbor Acress strain to determine the capability of a carrot and fruit juice wastes mixture (carrot, apple, manggo, avocado, orange, melon and Dutch egg plant) in the same proportion for replacing corn in broiler diet. This study involved a completely randomized design (CRD) with 5 treatments (0, 5, 10, 15, and 20% of juice wastes mixture in diets) and 4 replicates per treatment. Diets were isonitrogenous (22% crude protein) and isocaloric (3000 kcal/kg diet). Measured variables were feed consumption, average daily gain, feed conversion, as well as percentages of abdominal fat pad, carcass, digestive organs (liver, pancreas and gizzard), and heart. Data were analyzed by analysis of variance for CRD. Increasing juice wastes mixture levels in diets increased feed consumption (P

Detecting Email Forgery using Random Forests and Naïve Bayes Classifiers

As emails communications have no consistent authentication procedure to ensure the authenticity, we present an investigation analysis approach for detecting forged emails based on Random Forests and Naïve Bays classifiers. Instead of investigating the email headers, we use the body content to extract a unique writing style for all the possible suspects. Our approach consists of four main steps: (1) The cybercrime investigator extract different effective features including structural, lexical, linguistic, and syntactic evidence from previous emails for all the possible suspects, (2) The extracted features vectors are normalized to increase the accuracy rate. (3) The normalized features are then used to train the learning engine, (4) upon receiving the anonymous email (M); we apply the feature extraction process to produce a feature vector. Finally, using the machine learning classifiers the email is assigned to one of the suspects- whose writing style closely matches M. Experimental results on real data sets show the improved performance of the proposed method and the ability of identifying the authors with a very limited number of features.

Expert Witness Testimony in the Battered Woman Syndrome

The Expert Witness Testimony in the Battered Woman Syndrome Expert witness testimony (EWT) is a kind of information given by an expert specialized in the field (here in BWS) to the jury in order to help the court better understand the case. EWT does not always work in favor of the battered women. Two main decision-making models are discussed in the paper: the Mathematical model and the Explanation model. In the first model, the jurors calculate ″the importance and strength of each piece of evidence″ whereas in the second model they try to integrate the EWT with the evidence and create a coherent story that would describe the crime. The jury often misunderstands and misjudges battered women for their action (or in this case inaction). They assume that these women are masochists and accept being mistreated for if a man abuses a woman constantly, she should and could divorce him or simply leave at any time. The research in the domain found that indeed, expert witness testimony has a powerful influence on juror’s decisions thus its quality needs to be further explored. One of the important factors that need further studies is a bias called the dispositionist worldview (a belief that what happens to people is of their own doing). This kind of attributional bias represents a tendency to think that a person’s behavior is due to his or her disposition, even when the behavior is clearly attributed to the situation. Hypothesis The hypothesis of this paper is that if a juror has a dispositionist worldview then he or she will blame the rape victim for triggering the assault. The juror would therefore commit the fundamental attribution error and believe that the victim’s disposition caused the rape and not the situation she was in. Methods The subjects in the study were 500 randomly sampled undergraduate students from McGill, Concordia, Université de Montréal and UQAM. Dispositional Worldview was scored on the Dispositionist Worldview Questionnaire. After reading the Rape Scenarios, each student was asked to play the role of a juror and answer a questionnaire consisting of 7 questions about the responsibility, causality and fault of the victim. Results The results confirm the hypothesis which states that if a juror has a dispositionist worldview then he or she will blame the rape victim for triggering the assault. By doing so, the juror commits the fundamental attribution error because he will believe that the victim’s disposition, and not the constraints or opportunities of the situation, caused the rape scenario.

Investigation of Phytoextraction Coefficient Different Combination of Heavy Metals in Barley and Alfalfa

Two seperate experiments by barley and alfalfa were conducted to a 2×8 factorial completely randomised design, with four replicates. Factors were inoculation (M) with Gomus mosseae or uninoculation (M0) and seven levels of contaminants (Co, Cd, Pb and combinations) plus an uncontaminated control treatment (C). Heavy metals in plant tissues and soil were quantified by Inductively Coupled Plasma Optical Emission Spectrometer (ICP-OES) (Variant- Liberty 150AX Turbo). Phytoextraction coefficient of contaminants calculated by concentration of heavy metals in the shoot (mgkg-1) / concentration of heavy metals in soil (mgkg-1). In the barley, the highest rate of phytoextraction coefficient of Pb, Cd and Co was in M0Pb, M0PbCoCd and MCo, respectively (P

Stabilizer Fillet Weld Strength under Multiaxial Loading (Effect of Force, Size and Residual Stress)

In this paper, the strength of a stabilizer is determined when the static and fatigue multiaxial loading are applied. Stabilizer is a part of suspension system in the heavy truck for stabilizing the cabin against the vibration of the road which composes of a thin-walled tube joined to a forge component by fillet weld. The component is loaded by non proportional random sequence of torsion and bending. Residual stress of welding process is considered here for static loading. This static loading with road irregularities are applied in this study as fatigue case that can affected in the fillet welded area of this part. The stresses in the welded structure are calculated using FEA. In addition, the fatigue with multi axial loading in the fillet weld is also investigated and the critical zone of the stabilizer is specified and presented by graphs. Residual stresses that have been resulted by the thermal forces are considered in FEA. Force increasing is the element of finding the critical point of the component.