Compression of Semistructured Documents

EGOTHOR is a search engine that indexes the Web and allows us to search the Web documents. Its hit list contains URL and title of the hits, and also some snippet which tries to shortly show a match. The snippet can be almost always assembled by an algorithm that has a full knowledge of the original document (mostly HTML page). It implies that the search engine is required to store the full text of the documents as a part of the index. Such a requirement leads us to pick up an appropriate compression algorithm which would reduce the space demand. One of the solutions could be to use common compression methods, for instance gzip or bzip2, but it might be preferable if we develop a new method which would take advantage of the document structure, or rather, the textual character of the documents. There already exist a special compression text algorithms and methods for a compression of XML documents. The aim of this paper is an integration of the two approaches to achieve an optimal level of the compression ratio

Methane and Other Hydrocarbon Gas Emissions Resulting from Flaring in Kuwait Oilfields

Air pollution is a major environmental health problem, affecting developed and developing countries around the world. Increasing amounts of potentially harmful gases and particulate matter are being emitted into the atmosphere on a global scale, resulting in damage to human health and the environment. Petroleum-related air pollutants can have a wide variety of adverse environmental impacts. In the crude oil production sectors, there is a strong need for a thorough knowledge of gaseous emissions resulting from the flaring of associated gas of known composition on daily basis through combustion activities under several operating conditions. This can help in the control of gaseous emission from flares and thus in the protection of their immediate and distant surrounding against environmental degradation. The impacts of methane and non-methane hydrocarbons emissions from flaring activities at oil production facilities at Kuwait Oilfields have been assessed through a screening study using records of flaring operations taken at the gas and oil production sites, and by analyzing available meteorological and air quality data measured at stations located near anthropogenic sources. In the present study the Industrial Source Complex (ISCST3) Dispersion Model is used to calculate the ground level concentrations of methane and nonmethane hydrocarbons emitted due to flaring in all over Kuwait Oilfields. The simulation of real hourly air quality in and around oil production facilities in the State of Kuwait for the year 2006, inserting the respective source emission data into the ISCST3 software indicates that the levels of non-methane hydrocarbons from the flaring activities exceed the allowable ambient air standard set by Kuwait EPA. So, there is a strong need to address this acute problem to minimize the impact of methane and non-methane hydrocarbons released from flaring activities over the urban area of Kuwait.

A Materialized Approach to the Integration of XML Documents: the OSIX System

The data exchanged on the Web are of different nature from those treated by the classical database management systems; these data are called semi-structured data since they do not have a regular and static structure like data found in a relational database; their schema is dynamic and may contain missing data or types. Therefore, the needs for developing further techniques and algorithms to exploit and integrate such data, and extract relevant information for the user have been raised. In this paper we present the system OSIX (Osiris based System for Integration of XML Sources). This system has a Data Warehouse model designed for the integration of semi-structured data and more precisely for the integration of XML documents. The architecture of OSIX relies on the Osiris system, a DL-based model designed for the representation and management of databases and knowledge bases. Osiris is a viewbased data model whose indexing system supports semantic query optimization. We show that the problem of query processing on a XML source is optimized by the indexing approach proposed by Osiris.

Dynamic Models versus Frailty Models for Recurrent Event Data

Recurrent event data is a special type of multivariate survival data. Dynamic and frailty models are one of the approaches that dealt with this kind of data. A comparison between these two models is studied using the empirical standard deviation of the standardized martingale residual processes as a way of assessing the fit of the two models based on the Aalen additive regression model. Here we found both approaches took heterogeneity into account and produce residual standard deviations close to each other both in the simulation study and in the real data set.

Teaching Approach and Self-Confidence Effect Model Consistency between Taiwan and Singapore Multi-Group HLM

This study was conducted to explore the effects of two countries model comparison program in Taiwan and Singapore in TIMSS database. The researchers used Multi-Group Hierarchical Linear Modeling techniques to compare the effects of two different country models and we tested our hypotheses on 4,046 Taiwan students and 4,599 Singapore students in 2007 at two levels: the class level and student (individual) level. Design quality is a class level variable. Student level variables are achievement and self-confidence. The results challenge the widely held view that retention has a positive impact on self-confidence. Suggestions for future research are discussed.

Dengue Disease Mapping with Standardized Morbidity Ratio and Poisson-gamma Model: An Analysis of Dengue Disease in Perak, Malaysia

Dengue disease is an infectious vector-borne viral disease that is commonly found in tropical and sub-tropical regions, especially in urban and semi-urban areas, around the world and including Malaysia. There is no currently available vaccine or chemotherapy for the prevention or treatment of dengue disease. Therefore prevention and treatment of the disease depend on vector surveillance and control measures. Disease risk mapping has been recognized as an important tool in the prevention and control strategies for diseases. The choice of statistical model used for relative risk estimation is important as a good model will subsequently produce a good disease risk map. Therefore, the aim of this study is to estimate the relative risk for dengue disease based initially on the most common statistic used in disease mapping called Standardized Morbidity Ratio (SMR) and one of the earliest applications of Bayesian methodology called Poisson-gamma model. This paper begins by providing a review of the SMR method, which we then apply to dengue data of Perak, Malaysia. We then fit an extension of the SMR method, which is the Poisson-gamma model. Both results are displayed and compared using graph, tables and maps. Results of the analysis shows that the latter method gives a better relative risk estimates compared with using the SMR. The Poisson-gamma model has been demonstrated can overcome the problem of SMR when there is no observed dengue cases in certain regions. However, covariate adjustment in this model is difficult and there is no possibility for allowing spatial correlation between risks in adjacent areas. The drawbacks of this model have motivated many researchers to propose other alternative methods for estimating the risk.

Relationship between Level of Physical Activity and Exercise Imagery among Klang Valley Citizens

This study investigated the relationship between exercise imagery use and level of physical activity within a wide range of exercisers in Klang valley, Malaysia. One hundred and twenty four respondents (Mage = 28.92, SD = 9.34) completed two sets of questionnaires (Exercise Imagery Inventory and Leisure-Time Exercise Questionnaire) that measure the use of imagery and exercise frequency of participants. From the result obtained, exercise imagery is found to be significantly correlated to level of physical activity. Besides that, variables such as gender, age and ethnicity that may affect the use of imagery and exercise frequency were also being assessed in this study. Among all variables, only ethnicity showed significant difference in level of physical activity (p < 0.05). Findings in this study suggest that further investigation should be done on other variables such as socioeconomic, educational level, and selfefficacy that may affect the imagery use and frequency of physical activity among exercisers.

How to Build and Evaluate a Solution Method: An Illustration for the Vehicle Routing Problem

The vehicle routing problem (VRP) is a famous combinatorial optimization problem. Because of its well-known difficulty, metaheuristics are the most appropriate methods to tackle large and realistic instances. The goal of this paper is to highlight the key ideas for designing VRP metaheuristics according to the following criteria: efficiency, speed, robustness, and ability to take advantage of the problem structure. Such elements can obviously be used to build solution methods for other combinatorial optimization problems, at least in the deterministic field.

Numerical Study of Cyclic Behavior of Shallow Foundations on Sand Reinforced with Geogrid and Grid-Anchor

When the foundations of structures under cyclic loading with amplitudes less than their permissible load, the concern exists often for the amount of uniform and non-uniform settlement of such structures. Storage tank foundations with numerous filling and discharging and railways ballast course under repeating transportation loads are examples of such conditions. This paper deals with the effects of using the new generation of reinforcements, Grid-Anchor, for the purpose of reducing the permanent settlement of these foundations under the influence of different proportions of the ultimate load. Other items such as the type and the number of reinforcements as well as the number of loading cycles are studied numerically. Numerical models were made using the Plaxis3D Tunnel finite element code. The results show that by using gridanchor and increasing the number of their layers in the same proportion as that of the cyclic load being applied, the amount of permanent settlement decreases up to 42% relative to unreinforced condition depends on the number of reinforcement layers and percent of applied load and the number of loading cycles to reach a constant value of dimensionless settlement decreases up to 20% relative to unreinforced condition.

Ensemble Learning with Decision Tree for Remote Sensing Classification

In recent years, a number of works proposing the combination of multiple classifiers to produce a single classification have been reported in remote sensing literature. The resulting classifier, referred to as an ensemble classifier, is generally found to be more accurate than any of the individual classifiers making up the ensemble. As accuracy is the primary concern, much of the research in the field of land cover classification is focused on improving classification accuracy. This study compares the performance of four ensemble approaches (boosting, bagging, DECORATE and random subspace) with a univariate decision tree as base classifier. Two training datasets, one without ant noise and other with 20 percent noise was used to judge the performance of different ensemble approaches. Results with noise free data set suggest an improvement of about 4% in classification accuracy with all ensemble approaches in comparison to the results provided by univariate decision tree classifier. Highest classification accuracy of 87.43% was achieved by boosted decision tree. A comparison of results with noisy data set suggests that bagging, DECORATE and random subspace approaches works well with this data whereas the performance of boosted decision tree degrades and a classification accuracy of 79.7% is achieved which is even lower than that is achieved (i.e. 80.02%) by using unboosted decision tree classifier.

Evaluation of a New Method for Detection of Kidney Stone during Laparoscopy Using 3D Conceptual Modeling

Minimally invasive surgery (MIS) is now being widely used as a preferred choice for various types of operations. The need to detect various tactile properties, justifies the key role of tactile sensing that is currently missing in MIS. In this regard, Laparoscopy is one of the methods of minimally invasive surgery that can be used in kidney stone removal surgeries. At this moment, determination of the exact location of stone during laparoscopy is one of the limitations of this method that no scientific solution has been found for so far. Artificial tactile sensing is a new method for obtaining the characteristics of a hard object embedded in a soft tissue. Artificial palpation is an important application of artificial tactile sensing that can be used in different types of surgeries. In this study, a new method for determining the exact location of stone during laparoscopy is presented. In the present study, the effects of stone existence on the surface of kidney were investigated using conceptual 3D model of kidney containing a simulated stone. Having imitated palpation and modeled it conceptually, indications of stone existence that appear on the surface of kidney were determined. A number of different cases were created and solved by the software and using stress distribution contours and stress graphs, it is illustrated that the created stress patterns on the surface of kidney show not only the existence of stone inside, but also its exact location. So three-dimensional analysis leads to a novel method of predicting the exact location of stone and can be directly applied to the incorporation of tactile sensing in artificial palpation, helping surgeons in non-invasive procedures.

State Feedback Speed Controller for Turbocharged Diesel Engine and Its Robustness

In this paper, the full state feedback controllers capable of regulating and tracking the speed trajectory are presented. A fourth order nonlinear mean value model of a 448 kW turbocharged diesel engine published earlier is used for the purpose. For designing controllers, the nonlinear model is linearized and represented in state-space form. Full state feedback controllers capable of meeting varying speed demands of drivers are presented. Main focus here is to investigate sensitivity of the controller to the perturbations in the parameters of the original nonlinear model. Suggested controller is shown to be highly insensitive to the parameter variations. This indicates that the controller is likely perform with same accuracy even after significant wear and tear of engine due to its use for years.

Simulation and Parameterization by the Finite Element Method of a C Shape Delectromagnet for Application in the Characterization of Magnetic Properties of Materials

This article presents the simulation, parameterization and optimization of an electromagnet with the C–shaped configuration, intended for the study of magnetic properties of materials. The electromagnet studied consists of a C-shaped yoke, which provides self–shielding for minimizing losses of magnetic flux density, two poles of high magnetic permeability and power coils wound on the poles. The main physical variable studied was the static magnetic flux density in a column within the gap between the poles, with 4cm2 of square cross section and a length of 5cm, seeking a suitable set of parameters that allow us to achieve a uniform magnetic flux density of 1x104 Gaussor values above this in the column, when the system operates at room temperature and with a current consumption not exceeding 5A. By means of a magnetostatic analysis by the finite element method, the magnetic flux density and the distribution of the magnetic field lines were visualized and quantified. From the results obtained by simulating an initial configuration of electromagnet, a structural optimization of the geometry of the adjustable caps for the ends of the poles was performed. The magnetic permeability effect of the soft magnetic materials used in the poles system, such as low– carbon steel (0.08% C), Permalloy (45% Ni, 54.7% Fe) and Mumetal (21.2% Fe, 78.5% Ni), was also evaluated. The intensity and uniformity of the magnetic field in the gap showed a high dependence with the factors described above. The magnetic field achieved in the column was uniform and its magnitude ranged between 1.5x104 Gauss and 1.9x104 Gauss according to the material of the pole used, with the possibility of increasing the magnetic field by choosing a suitable geometry of the cap, introducing a cooling system for the coils and adjusting the spacing between the poles. This makes the device a versatile and scalable tool to generate the magnetic field necessary to perform magnetic characterization of materials by techniques such as vibrating sample magnetometry (VSM), Hall-effect, Kerr-effect magnetometry, among others. Additionally, a CAD design of the modules of the electromagnet is presented in order to facilitate the construction and scaling of the physical device.

Comparative Study on Recent Integer DCTs

This paper presents comparative study on recent integer DCTs and a new method to construct a low sensitive structure of integer DCT for colored input signals. The method refers to sensitivity of multiplier coefficients to finite word length as an indicator of how word length truncation effects on quality of output signal. The sensitivity is also theoretically evaluated as a function of auto-correlation and covariance matrix of input signal. The structure of integer DCT algorithm is optimized by combination of lower sensitive lifting structure types of IRT. It is evaluated by the sensitivity of multiplier coefficients to finite word length expression in a function of covariance matrix of input signal. Effectiveness of the optimum combination of IRT in integer DCT algorithm is confirmed by quality improvement comparing with existing case. As a result, the optimum combination of IRT in each integer DCT algorithm evidently improves output signal quality and it is still compatible with the existing one.

Shape-Based Image Retrieval Using Shape Matrix

Retrieval image by shape similarity, given a template shape is particularly challenging, owning to the difficulty to derive a similarity measurement that closely conforms to the common perception of similarity by humans. In this paper, a new method for the representation and comparison of shapes is present which is based on the shape matrix and snake model. It is scaling, rotation, translation invariant. And it can retrieve the shape images with some missing or occluded parts. In the method, the deformation spent by the template to match the shape images and the matching degree is used to evaluate the similarity between them.

Analysis of Explosive Shock Wave and its Application in Snow Avalanche Release

Avalanche velocity (from start to track zone) has been estimated in the present model for an avalanche which is triggered artificially by an explosive devise. The initial development of the model has been from the concept of micro-continuum theories [1], underwater explosions [2] and from fracture mechanics [3] with appropriate changes to the present model. The model has been computed for different slab depth R, slope angle θ, snow density ¤ü, viscosity μ, eddy viscosity η*and couple stress parameter η. The applicability of the present model in the avalanche forecasting has been highlighted.

Semantic Mobility Channel (SMC): Ubiquitous and Mobile Computing Meets the Semantic Web

With the advent of emerging personal computing paradigms such as ubiquitous and mobile computing, Web contents are becoming accessible from a wide range of mobile devices. Since these devices do not have the same rendering capabilities, Web contents need to be adapted for transparent access from a variety of client agents. Such content adaptation is exploited for either an individual element or a set of consecutive elements in a Web document and results in better rendering and faster delivery to the client device. Nevertheless, Web content adaptation sets new challenges for semantic markup. This paper presents an advanced components platform, called SMC, enabling the development of mobility applications and services according to a channel model based on the principles of Services Oriented Architecture (SOA). It then goes on to describe the potential for integration with the Semantic Web through a novel framework of external semantic annotation that prescribes a scheme for representing semantic markup files and a way of associating Web documents with these external annotations. The role of semantic annotation in this framework is to describe the contents of individual documents themselves, assuring the preservation of the semantics during the process of adapting content rendering. Semantic Web content adaptation is a way of adding value to Web contents and facilitates repurposing of Web contents (enhanced browsing, Web Services location and access, etc).

Solving Part Type Selection and Loading Problem in Flexible Manufacturing System Using Real Coded Genetic Algorithms – Part II: Optimization

This paper presents modeling and optimization of two NP-hard problems in flexible manufacturing system (FMS), part type selection problem and loading problem. Due to the complexity and extent of the problems, the paper was split into two parts. The first part of the papers has discussed the modeling of the problems and showed how the real coded genetic algorithms (RCGA) can be applied to solve the problems. This second part discusses the effectiveness of the RCGA which uses an array of real numbers as chromosome representation. The novel proposed chromosome representation produces only feasible solutions which minimize a computational time needed by GA to push its population toward feasible search space or repair infeasible chromosomes. The proposed RCGA improves the FMS performance by considering two objectives, maximizing system throughput and maintaining the balance of the system (minimizing system unbalance). The resulted objective values are compared to the optimum values produced by branch-and-bound method. The experiments show that the proposed RCGA could reach near optimum solutions in a reasonable amount of time.

On Formalizing Predefined OCL Properties

The ability of UML to handle the modeling process of complex industrial software applications has increased its popularity to the extent of becoming the de-facto language in serving the design purpose. Although, its rich graphical notation naturally oriented towards the object-oriented concept, facilitates the understandability, it hardly successes to report all domainspecific aspects in a satisfactory way. OCL, as the standard language for expressing additional constraints on UML models, has great potential to help improve expressiveness. Unfortunately, it suffers from a weak formalism due to its poor semantic resulting in many obstacles towards the build of tools support and thus its application in the industry field. For this reason, many researches were established to formalize OCL expressions using a more rigorous approach. Our contribution join this work in a complementary way since it focuses specifically on OCL predefined properties which constitute an important part in the construction of OCL expressions. Using formal methods, we mainly succeed in expressing rigorously OCL predefined functions.

One-Class Support Vector Machines for Aerial Images Segmentation

Interpretation of aerial images is an important task in various applications. Image segmentation can be viewed as the essential step for extracting information from aerial images. Among many developed segmentation methods, the technique of clustering has been extensively investigated and used. However, determining the number of clusters in an image is inherently a difficult problem, especially when a priori information on the aerial image is unavailable. This study proposes a support vector machine approach for clustering aerial images. Three cluster validity indices, distance-based index, Davies-Bouldin index, and Xie-Beni index, are utilized as quantitative measures of the quality of clustering results. Comparisons on the effectiveness of these indices and various parameters settings on the proposed methods are conducted. Experimental results are provided to illustrate the feasibility of the proposed approach.