Extraction of Data from Web Pages: A Vision Based Approach

With the explosive growth of information sources available on the World Wide Web, it has become increasingly difficult to identify the relevant pieces of information, since web pages are often cluttered with irrelevant content like advertisements, navigation-panels, copyright notices etc., surrounding the main content of the web page. Hence, tools for the mining of data regions, data records and data items need to be developed in order to provide value-added services. Currently available automatic techniques to mine data regions from web pages are still unsatisfactory because of their poor performance and tag-dependence. In this paper a novel method to extract data items from the web pages automatically is proposed. It comprises of two steps: (1) Identification and Extraction of the data regions based on visual clues information. (2) Identification of data records and extraction of data items from a data region. For step1, a novel and more effective method is proposed based on visual clues, which finds the data regions formed by all types of tags using visual clues. For step2 a more effective method namely, Extraction of Data Items from web Pages (EDIP), is adopted to mine data items. The EDIP technique is a list-based approach in which the list is a linear data structure. The proposed technique is able to mine the non-contiguous data records and can correctly identify data regions, irrespective of the type of tag in which it is bound. Our experimental results show that the proposed technique performs better than the existing techniques.

The Status Info Processing and Keeping System for Production Equipment

With the globalized production and logistics environment, the need for reducing the product development interval and lead time, having a faster response to orders, conforming to quality standards, fair tracking, and boosting information exchanging activities with customers and partners, and coping with changes in the management environment, manufacturers are in dire need of an information management system in their manufacturing environments. There are lots of information systems that have been designed to manage the condition or operation of equipment in the field but existing systems have a decentralized architecture, which is not unified. Also, these systems cannot effectively handle the status data extraction process upon encountering a problem related to protocols or changes in the equipment or the setting. In this regard, this paper will introduce a system for processing and saving the status info of production equipment, which uses standard representation formats, to enable flexible responses to and support for variables in the field equipment. This system can be used for a variety of manufacturing and equipment settings and is capable of interacting with higher-tier systems such as MES.

Improving Convergence of Parameter Tuning Process of the Additive Fuzzy System by New Learning Strategy

An additive fuzzy system comprising m rules with n inputs and p outputs in each rule has at least t m(2n + 2 p + 1) parameters needing to be tuned. The system consists of a large number of if-then fuzzy rules and takes a long time to tune its parameters especially in the case of a large amount of training data samples. In this paper, a new learning strategy is investigated to cope with this obstacle. Parameters that tend toward constant values at the learning process are initially fixed and they are not tuned till the end of the learning time. Experiments based on applications of the additive fuzzy system in function approximation demonstrate that the proposed approach reduces the learning time and hence improves convergence speed considerably.

An Environmental Impact Tool to Assess National Energy Scenarios

The Long-range Energy and Alternatives Planning (LEAP) energy planning system has been developed for South Africa, for the 2005 base year and a limited number of plausible future scenarios that may have significant implications (negative or positive) in terms of environmental impacts. The system quantifies the national energy demand for the domestic, commercial, transport, industry and agriculture sectors, the supply of electricity and liquid fuels, and the resulting emissions. The South African National Energy Research Institute (SANERI) identified the need to develop an environmental assessment tool, based on the LEAP energy planning system, to provide decision-makers and stakeholders with the necessary understanding of the environmental impacts associated with different energy scenarios. A comprehensive analysis of indicators that are used internationally and in South Africa was done and the available data was accessed to select a reasonable number of indicators that could be utilized in energy planning. A consultative process was followed to determine the needs of different stakeholders on the required indicators and also the most suitable form of reporting. This paper demonstrates the application of Energy Environmental Sustainability Indicators (EESIs) as part of the developed tool, which assists with the identification of the environmental consequences of energy generation and use scenarios and thereby promotes sustainability, since environmental considerations can then be integrated into the preparation and adoption of policies, plans, programs and projects. Recommendations are made to refine the tool further for South Africa.

Communicative Competence in Technical Oral Presentation: That “Magic“ Perceived by ESL Educators versus Content Experts

Till date, English as a Second Language (ESL) educators involved in teaching language and communication to engineering students face an uphill task in developing graduate communicative competency. This challenge is accentuated by the apparent lack of English for Specific Purposes (ESP) materials for engineering students in the engineering curriculum. As such, most ESL educators are forced to play multiple roles. They don tasks such as curriculum designers, material writers and teachers with limited knowledge of the disciplinary content. Previous research indicates that prospective professional engineers should possess some sub-sets of competency: technical, linguistic oral immediacy, meta-cognitive and rhetorical explanatory competence. Another study revealed that engineering students need to be equipped with technical and linguistic oral immediacy competence. However, little is known whether these competency needs are in line with the educators- perceptions of communicative competence. This paper examines the best mix of communicative competence subsets that create the magic for engineering students in technical oral presentations. For the purpose of this study, two groups of educators were interviewed. These educators were language and communication lecturers involved in teaching a speaking course and content experts who assess students- technical oral presentations at tertiary level. The findings indicate that these two groups differ in their perceptions

Design and Simulation of a New Self-Learning Expert System for Mobile Robot

In this paper, we present a novel technique called Self-Learning Expert System (SLES). Unlike Expert System, where there is a need for an expert to impart experiences and knowledge to create the knowledge base, this technique tries to acquire the experience and knowledge automatically. To display this technique at work, a simulation of a mobile robot navigating through an environment with obstacles is employed using visual basic. The mobile robot will move through this area without colliding with any obstacle and save the path that it took. If the mobile robot has to go through a similar environment again, then it will apply this experience to help it move through quicker without having to check for collision.

Hazard Contributing Factors Classification for Petrol Fuel Station

Petrol Fuel Station (PFS) has potential hazards to the people, asset, environment and reputation of an operating company. Fire hazards, static electricity air pollution evoked by aliphatic and aromatic organic compounds are major causes of accident/incident occurrence at fuel station. Activities such as carelessness, maintenance, housekeeping, slips trips and falls, transportation hazard, major and minor injuries, robbery and snake bites has a potential to create unsafe conditions. The level of risk of these hazards varies according to location and country. The emphasis on safety considerations by the government is variable all around the world. Developed countries safety records are much better as compared to developing countries safety statistics. There is no significant approach available to highlight the unsafe acts and unsafe conditions during operation and maintenance of fuel station. Fuel station is the most commonly available facilities that contain flammable and hazardous materials. Due to continuous operation of fuel station they pose various hazards to people, environment and assets of an organization. To control these hazards, there is a need for specific approach. PFS operation is unique as compared to other businesses. For smooth operations it demands an involvement of operating company, contractor and operator group. This study will focus to address hazard contributing factors that have a potential to make PFS operation risky. One year data collected, 902 activities analyzed, comparisons were made to highlight significant contributing factors. The study will provide help and assistance to PFS outlet marketing companies to make their fuel station operation safer. It will help health safety and environment (HSE) professionals to arrest the gap available related to safety matters at PFS.

Biological Data Integration using SOA

Nowadays scientific data is inevitably digital and stored in a wide variety of formats in heterogeneous systems. Scientists need to access an integrated view of remote or local heterogeneous data sources with advanced data accessing, analyzing, and visualization tools. This research suggests the use of Service Oriented Architecture (SOA) to integrate biological data from different data sources. This work shows SOA will solve the problems that facing integration process and if the biologist scientists can access the biological data in easier way. There are several methods to implement SOA but web service is the most popular method. The Microsoft .Net Framework used to implement proposed architecture.

Effective Methodology for Security Risk Assessment of Computer Systems

Today, computer systems are more and more complex and support growing security risks. The security managers need to find effective security risk assessment methodologies that allow modeling well the increasing complexity of current computer systems but also maintaining low the complexity of the assessment procedure. This paper provides a brief analysis of common security risk assessment methodologies leading to the selection of a proper methodology to fulfill these requirements. Then, a detailed analysis of the most effective methodology is accomplished, presenting numerical examples to demonstrate how easy it is to use.

A Distributed Cognition Framework to Compare E-Commerce Websites Using Data Envelopment Analysis

This paper presents an approach based on the adoption of a distributed cognition framework and a non parametric multicriteria evaluation methodology (DEA) designed specifically to compare e-commerce websites from the consumer/user viewpoint. In particular, the framework considers a website relative efficiency as a measure of its quality and usability. A website is modelled as a black box capable to provide the consumer/user with a set of functionalities. When the consumer/user interacts with the website to perform a task, he/she is involved in a cognitive activity, sustaining a cognitive cost to search, interpret and process information, and experiencing a sense of satisfaction. The degree of ambiguity and uncertainty he/she perceives and the needed search time determine the effort size – and, henceforth, the cognitive cost amount – he/she has to sustain to perform his/her task. On the contrary, task performing and result achievement induce a sense of gratification, satisfaction and usefulness. In total, 9 variables are measured, classified in a set of 3 website macro-dimensions (user experience, site navigability and structure). The framework is implemented to compare 40 websites of businesses performing electronic commerce in the information technology market. A questionnaire to collect subjective judgements for the websites in the sample was purposely designed and administered to 85 university students enrolled in computer science and information systems engineering undergraduate courses.

Investigating Intrusion Detection Systems in MANET and Comparing IDSs for Detecting Misbehaving Nodes

As mobile ad hoc networks (MANET) have different characteristics from wired networks and even from standard wireless networks, there are new challenges related to security issues that need to be addressed. Due to its unique features such as open nature, lack of infrastructure and central management, node mobility and change of dynamic topology, prevention methods from attacks on them are not enough. Therefore intrusion detection is one of the possible ways in recognizing a possible attack before the system could be penetrated. All in all, techniques for intrusion detection in old wireless networks are not suitable for MANET. In this paper, we classify the architecture for Intrusion detection systems that have so far been introduced for MANETs, and then existing intrusion detection techniques in MANET presented and compared. We then indicate important future research directions.

Review of Surface Electromyogram Signals: Its Analysis and Applications

Electromyography (EMG) is the study of muscles function through analysis of electrical activity produced from muscles. This electrical activity which is displayed in the form of signal is the result of neuromuscular activation associated with muscle contraction. The most common techniques of EMG signal recording are by using surface and needle/wire electrode where the latter is usually used for interest in deep muscle. This paper will focus on surface electromyogram (SEMG) signal. During SEMG recording, several problems had to been countered such as noise, motion artifact and signal instability. Thus, various signal processing techniques had been implemented to produce a reliable signal for analysis. SEMG signal finds broad application particularly in biomedical field. It had been analyzed and studied for various interests such as neuromuscular disease, enhancement of muscular function and human-computer interface.

Adaptive Hierarchical Key Structure Generation for Key Management in Wireless Sensor Networks using A*

Wireless Sensor networks have a wide spectrum of civil and military applications that call for secure communication such as the terrorist tracking, target surveillance in hostile environments. For the secure communication in these application areas, we propose a method for generating a hierarchical key structure for the efficient group key management. In this paper, we apply A* algorithm in generating a hierarchical key structure by considering the history data of the ratio of addition and eviction of sensor nodes in a location where sensor nodes are deployed. Thus generated key tree structure provides an efficient way of managing the group key in terms of energy consumption when addition and eviction event occurs. A* algorithm tries to minimize the number of messages needed for group key management by the history data. The experimentation with the tree shows efficiency of the proposed method.

The Catalytic Effects of Potassium Dichromate on the Pyrolysis of Polymeric Mixtures Part II: Hazelnut Shell and Ultra-high Molecular Weight Polyethylene and their Blend Cases

Renewable energy sources have gained ultimate urgency due to the need of the preservation of the environment for a sustainable development. Pyrolysis is an ultimate promising process in the recycling and acquisition of precious chemicals from wastes. Here, the co-pyrolysis of hazelnut shell with ultra-high molecular weight polyethylene was carried out catalytically and noncatalytically at 500 and 650 ºC. Potassium dichromate was added in certain amounts to act as a catalyst. The liquid, solid and gas products quantities were determined by gravimetry. As a main result, remarkable increases in gasification were observed by using this catalyst for pure components and their blends especially at 650 ºC. The increase in gas product quantity was compensated mainly with the decreases in the solid products and additionally in some cases liquid products quantities. These observations may stem from mainly the activation of carbon-carbon bonds rather than carbon-hydrogen bonds via potassium dichromate. Also, the catalytic effect of potassium dichromate on HS: PEO and HS: UHMWPE co-pyrolysis was compared.

Health Post A Sustainable Prototype for the Third World

This paper concerns the study of sustainable construction materials applied on the "Health Post", a prototype for the primary health care situated in alienated areas of the world. It's suitable for social and climatic Sub-Saharan context; however, it could be moved in other countries of the world with similar urgent needs. The idea is to create a Health Post with local construction materials that have a low environmental impact and promote the local workforce allowing reuse of traditional building techniques lowering production costs and transport. The aim of Primary Health Care Centre is to be a flexible and expandable structure identifying a modular form that can be repeated several times to expand its existing functions. In this way it could be not only a health care centre but also a socio-cultural facility.

Deterministic Random Number Generators for Online Applications

Cryptography, Image watermarking and E-banking are filled with apparent oxymora and paradoxes. Random sequences are used as keys to encrypt information to be used as watermark during embedding the watermark and also to extract the watermark during detection. Also, the keys are very much utilized for 24x7x365 banking operations. Therefore a deterministic random sequence is very much useful for online applications. In order to obtain the same random sequence, we need to supply the same seed to the generator. Many researchers have used Deterministic Random Number Generators (DRNGs) for cryptographic applications and Pseudo Noise Random sequences (PNs) for watermarking. Even though, there are some weaknesses in PN due to attacks, the research community used it mostly in digital watermarking. On the other hand, DRNGs have not been widely used in online watermarking due to its computational complexity and non-robustness. Therefore, we have invented a new design of generating DRNG using Pi-series to make it useful for online Cryptographic, Digital watermarking and Banking applications.

A Survey of Job Scheduling and Resource Management in Grid Computing

Grid computing is a form of distributed computing that involves coordinating and sharing computational power, data storage and network resources across dynamic and geographically dispersed organizations. Scheduling onto the Grid is NP-complete, so there is no best scheduling algorithm for all grid computing systems. An alternative is to select an appropriate scheduling algorithm to use in a given grid environment because of the characteristics of the tasks, machines and network connectivity. Job and resource scheduling is one of the key research area in grid computing. The goal of scheduling is to achieve highest possible system throughput and to match the application need with the available computing resources. Motivation of the survey is to encourage the amateur researcher in the field of grid computing, so that they can understand easily the concept of scheduling and can contribute in developing more efficient scheduling algorithm. This will benefit interested researchers to carry out further work in this thrust area of research.

Primer Design with Specific PCR Product using Particle Swarm Optimization

Before performing polymerase chain reactions (PCR), a feasible primer set is required. Many primer design methods have been proposed for design a feasible primer set. However, the majority of these methods require a relatively long time to obtain an optimal solution since large quantities of template DNA need to be analyzed. Furthermore, the designed primer sets usually do not provide a specific PCR product. In recent years, evolutionary computation has been applied to PCR primer design and yielded promising results. In this paper, a particle swarm optimization (PSO) algorithm is proposed to solve primer design problems associated with providing a specific product for PCR experiments. A test set of the gene CYP1A1, associated with a heightened lung cancer risk was analyzed and the comparison of accuracy and running time with the genetic algorithm (GA) and memetic algorithm (MA) was performed. A comparison of results indicated that the proposed PSO method for primer design finds optimal or near-optimal primer sets and effective PCR products in a relatively short time.

Hybrid Energy Supply with Dominantly Renewable Option for Small Industrial Complex

The deficit of power for electricity demand reaches almost 30% for consumers in the last few years. This reflects with continually increasing the price of electricity, and today the price for small industry is almost 110Euro/MWh. The high price is additional problem for the owners in the economy crisis which is reflected with higher price of the goods. The paper gives analyses of the energy needs for real agro complex in Macedonia, private vinery with capacity of over 2 million liters in a year and with self grapes and fruits fields. The existing power supply is from grid with 10/04 kV transformer. The geographical and meteorological condition of the vinery location gives opportunity for including renewable as a power supply option for the vinery complex. After observation of the monthly energy needs for the vinery, the base scenario is the existing power supply from the distribution grid. The electricity bill in small industry has three factors: electricity in high and low tariffs in kWh and the power engaged for the technological process of production in kW. These three factors make the total electricity bill and it is over 110 Euro/MWh which is the price near competitive for renewable option. On the other side investments in renewable (especially photovoltaic (PV)) has tendency of decreasing with price of near 1,5 Euro/W. This means that renewable with PV can be real option for power supply for small industry capacities (under 500kW installed power). Therefore, the other scenarios give the option with PV and the last one includes wind option. The paper presents some scenarios for power supply of the vinery as the followings: • Base scenario of existing conventional power supply from the grid • Scenario with implementation of renewable of Photovoltaic • Scenario with implementation of renewable of Photovoltaic and Wind power The total power installed in a vinery is near 570 kW, but the maximum needs are around 250kW. At the end of the full paper some of the results from scenarios will be presented. The paper also includes the environmental impacts of the renewable scenarios, as well as financial needs for investments and revenues from renewable.

Introduction of Open-Source e-Learning Environment and Resources: A Novel Approach for Secondary Schools in Tanzania

The concept of e-Learning is now emerging in Sub Saharan African countries like Tanzania. Due to economic constraints and other social and cultural factors faced by these countries, the use of Information and Communication Technology (ICT) is increasing at a very low pace. The digital divide threat has propelled the Government of Tanzania to put in place the national ICT Policy in 2003 which defines the direction of all ICT activities nationally. Among the main focused areas is the use of ICT in education, since for the development of any country, there is a need of creating knowledge based society. This paper discusses the initiatives made so far to introduce the use of ICT tools to some secondary schools using open source software in e-content development to facilitate a self-learning environment