Abstract: The majority of today's IR systems base the IR task on two main processes: indexing and searching. There exists a special group of dynamic IR systems where both processes (indexing and searching) happen simultaneously; such a system discards obsolete information, simultaneously dealing with the insertion of new in¬formation, while still answering user queries. In these dynamic, time critical text document databases, it is often important to modify index structures quickly, as documents arrive. This paper presents a method for dynamization which may be used for this task. Experimental results show that the dynamization process is possible and that it guarantees the response time for the query operation and index actualization.
Abstract: This paper presents the results of preliminary
assessment of water quality along the coastal areas in the vicinity of
Left Bank Outfall Drainage (LBOD) and Tidal Link Drain (TLD) in
Sindh province after the cyclone 2A occurred in 1999. The water
samples were collected from various RDs of Tidal Link Drain and
lakes during September 2001 to April 2002 and were analysed for
salinity, nitrite, phosphate, ammonia, silicate and suspended material
in water. The results of the study showed considerable variations in
water quality depending upon the location along the coast in the
vicinity of LBOD and RDs. The salinity ranged between 4.39–65.25
ppt in Tidal Link Drain samples whereas 2.4–38.05 ppt in samples
collected from lakes. The values of suspended material at various
RDs of Tidal Link Drain ranged between 56.6–2134 ppm and at the
lakes between 68–297 ppm. The data of continuous monitoring at
RD–93 showed the range of PO4 (8.6–25.2 μg/l), SiO3 (554.96–1462
μg/l), NO2 (0.557.2–25.2 μg/l) and NH3 (9.38–23.62 μg/l). The
concentration of nutrients in water samples collected from different
RDs was found in the range of PO4 (10.85 to 11.47 μg/l), SiO3 (1624
to 2635.08 μg/l), NO2 (20.38 to 44.8 μg/l) and NH3 (24.08 to 26.6
μg/l). Sindh coastal areas which situated at the north-western
boundary the Arabian Sea are highly vulnerable to flood damages
due to flash floods during SW monsoon or impact of sea level rise
and storm surges coupled with cyclones passing through Arabian Sea
along Pakistan coast. It is hoped that the obtained data in this study
would act as a database for future investigations and monitoring of
LBOD and Tidal Link Drain coastal waters.
Abstract: In the current age, retrieval of relevant information
from massive amount of data is a challenging job. Over the years,
precise and relevant retrieval of information has attained high
significance. There is a growing need in the market to build systems,
which can retrieve multimedia information that precisely meets the
user's current needs. In this paper, we have introduced a framework
for refining query results before showing it to the user, using ambient
intelligence, user profile, group profile, user location, time, day, user
device type and extracted features. A prototypic tool was also
developed to demonstrate the efficiency of the proposed approach.
Abstract: Needs of an efficient information retrieval in recent
years in increased more then ever because of the frequent use of
digital information in our life. We see a lot of work in the area of
textual information but in multimedia information, we cannot find
much progress. In text based information, new technology of data
mining and data marts are now in working that were started from the
basic concept of database some where in 1960.
In image search and especially in image identification,
computerized system at very initial stages. Even in the area of image
search we cannot see much progress as in the case of text based
search techniques. One main reason for this is the wide spread roots
of image search where many area like artificial intelligence,
statistics, image processing, pattern recognition play their role. Even
human psychology and perception and cultural diversity also have
their share for the design of a good and efficient image recognition
and retrieval system.
A new object based search technique is presented in this paper
where object in the image are identified on the basis of their
geometrical shapes and other features like color and texture where
object-co-relation augments this search process.
To be more focused on objects identification, simple images are
selected for the work to reduce the role of segmentation in overall
process however same technique can also be applied for other
images.
Abstract: The leisure boatbuilding industry has tight profit margins that demand that boats are created to a high quality but with low cost. This requirement means reduced design times combined with increased use of design for production can lead to large benefits. The evolutionary nature of the boatbuilding industry can lead to a large usage of previous vessels in new designs. With the increase in automated tools for concurrent engineering within structural design it is important that these tools can reuse this information while subsequently feeding this to designers. The ability to accurately gather this materials and parts data is also a key component to these tools. This paper therefore aims to develop an architecture made up of neural networks and databases to feed information effectively to the designers based on previous design experience.
Abstract: Data mining and knowledge engineering have become a tough task due to the availability of large amount of data in the web nowadays. Validity and reliability of data also become a main debate in knowledge acquisition. Besides, acquiring knowledge from different languages has become another concern. There are many language translators and corpora developed but the function of these translators and corpora are usually limited to certain languages and domains. Furthermore, search results from engines with traditional 'keyword' approach are no longer satisfying. More intelligent knowledge engineering agents are needed. To address to these problems, a system known as Multilingual Word Semantic Network is proposed. This system adapted semantic network to organize words according to concepts and relations. The system also uses open source as the development philosophy to enable the native language speakers and experts to contribute their knowledge to the system. The contributed words are then defined and linked using lexical and semantic relations. Thus, related words and derivatives can be identified and linked. From the outcome of the system implementation, it contributes to the development of semantic web and knowledge engineering.
Abstract: Databases have become ubiquitous. Almost all IT applications are storing into and retrieving information from databases. Retrieving information from the database requires knowledge of technical languages such as Structured Query Language (SQL). However majority of the users who interact with the databases do not have a technical background and are intimidated by the idea of using languages such as SQL. This has led to the development of a few Natural Language Database Interfaces (NLDBIs). A NLDBI allows the user to query the database in a natural language. This paper highlights on architecture of new NLDBI system, its implementation and discusses on results obtained. In most of the typical NLDBI systems the natural language statement is converted into an internal representation based on the syntactic and semantic knowledge of the natural language. This representation is then converted into queries using a representation converter. A natural language query is translated to an equivalent SQL query after processing through various stages. The work has been experimented on primitive database queries with certain constraints.
Abstract: One of the major problems in genomic field is to perform sequence comparison on DNA and protein sequences. Executing sequence comparison on the DNA and protein data is a computationally intensive task. Sequence comparison is the basic step for all algorithms in protein sequences similarity. Parallel computing is an attractive solution to provide the computational power needed to speedup the lengthy process of the sequence comparison. Our main research is to enhance the protein sequence algorithm using dynamic programming method. In our approach, we parallelize the dynamic programming algorithm using multithreaded program to perform the sequence comparison and also developed a distributed protein database among many PCs using Remote Method Interface (RMI). As a result, we showed how different sizes of protein sequences data and computation of scoring matrix of these protein sequence on different number of processors affected the processing time and speed, as oppose to sequential processing.
Abstract: The ability to distinguish missense nucleotide
substitutions that contribute to harmful effect from those that do not
is a difficult problem usually accomplished through functional in
vivo analyses. In this study, instead current biochemical methods, the
effects of missense mutations upon protein structure and function
were assayed by means of computational methods and information
from the databases. For this order, the effects of new missense
mutations in exon 5 of PTEN gene upon protein structure and
function were examined. The gene coding for PTEN was identified
and localized on chromosome region 10q23.3 as the tumor
suppressor gene. The utilization of these methods were shown that
c.319G>A and c.341T>G missense mutations that were recognized in
patients with breast cancer and Cowden disease, could be pathogenic.
This method could be use for analysis of missense mutation in others
genes.
Abstract: Clustering large populations is an important problem
when the data contain noise and different shapes. A good clustering
algorithm or approach should be efficient enough to detect clusters
sensitively. Besides space complexity, time complexity also gains
importance as the size grows. Using hierarchies we developed a new
algorithm to split attributes according to the values they have and
choosing the dimension for splitting so as to divide the database
roughly into equal parts as much as possible. At each node we
calculate some certain descriptive statistical features of the data
which reside and by pruning we generate the natural clusters with a
complexity of O(n).
Abstract: There is an urgent need to develop novel
Mycobacterium tuberculosis (Mtb) drugs that are active against drug
resistant bacteria but, more importantly, kill persistent bacteria. Our
study structured based on integrated analysis of metabolic pathways,
small molecule screening and similarity Search in PubChem
Database. Metabolic analysis approaches based on Unified weighted
used for potent target selection. Our results suggest that pantothenate
synthetase (panC) and and 3-methyl-2-oxobutanoate hydroxymethyl
transferase (panB) as a appropriate drug targets. In our study, we
used pantothenate synthetase because of existence inhibitors. We
have reported the discovery of new antitubercular compounds
through ligand based approaches using computational tools.
Abstract: This paper illustrates the use of a combined neural
network model for classification of electrocardiogram (ECG) beats.
We present a trainable neural network ensemble approach to develop
customized electrocardiogram beat classifier in an effort to further
improve the performance of ECG processing and to offer
individualized health care.
We process a three stage technique for detection of premature
ventricular contraction (PVC) from normal beats and other heart
diseases. This method includes a denoising, a feature extraction and a
classification. At first we investigate the application of stationary
wavelet transform (SWT) for noise reduction of the
electrocardiogram (ECG) signals. Then feature extraction module
extracts 10 ECG morphological features and one timing interval
feature. Then a number of multilayer perceptrons (MLPs) neural
networks with different topologies are designed.
The performance of the different combination methods as well as
the efficiency of the whole system is presented. Among them,
Stacked Generalization as a proposed trainable combined neural
network model possesses the highest recognition rate of around 95%.
Therefore, this network proves to be a suitable candidate in ECG
signal diagnosis systems. ECG samples attributing to the different
ECG beat types were extracted from the MIT-BIH arrhythmia
database for the study.
Abstract: Recently, lots of researchers are attracted to retrieving
multimedia database by using some impression words and their values.
Ikezoe-s research is one of the representatives and uses eight pairs of
opposite impression words. We had modified its retrieval interface and
proposed '2D-RIB'. In '2D-RIB', after a retrieval person selects a
single basic music, the system visually shows some other music
around the basic one along relative position. He/she can select one of
them fitting to his/her intention, as a retrieval result. The purpose of
this paper is to improve his/her satisfaction level to the retrieval result
in 2D-RIB. One of our extensions is to define and introduce the
following two measures: 'melody goodness' and 'general acceptance'.
We implement them in different five combinations. According to an
evaluation experiment, both of these two measures can contribute to
the improvement. Another extension is three types of customization.
We have implemented them and clarified which customization is
effective.
Abstract: Images of human iris contain specular highlights due
to the reflective properties of the cornea. This corneal reflection
causes many errors not only in iris and pupil center estimation but
also to locate iris and pupil boundaries especially for methods that
use active contour. Each iris recognition system has four steps:
Segmentation, Normalization, Encoding and Matching. In order to
address the corneal reflection, a novel reflection removal method is
proposed in this paper. Comparative experiments of two existing
methods for reflection removal method are evaluated on CASIA iris
image databases V3. The experimental results reveal that the
proposed algorithm provides higher performance in reflection
removal.
Abstract: Tool Tracker is a client-server based application. It is essentially a catalogue of various network monitoring and management tools that are available online. There is a database maintained on the server side that contains the information about various tools. Several clients can access this information simultaneously and utilize this information. The various categories of tools considered are packet sniffers, port mappers, port scanners, encryption tools, and vulnerability scanners etc for the development of this application. This application provides a front end through which the user can invoke any tool from a central repository for the purpose of packet sniffing, port scanning, network analysis etc. Apart from the tool, its description and the help files associated with it would also be stored in the central repository. This facility will enable the user to view the documentation pertaining to the tool without having to download and install the tool. The application would update the central repository with the latest versions of the tools. The application would inform the user about the availability of a newer version of the tool currently being used and give the choice of installing the newer version to the user. Thus ToolTracker provides any network administrator that much needed abstraction and ease-ofuse with respect to the tools that he can use to efficiently monitor a network.
Abstract: The data exchanged on the Web are of different nature
from those treated by the classical database management systems;
these data are called semi-structured data since they do not have a
regular and static structure like data found in a relational database;
their schema is dynamic and may contain missing data or types.
Therefore, the needs for developing further techniques and
algorithms to exploit and integrate such data, and extract relevant
information for the user have been raised. In this paper we present
the system OSIX (Osiris based System for Integration of XML
Sources). This system has a Data Warehouse model designed for the
integration of semi-structured data and more precisely for the
integration of XML documents. The architecture of OSIX relies on
the Osiris system, a DL-based model designed for the representation
and management of databases and knowledge bases. Osiris is a viewbased
data model whose indexing system supports semantic query
optimization. We show that the problem of query processing on a
XML source is optimized by the indexing approach proposed by
Osiris.
Abstract: With the rapid development in the field of life
sciences and the flooding of genomic information, the need for faster
and scalable searching methods has become urgent. One of the
approaches that were investigated is indexing. The indexing methods
have been categorized into three categories which are the lengthbased
index algorithms, transformation-based algorithms and mixed
techniques-based algorithms. In this research, we focused on the
transformation based methods. We embedded the N-gram method
into the transformation-based method to build an inverted index
table. We then applied the parallel methods to speed up the index
building time and to reduce the overall retrieval time when querying
the genomic database. Our experiments show that the use of N-Gram
transformation algorithm is an economical solution; it saves time and
space too. The result shows that the size of the index is smaller than
the size of the dataset when the size of N-Gram is 5 and 6. The
parallel N-Gram transformation algorithm-s results indicate that the
uses of parallel programming with large dataset are promising which
can be improved further.
Abstract: This study was conducted to explore the effects of two
countries model comparison program in Taiwan and Singapore in
TIMSS database. The researchers used Multi-Group Hierarchical
Linear Modeling techniques to compare the effects of two different
country models and we tested our hypotheses on 4,046 Taiwan
students and 4,599 Singapore students in 2007 at two levels: the class
level and student (individual) level. Design quality is a class level
variable. Student level variables are achievement and self-confidence.
The results challenge the widely held view that retention has a positive
impact on self-confidence. Suggestions for future research are
discussed.
Abstract: Cameron Highlands is a mountainous area subjected
to torrential tropical showers. It extracts 5.8 million liters of water
per day for drinking supply from its rivers at several intake points.
The water quality of rivers in Cameron Highlands, however, has
deteriorated significantly due to land clearing for agriculture,
excessive usage of pesticides and fertilizers as well as construction
activities in rapidly developing urban areas. On the other hand, these
pollution sources known as non-point pollution sources are diverse
and hard to identify and therefore they are difficult to estimate.
Hence, Geographical Information Systems (GIS) was used to provide
an extensive approach to evaluate landuse and other mapping
characteristics to explain the spatial distribution of non-point sources
of contamination in Cameron Highlands. The method to assess
pollution sources has been developed by using Cameron Highlands
Master Plan (2006-2010) for integrating GIS, databases, as well as
pollution loads in the area of study. The results show highest annual
runoff is created by forest, 3.56 × 108 m3/yr followed by urban
development, 1.46 × 108 m3/yr. Furthermore, urban development
causes highest BOD load (1.31 × 106 kgBOD/yr) while agricultural
activities and forest contribute the highest annual loads for
phosphorus (6.91 × 104 kgP/yr) and nitrogen (2.50 × 105 kgN/yr),
respectively. Therefore, best management practices (BMPs) are
suggested to be applied to reduce pollution level in the area.
Abstract: This paper presents a system overview of Mobile to Server Face Recognition, which is a face recognition application developed specifically for mobile phones. Images taken from mobile phone cameras lack of quality due to the low resolution of the cameras. Thus, a prototype is developed to experiment the chosen method. However, this paper shows a result of system backbone without the face recognition functionality. The result demonstrated in this paper indicates that the interaction between mobile phones and server is successfully working. The result shown before the database is completely ready. The system testing is currently going on using real images and a mock-up database to test the functionality of the face recognition algorithm used in this system. An overview of the whole system including screenshots and system flow-chart are presented in this paper. This paper also presents the inspiration or motivation and the justification in developing this system.