Bioinformatic Analysis of Retroelement-Associated Sequences in Human and Mouse Promoters

Mammalian genomes contain large number of retroelements (SINEs, LINEs and LTRs) which could affect expression of protein coding genes through associated transcription factor binding sites (TFBS). Activity of the retroelement-associated TFBS in many genes is confirmed experimentally but their global functional impact remains unclear. Human SINEs (Alu repeats) and mouse SINEs (B1 and B2 repeats) are known to be clustered in GCrich gene rich genome segments consistent with the view that they can contribute to regulation of gene expression. We have shown earlier that Alu are involved in formation of cis-regulatory modules (clusters of TFBS) in human promoters, and other authors reported that Alu located near promoter CpG islands have an increased frequency of CpG dinucleotides suggesting that these Alu are undermethylated. Human Alu and mouse B1/B2 elements have an internal bipartite promoter for RNA polymerase III containing conserved sequence motif called B-box which can bind basal transcription complex TFIIIC. It has been recently shown that TFIIIC binding to B-box leads to formation of a boundary which limits spread of repressive chromatin modifications in S. pombe. SINEassociated B-boxes may have similar function but conservation of TFIIIC binding sites in SINEs located near mammalian promoters has not been studied earlier. Here we analysed abundance and distribution of retroelements (SINEs, LINEs and LTRs) in annotated sequences of the Database of mammalian transcription start sites (DBTSS). Fractions of SINEs in human and mouse promoters are slightly lower than in all genome but >40% of human and mouse promoters contain Alu or B1/B2 elements within -1000 to +200 bp interval relative to transcription start site (TSS). Most of these SINEs is associated with distal segments of promoters (-1000 to -200 bp relative to TSS) indicating that their insertion at distances >200 bp upstream of TSS is tolerated during evolution. Distribution of SINEs in promoters correlates negatively with the distribution of CpG sequences. Using analysis of abundance of 12-mer motifs from the B1 and Alu consensus sequences in genome and DBTSS it has been confirmed that some subsegments of Alu and B1 elements are poorly conserved which depends in part on the presence of CpG dinucleotides. One of these CpG-containing subsegments in B1 elements overlaps with SINE-associated B-box and it shows better conservation in DBTSS compared to genomic sequences. It has been also studied conservation in DBTSS and genome of the B-box containing segments of old (AluJ, AluS) and young (AluY) Alu repeats and found that CpG sequence of the B-box of old Alu is better conserved in DBTSS than in genome. This indicates that Bbox- associated CpGs in promoters are better protected from methylation and mutation than B-box-associated CpGs in genomic SINEs. These results are consistent with the view that potential TFIIIC binding motifs in SINEs associated with human and mouse promoters may be functionally important. These motifs may protect promoters from repressive histone modifications which spread from adjacent sequences. This can potentially explain well known clustering of SINEs in GC-rich gene rich genome compartments and existence of unmethylated CpG islands.

Computing a Time Based Effective Radius-of-Curvature for Roadways

The radius-of-curvature (ROC) defines the degree of curvature along the centerline of a roadway whereby a travelling vehicle must follow. Roadway designs must encompass ROC in mitigating the cost of earthwork associated with construction while also allowing vehicles to travel at maximum allowable design speeds. Thus, a road will tend to follow natural topography where possible, but curvature must also be optimized to permit fast, but safe vehicle speeds. The more severe the curvature of the road, the slower the permissible vehicle speed. For route planning, whether for urban settings, emergency operations, or even parcel delivery, ROC is a necessary attribute of road arcs for computing travel time. It is extremely rare for a geo-spatial database to contain ROC. This paper will present a procedure and mathematical algorithm to calculate and assign ROC to a segment pair and/or polyline.

ECG Analysis using Nature Inspired Algorithm

This paper presents an algorithm based on the wavelet decomposition, for feature extraction from the ECG signal and recognition of three types of Ventricular Arrhythmias using neural networks. A set of Discrete Wavelet Transform (DWT) coefficients, which contain the maximum information about the arrhythmias, is selected from the wavelet decomposition. After that a novel clustering algorithm based on nature inspired algorithm (Ant Colony Optimization) is developed for classifying arrhythmia types. The algorithm is applied on the ECG registrations from the MIT-BIH arrhythmia and malignant ventricular arrhythmia databases. We applied Daubechies 4 wavelet in our algorithm. The wavelet decomposition enabled us to perform the task efficiently and produced reliable results.

A Detailed Timber Harvest Simulator Coupled with 3-D Visualization

In today-s world, the efficient utilization of wood resources comes more and more to the mind of forest owners. It is a very complex challenge to ensure an efficient harvest of the wood resources. This is one of the scopes the project “Virtual Forest II" addresses. Its core is a database with data about forests containing approximately 260 million trees located in North Rhine-Westphalia (NRW). Based on this data, tree growth simulations and wood mobilization simulations can be conducted. This paper focuses on the latter. It describes a discrete-event-simulation with an attached 3-D real time visualization which simulates timber harvest using trees from the database with different crop resources. This simulation can be displayed in 3-D to show the progress of the wood crop. All the data gathered during the simulation is presented as a detailed summary afterwards. This summary includes cost-benefit calculations and can be compared to those of previous runs to optimize the financial outcome of the timber harvest by exchanging crop resources or modifying their parameters.

The Implementation of Remote Automation Execution Agent over ACL on QOS POLICY Based System

This paper will present the implementation of QoS policy based system by utilizing rules on Access Control List (ACL) over Layer 3 (L3) switch. Also presented is the architecture on that implementation; the tools being used and the result were gathered. The system architecture has an ability to control ACL rules which are installed inside an external L3 switch. ACL rules used to instruct the way of access control being executed, in order to entertain all traffics through that particular switch. The main advantage of using this approach is that the single point of failure could be prevented when there are any changes on ACL rules inside L3 switches. Another advantage is that the agent could instruct ACL rules automatically straight away based on the changes occur on policy database without configuring them one by one. Other than that, when QoS policy based system was implemented in distributed environment, the monitoring process can be synchronized easily due to the automate process running by agent over external policy devices.

Incorporating Semantic Similarity Measure in Genetic Algorithm : An Approach for Searching the Gene Ontology Terms

The most important property of the Gene Ontology is the terms. These control vocabularies are defined to provide consistent descriptions of gene products that are shareable and computationally accessible by humans, software agent, or other machine-readable meta-data. Each term is associated with information such as definition, synonyms, database references, amino acid sequences, and relationships to other terms. This information has made the Gene Ontology broadly applied in microarray and proteomic analysis. However, the process of searching the terms is still carried out using traditional approach which is based on keyword matching. The weaknesses of this approach are: ignoring semantic relationships between terms, and highly depending on a specialist to find similar terms. Therefore, this study combines semantic similarity measure and genetic algorithm to perform a better retrieval process for searching semantically similar terms. The semantic similarity measure is used to compute similitude strength between two terms. Then, the genetic algorithm is employed to perform batch retrievals and to handle the situation of the large search space of the Gene Ontology graph. The computational results are presented to show the effectiveness of the proposed algorithm.

DD Models for Reports Building

In general, reports are a form of representing data in such way that user gets the information he needs. They can be built in various ways, from the simplest (“select from") to the most complex ones (results derived from different sources/tables with complex formulas applied). Furthermore, rules of calculations could be written as a program hard code or built in the database to be used by dynamic code. This paper will introduce two types of reports, defined in the DB structure. The main goal is to manage calculations in optimal way, keeping maintenance of reports as simple and smooth as possible.

Image Indexing Using a Color Similarity Metric based on the Human Visual System

The novelty proposed in this study is twofold and consists in the developing of a new color similarity metric based on the human visual system and a new color indexing based on a textual approach. The new color similarity metric proposed is based on the color perception of the human visual system. Consequently the results returned by the indexing system can fulfill as much as possibile the user expectations. We developed a web application to collect the users judgments about the similarities between colors, whose results are used to estimate the metric proposed in this study. In order to index the image's colors, we used a text indexing engine to facilitate the integration of visual features in a database of text documents. The textual signature is build by weighting the image's colors in according to their occurrence in the image. The use of a textual indexing engine, provide us a simple, fast and robust solution to index images. A typical usage of the system proposed in this study, is the development of applications whose data type is both visual and textual. In order to evaluate the proposed method we chose a price comparison engine as a case of study, collecting a series of commercial offers containing the textual description and the image representing a specific commercial offer.

Finding Fuzzy Association Rules Using FWFP-Growth with Linguistic Supports and Confidences

In data mining, the association rules are used to search for the relations of items of the transactions database. Following the data is collected and stored, it can find rules of value through association rules, and assist manager to proceed marketing strategy and plan market framework. In this paper, we attempt fuzzy partition methods and decide membership function of quantitative values of each transaction item. Also, by managers we can reflect the importance of items as linguistic terms, which are transformed as fuzzy sets of weights. Next, fuzzy weighted frequent pattern growth (FWFP-Growth) is used to complete the process of data mining. The method above is expected to improve Apriori algorithm for its better efficiency of the whole association rules. An example is given to clearly illustrate the proposed approach.

How Efficiency of Password Attack Based on a Keyboard

At present, dictionary attack has been the basic tool for recovering key passwords. In order to avoid dictionary attack, users purposely choose another character strings as passwords. According to statistics, about 14% of users choose keys on a keyboard (Kkey, for short) as passwords. This paper develops a framework system to attack the password chosen from Kkeys and analyzes its efficiency. Within this system, we build up keyboard rules using the adjacent and parallel relationship among Kkeys and then use these Kkey rules to generate password databases by depth-first search method. According to the experiment results, we find the key space of databases derived from these Kkey rules that could be far smaller than the password databases generated within brute-force attack, thus effectively narrowing down the scope of attack research. Taking one general Kkey rule, the combinations in all printable characters (94 types) with Kkey adjacent and parallel relationship, as an example, the derived key space is about 240 smaller than those in brute-force attack. In addition, we demonstrate the method's practicality and value by successfully cracking the access password to UNIX and PC using the password databases created

Automata Theory Approach for Solving Frequent Pattern Discovery Problems

The various types of frequent pattern discovery problem, namely, the frequent itemset, sequence and graph mining problems are solved in different ways which are, however, in certain aspects similar. The main approach of discovering such patterns can be classified into two main classes, namely, in the class of the levelwise methods and in that of the database projection-based methods. The level-wise algorithms use in general clever indexing structures for discovering the patterns. In this paper a new approach is proposed for discovering frequent sequences and tree-like patterns efficiently that is based on the level-wise issue. Because the level-wise algorithms spend a lot of time for the subpattern testing problem, the new approach introduces the idea of using automaton theory to solve this problem.

Implementation of Geo-knowledge Based Geographic Information System for Estimating Earthquake Hazard Potential at a Metropolitan Area, Gwangju, in Korea

In this study, an inland metropolitan area, Gwangju, in Korea was selected to assess the amplification potential of earthquake motion and provide the information for regional seismic countermeasure. A geographic information system-based expert system was implemented for reliably predicting the spatial geotechnical layers in the entire region of interesting by building a geo-knowledge database. Particularly, the database consists of the existing boring data gathered from the prior geotechnical projects and the surface geo-knowledge data acquired from the site visit. For practical application of the geo-knowledge database to estimate the earthquake hazard potential related to site amplification effects at the study area, seismic zoning maps on geotechnical parameters, such as the bedrock depth and the site period, were created within GIS framework. In addition, seismic zonation of site classification was also performed to determine the site amplification coefficients for seismic design at any site in the study area. KeywordsEarthquake hazard, geo-knowledge, geographic information system, seismic zonation, site period.

Elections Management Information Communication System Voter Ballot

Abovepresented work deals with the new scope of application of information and communication technologies for the improvement of the election process in the biased environment. We are introducing a new concept of construction of the information-communication system for the election participant. It consists of four main components: Software, Physical Infrastructure, Structured Information and the Trained Stuff. The Structured Information is the bases of the whole system and is the collection of all possible events (irregularities among them) at the polling stations, which are structured in special templates, forms and integrated in mobile devices.The software represents a package of analytic modules, which operates with the dynamic database. The application of modern communication technologies facilities the immediate exchange of information and of relevant documents between the polling stations and the Server of the participant. No less important is the training of the staff for the proper functioning of the system. The e-training system with various modules should be applied in this respect. The presented methodology is primarily focused on the election processes in the countries of emerging democracies.It can be regarded as the tool for the monitoring of elections process by the political organization(s) and as one of the instruments to foster the spread of democracy in these countries.

Experimental teaching, Perceived usefulness, Ease of use, Learning Interest and Science Achievement of Taiwan 8th Graders in TIMSS 2007 Database

the data of Taiwanese 8th grader in the 4th cycle of Trends in International Mathematics and Science Study (TIMSS) are analyzed to examine the influence of the science teachers- preference in experimental teaching on the relationships between the affective variables ( the perceived usefulness of science, ease of using science and science learning interest) and the academic achievement in science. After dealing with the missing data, 3711 students and 145 science teacher-s data were analyzed through a Hierarchical Linear Modeling technique. The major objective of this study was to determine the role of the experimental teaching moderates the relationship between perceived usefulness and achievement.

A method for Music Classification Based On Perceived Mood Detection for Indian Bollywood Music

A lot of research has been done in the past decade in the field of audio content analysis for extracting various information from audio signal. One such significant information is the "perceived mood" or the "emotions" related to a music or audio clip. This information is extremely useful in applications like creating or adapting the play-list based on the mood of the listener. This information could also be helpful in better classification of the music database. In this paper we have presented a method to classify music not just based on the meta-data of the audio clip but also include the "mood" factor to help improve the music classification. We propose an automated and efficient way of classifying music samples based on the mood detection from the audio data. We in particular try to classify the music based on mood for Indian bollywood music. The proposed method tries to address the following problem statement: Genre information (usually part of the audio meta-data) alone does not help in better music classification. For example the acoustic version of the song "nothing else matters by Metallica" can be classified as melody music and thereby a person in relaxing or chill out mood might want to listen to this track. But more often than not this track is associated with metal / heavy rock genre and if a listener classified his play-list based on the genre information alone for his current mood, the user shall miss out on listening to this track. Currently methods exist to detect mood in western or similar kind of music. Our paper tries to solve the issue for Indian bollywood music from an Indian cultural context

A Novel Adaptive E-Learning Model Based on Developed Learner's Styles

Adaptive e-learning today gives the student a central role in his own learning process. It allows learners to try things out, participate in courses like never before, and get more out of learning than before. In this paper, an adaptive e-learning model for logic design, simplification of Boolean functions and related fields is presented. Such model presents suitable courses for each student in a dynamic and adaptive manner using existing database and workflow technologies. The main objective of this research work is to provide an adaptive e-learning model based learners' personality using explicit and implicit feedback. To recognize the learner-s, we develop dimensions to decide each individual learning style in order to accommodate different abilities of the users and to develop vital skills. Thus, the proposed model becomes more powerful, user friendly and easy to use and interpret. Finally, it suggests a learning strategy and appropriate electronic media that match the learner-s preference.

A Novel Approach to Iris Localization for Iris Biometric Processing

Iris-based biometric system is gaining its importance in several applications. However, processing of iris biometric is a challenging and time consuming task. Detection of iris part in an eye image poses a number of challenges such as, inferior image quality, occlusion of eyelids and eyelashes etc. Due to these problems it is not possible to achieve 100% accuracy rate in any iris-based biometric authentication systems. Further, iris detection is a computationally intensive task in the overall iris biometric processing. In this paper, we address these two problems and propose a technique to localize iris part efficiently and accurately. We propose scaling and color level transform followed by thresholding, finding pupil boundary points for pupil boundary detection and dilation, thresholding, vertical edge detection and removal of unnecessary edges present in the eye images for iris boundary detection. Scaling reduces the search space significantly and intensity level transform is helpful for image thresholding. Experimental results show that our approach is comparable with the existing approaches. Following our approach it is possible to detect iris part with 95-99% accuracy as substantiated by our experiments on CASIA Ver-3.0, ICE 2005, UBIRIS, Bath and MMU iris image databases.

A Programming Solution for Moving Mobile Transaction

In this paper, our concern is the management of mobile transactions in the shared area among many servers, when the mobile user moves from one cell to another in online partiallyreplicated distributed mobile database environment. We defined the concept of transaction and classified the different types of transactions. Based on this analysis, we propose an algorithm that handles the disconnection due to moving among sites.

Rapid Finite-Element Based Airport Pavement Moduli Solutions using Neural Networks

This paper describes the use of artificial neural networks (ANN) for predicting non-linear layer moduli of flexible airfield pavements subjected to new generation aircraft (NGA) loading, based on the deflection profiles obtained from Heavy Weight Deflectometer (HWD) test data. The HWD test is one of the most widely used tests for routinely assessing the structural integrity of airport pavements in a non-destructive manner. The elastic moduli of the individual pavement layers backcalculated from the HWD deflection profiles are effective indicators of layer condition and are used for estimating the pavement remaining life. HWD tests were periodically conducted at the Federal Aviation Administration-s (FAA-s) National Airport Pavement Test Facility (NAPTF) to monitor the effect of Boeing 777 (B777) and Beoing 747 (B747) test gear trafficking on the structural condition of flexible pavement sections. In this study, a multi-layer, feed-forward network which uses an error-backpropagation algorithm was trained to approximate the HWD backcalculation function. The synthetic database generated using an advanced non-linear pavement finite-element program was used to train the ANN to overcome the limitations associated with conventional pavement moduli backcalculation. The changes in ANN-based backcalculated pavement moduli with trafficking were used to compare the relative severity effects of the aircraft landing gears on the NAPTF test pavements.

MONPAR - A Page Replacement Algorithm for a Spatiotemporal Database

For a spatiotemporal database management system, I/O cost of queries and other operations is an important performance criterion. In order to optimize this cost, an intense research on designing robust index structures has been done in the past decade. With these major considerations, there are still other design issues that deserve addressing due to their direct impact on the I/O cost. Having said this, an efficient buffer management strategy plays a key role on reducing redundant disk access. In this paper, we proposed an efficient buffer strategy for a spatiotemporal database index structure, specifically indexing objects moving over a network of roads. The proposed strategy, namely MONPAR, is based on the data type (i.e. spatiotemporal data) and the structure of the index structure. For the purpose of an experimental evaluation, we set up a simulation environment that counts the number of disk accesses while executing a number of spatiotemporal range-queries over the index. We reiterated simulations with query sets with different distributions, such as uniform query distribution and skewed query distribution. Based on the comparison of our strategy with wellknown page-replacement techniques, like LRU-based and Prioritybased buffers, we conclude that MONPAR behaves better than its competitors for small and medium size buffers under all used query-distributions.