Abstract: The internet is growing larger and becoming the most popular platform for the people to share their opinion in different interests. We choose the education domain specifically comparing some Malaysian universities against each other. This comparison produces benchmark based on different criteria shared by the online users in various online resources including Twitter, Facebook and web pages. The comparison is accomplished using opinion mining framework to extract, process the unstructured text and classify the result to positive, negative or neutral (polarity). Hence, we divide our framework to three main stages; opinion collection (extraction), unstructured text processing and polarity classification. The extraction stage includes web crawling, HTML parsing, Sentence segmentation for punctuation classification, Part of Speech (POS) tagging, the second stage processes the unstructured text with stemming and stop words removal and finally prepare the raw text for classification using Named Entity Recognition (NER). Last phase is to classify the polarity and present overall result for the comparison among the Malaysian universities. The final result is useful for those who are interested to study in Malaysia, in which our final output declares clear winners based on the public opinions all over the web.
Abstract: Success is a European project that will implement several clean transport offers in three European cities and evaluate the environmental impacts. The goal of these measures is to improve urban mobility or the displacement of residents inside cities. For e.g. park and ride, electric vehicles, hybrid bus and bike sharing etc. A list of 28 criteria and 60 measures has been established for evaluation of these transport projects. The evaluation criteria can be grouped into: Transport, environment, social, economic and fuel consumption. This article proposes a decision support system based that encapsulates a hybrid approach based on fuzzy logic, multicriteria analysis and belief theory for the evaluation of impacts of urban mobility solutions. A web-based tool called DeSSIA (Decision Support System for Impacts Assessment) has been developed that treats complex data. The tool has several functionalities starting from data integration (import of data), evaluation of projects and finishes by graphical display of results. The tool development is based on the concept of MVC (Model, View, and Controller). The MVC is a conception model adapted to the creation of software's which impose separation between data, their treatment and presentation. Effort is laid on the ergonomic aspects of the application. It has codes compatible with the latest norms (XHTML, CSS) and has been validated by W3C (World Wide Web Consortium). The main ergonomic aspect focuses on the usability of the application, ease of learning and adoption. By the usage of technologies such as AJAX (XML and Java Script asynchrones), the application is more rapid and convivial. The positive points of our approach are that it treats heterogeneous data (qualitative, quantitative) from various information sources (human experts, survey, sensors, model etc.).
Abstract: In this study, a fuzzy similarity approach for Arabic
web pages classification is presented. The approach uses a fuzzy
term-category relation by manipulating membership degree for the
training data and the degree value for a test web page. Six measures
are used and compared in this study. These measures include:
Einstein, Algebraic, Hamacher, MinMax, Special case fuzzy and
Bounded Difference approaches. These measures are applied and
compared using 50 different Arabic web pages. Einstein measure was
gave best performance among the other measures. An analysis of
these measures and concluding remarks are drawn in this study.
Abstract: In this study, a fuzzy similarity approach for Arabic web pages classification is presented. The approach uses a fuzzy term-category relation by manipulating membership degree for the training data and the degree value for a test web page. Six measures are used and compared in this study. These measures include: Einstein, Algebraic, Hamacher, MinMax, Special case fuzzy and Bounded Difference approaches. These measures are applied and compared using 50 different Arabic web-pages. Einstein measure was gave best performance among the other measures. An analysis of these measures and concluding remarks are drawn in this study.
Abstract: SIP (Session Initiation Protocol), using HTML based
call control messaging which is quite simple and efficient, is being
replaced for VoIP networks recently. As for authentication and
authorization purposes there are many approaches and considerations
for securing SIP to eliminate forgery on the integrity of SIP
messages. On the other hand Elliptic Curve Cryptography has
significant advantages like smaller key sizes, faster computations on
behalf of other Public Key Cryptography (PKC) systems that obtain
data transmission more secure and efficient. In this work a new
approach is proposed for secure SIP authentication by using a public
key exchange mechanism using ECC. Total execution times and
memory requirements of proposed scheme have been improved in
comparison with non-elliptic approaches by adopting elliptic-based
key exchange mechanism.
Abstract: Rapid steps made in the field of Information and Communication Technology (ICT) has facilitated the development of teaching and learning methods and prepared them to serve the needs of an assorted educational institution. In other words, the information age has redefined the fundamentals and transformed the institutions and method of services delivery forever. The vision is the articulation of a desire to transform the method of teaching and learning could proceed through e-learning. E-learning is commonly deliberated to use of networked information and communications technology in teaching and learning practice. This paper deals the general aspects of the e-leaning with its issues, developments, opportunities and challenges, which can the higher institutions own.
Abstract: Increasing growth of information volume in the
internet causes an increasing need to develop new (semi)automatic
methods for retrieval of documents and ranking them according to
their relevance to the user query. In this paper, after a brief review
on ranking models, a new ontology based approach for ranking
HTML documents is proposed and evaluated in various
circumstances. Our approach is a combination of conceptual,
statistical and linguistic methods. This combination reserves the
precision of ranking without loosing the speed. Our approach
exploits natural language processing techniques for extracting
phrases and stemming words. Then an ontology based conceptual
method will be used to annotate documents and expand the query.
To expand a query the spread activation algorithm is improved so
that the expansion can be done in various aspects. The annotated
documents and the expanded query will be processed to compute
the relevance degree exploiting statistical methods. The outstanding
features of our approach are (1) combining conceptual, statistical
and linguistic features of documents, (2) expanding the query with
its related concepts before comparing to documents, (3) extracting
and using both words and phrases to compute relevance degree, (4)
improving the spread activation algorithm to do the expansion based
on weighted combination of different conceptual relationships and
(5) allowing variable document vector dimensions. A ranking
system called ORank is developed to implement and test the
proposed model. The test results will be included at the end of the
paper.
Abstract: Evaluation of educational portals is an important
subject area that needs more attention from researchers. A university
that has an educational portal which is difficult to use and interact by
teachers or students or management staff can reduce the position and
reputation of the university. Therefore, it is important to have the
ability to make an evaluation of the quality of e-services the
university provide to improve them over time.
The present study evaluates the usability of the Information
Technology Faculty portal at University of Benghazi. Two evaluation
methods were used: a questionnaire-based method and an online
automated tool-based method. The first method was used to measure
the portal's external attributes of usability (Information, Content and
Organization of the portal, Navigation, Links and Accessibility,
Aesthetic and Visual Appeal, Performance and Effectiveness and
educational purpose) from users' perspectives, while the second
method was used to measure the portal's internal attributes of
usability (number and size of HTML files, number and size of images,
load time, HTML check errors, browsers compatibility problems,
number of bad and broken links), which cannot be perceived by the
users. The study showed that some of the usability aspects have been
found at the acceptable level of performance and quality, and some
others have been found otherwise. In general, it was concluded that
the usability of IT faculty educational portal generally acceptable.
Recommendations and suggestions to improve the weakness and
quality of the portal usability are presented in this study.
Abstract: Demand over web services is in growing with increases number of Web users. Web service is applied by Web application. Web application size is affected by its user-s requirements and interests. Differential in requirements and interests lead to growing of Web application size. The efficient way to save store spaces for more data and information is achieved by implementing algorithms to compress the contents of Web application documents. This paper introduces an algorithm to reduce Web application size based on reduction of the contents of HTML files. It removes unimportant contents regardless of the HTML file size. The removing is not ignored any character that is predicted in the HTML building process.
Abstract: This research is designed for helping a WAPbased mobile phone-s user in order to analyze of logistics in the traffic area by applying and designing the accessible processes from mobile user to server databases. The research-s design comprises Mysql 4.1.8-nt database system for being the server which there are three sub-databases, traffic light – times of intersections in periods of the day, distances on the road of area-blocks where are divided from the main sample-area and speeds of sample vehicles (motorcycle, personal car and truck) in periods of the day. For interconnections between the server and user, PHP is used to calculate distances and travelling times from the beginning point to destination, meanwhile XHTML applied for receiving, sending and displaying data from PHP to user-s mobile. In this research, the main sample-area is focused at the Huakwang-Ratchada-s area, Bangkok, Thailand where usually the congested point and 6.25 km2 surrounding area which are split into 25 blocks, 0.25 km2 for each. For simulating the results, the designed server-database and all communicating models of this research have been uploaded to www.utccengineering.com/m4tg and used the mobile phone which supports WAP 2.0 XHTML/HTML multimode browser for observing values and displayed pictures. According to simulated results, user can check the route-s pictures from the requiring point to destination along with analyzed consuming times when sample vehicles travel in various periods of the day.
Abstract: SeqWord Gene Island Sniffer, a new program for
the identification of mobile genetic elements in sequences of bacterial chromosomes is presented. This program is based on the
analysis of oligonucleotide usage variations in DNA sequences. 3,518 mobile genetic elements were identified in 637 bacterial
genomes and further analyzed by sequence similarity and the
functionality of encoded proteins. The results of this study are stored in an open database http://anjie.bi.up.ac.za/geidb/geidbhome.
php). The developed computer program and the database provide the information valuable for further investigation of the
distribution of mobile genetic elements and virulence factors among bacteria. The program is available for download at www.bi.up.ac.za/SeqWord/sniffer/index.html.
Abstract: In this paper, we propose a fixed formatting method of PPX(Pretty Printer for XML). PPX is a query language for XML database which has extensive formatting capability that produces HTML as the result of a query. The fixed formatting method is to completely specify the combination of variables and layout specification operators within the layout expression of the GENERATE clause of PPX. In the experiment, a quick comparison shows that PPX requires far less description compared to XSLT or XQuery programs doing the same tasks.
Abstract: Text Mining is an important step of Knowledge
Discovery process. It is used to extract hidden information from notstructured
o semi-structured data. This aspect is fundamental because
much of the Web information is semi-structured due to the nested
structure of HTML code, much of the Web information is linked,
much of the Web information is redundant. Web Text Mining helps
whole knowledge mining process to mining, extraction and
integration of useful data, information and knowledge from Web
page contents.
In this paper, we present a Web Text Mining process able to
discover knowledge in a distributed and heterogeneous multiorganization
environment. The Web Text Mining process is based on
flexible architecture and is implemented by four steps able to
examine web content and to extract useful hidden information
through mining techniques. Our Web Text Mining prototype starts
from the recovery of Web job offers in which, through a Text Mining
process, useful information for fast classification of the same are
drawn out, these information are, essentially, job offer place and
skills.
Abstract: The internet has become an attractive avenue for
global e-business, e-learning, knowledge sharing, etc. Due to
continuous increase in the volume of web content, it is not practically
possible for a user to extract information by browsing and integrating
data from a huge amount of web sources retrieved by the existing
search engines. The semantic web technology enables advancement
in information extraction by providing a suite of tools to integrate
data from different sources. To take full advantage of semantic web,
it is necessary to annotate existing web pages into semantic web
pages. This research develops a tool, named OWIE (Ontology-based
Web Information Extraction), for semantic web annotation using
domain specific ontologies. The tool automatically extracts
information from html pages with the help of pre-defined ontologies
and gives them semantic representation. Two case studies have been
conducted to analyze the accuracy of OWIE.
Abstract: Increasing growth of information volume in the
internet causes an increasing need to develop new (semi)automatic
methods for retrieval of documents and ranking them according to
their relevance to the user query. In this paper, after a brief review
on ranking models, a new ontology based approach for ranking
HTML documents is proposed and evaluated in various
circumstances. Our approach is a combination of conceptual,
statistical and linguistic methods. This combination reserves the
precision of ranking without loosing the speed. Our approach
exploits natural language processing techniques to extract phrases
from documents and the query and doing stemming on words. Then
an ontology based conceptual method will be used to annotate
documents and expand the query. To expand a query the spread
activation algorithm is improved so that the expansion can be done
flexible and in various aspects. The annotated documents and the
expanded query will be processed to compute the relevance degree
exploiting statistical methods. The outstanding features of our
approach are (1) combining conceptual, statistical and linguistic
features of documents, (2) expanding the query with its related
concepts before comparing to documents, (3) extracting and using
both words and phrases to compute relevance degree, (4) improving
the spread activation algorithm to do the expansion based on
weighted combination of different conceptual relationships and (5)
allowing variable document vector dimensions. A ranking system
called ORank is developed to implement and test the proposed
model. The test results will be included at the end of the paper.
Abstract: Today’s technology is heavily dependent on web applications. Web applications are being accepted by users at a very rapid pace. These have made our work efficient. These include webmail, online retail sale, online gaming, wikis, departure and arrival of trains and flights and list is very long. These are developed in different languages like PHP, Python, C#, ASP.NET and many more by using scripts such as HTML and JavaScript. Attackers develop tools and techniques to exploit web applications and legitimate websites. This has led to rise of web application security; which can be broadly classified into Declarative Security and Program Security. The most common attacks on the applications are by SQL Injection and XSS which give access to unauthorized users who totally damage or destroy the system. This paper presents a detailed literature description and analysis on Web Application Security, examples of attacks and steps to mitigate the vulnerabilities.
Abstract: This paper presents a new approach for automatic
document categorization. Exploiting the logical structure of the
document, our approach assigns a HTML document to one or more
categories (thesis, paper, call for papers, email, ...). Using a set of
training documents, our approach generates a set of rules used to
categorize new documents. The approach flexibility is carried out
with rule weight association representing your importance in the
discrimination between possible categories. This weight is
dynamically modified at each new document categorization. The
experimentation of the proposed approach provides satisfactory
results.
Abstract: Phishing, or stealing of sensitive information on the
web, has dealt a major blow to Internet Security in recent times. Most
of the existing anti-phishing solutions fail to handle the fuzziness
involved in phish detection, thus leading to a large number of false
positives. This fuzziness is attributed to the use of highly flexible and
at the same time, highly ambiguous HTML language. We introduce a
new perspective against phishing, that tries to systematically prove,
whether a given page is phished or not, using the corresponding
original page as the basis of the comparison. It analyzes the layout of
the pages under consideration to determine the percentage distortion
between them, indicative of any form of malicious alteration. The
system design represents an intelligent system, employing dynamic
assessment which accurately identifies brand new phishing attacks
and will prove effective in reducing the number of false positives.
This framework could potentially be used as a knowledge base, in
educating the internet users against phishing.
Abstract: MicroRNAs (miRNAs) are a class of non-coding
RNAs that hybridize to mRNAs and induce either translation
repression or mRNA cleavage. Recently, it has been reported that
miRNAs could possibly play an important role in human diseases. By
integrating miRNA target genes, cancer genes, miRNA and mRNA
expression profiles information, a database is developed to link
miRNAs to cancer target genes. The database provides experimentally
verified human miRNA target genes information, including oncogenes
and tumor suppressor genes. In addition, fragile sites information for
miRNAs, and the strength of the correlation of miRNA and its target
mRNA expression level for nine tissue types are computed, which
serve as an indicator for suggesting miRNAs could play a role in
human cancer. The database is freely accessible at
http://ppi.bioinfo.asia.edu.tw/mirna_target/index.html.
Abstract: EGOTHOR is a search engine that indexes the Web
and allows us to search the Web documents. Its hit list contains URL
and title of the hits, and also some snippet which tries to shortly
show a match. The snippet can be almost always assembled by an
algorithm that has a full knowledge of the original document (mostly
HTML page). It implies that the search engine is required to store
the full text of the documents as a part of the index.
Such a requirement leads us to pick up an appropriate compression
algorithm which would reduce the space demand. One of the solutions
could be to use common compression methods, for instance gzip or
bzip2, but it might be preferable if we develop a new method which
would take advantage of the document structure, or rather, the textual
character of the documents.
There already exist a special compression text algorithms and
methods for a compression of XML documents. The aim of this
paper is an integration of the two approaches to achieve an optimal
level of the compression ratio