Abstract: In recent years, SQL injection attacks have been identified as being prevalent against web applications. They affect network security and user data, which leads to a considerable loss of money and data every year. This paper presents the use of classification algorithms in machine learning using a method to classify the login data filtering inputs into "SQLi" or "Non-SQLi,” thus increasing the reliability and accuracy of results in terms of deciding whether an operation is an attack or a valid operation. A method as a Web-App is developed for auto-generated data replication to provide a twin of the targeted data structure. Shielding against SQLi attacks (WebAppShield) that verifies all users and prevents attackers (SQLi attacks) from entering and or accessing the database, which the machine learning module predicts as "Non-SQLi", has been developed. A special login form has been developed with a special instance of the data validation; this verification process secures the web application from its early stages. The system has been tested and validated, and up to 99% of SQLi attacks have been prevented.
Abstract: Multimedia Indexing and Retrieval is generally de-signed and implemented by employing feature graphs. These graphs typically contain a significant number of nodes and edges to reflect the level of detail in feature detection. A higher level of detail increases the effectiveness of the results but also leads to more complex graph structures. However, graph-traversal-based algorithms for similarity are quite inefficient and computation intensive, espe-cially for large data structures. To deliver fast and effective retrieval, an efficient similarity algorithm, particularly for large graphs, is mandatory. Hence, in this paper, we define a graph-projection into a 2D space (Graph Code) as well as the corresponding algorithms for indexing and retrieval. We show that calculations in this space can be performed more efficiently than graph-traversals due to a simpler processing model and a high level of parallelisation. In consequence, we prove that the effectiveness of retrieval also increases substantially, as Graph Codes facilitate more levels of detail in feature fusion. Thus, Graph Codes provide a significant increase in efficiency and effectiveness (especially for Multimedia indexing and retrieval) and can be applied to images, videos, audio, and text information.
Abstract: The Internet of Things (IoT) will lead to the development of advanced Smart Home services that are pervasive, cost-effective, and can be accessed by home occupants from anywhere and at any time. However, advanced smart home applications will introduce grand security challenges due to the increase in the attack surface. Current approaches do not handle cybersecurity from a holistic point of view; hence, a systematic cybersecurity mechanism needs to be adopted when designing smart home applications. In this paper, we present a generic intrusion detection methodology to detect and mitigate the anomaly behaviors happened in Smart Home Systems (SHS). By utilizing our Smart Home Context Data Structure, the heterogeneous information and services acquired from SHS are mapped in context attributes which can describe the context of smart home operation precisely and accurately. Runtime models for describing usage patterns of home assets are developed based on characterization functions. A threat-aware action management methodology, used to efficiently mitigate anomaly behaviors, is proposed at the end. Our preliminary experimental results show that our methodology can be used to detect and mitigate known and unknown threats, as well as to protect SHS premises and services.
Abstract: This paper presents a general approach to implement
efficient queries’ interpreters in a functional programming language.
Indeed, most of the standard tools actually available use an imperative
and/or object-oriented language for the implementation (e.g. Java for
Jena-Fuseki) but other paradigms are possible with, maybe, better
performances. To proceed, the paper first explains how to model
data structures and queries in a functional point of view. Then, it
proposes a general methodology to get performances (i.e. number of
computation steps to answer a query) then it explains how to integrate
some optimization techniques (short-cut fusion and, more important,
data transformations). It then compares the functional server proposed
to a standard tool (Fuseki) demonstrating that the first one can be
twice to ten times faster to answer queries.
Abstract: This paper proposes an Internet of Things (IoT) based virtualization platform for providing smart tourism services. The virtualization platform provides a consistent access interface to various types of data by naming IoT devices and legacy information systems as pathnames in a virtual file system. In the other words, the IoT virtualization platform functions as a middleware which uses the metadata for underlying collected data. The proposed platform makes it easy to provide customized tourism information by using tourist locations collected by IoT devices and additionally enables to create new interactive smart tourism services focused on the tourist locations. The proposed platform is very efficient so that the provided tourism services are isolated from changes in raw data and the services can be modified or expanded without changing the underlying data structure.
Abstract: A practical and simple self-indexing data structure, Partitioned Elias-Fano (PEF) - Compressed Suffix Arrays (CSA), is built in linear time for the CSA based on PEF indexes. Moreover, the PEF-CSA is compared with two classical compressed indexing methods, Ferragina and Manzini implementation (FMI) and Sad-CSA on different type and size files in Pizza & Chili. The PEF-CSA performs better on the existing data in terms of the compression ratio, count, and locates time except for the evenly distributed data such as proteins data. The observations of the experiments are that the distribution of the φ is more important than the alphabet size on the compression ratio. Unevenly distributed data φ makes better compression effect, and the larger the size of the hit counts, the longer the count and locate time.
Abstract: Automatic program generation saves time, human resources, and allows receiving syntactically clear and logically correct modules. The 4-th generation programming languages are related to drawing the data and the processes of the subject area, as well as, to obtain a frame of the respective information system. The application can be separated in interface and business logic. That means, for an interactive generation of the needed system to be used an already existing toolkit or to be created a new one.
Abstract: Recent progress in the next generation of automobile
technology is geared towards incorporating information technology
into cars. Collectively called smart cars are bringing intelligence to
cars that provides comfort, convenience and safety. A branch of smart
cars is connected-car system. The key concept in connected-cars is the
sharing of driving information among cars through decentralized
manner enabling collective intelligence. This paper proposes a
foundation of the information model that is necessary to define the
driving information for smart-cars. Road conditions are modeled
through a unique data structure that unambiguously represent the time
variant traffics in the streets. Additionally, the modeled data structure
is exemplified in a navigational scenario and usage using UML.
Optimal driving route searching is also discussed using the proposed
data structure in a dynamically changing road conditions.
Abstract: The increasing availability of information about earth
surface elevation (Digital Elevation Models DEM) generated from
different sources (remote sensing, Aerial Images, Lidar) poses the
question about how to integrate and make available to the most than
possible audience this huge amount of data. In order to exploit the potential of 3D elevation representation the
quality of data management plays a fundamental role. Due to the high
acquisition costs and the huge amount of generated data, highresolution
terrain surveys tend to be small or medium sized and
available on limited portion of earth. Here comes the need to merge
large-scale height maps that typically are made available for free at
worldwide level, with very specific high resolute datasets. One the
other hand, the third dimension increases the user experience and the
data representation quality, unlocking new possibilities in data
analysis for civil protection, real estate, urban planning, environment
monitoring, etc. The open-source 3D virtual globes, which are
trending topics in Geovisual Analytics, aim at improving the
visualization of geographical data provided by standard web services
or with proprietary formats. Typically, 3D Virtual globes like do not
offer an open-source tool that allows the generation of a terrain
elevation data structure starting from heterogeneous-resolution terrain
datasets. This paper describes a technological solution aimed to set
up a so-called “Terrain Builder”. This tool is able to merge
heterogeneous-resolution datasets, and to provide a multi-resolution
worldwide terrain services fully compatible with CesiumJS and
therefore accessible via web using traditional browser without any
additional plug-in.
Abstract: A large amount of data is typically stored in relational
databases (DB). The latter can efficiently handle user queries which
intend to elicit the appropriate information from data sources.
However, direct access and use of this data requires the end users to
have an adequate technical background, while they should also cope
with the internal data structure and values presented. Consequently
the information retrieval is a quite difficult process even for IT or DB
experts, taking into account the limited contributions of relational
databases from the conceptual point of view. Ontologies enable users
to formally describe a domain of knowledge in terms of concepts and
relations among them and hence they can be used for unambiguously
specifying the information captured by the relational database.
However, accessing information residing in a database using
ontologies is feasible, provided that the users are keen on using
semantic web technologies. For enabling users form different
disciplines to retrieve the appropriate data, the design of a Graphical
User Interface is necessary. In this work, we will present an
interactive, ontology-based, semantically enable web tool that can be
used for information retrieval purposes. The tool is totally based on
the ontological representation of underlying database schema while it
provides a user friendly environment through which the users can
graphically form and execute their queries.
Abstract: This paper presents a real-time visualization technique
and filtering of classified LiDAR point clouds. The visualization is
capable of displaying filtered information organized in layers by the
classification attribute saved within LiDAR datasets. We explain the
used data structure and data management, which enables real-time
presentation of layered LiDAR data. Real-time visualization is
achieved with LOD optimization based on the distance from the
observer without loss of quality. The filtering process is done in two
steps and is entirely executed on the GPU and implemented using
programmable shaders.
Abstract: The advances in technology in the last five years
allowed an improvement in the educational area, as the increasing in
the development of educational software. One of the techniques that
emerged in this lapse is called Gamification, which is the utilization of
video game mechanics outside its bounds. Recent studies involving
this technique provided positive results in the application of these
concepts in many areas as marketing, health and education. In the last
area there are studies that covers from elementary to higher education,
with many variations to adequate to the educators methodologies.
Among higher education, focusing on IT courses, data structures are
an important subject taught in many of these courses, as they are
base for many systems. Based on the exposed this paper exposes
the development of an interactive web learning environment, called
DSLEP (Data Structure Learning Platform), to aid students in higher
education IT courses. The system includes basic concepts seen on
this subject such as stacks, queues, lists, arrays, trees and was
implemented to ease the insertion of new structures. It was also
implemented with gamification concepts, such as points, levels, and
leader boards, to engage students in the search for knowledge and
stimulate self-learning.
Abstract: Due to the widespread of mobile sensing, there is a strong need to handle trails of moving objects, and trajectories. This paper proposes three visual analytics approaches for higher order information of trajectory datasets based on the higher order Voronoi diagram data structure. Proposed approaches reveal geometrical, topological, and directional information. Experimental resultsdemonstrate the applicability and usefulness of proposed three approaches.
Abstract: Reducing energy consumption of embedded systems requires careful memory management. It has been shown that Scratch- Pad Memories (SPMs) are low size, low cost, efficient (i.e. energy saving) data structures directly managed at the software level. In this paper, the focus is on heuristic methods for SPMs management. A method is efficient if the number of accesses to SPM is as large as possible and if all available space (i.e. bits) is used. A Tabu Search (TS) approach for memory management is proposed which is, to the best of our knowledge, a new original alternative to the best known existing heuristic (BEH). In fact, experimentations performed on benchmarks show that the Tabu Search method is as efficient as BEH (in terms of energy consumption) but BEH requires a sorting which can be computationally expensive for a large amount of data. TS is easy to implement and since no sorting is necessary, unlike BEH, the corresponding sorting time is saved. In addition to that, in a dynamic perspective where the maximum capacity of the SPM is not known in advance, the TS heuristic will perform better than BEH.
Abstract: In recent years, new product development became more and more competitive and globalized, and the designing phase is critical for the product success. The concept of modularity can provide the necessary foundation for organizations to design products that can respond rapidly to market needs. The paper describes data structures and algorithms of intelligent Web-based system for modular design taking into account modules compatibility relationship and given design requirements. The system intelligence is realized by developed algorithms for choice of modules reflecting all system restrictions and requirements. The proposed data structure and algorithms are illustrated by case study of personal computer configuration. The applicability of the proposed approach is tested through a prototype of Web-based system.
Abstract: Sonogram images of normal and lymphocyte thyroid tissues have considerable overlap which makes it difficult to interpret and distinguish. Classification from sonogram images of thyroid gland is tackled in semiautomatic way. While making manual diagnosis from images, some relevant information need not to be recognized by human visual system. Quantitative image analysis could be helpful to manual diagnostic process so far done by physician. Two classes are considered: normal tissue and chronic lymphocyte thyroid (Hashimoto's Thyroid). Data structure is analyzed using K-nearest-neighbors classification. This paper is mentioned that unlike the wavelet sub bands' energy, histograms and Haralick features are not appropriate to distinguish between normal tissue and Hashimoto's thyroid.
Abstract: Bloom filter is a probabilistic and memory efficient
data structure designed to answer rapidly whether an element is
present in a set. It tells that the element is definitely not in the set but
its presence is with certain probability. The trade-off to use Bloom
filter is a certain configurable risk of false positives. The odds of a
false positive can be made very low if the number of hash function is
sufficiently large. For spam detection, weight is attached to each set
of elements. The spam weight for a word is a measure used to rate the
e-mail. Each word is assigned to a Bloom filter based on its weight.
The proposed work introduces an enhanced concept in Bloom filter
called Bin Bloom Filter (BBF). The performance of BBF over
conventional Bloom filter is evaluated under various optimization
techniques. Real time data set and synthetic data sets are used for
experimental analysis and the results are demonstrated for bin sizes 4,
5, 6 and 7. Finally analyzing the results, it is found that the BBF
which uses heuristic techniques performs better than the traditional
Bloom filter in spam detection.
Abstract: In this paper, we propose an improvement of pattern
growth-based PrefixSpan algorithm, called I-PrefixSpan. The general idea of I-PrefixSpan is to use sufficient data structure for Seq-Tree
framework and separator database to reduce the execution time and
memory usage. Thus, with I-PrefixSpan there is no in-memory database stored after index set is constructed. The experimental result
shows that using Java 2, this method improves the speed of PrefixSpan up to almost two orders of magnitude as well as the memory usage to more than one order of magnitude.
Abstract: This paper presented a collaborative education model,
which consists four parts: collaborative teaching, collaborative
working, collaborative training and interaction. Supported by an
e-learning platform, collaborative education was practiced in a data
structure e-learning course. Data collected shows that most of students
accept collaborative education. This paper goes one step attempting to
determine which aspects appear to be most important or helpful in
collaborative education.