Abstract: Digital investigators often have a hard time spotting evidence in digital information. It has become hard to determine which source of proof relates to a specific investigation. A growing concern is that the various processes, technology, and specific procedures used in the digital investigation are not keeping up with criminal developments. Therefore, criminals are taking advantage of these weaknesses to commit further crimes. In digital forensics investigations, artificial intelligence (AI) is invaluable in identifying crime. Providing objective data and conducting an assessment is the goal of digital forensics and digital investigation, which will assist in developing a plausible theory that can be presented as evidence in court. This research paper aims at developing a multiagent framework for digital investigations using specific intelligent software agents (ISAs). The agents communicate to address particular tasks jointly and keep the same objectives in mind during each task. The rules and knowledge contained within each agent are dependent on the investigation type. A criminal investigation is classified quickly and efficiently using the case-based reasoning (CBR) technique. The proposed framework development is implemented using the Java Agent Development Framework, Eclipse, Postgres repository, and a rule engine for agent reasoning. The proposed framework was tested using the Lone Wolf image files and datasets. Experiments were conducted using various sets of ISAs and VMs. There was a significant reduction in the time taken for the Hash Set Agent to execute. As a result of loading the agents, 5% of the time was lost, as the File Path Agent prescribed deleting 1,510, while the Timeline Agent found multiple executable files. In comparison, the integrity check carried out on the Lone Wolf image file using a digital forensic tool kit took approximately 48 minutes (2,880 ms), whereas the MADIK framework accomplished this in 16 minutes (960 ms). The framework is integrated with Python, allowing for further integration of other digital forensic tools, such as AccessData Forensic Toolkit (FTK), Wireshark, Volatility, and Scapy.
Abstract: The aim of this paper is to propose pedagogical design for learning management systems (LMS) that offers greater inclusion for students based on a number of theoretical perspectives and delineated through an example. Considering the impact of COVID-19, including on student mental health, the research suggesting the importance of student sense of belonging on retention, success, and student well-being, the author describes intentional LMS design incorporating theoretically based practices informed by critical theory, feminist theory, indigenous theory and practices, and new materiality. This article considers important aspects of these theories and practices which attend to inclusion, identities, and socially just learning environments. Additionally, increasing student sense of belonging and mental health through LMS design influenced by adult learning theory and the community of inquiry model are described. The process of thinking through LMS pedagogical design with inclusion intentionally in mind affords the opportunity to allow LMS to go beyond course use as a repository of documents, to an intentional community of practice that facilitates belonging and connection, something much needed in our times. In virtual learning environments it has been harder to discern how students are doing, especially in feeling connected to their courses, their faculty, and their student peers. Increasingly at the forefront of public universities is addressing the needs of students with multiple and intersecting identities and the multiplicity of needs and accommodations. Education in 2020, and moving forward, calls for embedding critical theories and inclusive ideals and pedagogies to the ways instructors design and teach in online platforms. Through utilization of critical theoretical frameworks and instructional practices, students may experience the LMS as a welcoming place with intentional plans for welcoming diversity in identities.
Abstract: The neural network quantization is highly desired
procedure to perform before running neural networks on mobile
devices. Quantization without fine-tuning leads to accuracy drop of
the model, whereas commonly used training with quantization is done
on the full set of the labeled data and therefore is both time- and
resource-consuming. Real life applications require simplification and
acceleration of quantization procedure that will maintain accuracy of
full-precision neural network, especially for modern mobile neural
network architectures like Mobilenet-v1, MobileNet-v2 and MNAS. Here we present a method to significantly optimize training with
quantization procedure by introducing the trained scale factors for
discretization thresholds that are separate for each filter. Using the
proposed technique, we quantize the modern mobile architectures of
neural networks with the set of train data of only ∼ 10% of the
total ImageNet 2012 sample. Such reduction of train dataset size and
small number of trainable parameters allow to fine-tune the network
for several hours while maintaining the high accuracy of quantized
model (accuracy drop was less than 0.5%). Ready-for-use models and
code are available in the GitHub repository.
Abstract: In this work, a training algorithm for probabilistic neural networks (PNN) is presented. The algorithm addresses one of the major drawbacks of PNN, which is the size of the hidden layer in the network. By using a cross-validation training algorithm, the number of hidden neurons is shrunk to a smaller number consisting of the most representative samples of the training set. This is done without affecting the overall architecture of the network. Performance of the network is compared against performance of standard PNN for different databases from the UCI database repository. Results show an important gain in network size and performance.
Abstract: In telemedicine, the image repository service is important to increase the accuracy of diagnostic support of medical personnel. This study makes comparison between two routing algorithms regarding the quality of service (QoS), to be able to analyze the optimal performance at the time of loading and/or downloading of medical images. This study focused on comparing the performance of Tabu Search with other heuristic and metaheuristic algorithms that improve QoS in telemedicine services in Colombia. For this, Tabu Search and Simulated Annealing heuristic algorithms are chosen for their high usability in this type of applications; the QoS is measured taking into account the following metrics: Delay, Throughput, Jitter and Latency. In addition, routing tests were carried out on ten images in digital image and communication in medicine (DICOM) format of 40 MB. These tests were carried out for ten minutes with different traffic conditions, reaching a total of 25 tests, from a server of Universidad Militar Nueva Granada (UMNG) in Bogotá-Colombia to a remote user in Universidad de Santiago de Chile (USACH) - Chile. The results show that Tabu search presents a better QoS performance compared to Simulated Annealing, managing to optimize the routing of medical images, a basic requirement to offer diagnostic images services in telemedicine.
Abstract: Feature selection and attribute reduction are crucial
problems, and widely used techniques in the field of machine
learning, data mining and pattern recognition to overcome the
well-known phenomenon of the Curse of Dimensionality. This paper
presents a feature selection method that efficiently carries out attribute
reduction, thereby selecting the most informative features of a dataset.
It consists of two components: 1) a measure for feature subset
evaluation, and 2) a search strategy. For the evaluation measure,
we have employed the fuzzy-rough dependency degree (FRFDD)
of the lower approximation-based fuzzy-rough feature selection
(L-FRFS) due to its effectiveness in feature selection. As for the
search strategy, a modified version of a binary shuffled frog leaping
algorithm is proposed (B-SFLA). The proposed feature selection
method is obtained by hybridizing the B-SFLA with the FRDD. Nine
classifiers have been employed to compare the proposed approach
with several existing methods over twenty two datasets, including
nine high dimensional and large ones, from the UCI repository.
The experimental results demonstrate that the B-SFLA approach
significantly outperforms other metaheuristic methods in terms of the
number of selected features and the classification accuracy.
Abstract: Missing values in real-world datasets are a common
problem. Many algorithms were developed to deal with this
problem, most of them replace the missing values with a fixed
value that was computed based on the observed values. In
our work, we used a distance function based on Bhattacharyya
distance to measure the distance between objects with missing
values. Bhattacharyya distance, which measures the similarity of
two probability distributions. The proposed distance distinguishes
between known and unknown values. Where the distance between
two known values is the Mahalanobis distance. When, on the other
hand, one of them is missing the distance is computed based on the
distribution of the known values, for the coordinate that contains
the missing value. This method was integrated with Wikaya, a
digital health company developing a platform that helps to improve
prevention of chronic diseases such as diabetes and cancer. In order
for Wikaya’s recommendation system to work distance between users
need to be measured. Since there are missing values in the collected
data, there is a need to develop a distance function distances between
incomplete users profiles. To evaluate the accuracy of the proposed
distance function in reflecting the actual similarity between different
objects, when some of them contain missing values, we integrated it
within the framework of k nearest neighbors (kNN) classifier, since
its computation is based only on the similarity between objects. To
validate this, we ran the algorithm over diabetes and breast cancer
datasets, standard benchmark datasets from the UCI repository. Our
experiments show that kNN classifier using our proposed distance
function outperforms the kNN using other existing methods.
Abstract: The production and publication of scientific works have increased significantly in the last years, being the Internet the main factor of access and distribution of these works. Faced with this, there is a growing interest in understanding how scientific research has evolved, in order to explore this knowledge to encourage research groups to become more productive. Therefore, the objective of this work is to explore repositories containing data from scientific publications and to characterize keyword networks of these publications, in order to identify the most relevant keywords, and to highlight those that have the greatest impact on the network. To do this, each article in the study repository has its keywords extracted and in this way the network is characterized, after which several metrics for social network analysis are applied for the identification of the highlighted keywords.
Abstract: The Canadian Used Fuel Container (UFC) is a mid-size hemispherical headed copper coated steel container measuring 2.5 meters in length and 0.5 meters in diameter containing 48 used fuel bundles. The contained used fuel produces significant gamma radiation requiring automated assembly processes to complete the assembly. The design throughput of 2,500 UFCs per year places constraints on equipment and hot cell design for repeatability, speed of processing, robustness and recovery from upset conditions. After UFC assembly, the UFC is inserted into a Buffer Box (BB). The BB is made from adequately pre-shaped blocks (lower and upper block) and Highly Compacted Bentonite (HCB) material. The blocks are practically ‘sandwiching’ the UFC between them after assembly. This paper identifies one possible approach for the BB automatic assembly cell and processes. Automation of the BB assembly will have a significant positive impact on nuclear safety, quality, productivity, and reliability.
Abstract: Heritage trees are natural large, individual trees with exceptionally value due to association with age or event or distinguished people. In Malaysia, there is an abundance of tropical heritage trees throughout the country. It is essential to set up a repository of heritage trees to prevent valuable trees from being cut down. In this cross domain study, a web-based online expert system namely the Heritage Tree Expert Assessment and Classification (HTEAC) is developed and deployed for public to nominate potential heritage trees. Based on the nomination, tree care experts or arborists would evaluate and verify the nominated trees as heritage trees. The expert system automatically rates the approved heritage trees according to pre-defined grades via Delphi technique. Features and usability test of the expert system are presented. Preliminary result is promising for the system to be used as a full scale public system.
Abstract: The newest Canadian Used Fuel Container (UFC)- (called also “Mark II”) modifies the design approach for its Assembly Robotic Cell (ARC) in the Canadian Used (Nuclear) Fuel Packing Plant (UFPP). Some of the robotic design solutions are presented in this paper. The design indicates that robots and manipulators are expected to be used in the Canadian UFPP. As normally, the UFPP design will incorporate redundancy of all equipment to allow expedient recovery from any postulated upset conditions. Overall, this paper suggests that robot usage will have a significant positive impact on nuclear safety, quality, productivity, and reliability.
Abstract: Modern industrial automation relies on service oriented concepts of Internet of Things (IoT) device modeling in order to provide a flexible and extendable environment for service meta-repository. However, state-of-the-art meta-modeling techniques prefer design-time modeling, which results in a heavy usage of class sometimes unnecessary static subtyping. Although this approach benefits from clear-cut object-oriented design principles, it also seals the model repository for further dynamic extensions. In this paper, a dynamic multi-level modeling approach is introduced that enables dynamic subtyping through a more relaxed partial instantiation mechanism. The approach is demonstrated on a simple sensor network example.
Abstract: This web based project focuses on continuing corporate education and improving workers' skills in Brazilian radioactive facilities throughout the country. The potential of Information and Communication Technologies (ICTs) shall contribute to improve the global communication in this very large country, where it is a strong challenge to ensure high quality professional information to as many people as possible. The main objective of this system is to provide Brazilian radioactive facilities a complete web-based repository - in Portuguese - for research, consultation and information, offering conditions for learning and improving professional and personal skills. UNIPRORAD is a web based system to offer unified programs and inter-related information about radiological protection programs. The content includes the best practices for radioactive facilities in order to meet both national standards and international recommendations published by different organizations over the past decades: International Commission on Radiological Protection (ICRP), International Atomic Energy Agency (IAEA) and National Nuclear Energy Commission (CNEN). The website counts on concepts, definitions and theory about optimization and ionizing radiation monitoring procedures. Moreover, the content presents further discussions related to some national and international recommendations, such as potential exposure, which is currently one of the most important research fields in radiological protection. Only two publications of ICRP develop expressively the issue and there is still a lack of knowledge of fail probabilities, for there are still uncertainties to find effective paths to quantify probabilistically the occurrence of potential exposures and the probabilities to reach a certain level of dose. To respond to this challenge, this project discusses and introduces potential exposures in a more quantitative way than national and international recommendations. Articulating ICRP and AIEA valid recommendations and official reports, in addition to scientific papers published in major international congresses, the website discusses and suggests a number of effective actions towards safety which can be incorporated into labor practice. The WEB platform was created according to corporate public needs, taking into account the development of a robust but flexible system, which can be easily adapted to future demands. ICTs provide a vast array of new communication capabilities and allow to spread information to as many people as possible at low costs and high quality communication. This initiative shall provide opportunities for employees to increase professional skills, stimulating development in this large country where it is an enormous challenge to ensure effective and updated information to geographically distant facilities, minimizing costs and optimizing results.
Abstract: The traditional k-means algorithm has been widely used as a simple and efficient clustering method. However, the algorithm often converges to local minima for the reason that it is sensitive to the initial cluster centers. In this paper, an algorithm for selecting initial cluster centers on the basis of minimum spanning tree (MST) is presented. The set of vertices in MST with same degree are regarded as a whole which is used to find the skeleton data points. Furthermore, a distance measure between the skeleton data points with consideration of degree and Euclidean distance is presented. Finally, MST-based initialization method for the k-means algorithm is presented, and the corresponding time complexity is analyzed as well. The presented algorithm is tested on five data sets from the UCI Machine Learning Repository. The experimental results illustrate the effectiveness of the presented algorithm compared to three existing initialization methods.
Abstract: The Petri nets are the first standard for business
process modeling. Most probably, it is one of the core reasons why
all new standards created afterwards have to be so reformed as to
reach the stage of mapping the new standard onto Petri nets. The paper presents a business process repository based on a
universal database. The repository provides the possibility the data
about a given process to be stored in three different ways. Business
process repository is developed with regard to the reformation of a
given model to a Petri net in order to be easily simulated. Two different techniques for business process simulation based on
Petri nets - Yasper and Woflan are discussed. Their advantages and
drawbacks are outlined. The way of simulating business process
models, stored in the Business process repository is shown.
Abstract: Communicating users' needs, goals and problems help
designers and developers overcome challenges faced by end users.
Personas are used to represent end users’ needs. In our research,
creating personas allowed the following questions to be answered:
Who are the potential user groups? What do they want to achieve by
using the service? What are the problems that users face? What
should the service provide to them? To develop realistic personas, we
conducted a focus group discussion with undergraduate and graduate
students and also interviewed a university librarian. The personas
were created to help evaluating the Institutional Repository that is
based on the DSpace system. The profiles helped to communicate
users' needs, abilities, tasks, and problems, and the task scenarios
used in the heuristic evaluation were based on these personas. Four
personas resulted of a focus group discussion with undergraduate and
graduate students and from interviewing a university librarian. We
then used these personas to create focused task-scenarios for a
heuristic evaluation on the system interface to ensure that it met
users' needs, goals, problems and desires. In this paper, we present
the process that we used to create the personas that led to devise the
task scenarios used in the heuristic evaluation as a follow up study of
the DSpace university repository.
Abstract: Ontology validation is an important part of web
applications’ development, where knowledge integration and
ontological reasoning play a fundamental role. It aims to ensure the
consistency and correctness of ontological knowledge and to
guarantee that ontological reasoning is carried out in a meaningful
way. Existing approaches to ontology validation address more or less
specific validation issues, but the overall process of validating web
ontologies has not been formally established yet. As the size and the
number of web ontologies continue to grow, more web applications’
developers will rely on the existing repository of ontologies rather
than develop ontologies from scratch. If an application utilizes
multiple independently created ontologies, their consistency must be
validated and eventually adjusted to ensure proper interoperability
between them. This paper presents a validation technique intended to
test the consistency of independent ontologies utilized by a common
application.
Abstract: This paper describes the tradeoffs and the design from
scratch of a self-contained, easy-to-use health dashboard software
system that provides customizable data tracking for patients in smart
homes. The system is made up of different software modules and
comprises a front-end and a back-end component. Built with HTML,
CSS, and JavaScript, the front-end allows adding users, logging into
the system, selecting metrics, and specifying health goals. The backend
consists of a NoSQL Mongo database, a Python script, and a
SimpleHTTPServer written in Python. The database stores user
profiles and health data in JSON format. The Python script makes use
of the PyMongo driver library to query the database and displays
formatted data as a daily snapshot of user health metrics against
target goals. Any number of standard and custom metrics can be
added to the system, and corresponding health data can be fed
automatically, via sensor APIs or manually, as text or picture data
files. A real-time METAR request API permits correlating weather
data with patient health, and an advanced query system is
implemented to allow trend analysis of selected health metrics over
custom time intervals. Available on the GitHub repository system,
the project is free to use for academic purposes of learning and
experimenting, or practical purposes by building on it.
Abstract: In Knowledge and Data Engineering field, relational
database is the best repository to store data in a real world. It has
been using around the world more than eight decades. Normalization
is the most important process for the analysis and design of relational
databases. It aims at creating a set of relational tables with minimum
data redundancy that preserve consistency and facilitate correct
insertion, deletion, and modification. Normalization is a major task in
the design of relational databases. Despite its importance, very few
algorithms have been developed to be used in the design of
commercial automatic normalization tools. It is also rare technique to
do it automatically rather manually. Moreover, for a large and
complex database as of now, it make even harder to do it manually.
This paper presents a new complete automated relational database
normalization method. It produces the directed graph and spanning
tree, first. It then proceeds with generating the 2NF, 3NF and also
BCNF normal forms. The benefit of this new algorithm is that it can
cope with a large set of complex function dependencies.
Abstract: Due to the fact that there exist only a small number of complex systems in artificial immune system (AIS) that work out nonlinear problems, nonlinear AIS approaches, among the well-known solution techniques, need to be developed. Gaussian function is usually used as similarity estimation in classification problems and pattern recognition. In this study, diagnosis of breast cancer, the second type of the most widespread cancer in women, was performed with different distance calculation functions that euclidean, gaussian and gaussian-euclidean hybrid function in the clonal selection model of classical AIS on Wisconsin Breast Cancer Dataset (WBCD), which was taken from the University of California, Irvine Machine-Learning Repository. We used 3-fold cross validation method to train and test the dataset. According to the results, the maximum test classification accuracy was reported as 97.35% by using of gaussian-euclidean hybrid function for fold-3. Also, mean of test classification accuracies for all of functions were obtained as 94.78%, 94.45% and 95.31% with use of euclidean, gaussian and gaussian-euclidean, respectively. With these results, gaussian-euclidean hybrid function seems to be a potential distance calculation method, and it may be considered as an alternative distance calculation method for hard nonlinear classification problems.