Data Migration Methodology from Relational to NoSQL Databases

Currently, the field of data migration is very topical. As the number of applications developed rapidly, the ever-increasing volume of data collected has driven the architectural migration from Relational Database Management System (RDBMS) to NoSQL (Not Only SQL) database. This very recent technology is important enough in the field of database management. The main aim of this paper is to present a methodology for data migration from RDBMS to NoSQL database. To illustrate this methodology, we implement a software prototype using MySQL as a RDBMS and MongoDB as a NoSQL database. Although this is a hard engineering work, our results show that the proposed methodology can successfully accomplish the goal of this study.

Breast Cancer Survivability Prediction via Classifier Ensemble

This paper presents a classifier ensemble approach for predicting the survivability of the breast cancer patients using the latest database version of the Surveillance, Epidemiology, and End Results (SEER) Program of the National Cancer Institute. The system consists of two main components; features selection and classifier ensemble components. The features selection component divides the features in SEER database into four groups. After that it tries to find the most important features among the four groups that maximizes the weighted average F-score of a certain classification algorithm. The ensemble component uses three different classifiers, each of which models different set of features from SEER through the features selection module. On top of them, another classifier is used to give the final decision based on the output decisions and confidence scores from each of the underlying classifiers. Different classification algorithms have been examined; the best setup found is by using the decision tree, Bayesian network, and Na¨ıve Bayes algorithms for the underlying classifiers and Na¨ıve Bayes for the classifier ensemble step. The system outperforms all published systems to date when evaluated against the exact same data of SEER (period of 1973-2002). It gives 87.39% weighted average F-score compared to 85.82% and 81.34% of the other published systems. By increasing the data size to cover the whole database (period of 1973-2014), the overall weighted average F-score jumps to 92.4% on the held out unseen test set.

Performance Analysis of Reconstruction Algorithms in Diffuse Optical Tomography

Diffuse Optical Tomography (DOT) is a non-invasive imaging modality used in clinical diagnosis for earlier detection of carcinoma cells in brain tissue. It is a form of optical tomography which produces gives the reconstructed image of a human soft tissue with by using near-infra-red light. It comprises of two steps called forward model and inverse model. The forward model provides the light propagation in a biological medium. The inverse model uses the scattered light to collect the optical parameters of human tissue. DOT suffers from severe ill-posedness due to its incomplete measurement data. So the accurate analysis of this modality is very complicated. To overcome this problem, optical properties of the soft tissue such as absorption coefficient, scattering coefficient, optical flux are processed by the standard regularization technique called Levenberg - Marquardt regularization. The reconstruction algorithms such as Split Bregman and Gradient projection for sparse reconstruction (GPSR) methods are used to reconstruct the image of a human soft tissue for tumour detection. Among these algorithms, Split Bregman method provides better performance than GPSR algorithm. The parameters such as signal to noise ratio (SNR), contrast to noise ratio (CNR), relative error (RE) and CPU time for reconstructing images are analyzed to get a better performance.

Image Enhancement Algorithm of Photoacoustic Tomography Using Active Contour Filtering

The photoacoustic images are obtained from a custom developed linear array photoacoustic tomography system. The biological specimens are imitated by conducting phantom tests in order to retrieve a fully functional photoacoustic image. The acquired image undergoes the active region based contour filtering to remove the noise and accurately segment the object area for further processing. The universal back projection method is used as the image reconstruction algorithm. The active contour filtering is analyzed by evaluating the signal to noise ratio and comparing it with the other filtering methods.

The Characterisation of TLC NAND Flash Memory, Leading to a Definable Endurance/Retention Trade-Off

Triple-Level Cell (TLC) NAND Flash memory at, and below, 20nm (nanometer) is still largely unexplored by researchers, and with the ever more commonplace existence of Flash in consumer and enterprise applications there is a need for such gaps in knowledge to be filled. At the time of writing, there was little published data or literature on TLC, and more specifically reliability testing, with a further emphasis on both endurance and retention. This paper will give an introduction to NAND Flash memory, followed by an overview of the relevant current research on the reliability of Flash memory, along with the planned future work which will provide results to help characterise the reliability of TLC memory.

Investigating Polynomial Interpolation Functions for Zooming Low Resolution Digital Medical Images

Medical digital images usually have low resolution because of nature of their acquisition. Therefore, this paper focuses on zooming these images to obtain better level of information, required for the purpose of medical diagnosis. For this purpose, a strategy for selecting pixels in zooming operation is proposed. It is based on the principle of analog clock and utilizes a combination of point and neighborhood image processing. In this approach, the hour hand of clock covers the portion of image to be processed. For alignment, the center of clock points at middle pixel of the selected portion of image. The minute hand is longer in length, and is used to gain information about pixels of the surrounding area. This area is called neighborhood pixels region. This information is used to zoom the selected portion of the image. The proposed algorithm is implemented and its performance is evaluated for many medical images obtained from various sources such as X-ray, Computerized Tomography (CT) scan and Magnetic Resonance Imaging (MRI). However, for illustration and simplicity, the results obtained from a CT scanned image of head is presented. The performance of algorithm is evaluated in comparison to various traditional algorithms in terms of Peak signal-to-noise ratio (PSNR), maximum error, SSIM index, mutual information and processing time. From the results, the proposed algorithm is found to give better performance than traditional algorithms.

A Two Level Load Balancing Approach for Cloud Environment

Cloud computing is the outcome of rapid growth of internet. Due to elastic nature of cloud computing and unpredictable behavior of user, load balancing is the major issue in cloud computing paradigm. An efficient load balancing technique can improve the performance in terms of efficient resource utilization and higher customer satisfaction. Load balancing can be implemented through task scheduling, resource allocation and task migration. Various parameters to analyze the performance of load balancing approach are response time, cost, data processing time and throughput. This paper demonstrates a two level load balancer approach by combining join idle queue and join shortest queue approach. Authors have used cloud analyst simulator to test proposed two level load balancer approach. The results are analyzed and compared with the existing algorithms and as observed, proposed work is one step ahead of existing techniques.

Applications for Accounting of Inherited Object-Oriented Class Members

A class in an Object-Oriented (OO) system is the basic unit of design, and it encapsulates a set of attributes and methods. In OO systems, instead of redefining the attributes and methods that are included in other classes, a class can inherit these attributes and methods and only implement its unique attributes and methods, which results in reducing code redundancy and improving code testability and maintainability. Such mechanism is called Class Inheritance. However, some software engineering applications may require accounting for all the inherited class members (i.e., attributes and methods). This paper explains how to account for inherited class members and discusses the software engineering applications that require such consideration.

Active Surface Tracking Algorithm for All-Fiber Common-Path Fourier-Domain Optical Coherence Tomography

A conventional optical coherence tomography (OCT) system has limited imaging depth, which is 1-2 mm, and suffers unwanted noise such as speckle noise. The motorized-stage-based OCT system, using a common-path Fourier-domain optical coherence tomography (CP-FD-OCT) configuration, provides enhanced imaging depth and less noise so that we can overcome these limitations. Using this OCT systems, OCT images were obtained from an onion, and their subsurface structure was observed. As a result, the images obtained using the developed motorized-stage-based system showed enhanced imaging depth than the conventional system, since it is real-time accurate depth tracking. Consequently, the developed CP-FD-OCT systems and algorithms have good potential for the further development of endoscopic OCT for microsurgery.

Synthesis of Dispersion-Compensating Triangular Lattice Index-Guiding Photonic Crystal Fibers Using the Directed Tabu Search Method

In this paper, triangular lattice index-guiding photonic crystal fibers (PCFs) are synthesized to compensate the chromatic dispersion of a single mode fiber (SMF-28) for an 80 km optical link operating at 1.55 µm, by using the directed tabu search algorithm. Hole-to-hole distance, circular air-hole diameter, solid-core diameter, ring number and PCF length parameters are optimized for this purpose. Three Synthesized PCFs with different physical parameters are compared in terms of their objective functions values, residual dispersions and compensation ratios.

Discriminant Analysis as a Function of Predictive Learning to Select Evolutionary Algorithms in Intelligent Transportation System

In this paper, we present the use of the discriminant analysis to select evolutionary algorithms that better solve instances of the vehicle routing problem with time windows. We use indicators as independent variables to obtain the classification criteria, and the best algorithm from the generic genetic algorithm (GA), random search (RS), steady-state genetic algorithm (SSGA), and sexual genetic algorithm (SXGA) as the dependent variable for the classification. The discriminant classification was trained with classic instances of the vehicle routing problem with time windows obtained from the Solomon benchmark. We obtained a classification of the discriminant analysis of 66.7%.

The Application of Bayesian Heuristic for Scheduling in Real-Time Private Clouds

The emergence of Cloud data centers has revolutionized the IT industry. Private Clouds in specific provide Cloud services for certain group of customers/businesses. In a real-time private Cloud each task that is given to the system has a deadline that desirably should not be violated. Scheduling tasks in a real-time private CLoud determine the way available resources in the system are shared among incoming tasks. The aim of the scheduling policy is to optimize the system outcome which for a real-time private Cloud can include: energy consumption, deadline violation, execution time and the number of host switches. Different scheduling policies can be used for scheduling. Each lead to a sub-optimal outcome in a certain settings of the system. A Bayesian Scheduling strategy is proposed for scheduling to further improve the system outcome. The Bayesian strategy showed to outperform all selected policies. It also has the flexibility in dealing with complex pattern of incoming task and has the ability to adapt.

An Efficient Implementation of High Speed Vedic Multiplier Using Compressors for Image Processing Applications

Digital signal processor, image signal processor and FIR filters have multipliers as an important part of their design. On the basis of Vedic mathematics, Vedic multipliers have come out to be very fast multipliers. One of the image processing applications is edge detection. This research presents a small area and high speed 8 bit Vedic multiplier system comprising of compressor based adders. This results in faster edge detection. This architecture is tested on Xilinx vertex 4 FPGA board and simulations were carried out using the Xilinx synthesis tool. Comparisons are made and this system is found to be smaller in area with high speed (the lesser propagation delay). This compressor based Vedic multiplier is 1.1 times speedier than a typical Vedic multiplier. Also, this Vedic Multiplier is 2 times speedier than a ‘simple’ multiplier.

A Recognition Method for Spatio-Temporal Background in Korean Historical Novels

The most important elements of a novel are the characters, events and background. The background represents the time, place and situation that character appears, and conveys event and atmosphere more realistically. If readers have the proper knowledge about background of novels, it may be helpful for understanding the atmosphere of a novel and choosing a novel that readers want to read. In this paper, we are targeting Korean historical novels because spatio-temporal background especially performs an important role in historical novels among the genre of Korean novels. To the best of our knowledge, we could not find previous study that was aimed at Korean novels. In this paper, we build a Korean historical national dictionary. Our dictionary has historical places and temple names of kings over many generations as well as currently existing spatial words or temporal words in Korean history. We also present a method for recognizing spatio-temporal background based on patterns of phrasal words in Korean sentences. Our rules utilize postposition for spatial background recognition and temple names for temporal background recognition. The knowledge of the recognized background can help readers to understand the flow of events and atmosphere, and can use to visualize the elements of novels.

A Hybrid P2P Storage Scheme Based on Erasure Coding and Replication

A peer-to-peer storage system has challenges like; peer availability, data protection, churn rate. To address these challenges different redundancy, replacement and repair schemes are used. This paper presents a hybrid scheme of redundancy using replication and erasure coding. We calculate and compare the storage, access, and maintenance costs of our proposed scheme with existing redundancy schemes. For realistic behaviour of peers a trace of live peer-to-peer system is used. The effect of different replication, and repair schemes are also shown. The proposed hybrid scheme performs better than existing double coding hybrid scheme in all metrics and have an improved maintenance cost than hierarchical codes.

Solution of Logistics Center Selection Problem Using the Axiomatic Design Method

Logistics centers represent areas that all national and international logistics and activities related to logistics can be implemented by the various businesses. Logistics centers have a key importance in joining the transport stream and the transport system operations. Therefore, it is important where these centers are positioned to be effective and efficient and to show the expected performance of the centers. In this study, the location selection problem to position the logistics center is discussed. Alternative centers are evaluated according certain criteria. The most appropriate center is identified using the axiomatic design method.

Triangular Geometric Feature for Offline Signature Verification

Handwritten signature is accepted widely as a biometric characteristic for personal authentication. The use of appropriate features plays an important role in determining accuracy of signature verification; therefore, this paper presents a feature based on the geometrical concept. To achieve the aim, triangle attributes are exploited to design a new feature since the triangle possesses orientation, angle and transformation that would improve accuracy. The proposed feature uses triangulation geometric set comprising of sides, angles and perimeter of a triangle which is derived from the center of gravity of a signature image. For classification purpose, Euclidean classifier along with Voting-based classifier is used to verify the tendency of forgery signature. This classification process is experimented using triangular geometric feature and selected global features. Based on an experiment that was validated using Grupo de Senales 960 (GPDS-960) signature database, the proposed triangular geometric feature achieves a lower Average Error Rates (AER) value with a percentage of 34% as compared to 43% of the selected global feature. As a conclusion, the proposed triangular geometric feature proves to be a more reliable feature for accurate signature verification.

Formal Specification and Description Language and Message Sequence Chart to Model and Validate Session Initiation Protocol Services

Session Initiation Protocol (SIP) is a signaling layer protocol for building, adjusting and ending sessions among participants including Internet conferences, telephone calls and multimedia distribution. SIP facilitates user movement by proxying and forwarding requests to the present location of the user. In this paper, we provide a formal Specification and Description Language (SDL) and Message Sequence Chart (MSC) to model and define the Internet Engineering Task Force (IETF) SIP protocol and its sample services resulted from informal SIP specification. We create an “Abstract User Interface” using case analysis so that can be applied to identify SIP services more explicitly. The issued sample SIP features are then used as case scenarios; they are revised in MSCs format and validated to their corresponding SDL models.

Threshold Based Region Incrementing Secret Sharing Scheme for Color Images

In this era of online communication, which transacts data in 0s and 1s, confidentiality is a priced commodity. Ensuring safe transmission of encrypted data and their uncorrupted recovery is a matter of prime concern. Among the several techniques for secure sharing of images, this paper proposes a k out of n region incrementing image sharing scheme for color images. The highlight of this scheme is the use of simple Boolean and arithmetic operations for generating shares and the Lagrange interpolation polynomial for authenticating shares. Additionally, this scheme addresses problems faced by existing algorithms such as color reversal and pixel expansion. This paper regenerates the original secret image whereas the existing systems regenerates only the half toned secret image.

Business-Intelligence Mining of Large Decentralized Multimedia Datasets with a Distributed Multi-Agent System

The rapid generation of high volume and a broad variety of data from the application of new technologies pose challenges for the generation of business-intelligence. Most organizations and business owners need to extract data from multiple sources and apply analytical methods for the purposes of developing their business. Therefore, the recently decentralized data management environment is relying on a distributed computing paradigm. While data are stored in highly distributed systems, the implementation of distributed data-mining techniques is a challenge. The aim of this technique is to gather knowledge from every domain and all the datasets stemming from distributed resources. As agent technologies offer significant contributions for managing the complexity of distributed systems, we consider this for next-generation data-mining processes. To demonstrate agent-based business intelligence operations, we use agent-oriented modeling techniques to develop a new artifact for mining massive datasets.