Automatic Detection of Proliferative Cells in Immunohistochemically Images of Meningioma Using Fuzzy C-Means Clustering and HSV Color Space

Visual search and identification of immunohistochemically stained tissue of meningioma was performed manually in pathologic laboratories to detect and diagnose the cancers type of meningioma. This task is very tedious and time-consuming. Moreover, because of cell's complex nature, it still remains a challenging task to segment cells from its background and analyze them automatically. In this paper, we develop and test a computerized scheme that can automatically identify cells in microscopic images of meningioma and classify them into positive (proliferative) and negative (normal) cells. Dataset including 150 images are used to test the scheme. The scheme uses Fuzzy C-means algorithm as a color clustering method based on perceptually uniform hue, saturation, value (HSV) color space. Since the cells are distinguishable by the human eye, the accuracy and stability of the algorithm are quantitatively compared through application to a wide variety of real images.

A Mixing Matrix Estimation Algorithm for Speech Signals under the Under-Determined Blind Source Separation Model

The separation of speech signals has become a research hotspot in the field of signal processing in recent years. It has many applications and influences in teleconferencing, hearing aids, speech recognition of machines and so on. The sounds received are usually noisy. The issue of identifying the sounds of interest and obtaining clear sounds in such an environment becomes a problem worth exploring, that is, the problem of blind source separation. This paper focuses on the under-determined blind source separation (UBSS). Sparse component analysis is generally used for the problem of under-determined blind source separation. The method is mainly divided into two parts. Firstly, the clustering algorithm is used to estimate the mixing matrix according to the observed signals. Then the signal is separated based on the known mixing matrix. In this paper, the problem of mixing matrix estimation is studied. This paper proposes an improved algorithm to estimate the mixing matrix for speech signals in the UBSS model. The traditional potential algorithm is not accurate for the mixing matrix estimation, especially for low signal-to noise ratio (SNR).In response to this problem, this paper considers the idea of an improved potential function method to estimate the mixing matrix. The algorithm not only avoids the inuence of insufficient prior information in traditional clustering algorithm, but also improves the estimation accuracy of mixing matrix. This paper takes the mixing of four speech signals into two channels as an example. The results of simulations show that the approach in this paper not only improves the accuracy of estimation, but also applies to any mixing matrix.

Maximization of Lifetime for Wireless Sensor Networks Based on Energy Efficient Clustering Algorithm

Since last decade, wireless sensor networks (WSNs) have been used in many areas like health care, agriculture, defense, military, disaster hit areas and so on. Wireless Sensor Networks consist of a Base Station (BS) and more number of wireless sensors in order to monitor temperature, pressure, motion in different environment conditions. The key parameter that plays a major role in designing a protocol for Wireless Sensor Networks is energy efficiency which is a scarcest resource of sensor nodes and it determines the lifetime of sensor nodes. Maximizing sensor node’s lifetime is an important issue in the design of applications and protocols for Wireless Sensor Networks. Clustering sensor nodes mechanism is an effective topology control approach for helping to achieve the goal of this research. In this paper, the researcher presents an energy efficiency protocol to prolong the network lifetime based on Energy efficient clustering algorithm. The Low Energy Adaptive Clustering Hierarchy (LEACH) is a routing protocol for clusters which is used to lower the energy consumption and also to improve the lifetime of the Wireless Sensor Networks. Maximizing energy dissipation and network lifetime are important matters in the design of applications and protocols for wireless sensor networks. Proposed system is to maximize the lifetime of the Wireless Sensor Networks by choosing the farthest cluster head (CH) instead of the closest CH and forming the cluster by considering the following parameter metrics such as Node’s density, residual-energy and distance between clusters (inter-cluster distance). In this paper, comparisons between the proposed protocol and comparative protocols in different scenarios have been done and the simulation results showed that the proposed protocol performs well over other comparative protocols in various scenarios.

Consumer Load Profile Determination with Entropy-Based K-Means Algorithm

With the continuous increment of smart meter installations across the globe, the need for processing of the load data is evident. Clustering-based load profiling is built upon the utilization of unsupervised machine learning tools for the purpose of formulating the typical load curves or load profiles. The most commonly used algorithm in the load profiling literature is the K-means. While the algorithm has been successfully tested in a variety of applications, its drawback is the strong dependence in the initialization phase. This paper proposes a novel modified form of the K-means that addresses the aforementioned problem. Simulation results indicate the superiority of the proposed algorithm compared to the K-means.

A Two-Stage Expert System for Diagnosis of Leukemia Based on Type-2 Fuzzy Logic

Diagnosis and deciding about diseases in medical fields is facing innate uncertainty which can affect the whole process of treatment. This decision is made based on expert knowledge and the way in which an expert interprets the patient's condition, and the interpretation of the various experts from the patient's condition may be different. Fuzzy logic can provide mathematical modeling for many concepts, variables, and systems that are unclear and ambiguous and also it can provide a framework for reasoning, inference, control, and decision making in conditions of uncertainty. In systems with high uncertainty and high complexity, fuzzy logic is a suitable method for modeling. In this paper, we use type-2 fuzzy logic for uncertainty modeling that is in diagnosis of leukemia. The proposed system uses an indirect-direct approach and consists of two stages: In the first stage, the inference of blood test state is determined. In this step, we use an indirect approach where the rules are extracted automatically by implementing a clustering approach. In the second stage, signs of leukemia, duration of disease until its progress and the output of the first stage are combined and the final diagnosis of the system is obtained. In this stage, the system uses a direct approach and final diagnosis is determined by the expert. The obtained results show that the type-2 fuzzy expert system can diagnose leukemia with the average accuracy about 97%.

Localization of Geospatial Events and Hoax Prediction in the UFO Database

Unidentified Flying Objects (UFOs) have been an interesting topic for most enthusiasts and hence people all over the United States report such findings online at the National UFO Report Center (NUFORC). Some of these reports are a hoax and among those that seem legitimate, our task is not to establish that these events confirm that they indeed are events related to flying objects from aliens in outer space. Rather, we intend to identify if the report was a hoax as was identified by the UFO database team with their existing curation criterion. However, the database provides a wealth of information that can be exploited to provide various analyses and insights such as social reporting, identifying real-time spatial events and much more. We perform analysis to localize these time-series geospatial events and correlate with known real-time events. This paper does not confirm any legitimacy of alien activity, but rather attempts to gather information from likely legitimate reports of UFOs by studying the online reports. These events happen in geospatial clusters and also are time-based. We look at cluster density and data visualization to search the space of various cluster realizations to decide best probable clusters that provide us information about the proximity of such activity. A random forest classifier is also presented that is used to identify true events and hoax events, using the best possible features available such as region, week, time-period and duration. Lastly, we show the performance of the scheme on various days and correlate with real-time events where one of the UFO reports strongly correlates to a missile test conducted in the United States.

On the Development of a Homogenized Earthquake Catalogue for Northern Algeria

Regions with a significant percentage of non-seismically designed buildings and reduced urban planning are particularly vulnerable to natural hazards. In this context, the project ‘Improved Tools for Disaster Risk Mitigation in Algeria’ (ITERATE) aims at seismic risk mitigation in Algeria. Past earthquakes in North Algeria caused extensive damages, e.g. the El Asnam 1980 moment magnitude (Mw) 7.1 and Boumerdes 2003 Mw 6.8 earthquakes. This paper will address a number of proposed developments and considerations made towards a further improvement of the component of seismic hazard. In specific, an updated earthquake catalog (until year 2018) is compiled, and new conversion equations to moment magnitude are introduced. Furthermore, a network-based method for the estimation of the spatial and temporal distribution of the minimum magnitude of completeness is applied. We found relatively large values for Mc, due to the sparse network, and a nonlinear trend between Mw and body wave (mb) or local magnitude (ML), which are the most common scales reported in the region. Lastly, the resulting b-value of the Gutenberg-Richter distribution is sensitive to the declustering method.

A Geospatial Consumer Marketing Campaign Optimization Strategy: Case of Fuzzy Approach in Nigeria Mobile Market

Getting the consumer marketing strategy right is a crucial and complex task for firms with a large customer base such as mobile operators in a competitive mobile market. While empirical studies have made efforts to identify key constructs, no geospatial model has been developed to comprehensively assess the viability and interdependency of ground realities regarding the customer, competition, channel and the network quality of mobile operators. With this research, a geo-analytic framework is proposed for strategy formulation and allocation for mobile operators. Firstly, a fuzzy analytic network using a self-organizing feature map clustering technique based on inputs from managers and literature, which depicts the interrelationships amongst ground realities is developed. The model is tested with a mobile operator in the Nigeria mobile market. As a result, a customer-centric geospatial and visualization solution is developed. This provides a consolidated and integrated insight that serves as a transparent, logical and practical guide for strategic, tactical and operational decision making.

Analysis of Cooperative Learning Behavior Based on the Data of Students' Movement

The purpose of this paper is to analyze the cooperative learning behavior pattern based on the data of students' movement. The study firstly reviewed the cooperative learning theory and its research status, and briefly introduced the k-means clustering algorithm. Then, it used clustering algorithm and mathematical statistics theory to analyze the activity rhythm of individual student and groups in different functional areas, according to the movement data provided by 10 first-year graduate students. It also focused on the analysis of students' behavior in the learning area and explored the law of cooperative learning behavior. The research result showed that the cooperative learning behavior analysis method based on movement data proposed in this paper is feasible. From the results of data analysis, the characteristics of behavior of students and their cooperative learning behavior patterns could be found.

Optical Flow Based System for Cross Traffic Alert

This document describes an advanced system and methodology for Cross Traffic Alert (CTA), able to detect vehicles that move into the vehicle driving path from the left or right side. The camera is supposed to be not only on a vehicle still, e.g. at a traffic light or at an intersection, but also moving slowly, e.g. in a car park. In all of the aforementioned conditions, a driver’s short loss of concentration or distraction can easily lead to a serious accident. A valid support to avoid these kinds of car crashes is represented by the proposed system. It is an extension of our previous work, related to a clustering system, which only works on fixed cameras. Just a vanish point calculation and simple optical flow filtering, to eliminate motion vectors due to the car relative movement, is performed to let the system achieve high performances with different scenarios, cameras and resolutions. The proposed system just uses as input the optical flow, which is hardware implemented in the proposed platform and since the elaboration of the whole system is really speed and power consumption, it is inserted directly in the camera framework, allowing to execute all the processing in real-time.

An Improved K-Means Algorithm for Gene Expression Data Clustering

Data mining technique used in the field of clustering is a subject of active research and assists in biological pattern recognition and extraction of new knowledge from raw data. Clustering means the act of partitioning an unlabeled dataset into groups of similar objects. Each group, called a cluster, consists of objects that are similar between themselves and dissimilar to objects of other groups. Several clustering methods are based on partitional clustering. This category attempts to directly decompose the dataset into a set of disjoint clusters leading to an integer number of clusters that optimizes a given criterion function. The criterion function may emphasize a local or a global structure of the data, and its optimization is an iterative relocation procedure. The K-Means algorithm is one of the most widely used partitional clustering techniques. Since K-Means is extremely sensitive to the initial choice of centers and a poor choice of centers may lead to a local optimum that is quite inferior to the global optimum, we propose a strategy to initiate K-Means centers. The improved K-Means algorithm is compared with the original K-Means, and the results prove how the efficiency has been significantly improved.

CoP-Networks: Virtual Spaces for New Faculty’s Professional Development in the 21st Higher Education

The 21st century higher education and globalization challenge new faculty members to build effective professional networks and partnership with industry in order to accelerate their growth and success. This creates the need for community of practice (CoP)-oriented development approaches that focus on cognitive apprenticeship while considering individual predisposition and future career needs. This work adopts data mining, clustering analysis, and social networking technologies to present the CoP-Network as a virtual space that connects together similar career-aspiration individuals who are socially influenced to join and engage in a process for domain-related knowledge and practice acquisitions. The CoP-Network model can be integrated into higher education to extend traditional graduate and professional development programs.

Web Proxy Detection via Bipartite Graphs and One-Mode Projections

With the Internet becoming the dominant channel for business and life, many IPs are increasingly masked using web proxies for illegal purposes such as propagating malware, impersonate phishing pages to steal sensitive data or redirect victims to other malicious targets. Moreover, as Internet traffic continues to grow in size and complexity, it has become an increasingly challenging task to detect the proxy service due to their dynamic update and high anonymity. In this paper, we present an approach based on behavioral graph analysis to study the behavior similarity of web proxy users. Specifically, we use bipartite graphs to model host communications from network traffic and build one-mode projections of bipartite graphs for discovering social-behavior similarity of web proxy users. Based on the similarity matrices of end-users from the derived one-mode projection graphs, we apply a simple yet effective spectral clustering algorithm to discover the inherent web proxy users behavior clusters. The web proxy URL may vary from time to time. Still, the inherent interest would not. So, based on the intuition, by dint of our private tools implemented by WebDriver, we examine whether the top URLs visited by the web proxy users are web proxies. Our experiment results based on real datasets show that the behavior clusters not only reduce the number of URLs analysis but also provide an effective way to detect the web proxies, especially for the unknown web proxies.

Automatic Landmark Selection Based on Feature Clustering for Visual Autonomous Unmanned Aerial Vehicle Navigation

The selection of specific landmarks for an Unmanned Aerial Vehicles’ Visual Navigation systems based on Automatic Landmark Recognition has significant influence on the precision of the system’s estimated position. At the same time, manual selection of the landmarks does not guarantee a high recognition rate, which would also result on a poor precision. This work aims to develop an automatic landmark selection that will take the image of the flight area and identify the best landmarks to be recognized by the Visual Navigation Landmark Recognition System. The criterion to select a landmark is based on features detected by ORB or AKAZE and edges information on each possible landmark. Results have shown that disposition of possible landmarks is quite different from the human perception.

A Construction Management Tool: Determining a Project Schedule Typical Behaviors Using Cluster Analysis

Delays in the construction industry are a global phenomenon. Many construction projects experience extensive delays exceeding the initially estimated completion time. The main purpose of this study is to identify construction projects typical behaviors in order to develop a prognosis and management tool. Being able to know a construction projects schedule tendency will enable evidence-based decision-making to allow resolutions to be made before delays occur. This study presents an innovative approach that uses Cluster Analysis Method to support predictions during Earned Value Analyses. A clustering analysis was used to predict future scheduling, Earned Value Management (EVM), and Earned Schedule (ES) principal Indexes behaviors in construction projects. The analysis was made using a database with 90 different construction projects. It was validated with additional data extracted from literature and with another 15 contrasting projects. For all projects, planned and executed schedules were collected and the EVM and ES principal indexes were calculated. A complete linkage classification method was used. In this way, the cluster analysis made considers that the distance (or similarity) between two clusters must be measured by its most disparate elements, i.e. that the distance is given by the maximum span among its components. Finally, through the use of EVM and ES Indexes and Tukey and Fisher Pairwise Comparisons, the statistical dissimilarity was verified and four clusters were obtained. It can be said that construction projects show an average delay of 35% of its planned completion time. Furthermore, four typical behaviors were found and for each of the obtained clusters, the interim milestones and the necessary rhythms of construction were identified. In general, detected typical behaviors are: (1) Projects that perform a 5% of work advance in the first two tenths and maintain a constant rhythm until completion (greater than 10% for each remaining tenth), being able to finish on the initially estimated time. (2) Projects that start with an adequate construction rate but suffer minor delays culminating with a total delay of almost 27% of the planned time. (3) Projects which start with a performance below the planned rate and end up with an average delay of 64%, and (4) projects that begin with a poor performance, suffer great delays and end up with an average delay of a 120% of the planned completion time. The obtained clusters compose a tool to identify the behavior of new construction projects by comparing their current work performance to the validated database, thus allowing the correction of initial estimations towards more accurate completion schedules.

Identification of Disease Causing DNA Motifs in Human DNA Using Clustering Approach

Studying DNA (deoxyribonucleic acid) sequence is useful in biological processes and it is applied in the fields such as diagnostic and forensic research. DNA is the hereditary information in human and almost all other organisms. It is passed to their generations. Earlier stage detection of defective DNA sequence may lead to many developments in the field of Bioinformatics. Nowadays various tedious techniques are used to identify defective DNA. The proposed work is to analyze and identify the cancer-causing DNA motif in a given sequence. Initially the human DNA sequence is separated as k-mers using k-mer separation rule. The separated k-mers are clustered using Self Organizing Map (SOM). Using Levenshtein distance measure, cancer associated DNA motif is identified from the k-mer clusters. Experimental results of this work indicate the presence or absence of cancer causing DNA motif. If the cancer associated DNA motif is found in DNA, it is declared as the cancer disease causing DNA sequence. Otherwise the input human DNA is declared as normal sequence. Finally, elapsed time is calculated for finding the presence of cancer causing DNA motif using clustering formation. It is compared with normal process of finding cancer causing DNA motif. Locating cancer associated motif is easier in cluster formation process than the other one. The proposed work will be an initiative aid for finding genetic disease related research.

Comparative Study of Ad Hoc Routing Protocols in Vehicular Ad-Hoc Networks for Smart City

In this paper, we perform the investigation of some routing protocols in Vehicular Ad-Hoc Network (VANET) context. Indeed, we study the efficiency of protocols like Dynamic Source Routing (DSR), Ad hoc On-demand Distance Vector Routing (AODV), Destination Sequenced Distance Vector (DSDV), Optimized Link State Routing convention (OLSR) and Vehicular Multi-hop algorithm for Stable Clustering (VMASC) in terms of packet delivery ratio (PDR) and throughput. The performance evaluation and comparison between the studied protocols shows that the VMASC is the best protocols regarding fast data transmission and link stability in VANETs. The validation of all results is done by the NS3 simulator.

Implementation of an IoT Sensor Data Collection and Analysis Library

Due to the development of information technology and wireless Internet technology, various data are being generated in various fields. These data are advantageous in that they provide real-time information to the users themselves. However, when the data are accumulated and analyzed, more various information can be extracted. In addition, development and dissemination of boards such as Arduino and Raspberry Pie have made it possible to easily test various sensors, and it is possible to collect sensor data directly by using database application tools such as MySQL. These directly collected data can be used for various research and can be useful as data for data mining. However, there are many difficulties in using the board to collect data, and there are many difficulties in using it when the user is not a computer programmer, or when using it for the first time. Even if data are collected, lack of expert knowledge or experience may cause difficulties in data analysis and visualization. In this paper, we aim to construct a library for sensor data collection and analysis to overcome these problems.

A Comparative Study on Fuzzy and Neuro-Fuzzy Enabled Cluster Based Routing Protocols for Wireless Sensor Networks

Dynamic Routing in Wireless Sensor Networks (WSNs) has played a significant task in research for the recent years. Energy consumption and data delivery in time are the major parameters with the usage of sensor nodes that are significant criteria for these networks. The location of sensor nodes must not be prearranged. Clustering in WSN is a key methodology which is used to enlarge the life-time of a sensor network. It consists of numerous real-time applications. The features of WSNs are minimized the consumption of energy. Soft computing techniques can be included to accomplish improved performance. This paper surveys the modern trends in routing enclose fuzzy logic and Neuro-fuzzy logic based on the clustering techniques and implements a comparative study of the numerous related methodologies.

Method of Cluster Based Cross-Domain Knowledge Acquisition for Biologically Inspired Design

Biologically inspired design inspires inventions and new technologies in the field of engineering by mimicking functions, principles, and structures in the biological domain. To deal with the obstacles of cross-domain knowledge acquisition in the existing biologically inspired design process, functional semantic clustering based on functional feature semantic correlation and environmental constraint clustering composition based on environmental characteristic constraining adaptability are proposed. A knowledge cell clustering algorithm and the corresponding prototype system is developed. Finally, the effectiveness of the method is verified by the visual prosthetic device design.