Localization of Geospatial Events and Hoax Prediction in the UFO Database

Unidentified Flying Objects (UFOs) have been an interesting topic for most enthusiasts and hence people all over the United States report such findings online at the National UFO Report Center (NUFORC). Some of these reports are a hoax and among those that seem legitimate, our task is not to establish that these events confirm that they indeed are events related to flying objects from aliens in outer space. Rather, we intend to identify if the report was a hoax as was identified by the UFO database team with their existing curation criterion. However, the database provides a wealth of information that can be exploited to provide various analyses and insights such as social reporting, identifying real-time spatial events and much more. We perform analysis to localize these time-series geospatial events and correlate with known real-time events. This paper does not confirm any legitimacy of alien activity, but rather attempts to gather information from likely legitimate reports of UFOs by studying the online reports. These events happen in geospatial clusters and also are time-based. We look at cluster density and data visualization to search the space of various cluster realizations to decide best probable clusters that provide us information about the proximity of such activity. A random forest classifier is also presented that is used to identify true events and hoax events, using the best possible features available such as region, week, time-period and duration. Lastly, we show the performance of the scheme on various days and correlate with real-time events where one of the UFO reports strongly correlates to a missile test conducted in the United States.

Effects of Grape Seed Oil on Postharvest Life and Quality of Some Grape Cultivars

Table grapes (Vitis vinifera L.) are an important crop worldwide. Postharvest problems like berry shattering, decay and stem dehydration are some of the important factors that limit the marketing of table grapes. Edible coatings are an alternative for increasing shelf-life of fruits, protecting fruits from humidity and oxygen effects, thus retarding their deterioration. This study aimed to compare different grape seed oil applications (GSO, 0.5 g L-1, 1 g L-1, 2 g L-1) and SO2 generating pads effects (SO2-1, SO2-2). Treated grapes with GSO and generating pads were packaged into polyethylene trays and stored at 0 ± 1°C and 85-95% moisture. Effects of the applications were investigated by some quality and sensory evaluations with intervals of 15 days. SO2 applications were determined the most effective treatments for minimizing weight loss and changes in TA, pH, color and appearance value. Grape seed oil applications were determined as a good alternative for grape preservation, improving weight losses and °Brix, TA, the color values and sensory analysis. Commercially, ‘Alphonse Lavallée’ clusters were stored for 75 days and ‘Antep Karası’ clusters for 60 days. The data obtained from GSO indicated that it had a similar quality result to SO2 for up to 40 days storage.

Comparison of Real-Time PCR and FTIR with Chemometrics Technique in Analysing Halal Supplement Capsules

Halal authentication and verification in supplement capsules are highly required as the gelatine available in the market can be from halal or non-halal sources. It is an obligation for Muslim to consume and use the halal consumer goods. At present, real-time polymerase chain reaction (RT-PCR) is the most common technique being used for the detection of porcine and bovine DNA in gelatine due to high sensitivity of the technique and higher stability of DNA compared to protein. In this study, twenty samples of supplements capsules from different products with different Halal logos were analyzed for porcine and bovine DNA using RT-PCR. Standard bovine and porcine gelatine from eurofins at a range of concentration from 10-1 to 10-5 ng/µl were used to determine the linearity range, limit of detection and specificity on RT-PCR (SYBR Green method). RT-PCR detected porcine (two samples), bovine (four samples) and mixture of porcine and bovine (six samples). The samples were also tested using FT-IR technique where normalized peak of IR spectra were pre-processed using Savitsky Golay method before Principal Components Analysis (PCA) was performed on the database. Scores plot of PCA shows three clusters of samples; bovine, porcine and mixture (bovine and porcine). The RT-PCR and FT-IR with chemometrics technique were found to give same results for porcine gelatine samples which can be used for Halal authentication.

Specific Frequency of Globular Clusters in Different Galaxy Types

Globular clusters (GC) are important objects for tracing the early evolution of a galaxy. We study the correlation between the cluster population and the global properties of the host galaxy. We found that the correlation between cluster population (NGC) and the baryonic mass (Mb) of the host galaxy are best described as 10 −5.6038Mb. In order to understand the origin of the U -shape relation between the GC specific frequency (SN) and Mb (caused by the high value of SN for dwarfs galaxies and giant ellipticals and a minimum SN for intermediate mass galaxies≈ 1010M), we derive a theoretical model for the specific frequency (SNth). The theoretical model for SNth is based on the slope of the power-law embedded cluster mass function (β) and different time scale (Δt) of the forming galaxy. Our results show a good agreement between the observation and the model at a certain β and Δt. The model seems able to reproduce higher value of SNth of β = 1.5 at the midst formation time scale.

Road Traffic Accidents Analysis in Mexico City through Crowdsourcing Data and Data Mining Techniques

Road traffic accidents are among the principal causes of traffic congestion, causing human losses, damages to health and the environment, economic losses and material damages. Studies about traditional road traffic accidents in urban zones represents very high inversion of time and money, additionally, the result are not current. However, nowadays in many countries, the crowdsourced GPS based traffic and navigation apps have emerged as an important source of information to low cost to studies of road traffic accidents and urban congestion caused by them. In this article we identified the zones, roads and specific time in the CDMX in which the largest number of road traffic accidents are concentrated during 2016. We built a database compiling information obtained from the social network known as Waze. The methodology employed was Discovery of knowledge in the database (KDD) for the discovery of patterns in the accidents reports. Furthermore, using data mining techniques with the help of Weka. The selected algorithms was the Maximization of Expectations (EM) to obtain the number ideal of clusters for the data and k-means as a grouping method. Finally, the results were visualized with the Geographic Information System QGIS.

Evaluation of Buckwheat Genotypes to Different Planting Geometries and Fertility Levels in Northern Transition Zone of Karnataka

Buckwheat (Fagopyrum esculentum Moench) is an annual crop belongs to family Poligonaceae. The cultivated buckwheat species are notable for their exceptional nutritive values. It is an important source of carbohydrates, fibre, macro, and microelements such as K, Ca, Mg, Na and Mn, Zn, Se, and Cu. It also contains rutin, flavonoids, riboflavin, pyridoxine and many amino acids which have beneficial effects on human health, including lowering both blood lipid and sugar levels. Rutin, quercetin and some other polyphenols are potent carcinogens against colon and other cancers. Buckwheat has significant nutritive value and plenty of uses. Cultivation of buckwheat in Sothern part of India is very meager. Hence, a study was planned with an objective to know the performance of buckwheat genotypes to different planting geometries and fertility levels. The field experiment was conducted at Main Agriculture Research Station, University of Agriculture Sciences, Dharwad, India, during 2017 Kharif. The experiment was laid-out in split-plot design with three replications having three planting geometries as main plots, two genotypes as sub plots and three fertility levels as sub-sub plot treatments. The soil of the experimental site was vertisol. The standard procedures are followed to record the observations. The planting geometry of 30*10 cm was recorded significantly higher seed yield (893 kg/ha⁻¹), stover yield (1507 kg ha⁻¹), clusters plant⁻¹ (7.4), seeds clusters⁻¹ (7.9) and 1000 seed weight (26.1 g) as compared to 40*10 cm and 20*10 cm planting geometries. Between the genotypes, significantly higher seed yield (943 kg ha⁻¹) and harvest index (45.1) was observed with genotype IC-79147 as compared to PRB-1 genotype (687 kg ha⁻¹ and 34.2, respectively). However, the genotype PRB-1 recorded significantly higher stover yield (1344 kg ha⁻¹) as compared to genotype IC-79147 (1173 kg ha⁻¹). The genotype IC-79147 was recorded significantly higher clusters plant⁻¹ (7.1), seeds clusters⁻¹ (7.9) and 1000 seed weight (24.5 g) as compared PRB-1 (5.4, 5.8 and 22.3 g, respectively). Among the fertility levels tried, the fertility level of 60:30 NP kg ha⁻¹ recorded significantly higher seed yield (845 kg ha-1) and stover yield (1359 kg ha⁻¹) as compared to 40:20 NP kg ha-1 (808 and 1259 kg ha⁻¹ respectively) and 20:10 NP kg ha-1 (793 and 1144 kg ha⁻¹ respectively). Within the treatment combinations, IC 79147 genotype having 30*10 cm planting geometry with 60:30 NP kg ha⁻¹ recorded significantly higher seed yield (1070 kg ha⁻¹), clusters plant⁻¹ (10.3), seeds clusters⁻¹ (9.9) and 1000 seed weight (27.3 g) compared to other treatment combinations.

An Improved K-Means Algorithm for Gene Expression Data Clustering

Data mining technique used in the field of clustering is a subject of active research and assists in biological pattern recognition and extraction of new knowledge from raw data. Clustering means the act of partitioning an unlabeled dataset into groups of similar objects. Each group, called a cluster, consists of objects that are similar between themselves and dissimilar to objects of other groups. Several clustering methods are based on partitional clustering. This category attempts to directly decompose the dataset into a set of disjoint clusters leading to an integer number of clusters that optimizes a given criterion function. The criterion function may emphasize a local or a global structure of the data, and its optimization is an iterative relocation procedure. The K-Means algorithm is one of the most widely used partitional clustering techniques. Since K-Means is extremely sensitive to the initial choice of centers and a poor choice of centers may lead to a local optimum that is quite inferior to the global optimum, we propose a strategy to initiate K-Means centers. The improved K-Means algorithm is compared with the original K-Means, and the results prove how the efficiency has been significantly improved.

Web Proxy Detection via Bipartite Graphs and One-Mode Projections

With the Internet becoming the dominant channel for business and life, many IPs are increasingly masked using web proxies for illegal purposes such as propagating malware, impersonate phishing pages to steal sensitive data or redirect victims to other malicious targets. Moreover, as Internet traffic continues to grow in size and complexity, it has become an increasingly challenging task to detect the proxy service due to their dynamic update and high anonymity. In this paper, we present an approach based on behavioral graph analysis to study the behavior similarity of web proxy users. Specifically, we use bipartite graphs to model host communications from network traffic and build one-mode projections of bipartite graphs for discovering social-behavior similarity of web proxy users. Based on the similarity matrices of end-users from the derived one-mode projection graphs, we apply a simple yet effective spectral clustering algorithm to discover the inherent web proxy users behavior clusters. The web proxy URL may vary from time to time. Still, the inherent interest would not. So, based on the intuition, by dint of our private tools implemented by WebDriver, we examine whether the top URLs visited by the web proxy users are web proxies. Our experiment results based on real datasets show that the behavior clusters not only reduce the number of URLs analysis but also provide an effective way to detect the web proxies, especially for the unknown web proxies.

A Construction Management Tool: Determining a Project Schedule Typical Behaviors Using Cluster Analysis

Delays in the construction industry are a global phenomenon. Many construction projects experience extensive delays exceeding the initially estimated completion time. The main purpose of this study is to identify construction projects typical behaviors in order to develop a prognosis and management tool. Being able to know a construction projects schedule tendency will enable evidence-based decision-making to allow resolutions to be made before delays occur. This study presents an innovative approach that uses Cluster Analysis Method to support predictions during Earned Value Analyses. A clustering analysis was used to predict future scheduling, Earned Value Management (EVM), and Earned Schedule (ES) principal Indexes behaviors in construction projects. The analysis was made using a database with 90 different construction projects. It was validated with additional data extracted from literature and with another 15 contrasting projects. For all projects, planned and executed schedules were collected and the EVM and ES principal indexes were calculated. A complete linkage classification method was used. In this way, the cluster analysis made considers that the distance (or similarity) between two clusters must be measured by its most disparate elements, i.e. that the distance is given by the maximum span among its components. Finally, through the use of EVM and ES Indexes and Tukey and Fisher Pairwise Comparisons, the statistical dissimilarity was verified and four clusters were obtained. It can be said that construction projects show an average delay of 35% of its planned completion time. Furthermore, four typical behaviors were found and for each of the obtained clusters, the interim milestones and the necessary rhythms of construction were identified. In general, detected typical behaviors are: (1) Projects that perform a 5% of work advance in the first two tenths and maintain a constant rhythm until completion (greater than 10% for each remaining tenth), being able to finish on the initially estimated time. (2) Projects that start with an adequate construction rate but suffer minor delays culminating with a total delay of almost 27% of the planned time. (3) Projects which start with a performance below the planned rate and end up with an average delay of 64%, and (4) projects that begin with a poor performance, suffer great delays and end up with an average delay of a 120% of the planned completion time. The obtained clusters compose a tool to identify the behavior of new construction projects by comparing their current work performance to the validated database, thus allowing the correction of initial estimations towards more accurate completion schedules.

Identification of Disease Causing DNA Motifs in Human DNA Using Clustering Approach

Studying DNA (deoxyribonucleic acid) sequence is useful in biological processes and it is applied in the fields such as diagnostic and forensic research. DNA is the hereditary information in human and almost all other organisms. It is passed to their generations. Earlier stage detection of defective DNA sequence may lead to many developments in the field of Bioinformatics. Nowadays various tedious techniques are used to identify defective DNA. The proposed work is to analyze and identify the cancer-causing DNA motif in a given sequence. Initially the human DNA sequence is separated as k-mers using k-mer separation rule. The separated k-mers are clustered using Self Organizing Map (SOM). Using Levenshtein distance measure, cancer associated DNA motif is identified from the k-mer clusters. Experimental results of this work indicate the presence or absence of cancer causing DNA motif. If the cancer associated DNA motif is found in DNA, it is declared as the cancer disease causing DNA sequence. Otherwise the input human DNA is declared as normal sequence. Finally, elapsed time is calculated for finding the presence of cancer causing DNA motif using clustering formation. It is compared with normal process of finding cancer causing DNA motif. Locating cancer associated motif is easier in cluster formation process than the other one. The proposed work will be an initiative aid for finding genetic disease related research.

Solar-Inducted Cluster Head Relocation Algorithm

A special area in the study of Wireless Sensor Networks (WSNs) is how to move sensor nodes, as it expands the scope of application of wireless sensors and provides new opportunities to improve network performance. On the other side, it opens a set of new problems, especially if complete clusters are mobile. Node mobility can prolong the network lifetime. In such WSN, some nodes are possibly moveable or nomadic (relocated periodically), while others are static. This paper presents an idea of mobile, solar-powered CHs that relocate themselves inside clusters in such a way that the total energy consumption in the network reduces, and the lifetime of the network extends. Positioning of CHs is made in each round based on selfish herd hypothesis, where leader retreats to the center of gravity. Based on this idea, an algorithm, together with its modified version, has been presented and tested in this paper. Simulation results show that both algorithms have benefits in network lifetime, and prolongation of network stability period duration.

Networks in the Tourism Sector in Brazil: Proposal of a Management Model Applied to Tourism Clusters

Companies in the tourism sector need to achieve competitive advantages for their survival in the market. In this way, the models based on association, cooperation, complementarity, distribution, exchange and mutual assistance arise as a possibility of organizational development, taking as reference the concept of networks. Many companies seek to partner in local networks as clusters to act together and associate. The main objective of the present research is to identify the specificities of management and the practices of cooperation in the tourist destination of São Paulo - Brazil, and to propose a new management model with possible cluster of tourism. The empirical analysis was carried out in three phases. As a first phase, a research was made by the companies, associations and tourism organizations existing in São Paulo, analyzing the characteristics of their business. In the second phase, the management specificities and cooperation practice used in the tourist destination. And in the third phase, identifying the possible strengths and weaknesses that potential or potential tourist cluster could have, proposing the development of the management model of the same adapted to the needs of the companies, associations and organizations. As a main result, it has been identified that companies, associations and organizations could be looking for synergies with each other and collaborate through a Hiperred organizational structure, in which they share their knowledge, try to make the most of the collaboration and to benefit from three concepts: flexibility, learning and collaboration. Finally, it is concluded that, the proposed tourism cluster management model is viable for the development of tourism destinations because it makes it possible to strategically address agents which are responsible for public policies, as well as public and private companies and organizations in their strategies competitiveness and cooperation.

Ensuring Uniform Energy Consumption in Non-Deterministic Wireless Sensor Network to Protract Networks Lifetime

Wireless sensor networks have enticed much of the spotlight from researchers all around the world, owing to its extensive applicability in agricultural, industrial and military fields. Energy conservation node deployment stratagems play a notable role for active implementation of Wireless Sensor Networks. Clustering is the approach in wireless sensor networks which improves energy efficiency in the network. The clustering algorithm needs to have an optimum size and number of clusters, as clustering, if not implemented properly, cannot effectively increase the life of the network. In this paper, an algorithm has been proposed to address connectivity issues with the aim of ensuring the uniform energy consumption of nodes in every part of the network. The results obtained after simulation showed that the proposed algorithm has an edge over existing algorithms in terms of throughput and networks lifetime.

Digital Forensics Compute Cluster: A High Speed Distributed Computing Capability for Digital Forensics

We have developed a distributed computing capability, Digital Forensics Compute Cluster (DFORC2) to speed up the ingestion and processing of digital evidence that is resident on computer hard drives. DFORC2 parallelizes evidence ingestion and file processing steps. It can be run on a standalone computer cluster or in the Amazon Web Services (AWS) cloud. When running in a virtualized computing environment, its cluster resources can be dynamically scaled up or down using Kubernetes. DFORC2 is an open source project that uses Autopsy, Apache Spark and Kafka, and other open source software packages. It extends the proven open source digital forensics capabilities of Autopsy to compute clusters and cloud architectures, so digital forensics tasks can be accomplished efficiently by a scalable array of cluster compute nodes. In this paper, we describe DFORC2 and compare it with a standalone version of Autopsy when both are used to process evidence from hard drives of different sizes.

Regression Approach for Optimal Purchase of Hosts Cluster in Fixed Fund for Hadoop Big Data Platform

Given a fixed fund, purchasing fewer hosts of higher capability or inversely more of lower capability is a must-be-made trade-off in practices for building a Hadoop big data platform. An exploratory study is presented for a Housing Big Data Platform project (HBDP), where typical big data computing is with SQL queries of aggregate, join, and space-time condition selections executed upon massive data from more than 10 million housing units. In HBDP, an empirical formula was introduced to predict the performance of host clusters potential for the intended typical big data computing, and it was shaped via a regression approach. With this empirical formula, it is easy to suggest an optimal cluster configuration. The investigation was based on a typical Hadoop computing ecosystem HDFS+Hive+Spark. A proper metric was raised to measure the performance of Hadoop clusters in HBDP, which was tested and compared with its predicted counterpart, on executing three kinds of typical SQL query tasks. Tests were conducted with respect to factors of CPU benchmark, memory size, virtual host division, and the number of element physical host in cluster. The research has been applied to practical cluster procurement for housing big data computing.

Evaluation of Groundwater Quality and Its Suitability for Drinking and Agricultural Purposes Using Self-Organizing Maps

In the present study, the self-organizing map (SOM) clustering technique was applied to identify homogeneous clusters of hydrochemical parameters in El Milia plain, Algeria, to assess the quality of groundwater for potable and agricultural purposes. The visualization of SOM-analysis indicated that 35 groundwater samples collected in the study area were classified into three clusters, which showed progressive increase in electrical conductivity from cluster one to cluster three. Samples belonging to cluster one are mostly located in the recharge zone showing hard fresh water type, however, water type gradually changed to hard-brackish type in the discharge zone, including clusters two and three. Ionic ratio studies indicated the role of carbonate rock dissolution in increases on groundwater hardness, especially in cluster one. However, evaporation and evapotranspiration are the main processes increasing salinity in cluster two and three.

Performance Analysis of Deterministic Stable Election Protocol Using Fuzzy Logic in Wireless Sensor Network

In Wireless Sensor Network (WSN), the sensor containing motes (nodes) incorporate batteries that can lament at some extent. To upgrade the energy utilization, clustering is one of the prototypical approaches for split sensor motes into a number of clusters where one mote (also called as node) proceeds as a Cluster Head (CH). CH selection is one of the optimization techniques for enlarging stability and network lifespan. Deterministic Stable Election Protocol (DSEP) is an effectual clustering protocol that makes use of three kinds of nodes with dissimilar residual energy for CH election. Fuzzy Logic technology is used to expand energy level of DSEP protocol by using fuzzy inference system. This paper presents protocol DSEP using Fuzzy Logic (DSEP-FL) CH by taking into account four linguistic variables such as energy, concentration, centrality and distance to base station. Simulation results show that our proposed method gives more effective results in term of a lifespan of network and stability as compared to the performance of other clustering protocols.

Assessment of Multi-Domain Energy Systems Modelling Methods

Emissions are a consequence of electricity generation. A major option for low carbon generation, local energy systems featuring Combined Heat and Power with solar PV (CHPV) has significant potential to increase energy performance, increase resilience, and offer greater control of local energy prices while complementing the UK’s emissions standards and targets. Recent advances in dynamic modelling and simulation of buildings and clusters of buildings using the IDEAS framework have successfully validated a novel multi-vector (simultaneous control of both heat and electricity) approach to integrating the wide range of primary and secondary plant typical of local energy systems designs including CHP, solar PV, gas boilers, absorption chillers and thermal energy storage, and associated electrical and hot water networks, all operating under a single unified control strategy. Results from this work indicate through simulation that integrated control of thermal storage can have a pivotal role in optimizing system performance well beyond the present expectations. Environmental impact analysis and reporting of all energy systems including CHPV LES presently employ a static annual average carbon emissions intensity for grid supplied electricity. This paper focuses on establishing and validating CHPV environmental performance against conventional emissions values and assessment benchmarks to analyze emissions performance without and with an active thermal store in a notional group of non-domestic buildings. Results of this analysis are presented and discussed in context of performance validation and quantifying the reduced environmental impact of CHPV systems with active energy storage in comparison with conventional LES designs.

Structure-Activity Relationship of Gold Catalysts on Alumina Supported Cu-Ce Oxides for CO and Volatile Organic Compound Oxidation

The catalytic oxidation of CO and volatile organic compounds (VOCs) is considered as one of the most efficient ways to reduce harmful emissions from various chemical industries. The effectiveness of gold-based catalysts for many reactions of environmental significance was proven during the past three decades. The aim of this work was to combine the favorable features of Au and Cu-Ce mixed oxides in the design of new catalytic materials of improved efficiency and economic viability for removal of air pollutants in waste gases from formaldehyde production. Supported oxides of copper and cerium with Cu: Ce molar ratio 2:1 and 1:5 were prepared by wet impregnation of g-alumina. Gold (2 wt.%) catalysts were synthesized by a deposition-precipitation method. Catalysts characterization was carried out by texture measurements, powder X-ray diffraction, temperature programmed reduction and electron paramagnetic resonance spectroscopy. The catalytic activity in the oxidation of CO, CH3OH and (CH3)2O was measured using continuous flow equipment with fixed bed reactor. Both Cu-Ce/alumina samples demonstrated similar catalytic behavior. The addition of gold caused significant enhancement of CO and methanol oxidation activity (100 % degree of CO and CH3OH conversion at about 60 and 140 oC, respectively). The composition of Cu-Ce mixed oxides affected the performance of gold-based samples considerably. Gold catalyst on Cu-Ce/γ-Al2O3 1:5 exhibited higher activity for CO and CH3OH oxidation in comparison with Au on Cu-Ce/γ-Al2O3 2:1. The better performance of Au/Cu-Ce 1:5 was related to the availability of highly dispersed gold particles and copper oxide clusters in close contact with ceria.

Identifying Factors for Evaluating Livability Potential within a Metropolis: A Case of Kolkata

Livability is a holistic concept whose factors include many complex characteristics and levels of interrelationships among them. It has been considered as people’s need for public amenities and is recognized as a major element to create social welfare. The concept and principles of livability are essential for recognizing the significance of community well-being. The attributes and dimensions of livability are also important aspects to measure the overall quality of environment. Livability potential is mainly considered as the capacity to develop into the overall well-being of an urban area in future. The intent of the present study is to identify the prime factors to evaluate livability potential within a metropolis. For ground level case study, the paper has selected Kolkata Metropolitan Area (KMA) as it has wide physical, social, and economic variations within it. The initial part of the study deals with detailed literature review on livability and its significance of evaluating its potential within a metropolis. The next segment is dedicated for identifying the primary factors which would evaluate livability potential within a metropolis. In pursuit of identifying primary factors, which have a direct impact on urban livability, this study delineates the metropolitan area into various clusters, having their distinct livability potential. As a final outcome of the study, variations of livability potential of those selected clusters are highlighted to explain the complexity of the metropolitan development.