Data Mining Approach for Commercial Data Classification and Migration in Hybrid Storage Systems

Parallel hybrid storage systems consist of a hierarchy of different storage devices that vary in terms of data reading speed performance. As we ascend in the hierarchy, data reading speed becomes faster. Thus, migrating the application’ important data that will be accessed in the near future to the uppermost level will reduce the application I/O waiting time; hence, reducing its execution elapsed time. In this research, we implement trace-driven two-levels parallel hybrid storage system prototype that consists of HDDs and SSDs. The prototype uses data mining techniques to classify application’ data in order to determine its near future data accesses in parallel with the its on-demand request. The important data (i.e. the data that the application will access in the near future) are continuously migrated to the uppermost level of the hierarchy. Our simulation results show that our data migration approach integrated with data mining techniques reduces the application execution elapsed time when using variety of traces in at least to 22%.

Application of Building Information Modeling in Energy Management of Individual Departments Occupying University Facilities

To assist individual departments within universities in their energy management tasks, this study explores the application of Building Information Modeling in establishing the ‘BIM based Energy Management Support System’ (BIM-EMSS). The BIM-EMSS consists of six components: (1) sensors installed for each occupant and each equipment, (2) electricity sub-meters (constantly logging lighting, HVAC, and socket electricity consumptions of each room), (3) BIM models of all rooms within individual departments’ facilities, (4) data warehouse (for storing occupancy status and logged electricity consumption data), (5) building energy management system that provides energy managers with various energy management functions, and (6) energy simulation tool (such as eQuest) that generates real time 'standard energy consumptions' data against which 'actual energy consumptions' data are compared and energy efficiency evaluated. Through the building energy management system, the energy manager is able to (a) have 3D visualization (BIM model) of each room, in which the occupancy and equipment status detected by the sensors and the electricity consumptions data logged are displayed constantly; (b) perform real time energy consumption analysis to compare the actual and standard energy consumption profiles of a space; (c) obtain energy consumption anomaly detection warnings on certain rooms so that energy management corrective actions can be further taken (data mining technique is employed to analyze the relation between space occupancy pattern with current space equipment setting to indicate an anomaly, such as when appliances turn on without occupancy); and (d) perform historical energy consumption analysis to review monthly and annually energy consumption profiles and compare them against historical energy profiles. The BIM-EMSS was further implemented in a research lab in the Department of Architecture of NTUST in Taiwan and implementation results presented to illustrate how it can be used to assist individual departments within universities in their energy management tasks.

Applying Hybrid Graph Drawing and Clustering Methods on Stock Investment Analysis

Stock investment decisions are often made based on current events of the global economy and the analysis of historical data. Conversely, visual representation could assist investors’ gain deeper understanding and better insight on stock market trends more efficiently. The trend analysis is based on long-term data collection. The study adopts a hybrid method that combines the Clustering algorithm and Force-directed algorithm to overcome the scalability problem when visualizing large data. This method exemplifies the potential relationships between each stock, as well as determining the degree of strength and connectivity, which will provide investors another understanding of the stock relationship for reference. Information derived from visualization will also help them make an informed decision. The results of the experiments show that the proposed method is able to produced visualized data aesthetically by providing clearer views for connectivity and edge weights.

Performance Evaluation of Data Mining Techniques for Predicting Software Reliability

Accurate software reliability prediction not only enables developers to improve the quality of software but also provides useful information to help them for planning valuable resources. This paper examines the performance of three well-known data mining techniques (CART, TreeNet and Random Forest) for predicting software reliability. We evaluate and compare the performance of proposed models with Cascade Correlation Neural Network (CCNN) using sixteen empirical databases from the Data and Analysis Center for Software. The goal of our study is to help project managers to concentrate their testing efforts to minimize the software failures in order to improve the reliability of the software systems. Two performance measures, Normalized Root Mean Squared Error (NRMSE) and Mean Absolute Errors (MAE), illustrate that CART model is accurate than the models predicted using Random Forest, TreeNet and CCNN in all datasets used in our study. Finally, we conclude that such methods can help in reliability prediction using real-life failure datasets.

Effects of Alternative Opportunities and Compensation on Turnover Intention of Singapore PMET

In Singapore, talent retention is one of the most persistent and real issue companies have to grapple with due to the tight labour market. Being resource-scarce, Singapore depends solely on its talented pool of high quality human resource to sustain its competitive advantage in the global economy. But the complex and multifaceted nature of turnover phenomenon makes the prescription of effective talent retention strategies in such a competitive labour market very challenging, especially when it comes to monetary incentives, companies struggle to answer the question of “How much is enough?” By examining the interactive effects of perceived alternative employment opportunities, annual salary and satisfaction with compensation on the turnover intention of 102 Singapore Professionals, Managers, Executives and Technicians (PMET) through correlation analyses and multiple regressions, important insights into the psyche of the Singapore talent pool can be drawn. It is found that annual salary influence turnover intention indirectly through mediation and moderation effects on PMET’s satisfaction on compensation. PMET are also found to be heavily swayed by better external opportunities. This implies that talent retention strategies should not adopt a purely monetary based blanket approach but rather a comprehensive and holistic one that considers the dynamics of prevailing market conditions.

Design of Wireless Readout System for Resonant Gas Sensors

This paper presents a design of a wireless read out system for tracking the frequency shift of the polymer coated piezoelectric micro electromechanical resonator due to gas absorption. The measure of this frequency shift indicates the percentage of a particular gas the sensor is exposed to. It is measured using an oscillator and an FPGA based frequency counter by employing the resonator as a frequency determining element in the oscillator. This system consists of a Gas Sensing Wireless Readout (GSWR) and an USB Wireless Transceiver (UWT). GSWR consists of an oscillator based on a trans-impedance sustaining amplifier, an FPGA based frequency readout, a sub 1GHz wireless transceiver and a micro controller. UWT can be plugged into the computer via USB port and function as a wireless module to transfer gas sensor data from GSWR to the computer through its USB port. GUI program running on the computer periodically polls for sensor data through UWT - GSWR wireless link, the response from GSWR is logged in a file for post processing as well as displayed on screen.

Development of Tensile Stress-Strain Relationship for High-Strength Steel Fiber Reinforced Concrete

This paper provides a tensile stress-strain (σ-ε) relationship for High-Strength Steel Fiber Reinforced Concrete (HSFRC). Load-deflection (P-δ) behavior of HSFRC beams tested under four-point flexural load were used with inverse analysis to calculate the tensile σ-ε relationship for various tested concrete grades (70 and 90MPa) containing 60 kg/m3 (0.76 %) of hook-end steel fibers. A first estimate of the tensile (σ-ε) relationship is obtained using RILEM TC 162-TDF and other methods available in literature, frequently used for determining tensile σ-ε relationship of Normal-Strength Concrete (NSC) Non-Linear Finite Element Analysis (NLFEA) package ABAQUS® is used to model the beam’s P-δ behavior. The results have shown that an element-size dependent tensile σ-ε relationship for HSFRC can be successfully generated and adopted for further analyses involving HSFRC structures.

Analysis of Diverse Cluster Ensemble Techniques

Data mining is the procedure of determining interesting patterns from the huge amount of data. With the intention of accessing the data faster the most supporting processes needed is clustering. Clustering is the process of identifying similarity between data according to the individuality present in the data and grouping associated data objects into clusters. Cluster ensemble is the technique to combine various runs of different clustering algorithms to obtain a general partition of the original dataset, aiming for consolidation of outcomes from a collection of individual clustering outcomes. The performances of clustering ensembles are mainly affecting by two principal factors such as diversity and quality. This paper presents the overview about the different cluster ensemble algorithm along with their methods used in cluster ensemble to improve the diversity and quality in the several cluster ensemble related papers and shows the comparative analysis of different cluster ensemble also summarize various cluster ensemble methods. Henceforth this clear analysis will be very useful for the world of clustering experts and also helps in deciding the most appropriate one to determine the problem in hand.

Upgraded Rough Clustering and Outlier Detection Method on Yeast Dataset by Entropy Rough K-Means Method

Rough set theory is used to handle uncertainty and incomplete information by applying two accurate sets, Lower approximation and Upper approximation. In this paper, the rough clustering algorithms are improved by adopting the Similarity, Dissimilarity–Similarity and Entropy based initial centroids selection method on three different clustering algorithms namely Entropy based Rough K-Means (ERKM), Similarity based Rough K-Means (SRKM) and Dissimilarity-Similarity based Rough K-Means (DSRKM) were developed and executed by yeast dataset. The rough clustering algorithms are validated by cluster validity indexes namely Rand and Adjusted Rand indexes. An experimental result shows that the ERKM clustering algorithm perform effectively and delivers better results than other clustering methods. Outlier detection is an important task in data mining and very much different from the rest of the objects in the clusters. Entropy based Rough Outlier Factor (EROF) method is seemly to detect outlier effectively for yeast dataset. In rough K-Means method, by tuning the epsilon (ᶓ) value from 0.8 to 1.08 can detect outliers on boundary region and the RKM algorithm delivers better results, when choosing the value of epsilon (ᶓ) in the specified range. An experimental result shows that the EROF method on clustering algorithm performed very well and suitable for detecting outlier effectively for all datasets. Further, experimental readings show that the ERKM clustering method outperformed the other methods.

A Strategic Sustainability Analysis of Electric Vehicles in EU Today and Towards 2050

Ambitions within the EU for moving towards sustainable transport include major emission reductions for fossil fuel road vehicles, especially for buses, trucks, and cars. The electric driveline seems to be an attractive solution for such development. This study first applied the Framework for Strategic Sustainable Development to compare sustainability effects of today’s fossil fuel vehicles with electric vehicles that have batteries or hydrogen fuel cells. The study then addressed a scenario were electric vehicles might be in majority in Europe by 2050. The methodology called Strategic Lifecycle Assessment was first used, were each life cycle phase was assessed for violations against sustainability principles. This indicates where further analysis could be done in order to quantify the magnitude of each violation, and later to create alternative strategies and actions that lead towards sustainability. A Life Cycle Assessment of combustion engine cars, plug-in hybrid cars, battery electric cars and hydrogen fuel cell cars was then conducted to compare and quantify environmental impacts. The authors found major violations of sustainability principles like use of fossil fuels, which contribute to the increase of emission related impacts such as climate change, acidification, eutrophication, ozone depletion, and particulate matters. Other violations were found, such as use of scarce materials for batteries and fuel cells, and also for most life cycle phases for all vehicles when using fossil fuel vehicles for mining, production and transport. Still, the studied current battery and hydrogen fuel cell cars have less severe violations than fossil fuel cars. The life cycle assessment revealed that fossil fuel cars have overall considerably higher environmental impacts compared to electric cars as long as the latter are powered by renewable electricity. By 2050, there will likely be even more sustainable alternatives than the studied electric vehicles when the EU electricity mix mainly should stem from renewable sources, batteries should be recycled, fuel cells should be a mature technology for use in vehicles (containing no scarce materials), and electric drivelines should have replaced combustion engines in other sectors. An uncertainty for fuel cells in 2050 is whether the production of hydrogen will have had time to switch to renewable resources. If so, that would contribute even more to a sustainable development. Except for being adopted in the GreenCharge roadmap, the authors suggest that the results can contribute to planning in the upcoming decades for a sustainable increase of EVs in Europe, and potentially serve as an inspiration for other smaller or larger regions. Further studies could map the environmental effects in LCA further, and include other road vehicles to get a more precise perception of how much they could affect sustainable development.

Field Trial of Resin-Based Composite Materials for the Treatment of Surface Collapses Associated with Former Shallow Coal Mining

Effective treatment of ground instability is essential when managing the impacts associated with historic mining. A field trial was undertaken by the Coal Authority to investigate the geotechnical performance and potential use of composite materials comprising resin and fill or stone to safely treat surface collapses, such as crown-holes, associated with shallow mining. Test pits were loosely filled with various granular fill materials. The fill material was injected with commercially available silicate and polyurethane resin foam products. In situ and laboratory testing was undertaken to assess the geotechnical properties of the resultant composite materials. The test pits were subsequently excavated to assess resin permeation. Drilling and resin injection was easiest through clean limestone fill materials. Recycled building waste fill material proved difficult to inject with resin; this material is thus considered unsuitable for use in resin composites. Incomplete resin permeation in several of the test pits created irregular ‘blocks’ of composite. Injected resin foams significantly improve the stiffness and resistance (strength) of the un-compacted fill material. The stiffness of the treated fill material appears to be a function of the stone particle size, its associated compaction characteristics (under loose tipping) and the proportion of resin foam matrix. The type of fill material is more critical than the type of resin to the geotechnical properties of the composite materials. Resin composites can effectively support typical design imposed loads. Compared to other traditional treatment options, such as cement grouting, the use of resin composites is potentially less disruptive, particularly for sites with limited access, and thus likely to achieve significant reinstatement cost savings. The use of resin composites is considered a suitable option for the future treatment of shallow mining collapses.

The Characteristics of Static Plantar Loading in the First-Division College Sprint Athletes

Background: Plantar pressure measurement is an effective method for assessing plantar loading and can be applied to evaluating movement performance of the foot. The purpose of this study is to explore the sprint athletes’ plantar loading characteristics and pain profiles in static standing. Methods: Experiments were undertaken on 80 first-division college sprint athletes and 85 healthy non-sprinters. ‘JC Mat’, the optical plantar pressure measurement was applied to examining the differences between both groups in the arch index (AI), three regional and six distinct sub-regional plantar pressure distributions (PPD), and footprint characteristics. Pain assessment and self-reported health status in sprint athletes were examined for evaluating their common pain areas. Results: Findings from the control group, the males’ AI fell into the normal range. Yet, the females’ AI was classified as the high-arch type. AI values of the sprint group were found to be significantly lower than the control group. PPD were higher at the medial metatarsal bone of both feet and the lateral heel of the right foot in the sprint group, the males in particular, whereas lower at the medial and lateral longitudinal arches of both feet. Footprint characteristics tended to support the results of the AI and PPD, and this reflected the corresponding pressure profiles. For the sprint athletes, the lateral knee joint and biceps femoris were the most common musculoskeletal pains. Conclusions: The sprint athletes’ AI were generally classified as high arches, and that their PPD were categorized between the features of runners and high-arched runners. These findings also correspond to the profiles of patellofemoral pain syndrome (PFPS)-related plantar pressure. The pain profiles appeared to correspond to the symptoms of high-arched runners and PFPS. The findings reflected upon the possible link between high arches and PFPS. The correlation between high-arched runners and PFPS development is worth further studies.

Seasonal Variation of the Impact of Mining Activities on Ga-Selati River in Limpopo Province, South Africa

Water is a very rare natural resource in South Africa. Ga-Selati River is used for both domestic and industrial purposes. This study was carried out in order to assess the quality of Ga-Selati River in a mining area of Limpopo Province-Phalaborwa. The pH, Electrical Conductivity (EC) and Total Dissolved Solids (TDS) were determined using a Crinson multimeter while turbidity was measured using a Labcon Turbidimeter. The concentrations of Al, Ca, Cd, Cr, Fe, K, Mg, Mn, Na and Pb were analysed in triplicate using a Varian 520 flame atomic absorption spectrometer (AAS) supplied by PerkinElmer, after acid digestion with nitric acid in a fume cupboard. The average pH of the river from eight different sampling sites was 8.00 and 9.38 in wet and dry season respectively. Higher EC values were determined in the dry season (138.7 mS/m) than in the wet season (96.93 mS/m). Similarly, TDS values were higher in dry (929.29 mg/L) than in the wet season (640.72 mg/L) season. These values exceeded the recommended guideline of South Africa Department of Water Affairs and Forestry (DWAF) for domestic water use (70 mS/m) and that of the World Health Organization (WHO) (600 mS/m), respectively. Turbidity varied between 1.78-5.20 and 0.95-2.37 NTU in both wet and dry seasons. Total hardness of 312.50 mg/L and 297.75 mg/L as the concentration of CaCO3 was computed for the river in both the wet and the dry seasons and the river water was categorised as very hard. Mean concentration of the metals studied in both the wet and the dry seasons are: Na (94.06 mg/L and 196.3 mg/L), K (11.79 mg/L and 13.62 mg/L), Ca (45.60 mg/L and 41.30 mg/L), Mg (48.41 mg/L and 44.71 mg/L), Al (0.31 mg/L and 0.38 mg/L), Cd (0.01 mg/L and 0.01 mg/L), Cr (0.02 mg/L and 0.09 mg/L), Pb (0.05 mg/L and 0.06 mg/L), Mn (0.31 mg/L and 0.11 mg/L) and Fe (0.76 mg/L and 0.69 mg/L). Results from this study reveal that most of the metals were present in concentrations higher than the recommended guidelines of DWAF and WHO for domestic use and the protection of aquatic life.

Critical Success Factors Influencing Construction Project Performance for Different Objectives: Procurement Phase

Critical success factors (CSFs) and the criteria to measure project success have received much attention over the decades and are among the most widely researched topics in the context of project management. However, although there have been extensive studies on the subject by different researchers, to date, there has been little agreement on the CSFs. The aim of this study is to identify the CSFs that influence the performance of construction projects, and determine their relative importance for different objectives across five stages in the project life cycle. A considerable literature review was conducted that resulted in the identification of 179 individual factors. These factors were then grouped into nine major categories. A questionnaire survey was used to collect data from three groups of respondents: client representatives, consultants, and contractors. Out of 164 questionnaires distributed, 93 were returned, yielding a response rate of 56.7%. Using the mean score, relative importance index, and weighted average method, the top 10 critical factors for each category were identified. The agreement of survey respondents on those categorised factors were analysed using Spearman’s rank correlation. A one-way analysis of variance was then performed to determine whether the mean scores among the various groups of respondents were statistically significant. The findings indicate the most CSFs in each category in procurement phase are: proper procurement programming of materials (time), stability in the price of materials (cost), and determining quality in the construction (quality). They are then followed by safety equipment acquisition and maintenance (health and safety), budgeting allowed in a contractual arrangement for implementing environmental management activities (environment), completeness of drawing documents (productivity), accurate measurement and pricing of bill of quantities (risk management), adequate communication among the project team (human resource), and adequate cost control measures (client satisfaction). An understanding of CSFs would help all interested parties in the construction industry to improve project performance. Furthermore, the results of this study would help construction professionals and practitioners take proactive measures for effective project management.

Mining Big Data in Telecommunications Industry: Challenges, Techniques, and Revenue Opportunity

Mining big data represents a big challenge nowadays. Many types of research are concerned with mining massive amounts of data and big data streams. Mining big data faces a lot of challenges including scalability, speed, heterogeneity, accuracy, provenance and privacy. In telecommunication industry, mining big data is like a mining for gold; it represents a big opportunity and maximizing the revenue streams in this industry. This paper discusses the characteristics of big data (volume, variety, velocity and veracity), data mining techniques and tools for handling very large data sets, mining big data in telecommunication and the benefits and opportunities gained from them.

Models of State Organization and Influence over Collective Identity and Nationalism in Spain

The main objective of this paper is to establish the relationship between models of state organization and the various types of collective identity expressed by the Spanish. The question of nationalism and identity ascription in Spain has always been a topic of special importance due to the presence in that country of territories where the population emits very different opinions of nationalist sentiment than the rest of Spain. The current situation of sovereignty challenge of Catalonia to the central government exemplifies the importance of the subject matter. In order to analyze this process of interrelation, we use a secondary data mining by applying the multiple correspondence analysis technique (MCA). As a main result a typology of four types of expression of collective identity based on models of State organization are shown, which are connected with the party position on this issue.

Effective Work Roll Cooling toward Stand Reduction in Hot Strip Process

The maintenance of work rolls in hot strip processing has been lengthy and difficult tasks for hot strip manufacturer because heavy work rolls have to be taken out of the production line, which could take hours. One way to increase the time between maintenance is to improve the effectiveness of the work roll cooling system such that the wear and tear more slowly occurs, while the operation cost is kept low. Therefore, this study aims to improve the work roll cooling system by providing the manufacturer the relationship between the work-roll temperature reduced by cooling and the water flow that can help manufacturer determining the more effective water flow of the cooling system. The relationship is found using simulation with a systematic process adjustment so that the satisfying quality of product is achieved. Results suggest that the manufacturer could reduce the water flow by 9% with roughly the same performance. With the same process adjustment, the feasibility of finishing-mill-stand reduction is also investigated. Results suggest its possibility.

Evaluation of Residual Stresses in Human Face as a Function of Growth

Growth and remodeling of biological structures have gained lots of attention over the past decades. Determining the response of living tissues to mechanical loads is necessary for a wide range of developing fields such as prosthetics design or computerassisted surgical interventions. It is a well-known fact that biological structures are never stress-free, even when externally unloaded. The exact origin of these residual stresses is not clear, but theoretically, growth is one of the main sources. Extracting body organ’s shapes from medical imaging does not produce any information regarding the existing residual stresses in that organ. The simplest cause of such stresses is gravity since an organ grows under its influence from birth. Ignoring such residual stresses might cause erroneous results in numerical simulations. Accounting for residual stresses due to tissue growth can improve the accuracy of mechanical analysis results. This paper presents an original computational framework based on gradual growth to determine the residual stresses due to growth. To illustrate the method, we apply it to a finite element model of a healthy human face reconstructed from medical images. The distribution of residual stress in facial tissues is computed, which can overcome the effect of gravity and maintain tissues firmness. Our assumption is that tissue wrinkles caused by aging could be a consequence of decreasing residual stress and thus not counteracting gravity. Taking into account these stresses seems therefore extremely important in maxillofacial surgery. It would indeed help surgeons to estimate tissues changes after surgery.

Development of Map of Gridded Basin Flash Flood Potential Index: GBFFPI Map of QuangNam, QuangNgai, DaNang, Hue Provinces

Flash flood is occurred in short time rainfall interval: from 1 hour to 12 hours in small and medium basins. Flash floods typically have two characteristics: large water flow and big flow velocity. Flash flood is occurred at hill valley site (strip of lowland of terrain) in a catchment with large enough distribution area, steep basin slope, and heavy rainfall. The risk of flash floods is determined through Gridded Basin Flash Flood Potential Index (GBFFPI). Flash Flood Potential Index (FFPI) is determined through terrain slope flash flood index, soil erosion flash flood index, land cover flash floods index, land use flash flood index, rainfall flash flood index. Determining GBFFPI, each cell in a map can be considered as outlet of a water accumulation basin. GBFFPI of the cell is determined as basin average value of FFPI of the corresponding water accumulation basin. Based on GIS, a tool is developed to compute GBFFPI using ArcObjects SDK for .NET. The maps of GBFFPI are built in two types: GBFFPI including rainfall flash flood index (real time flash flood warning) or GBFFPI excluding rainfall flash flood index. GBFFPI Tool can be used to determine a high flash flood potential site in a large region as quick as possible. The GBFFPI is improved from conventional FFPI. The advantage of GBFFPI is that GBFFPI is taking into account the basin response (interaction of cells) and determines more true flash flood site (strip of lowland of terrain) while conventional FFPI is taking into account single cell and does not consider the interaction between cells. The GBFFPI Map of QuangNam, QuangNgai, DaNang, Hue is built and exported to Google Earth. The obtained map proves scientific basis of GBFFPI.

Multimedia Data Fusion for Event Detection in Twitter by Using Dempster-Shafer Evidence Theory

Data fusion technology can be the best way to extract useful information from multiple sources of data. It has been widely applied in various applications. This paper presents a data fusion approach in multimedia data for event detection in twitter by using Dempster-Shafer evidence theory. The methodology applies a mining algorithm to detect the event. There are two types of data in the fusion. The first is features extracted from text by using the bag-ofwords method which is calculated using the term frequency-inverse document frequency (TF-IDF). The second is the visual features extracted by applying scale-invariant feature transform (SIFT). The Dempster - Shafer theory of evidence is applied in order to fuse the information from these two sources. Our experiments have indicated that comparing to the approaches using individual data source, the proposed data fusion approach can increase the prediction accuracy for event detection. The experimental result showed that the proposed method achieved a high accuracy of 0.97, comparing with 0.93 with texts only, and 0.86 with images only.