Improving University Operations with Data Mining: Predicting Student Performance

The purpose of this paper is to develop models that would enable predicting student success. These models could improve allocation of students among colleges and optimize the newly introduced model of government subsidies for higher education. For the purpose of collecting data, an anonymous survey was carried out in the last year of undergraduate degree student population using random sampling method. Decision trees were created of which two have been chosen that were most successful in predicting student success based on two criteria: Grade Point Average (GPA) and time that a student needs to finish the undergraduate program (time-to-degree). Decision trees have been shown as a good method of classification student success and they could be even more improved by increasing survey sample and developing specialized decision trees for each type of college. These types of methods have a big potential for use in decision support systems.

Digital Social Networks: Examining the Knowledge Characteristics

In today-s information age, numbers of organizations are still arguing on capitalizing the values of Information Technology (IT) and Knowledge Management (KM) to which individuals can benefit from and effective communication among the individuals can be established. IT exists in enabling positive improvement for communication among knowledge workers (k-workers) with a number of social network technology domains at workplace. The acceptance of digital discourse in sharing of knowledge and facilitating the knowledge and information flows at most of the organizations indeed impose the culture of knowledge sharing in Digital Social Networks (DSN). Therefore, this study examines whether the k-workers with IT background would confer an effect on the three knowledge characteristics -- conceptual, contextual, and operational. Derived from these three knowledge characteristics, five potential factors will be examined on the effects of knowledge exchange via e-mail domain as the chosen query. It is expected, that the results could provide such a parameter in exploring how DSN contributes in supporting the k-workers- virtues, performance and qualities as well as revealing the mutual point between IT and KM.

Skin Lesion Segmentation Using Color Channel Optimization and Clustering-based Histogram Thresholding

Automatic segmentation of skin lesions is the first step towards the automated analysis of malignant melanoma. Although numerous segmentation methods have been developed, few studies have focused on determining the most effective color space for melanoma application. This paper proposes an automatic segmentation algorithm based on color space analysis and clustering-based histogram thresholding, a process which is able to determine the optimal color channel for detecting the borders in dermoscopy images. The algorithm is tested on a set of 30 high resolution dermoscopy images. A comprehensive evaluation of the results is provided, where borders manually drawn by four dermatologists, are compared to automated borders detected by the proposed algorithm, applying three previously used metrics of accuracy, sensitivity, and specificity and a new metric of similarity. By performing ROC analysis and ranking the metrics, it is demonstrated that the best results are obtained with the X and XoYoR color channels, resulting in an accuracy of approximately 97%. The proposed method is also compared with two state-of-theart skin lesion segmentation methods.

Knowledge Management Criteria among Malaysian Organizations: An ANOVA Approach

The Knowledge Management (KM) Criteria is an essential foundation to evaluate KM outcomes. Different sets of criteria were developed and tailored by many researchers to determine the results of KM initiatives. However, literature review has emphasized on incomplete set of criteria for evaluating KM outcomes. Hence, this paper tried to address the problem of determining the criteria for measuring knowledge management outcomes among different types of Malaysian organizations. Successively, this paper was assumed to develop widely accepted criteria to measure success of knowledge management efforts for Malaysian organizations. Our analysis approach was based on the ANOVA procedure to compare a set of criteria among different types of organizations. This set of criteria was exploited from literature review. It is hoped that this study provides a better picture for different types of Malaysian organizations to establish a comprehensive set of criteria due to measure results of KM programs.

A Semantic Recommendation Procedure for Electronic Product Catalog

To overcome the product overload of Internet shoppers, we introduce a semantic recommendation procedure which is more efficient when applied to Internet shopping malls. The suggested procedure recommends the semantic products to the customers and is originally based on Web usage mining, product classification, association rule mining, and frequently purchasing. We applied the procedure to the data set of MovieLens Company for performance evaluation, and some experimental results are provided. The experimental results have shown superior performance in terms of coverage and precision.

Transferring Route Plan over Time

Travelling salesman problem (TSP) is a combinational optimization problem and solution approaches have been applied many real world problems. Pure TSP assumes the cities to visit are fixed in time and thus solutions are created to find shortest path according to these point. But some of the points are canceled to visit in time. If the problem is not time crucial it is not important to determine new routing plan but if the points are changing rapidly and time is necessary do decide a new route plan a new approach should be applied in such cases. We developed a route plan transfer method based on transfer learning and we achieved high performance against determining a new model from scratch in every change.

Urban Flood Control and Management - An Integrated Approach

Flood management is one of the important fields in urban storm water management. Floods are influenced by the increase of huge storm event, or improper planning of the area. This study mainly provides the flood protection in four stages; planning, flood event, responses and evaluation. However it is most effective then flood protection is considered in planning/design and evaluation stages since both stages represent the land development of the area. Structural adjustments are often more reliable than nonstructural adjustments in providing flood protection, however structural adjustments are constrained by numerous factors such as political constraints and cost. Therefore it is important to balance both adjustments with the situation. The technical decisions provided will have to be approved by the higher-ups who have the power to decide on the final solution. Costs however, are the biggest factor in determining the final decision. Therefore this study recommends flood protection system should have been integrated and enforces more in the early stages (planning and design) as part of the storm water management plan. Factors influencing the technical decisions provided should be reduced as low as possible to avoid a reduction in the expected performance of the proposed adjustments.

Role of Association Rule Mining in Numerical Data Analysis

Numerical analysis naturally finds applications in all fields of engineering and the physical sciences, but in the 21st century, the life sciences and even the arts have adopted elements of scientific computations. The numerical data analysis became key process in research and development of all the fields [6]. In this paper we have made an attempt to analyze the specified numerical patterns with reference to the association rule mining techniques with minimum confidence and minimum support mining criteria. The extracted rules and analyzed results are graphically demonstrated. Association rules are a simple but very useful form of data mining that describe the probabilistic co-occurrence of certain events within a database [7]. They were originally designed to analyze market-basket data, in which the likelihood of items being purchased together within the same transactions are analyzed.

Discovering Complex Regularities: from Tree to Semi-Lattice Classifications

Data mining uses a variety of techniques each of which is useful for some particular task. It is important to have a deep understanding of each technique and be able to perform sophisticated analysis. In this article we describe a tool built to simulate a variation of the Kohonen network to perform unsupervised clustering and support the entire data mining process up to results visualization. A graphical representation helps the user to find out a strategy to optimize classification by adding, moving or delete a neuron in order to change the number of classes. The tool is able to automatically suggest a strategy to optimize the number of classes optimization, but also support both tree classifications and semi-lattice organizations of the classes to give to the users the possibility of passing from one class to the ones with which it has some aspects in common. Examples of using tree and semi-lattice classifications are given to illustrate advantages and problems. The tool is applied to classify macroeconomic data that report the most developed countries- import and export. It is possible to classify the countries based on their economic behaviour and use the tool to characterize the commercial behaviour of a country in a selected class from the analysis of positive and negative features that contribute to classes formation. Possible interrelationships between the classes and their meaning are also discussed.

Animated Versus Static User Interfaces: A Study of Mathsigner™

In this paper we report a study aimed at determining the effects of animation on usability and appeal of educational software user interfaces. Specifically, the study compares 3 interfaces developed for the Mathsigner™ program: a static interface, an interface with highlighting/sound feedback, and an interface that incorporates five Disney animation principles. The main objectives of the comparative study were to: (1) determine which interface is the most effective for the target users of Mathsigner™ (e.g., children ages 5-11), and (2) identify any Gender and Age differences in using the three interfaces. To accomplish these goals we have designed an experiment consisting of a cognitive walkthrough and a survey with rating questions. Sixteen children ages 7-11 participated in the study, ten males and six females. Results showed no significant interface effect on user task performance (e.g., task completion time and number of errors); however, interface differences were seen in rating of appeal, with the animated interface rated more 'likeable' than the other two. Task performance and rating of appeal were not affected significantly by Gender or Age of the subjects.

A Testbed for the Experiments Performed in Missing Value Treatments

The occurrence of missing values in database is a serious problem for Data Mining tasks, responsible for degrading data quality and accuracy of analyses. In this context, the area has shown a lack of standardization for experiments to treat missing values, introducing difficulties to the evaluation process among different researches due to the absence in the use of common parameters. This paper proposes a testbed intended to facilitate the experiments implementation and provide unbiased parameters using available datasets and suited performance metrics in order to optimize the evaluation and comparison between the state of art missing values treatments.

Effect of the Rise/Span Ratio of a Spherical Cap Shell on the Buckling Load

Rise/span ratio has been mentioned as one of the reasons which contribute to the lower buckling load as compared to the Classical theory buckling load but this ratio has not been quantified in the equation. The purpose of this study was to determine a more realistic buckling load by quantifying the effect of the rise/span ratio because experiments have shown that the Classical theory overestimates the load. The buckling load equation was derived based on the theorem of work done and strain energy. Thereafter, finite element modeling and simulation using ABAQUS was done to determine the variables that determine the constant in the derived equation. The rise/span was found to be the determining factor of the constant in the buckling load equation. The derived buckling load correlates closely to the load obtained from experiments.

Multi-Dimensional Concerns Mining for Web Applications via Concept-Analysis

Web applications have become very complex and crucial, especially when combined with areas such as CRM (Customer Relationship Management) and BPR (Business Process Reengineering), the scientific community has focused attention to Web applications design, development, analysis, and testing, by studying and proposing methodologies and tools. This paper proposes an approach to automatic multi-dimensional concern mining for Web Applications, based on concepts analysis, impact analysis, and token-based concern identification. This approach lets the user to analyse and traverse Web software relevant to a particular concern (concept, goal, purpose, etc.) via multi-dimensional separation of concerns, to document, understand and test Web applications. This technique was developed in the context of WAAT (Web Applications Analysis and Testing) project. A semi-automatic tool to support this technique is currently under development.

Internal Force State Recognition of Jiujiang Bridge Based on Cable Force-displacement Relationship

The nearly 21-year-old Jiujiang Bridge, which is suffering from uneven line shape, constant great downwarping of the main beam and cracking of the box girder, needs reinforcement and cable adjustment. It has undergone cable adjustment for twice with incomplete data. Therefore, the initial internal force state of the Jiujiang Bridge is identified as the key for the cable adjustment project. Based on parameter identification by means of static force test data, this paper suggests determining the initial internal force state of the cable-stayed bridge according to the cable force-displacement relationship parameter identification method. That is, upon measuring the displacement and the change in cable forces for twice, one can identify the parameters concerned by means of optimization. This method is applied to the cable adjustment, replacement and reinforcement project for the Jiujiang Bridge as a guidance for the cable adjustment and reinforcement project of the bridge.

Selection of Plants as Possible Rhizoremediators for Restoration of Oil Contaminated Soil

In studying the possibility of using plants as rhizoremediators, barley and grass mixture which showed resistance to various concentrations of oil were selected. The minimum inhibitory effect of oil on these plants by morphological parameters such as survival of plants, length and biomass of shoot and root compared with the control was showed. In determining physiological parameters, a slight decrease in the number of chlorophyll a and b in the leaves of plants was noted. The differences in the ratio of the total surface of the roots to the work surface with the growth of plants in soil with oil in the study of adsorption of the root surface were showed.

From Separatism to Coalition: Variants in Language Politics and Leadership Pattern in Dravidian Movement

This paper describes the evolution of language politics and the part played by political leaders with reference to the Dravidian parties in Tamil Nadu. It explores the interesting evolution from separatism to coalition in sustaining the values of parliamentary democracy and federalism. It seems that the appropriation of language politics is fully ascribed to the DMK leadership under Annadurai and Karunanidhi. For them, the Tamil language is a self-determining power, a terrain of nationhood, and a perennial source of social and political powers. The DMK remains a symbol of Tamil nationalist party playing language politics in the interest of the Tamils. Though electoral alliances largely determine the success, the language politics still has significant space in the politics of Tamil Nadu. Ironically, DMK moves from the periphery to centre for getting national recognition for the Tamils as well as for its own maximization of power. The evolution can be seen in two major phases as: language politics for party building; and language politics for state building with three successive political processes, namely, language politics in the process of separatism, representative politics and coalition. The much pronounced Dravidian Movement is radical enough to democratize the party ideology to survive the spirit of parliamentary democracy. This has secured its own rewards in terms of political power. The political power provides the means to achieve the social and political goal of the political party. Language politics and leadership pattern actualized this trend though the movement is shifted from separatism to coalition.

Finding Authoritative Researchers on Academic Web Sites

In this paper, we present a methodology for finding authoritative researchers by analyzing academic Web sites. We show a case study in which we concentrate on a set of Czech computer science departments- Web sites. We analyze the relations between them via hyperlinks and find the most important ones using several common ranking algorithms. We then examine the contents of the research papers present on these sites and determine the most authoritative Czech authors.

Defect Cause Modeling with Decision Tree and Regression Analysis

The main aim of this study is to identify the most influential variables that cause defects on the items produced by a casting company located in Turkey. To this end, one of the items produced by the company with high defective percentage rates is selected. Two approaches-the regression analysis and decision treesare used to model the relationship between process parameters and defect types. Although logistic regression models failed, decision tree model gives meaningful results. Based on these results, it can be claimed that the decision tree approach is a promising technique for determining the most important process variables.

Application of Artificial Neural Network to Classification Surface Water Quality

Water quality is a subject of ongoing concern. Deterioration of water quality has initiated serious management efforts in many countries. This study endeavors to automatically classify water quality. The water quality classes are evaluated using 6 factor indices. These factors are pH value (pH), Dissolved Oxygen (DO), Biochemical Oxygen Demand (BOD), Nitrate Nitrogen (NO3N), Ammonia Nitrogen (NH3N) and Total Coliform (TColiform). The methodology involves applying data mining techniques using multilayer perceptron (MLP) neural network models. The data consisted of 11 sites of canals in Dusit district in Bangkok, Thailand. The data is obtained from the Department of Drainage and Sewerage Bangkok Metropolitan Administration during 2007-2011. The results of multilayer perceptron neural network exhibit a high accuracy multilayer perception rate at 96.52% in classifying the water quality of Dusit district canal in Bangkok Subsequently, this encouraging result could be applied with plan and management source of water quality.

Performance Optimization of Data Mining Application Using Radial Basis Function Classifier

Text data mining is a process of exploratory data analysis. Classification maps data into predefined groups or classes. It is often referred to as supervised learning because the classes are determined before examining the data. This paper describes proposed radial basis function Classifier that performs comparative crossvalidation for existing radial basis function Classifier. The feasibility and the benefits of the proposed approach are demonstrated by means of data mining problem: direct Marketing. Direct marketing has become an important application field of data mining. Comparative Cross-validation involves estimation of accuracy by either stratified k-fold cross-validation or equivalent repeated random subsampling. While the proposed method may have high bias; its performance (accuracy estimation in our case) may be poor due to high variance. Thus the accuracy with proposed radial basis function Classifier was less than with the existing radial basis function Classifier. However there is smaller the improvement in runtime and larger improvement in precision and recall. In the proposed method Classification accuracy and prediction accuracy are determined where the prediction accuracy is comparatively high.