Research of Data Cleaning Methods Based on Dependency Rules

This paper introduces the concept and principle of data cleaning, analyzes the types and causes of dirty data, and proposes several key steps of typical cleaning process, puts forward a well scalability and versatility data cleaning framework, in view of data with attribute dependency relation, designs several of violation data discovery algorithms by formal formula, which can obtain inconsistent data to all target columns with condition attribute dependent no matter data is structured (SQL) or unstructured (NoSql), and gives 6 data cleaning methods based on these algorithms.

Model-Based Software Regression Test Suite Reduction

In this paper, we present a model-based regression test suite reducing approach that uses EFSM model dependence analysis and probability-driven greedy algorithm to reduce software regression test suites. The approach automatically identifies the difference between the original model and the modified model as a set of elementary model modifications. The EFSM dependence analysis is performed for each elementary modification to reduce the regression test suite, and then the probability-driven greedy algorithm is adopted to select the minimum set of test cases from the reduced regression test suite that cover all interaction patterns. Our initial experience shows that the approach may significantly reduce the size of regression test suites.

Multilevel Fuzzy Decision Support Model for China-s Urban Rail Transit Planning Schemes

This paper aims at developing a multilevel fuzzy decision support model for urban rail transit planning schemes in China under the background that China is presently experiencing an unprecedented construction of urban rail transit. In this study, an appropriate model using multilevel fuzzy comprehensive evaluation method is developed. In the decision process, the followings are considered as the influential objectives: traveler attraction, environment protection, project feasibility and operation. In addition, consistent matrix analysis method is used to determine the weights between objectives and the weights between the objectives- sub-indictors, which reduces the work caused by repeated establishment of the decision matrix on the basis of ensuring the consistency of decision matrix. The application results show that multilevel fuzzy decision model can perfectly deal with the multivariable and multilevel decision process, which is particularly useful in the resolution of multilevel decision-making problem of urban rail transit planning schemes.