Abstract: Measuring semantic similarity between texts is calculating semantic relatedness between texts using various techniques. Our web application (Measuring Relatedness of Concepts-MRC) allows user to input two text corpuses and get semantic similarity percentage between both using WordNet. Our application goes through five stages for the computation of semantic relatedness. Those stages are: Preprocessing (extracts keywords from content), Feature Extraction (classification of words into Parts-of-Speech), Synonyms Extraction (retrieves synonyms against each keyword), Measuring Similarity (using keywords and synonyms, similarity is measured) and Visualization (graphical representation of similarity measure). Hence the user can measure similarity on basis of features as well. The end result is a percentage score and the word(s) which form the basis of similarity between both texts with use of different tools on same platform. In future work we look forward for a Web as a live corpus application that provides a simpler and user friendly tool to compare documents and extract useful information.
Abstract: Measuring semantic similarity between texts is calculating semantic relatedness between texts using various techniques. Our web application (Measuring Relatedness of Concepts-MRC) allows user to input two text corpuses and get semantic similarity percentage between both using WordNet. Our application goes through five stages for the computation of semantic relatedness. Those stages are: Preprocessing (extracts keywords from content), Feature Extraction (classification of words into Parts-of-Speech), Synonyms Extraction (retrieves synonyms against each keyword), Measuring Similarity (using keywords and synonyms, similarity is measured) and Visualization (graphical representation of similarity measure). Hence the user can measure similarity on basis of features as well. The end result is a percentage score and the word(s) which form the basis of similarity between both texts with use of different tools on same platform. In future work we look forward for a Web as a live corpus application that provides a simpler and user friendly tool to compare documents and extract useful information.
Abstract: In recent years, interest in ecogenetic and biomedical problems related to the effects on the population of radon and its daughter decay products has increased significantly. Of particular interest is the assessment of the consequence of irradiation at hazardous radon areas, which includes the Almaty region due to the large number of tectonic faults that enhance radon emanation. In connection with the foregoing, the purpose of this work was to study the genetic effects of exposure to supernormal radon doses on the alpha-radiation model. Irradiation does not affect the growth of the cell, but rather its ability to differentiate. In addition, irradiation can lead to somatic mutations, morphoses and modifications. These damages most likely occur from changes in the composition of the substances of the cell. Such changes are epigenetic since they affect the regulatory processes of ontogenesis. Variability in the expression of regulatory genes refers to conditional mutations that modify the formation of signs of intraspecific similarity. Characteristic features of these conditional mutations are the dominant type of their manifestation, phenotypic asymmetry and their instability in the generations. Currently, the terms “morphosis” and “modification” are used to describe epigenetic variability, which are maintained in Drosophila melanogaster cultures using linkaged X- chromosomes, and the mutant X-chromosome is transmitted along the paternal line. In this paper, we investigated the epigenetic effects of alpha particles, whose source in nature is mainly radon and its daughter decay products. In the experiment, an isotope of plutonium-238 (Pu238), generating radiation with an energy of about 5500 eV, was used as a source of alpha particles. In an experiment in the first generation (F1), deformities or morphoses were found, which can be called "radiation syndromes" or mutations, the manifestation of which is similar to the pleiotropic action of genes. The proportion of morphoses in the experiment was 1.8%, and in control 0.4%. In this experiment, the morphoses in the flies of the first and second generation looked like black spots, or melanomas on different parts of the imago body; "generalized" melanomas; curled, curved wings; shortened wing; bubble on one wing; absence of one wing, deformation of thorax, interruption and violation of tergite patterns, disruption of distribution of ocular facets and bristles; absence of pigmentation of the second and third legs. Statistical analysis by the Chi-square method showed the reliability of the difference in experiment and control at P ≤ 0.01. On the basis of this, it can be considered that alpha particles, which in the environment are mainly generated by radon and its isotopes, have a mutagenic effect that manifests itself, mainly in the formation of morphoses or deformities.
Abstract: This paper presents a method for improving object search accuracy using a deep learning model. A major limitation to provide accurate similarity with deep learning is the requirement of huge amount of data for training pairwise similarity scores (metrics), which is impractical to collect. Thus, similarity scores are usually trained with a relatively small dataset, which comes from a different domain, causing limited accuracy on measuring similarity. For this reason, this paper proposes a deep learning model that can be trained with a significantly small amount of data, a clustered data which of each cluster contains a set of visually similar images. In order to measure similarity distance with the proposed method, visual features of two images are extracted from intermediate layers of a convolutional neural network with various pooling methods, and the network is trained with pairwise similarity scores which is defined zero for images in identical cluster. The proposed method outperforms the state-of-the-art object similarity scoring techniques on evaluation for finding exact items. The proposed method achieves 86.5% of accuracy compared to the accuracy of the state-of-the-art technique, which is 59.9%. That is, an exact item can be found among four retrieved images with an accuracy of 86.5%, and the rest can possibly be similar products more than the accuracy. Therefore, the proposed method can greatly reduce the amount of training data with an order of magnitude as well as providing a reliable similarity metric.
Abstract: In this paper we present a quick technique to measure the similarity between binary images. The technique is based on a probabilistic mapping approach and is fast because only a minute percentage of the image pixels need to be compared to measure the similarity, and not the whole image. We exploit the power of the Probabilistic Matching Model for Binary Images (PMMBI) to arrive at an estimate of the similarity. We show that the estimate is a good approximation of the actual value, and the quality of the estimate can be improved further with increased image mappings. Furthermore, the technique is image size invariant; the similarity between big images can be measured as fast as that for small images. Examples of trials conducted on real images are presented.
Abstract: The process of determining the degree of membership for an element to an uncertain concept has been found in many ways, using equivalence and symmetry relations in information systems. In the case of similarity, these methods did not take into account the degree of symmetry between elements. In this paper, we use a new definition for finding the membership based on the degree of symmetry. We provide an example to clarify the suggested methods and compare it with previous methods. This method opens the door to more accurate decisions in information systems.
Abstract: Turkey’s immigration policy is a controversial issue considering its legal, economic, social, and political and human rights dimensions. Formulation of an immigration policy goes hand in hand with political processes, where natives’ attitudes play a significant role. On the other hand, as was the case in Turkey, radical changes made in immigration policy or policies lacking transparency may cause severe reactions by the host society. The underlying discussion paper aims to analyze quantitatively the effects of the existing ‘open door’ immigration policy on the economic integration of Syrian refugees in Turkey, and on the perception of the native population of refugees. For the analysis, semi-structured in-depth interviews and focus group interviews have been conducted. After the introduction, a literature review is provided, followed by theoretical background on the explanation of natives’ attitudes towards immigrants. In the next section, a qualitative analysis of natives’ attitudes towards Syrian refugees is presented with the subtopics of (i) awareness, general opinions and expectations, (ii) open-door policy and management of the migration process, (iii) perception of positive and negative impacts of immigration, (iv) economic integration, and (v) cultural similarity. Results indicate that, natives concurrently have social, economic and security concerns regarding refugees, while difficulties regarding security and economic integration of refugees stand out. Socio-economic characteristics of the respondents, such as the educational level and employment status, are not sufficient to explain the overall attitudes towards refugees, while they can be used to explain the awareness of the respondents and the priority of the concerns felt.
Abstract: Closed die forging is a very complex process, and measurement of actual forces for real material is difficult and time consuming. Hence, the modelling technique has taken the advantage of carrying out the experimentation with the proper model material which needs lesser forces and relatively low temperature. The results of experiments on the model material then may be correlated with the actual material by using the theory of similarity. There are several methods available to resolve the complexity involved in the closed die forging process. Finite Element Method (FEM) and Finite Difference Method (FDM) are relatively difficult as compared to the slab method. The slab method is very popular and very widely used by the people working on shop floor because it is relatively easy to apply and reasonably accurate for most of the common forging load requirement computations.
Abstract: With advancements in science and technology, the concept of the Internet of Things (IoT) has gradually developed. The development of the intelligent environment adds intelligence to objects in the living space by using the IoT. In the smart environment, when multiple users share the living space, if different service requirements from different users arise, then the context-aware system will have conflicting situations for making decisions about providing services. Therefore, the purpose of establishing a communication and negotiation mechanism among objects in the intelligent environment is to resolve those service conflicts among users. This study proposes developing a decision-making methodology that uses “Event Agents” as its core. When the sensor system receives information, it evaluates a user’s current events and conditions; analyses object, location, time, and environmental information; calculates the priority of the object; and provides the user services based on the event. Moreover, when the event is not single but overlaps with another, conflicts arise. This study adopts the “Multiple Events Correlation Matrix” in order to calculate the degree values of incidents and support values for each object. The matrix uses these values as the basis for making inferences for system service, and to further determine appropriate services when there is a conflict.
Abstract: In this paper, we determine the similarity of two HTML web applications. We are going to use a genetic algorithm in order to determine the most significant web pages of each application (we are not going to use every web page of a site). Using these significant web pages, we will find the similarity value between the two applications. The algorithm is going to be efficient because we are going to use a reduced number of web pages for comparisons but it will return an approximate value of the similarity. The binary trees are used to keep the tags from the significant pages. The algorithm was implemented in Java language.
Abstract: This paper elaborates risk shifting in debt financing system as the ultimate cause of the global financial crisis. In contrast, risk sharing in equity financing like sukuk helps the economic system to be better sustained. Nevertheless, some types of sukuk are haunted by the issue of imitation with bonds. The critics on the imitation issue not only have raised doubt on the ability of sukuk to diminish risk shifting behavior but also the ability of this Islamic financial instrument to ensure better future financial stability. Through that, this paper provides discussion on the possibility of sukuk to induce risk shifting and how equity financing may help sukuk to be free from risk shifting. This paper is important in the sense that sukuk receives a significant demand from investors throughout the world. For this instrument to be supportive in the future economic stability, the issue of imitation needs to be identified and addressed. Furthermore, critics cannot be focused on debts and its ability to gauge the financial flux but also to sukuk due to their structures similarity.
Abstract: In this paper, we propose a new method for threedimensional
object indexing based on D.A.M.C-S.H.C descriptor
(Direct and Analytical Method for Calculating the Spherical
Harmonics Coefficients). For this end, we propose a direct
calculation of the coefficients of spherical harmonics with perfect
precision. The aims of the method are to minimize, the processing
time on the 3D objects database and the searching time of similar
objects to a request object.
Firstly we start by defining the new descriptor using a new
division of 3-D object in a sphere. Then we define a new distance
which will be tested and prove his efficiency in the search for similar
objects in the database in which we have objects with very various
and important size.
Abstract: Applications of the Hausdorff space and its mappings
into tangent spaces are outlined, including their fractal dimensions
and self-similarities. The paper details this theory set up and further
describes virtualizations and atomization of manufacturing processes.
It demonstrates novel concurrency principles that will guide
manufacturing processes and resources configurations. Moreover,
varying levels of details may be produced by up folding and breaking
down of newly introduced generic models. This choice of layered
generic models for units and systems aspects along specific aspects
allows research work in parallel to other disciplines with the same
focus on all levels of detail. More credit and easier access are granted
to outside disciplines for enriching manufacturing grounds. Specific
mappings and the layers give hints for chances for interdisciplinary
outcomes and may highlight more details for interoperability
standards, as already worked on the international level. The new rules
are described, which require additional properties concerning all
involved entities for defining distributed decision cycles, again on the
base of self-similarity. All properties are further detailed and assigned
to a maturity scale, eventually displaying the smartness maturity of a
total shopfloor or a factory. The paper contributes to the intensive
ongoing discussion in the field of intelligent distributed
manufacturing and promotes solid concepts for implementations of
Cyber Physical Systems and the Internet of Things into
manufacturing industry, like industry 4.0, as discussed in German-speaking
countries.
Abstract: Aim of this work was to study the genetic basis for oil
accumulation in olive fruit via tracking DGAT2 (Diacylglycerol
acyltransferase type-2) gene in three Egyptian Origen Olive cultivars
namely Toffahi, Hamed and Maraki using molecular marker
techniques and bioinformatics tools. Results illustrate that, firstly:
specific genomic band of Maraki cultivars was identified as DGAT2
(Diacylglycerol acyltransferase type-2) and identical for this gene in
Olea europaea with 100% of similarity. Secondly, differential
genomic band of Maraki cultivars which produced from RAPD
fingerprinting technique reflected predicted distinguished sequence
which identified as DGAT2 (Diacylglycerol acyltransferase type-2)
in Fragaria vesca subsp. Vesca with 76% of sequential similarity.
Third and finally, specific genomic specific band of Hamed cultivars
was identified as two fragments, 1- Olea europaea cultivar Koroneiki
diacylglycerol acyltransferase type 2 mRNA, complete cds with two
matches regions with 99% or 2- Predicted: Fragaria vesca subsp.
vesca diacylglycerol O-acyltransferase 2-like (LOC101313050),
mRNA with 86 % of similarity.
Abstract: This paper presents a preliminary attempt to apply classification of time series using meta-clusters in order to improve the quality of regression models. In this case, clustering was performed as a method to obtain subgroups of time series data with normal distribution from the inflow into wastewater treatment plant data, composed of several groups differing by mean value. Two simple algorithms, K-mean and EM, were chosen as a clustering method. The Rand index was used to measure the similarity. After simple meta-clustering, a regression model was performed for each subgroups. The final model was a sum of the subgroups models. The quality of the obtained model was compared with the regression model made using the same explanatory variables, but with no clustering of data. Results were compared using determination coefficient (R2), measure of prediction accuracy- mean absolute percentage error (MAPE) and comparison on a linear chart. Preliminary results allow us to foresee the potential of the presented technique.
Abstract: Missing values in data are common in real world applications. Since the performance of many data mining algorithms depend critically on it being given a good metric over the input space, we decided in this paper to define a distance function for unlabeled
datasets with missing values. We use the Bhattacharyya distance, which measures the similarity of two probability distributions, to define our new distance function. According to this distance, the distance between two points without missing attributes values is simply the Mahalanobis distance. When on the other hand there is a missing value of one of the coordinates, the distance is computed according to the distribution of the missing coordinate. Our distance is general and can be used as part of any algorithm that computes the distance between data points. Because its performance depends strongly on the chosen distance measure, we opted for the k nearest neighbor classifier to evaluate its ability to accurately reflect object similarity. We experimented on standard numerical datasets from the UCI repository from different fields. On these datasets we simulated missing values and compared the performance of the kNN classifier using our distance to other three basic methods. Our experiments show that kNN using our distance function outperforms the kNN using other methods. Moreover, the runtime performance of our method is only slightly higher than the other methods.
Abstract: Motion capture devices have been utilized in
producing several contents, such as movies and video games. However,
since motion capture devices are expensive and inconvenient to use,
motions segmented from captured data was recycled and synthesized
to utilize it in another contents, but the motions were generally
segmented by contents producers in manual. Therefore, automatic
motion segmentation is recently getting a lot of attentions. Previous
approaches are divided into on-line and off-line, where on-line
approaches segment motions based on similarities between
neighboring frames and off-line approaches segment motions by
capturing the global characteristics in feature space. In this paper, we
propose a graph-based high-level motion segmentation method. Since
high-level motions consist of several repeated frames within temporal
distances, we consider all similarities among all frames within the
temporal distance. This is achieved by constructing a graph, where
each vertex represents a frame and the edges between the frames are
weighted by their similarity. Then, normalized cuts algorithm is used
to partition the constructed graph into several sub-graphs by globally
finding minimum cuts. In the experiments, the results using the
proposed method showed better performance than PCA-based method
in on-line and GMM-based method in off-line, as the proposed method
globally segment motions from the graph constructed based
similarities between neighboring frames as well as similarities among
all frames within temporal distances.
Abstract: Most of researches for conventional simulations were
studied focusing on flocks with a single species. While there exist the
flocking behaviors with a single species in nature, the flocking
behaviors are frequently observed with multi-species. This paper
studies on the flocking simulation for heterogeneous agents. In order
to simulate the flocks for heterogeneous agents, the conventional
method uses the identifier of flock, while the proposed method defines
the feature vector of agent and uses the similarity between agents by
comparing with those feature vectors. Based on the similarity, the
paper proposed the attractive force and repulsive force and then
executed the simulation by applying two forces. The results of
simulation showed that flock formation with heterogeneous agents is
very natural in both cases. In addition, it showed that unlike the
existing method, the proposed method can not only control the density
of the flocks, but also be possible for two different groups of agents to
flock close to each other if they have a high similarity.
Abstract: Chinese Idioms are a type of traditional Chinese idiomatic
expressions with specific meanings and stereotypes structure
which are widely used in classical Chinese and are still common in
vernacular written and spoken Chinese today. Currently, Chinese
Idioms are retrieved in glossary with key character or key word in
morphology or pronunciation index that can not meet the need of
searching semantically. OCIRS is proposed to search the desired
idiom in the case of users only knowing its meaning without any key
character or key word. The user-s request in a sentence or phrase will
be grammatically analyzed in advance by word segmentation, key
word extraction and semantic similarity computation, thus can be
mapped to the idiom domain ontology which is constructed to provide
ample semantic relations and to facilitate description logics-based
reasoning for idiom retrieval. The experimental evaluation shows that
OCIRS realizes the function of searching idioms via semantics, obtaining
preliminary achievement as requested by the users.
Abstract: An important structuring mechanism for knowledge bases is building clusters based on the content of their knowledge objects. The objects are clustered based on the principle of maximizing the intraclass similarity and minimizing the interclass similarity. Clustering can also facilitate taxonomy formation, that is, the organization of observations into a hierarchy of classes that group similar events together. Hierarchical representation allows us to easily manage the complexity of knowledge, to view the knowledge at different levels of details, and to focus our attention on the interesting aspects only. One of such efficient and easy to understand systems is Hierarchical Production rule (HPRs) system. A HPR, a standard production rule augmented with generality and specificity information, is of the following form Decision If < condition> Generality Specificity . HPRs systems are capable of handling taxonomical structures inherent in the knowledge about the real world. In this paper, a set of related HPRs is called a cluster and is represented by a HPR-tree. This paper discusses an algorithm based on cumulative learning scenario for dynamic structuring of clusters. The proposed scheme incrementally incorporates new knowledge into the set of clusters from the previous episodes and also maintains summary of clusters as Synopsis to be used in the future episodes. Examples are given to demonstrate the behaviour of the proposed scheme. The suggested incremental structuring of clusters would be useful in mining data streams.