Web Search Engine Based Naming Procedure for Independent Topic

In recent years, the number of document data has been increasing since the spread of the Internet. Many methods have been studied for extracting topics from large document data. We proposed Independent Topic Analysis (ITA) to extract topics independent of each other from large document data such as newspaper data. ITA is a method for extracting the independent topics from the document data by using the Independent Component Analysis. The topic represented by ITA is represented by a set of words. However, the set of words is quite different from the topics the user imagines. For example, the top five words with high independence of a topic are as follows. Topic1 = {"scor", "game", "lead", "quarter", "rebound"}. This Topic 1 is considered to represent the topic of "SPORTS". This topic name "SPORTS" has to be attached by the user. ITA cannot name topics. Therefore, in this research, we propose a method to obtain topics easy for people to understand by using the web search engine, topics given by the set of words given by independent topic analysis. In particular, we search a set of topical words, and the title of the homepage of the search result is taken as the topic name. And we also use the proposed method for some data and verify its effectiveness.

A Watermarking Signature Scheme with Hidden Watermarks and Constraint Functions in the Symmetric Key Setting

To claim the ownership for an executable program is a non-trivial task. An emerging direction is to add a watermark to the program such that the watermarked program preserves the original program’s functionality and removing the watermark would heavily destroy the functionality of the watermarked program. In this paper, the first watermarking signature scheme with the watermark and the constraint function hidden in the symmetric key setting is constructed. The scheme uses well-known techniques of lattice trapdoors and a lattice evaluation. The watermarking signature scheme is unforgeable under the Short Integer Solution (SIS) assumption and satisfies other security requirements such as the unremovability security property.

Variable vs. Fixed Window Width Code Correlation Reference Waveform Receivers for Multipath Mitigation in Global Navigation Satellite Systems with Binary Offset Carrier and Multiplexed Binary Offset Carrier Signals

This paper compares the multipath mitigation performance of code correlation reference waveform receivers with variable and fixed window width, for binary offset carrier and multiplexed binary offset carrier signals typically used in global navigation satellite systems. In the variable window width method, such width is iteratively reduced until the distortion on the discriminator with multipath is eliminated. This distortion is measured as the Euclidean distance between the actual discriminator (obtained with the incoming signal), and the local discriminator (generated with a local copy of the signal). The variable window width have shown better performance compared to the fixed window width. In particular, the former yields zero error for all delays for the BOC and MBOC signals considered, while the latter gives rather large nonzero errors for small delays in all cases. Due to its computational simplicity, the variable window width method is perfectly suitable for implementation in low-cost receivers.

Constant Factor Approximation Algorithm for p-Median Network Design Problem with Multiple Cable Types

This research presents the first constant approximation algorithm to the p-median network design problem with multiple cable types. This problem was addressed with a single cable type and there is a bifactor approximation algorithm for the problem. To the best of our knowledge, the algorithm proposed in this paper is the first constant approximation algorithm for the p-median network design with multiple cable types. The addressed problem is a combination of two well studied problems which are p-median problem and network design problem. The introduced algorithm is a random sampling approximation algorithm of constant factor which is conceived by using some random sampling techniques form the literature. It is based on a redistribution Lemma from the literature and a steiner tree problem as a subproblem. This algorithm is simple, and it relies on the notions of random sampling and probability. The proposed approach gives an approximation solution with one constant ratio without violating any of the constraints, in contrast to the one proposed in the literature. This paper provides a (21 + 2)-approximation algorithm for the p-median network design problem with multiple cable types using random sampling techniques.

DocPro: A Framework for Processing Semantic and Layout Information in Business Documents

With the recent advance of the deep neural network, we observe new applications of NLP (natural language processing) and CV (computer vision) powered by deep neural networks for processing business documents. However, creating a real-world document processing system needs to integrate several NLP and CV tasks, rather than treating them separately. There is a need to have a unified approach for processing documents containing textual and graphical elements with rich formats, diverse layout arrangement, and distinct semantics. In this paper, a framework that fulfills this unified approach is presented. The framework includes a representation model definition for holding the information generated by various tasks and specifications defining the coordination between these tasks. The framework is a blueprint for building a system that can process documents with rich formats, styles, and multiple types of elements. The flexible and lightweight design of the framework can help build a system for diverse business scenarios, such as contract monitoring and reviewing.

Digital Transformation as the Subject of the Knowledge Model of the Discursive Space

Due to the development of the current civilization, one must create suitable models of its pervasive massive phenomena. Such a phenomenon is the digital transformation, which has a substantial number of disciplined, methodical interpretations forming the diversified reflection. This reflection could be understood pragmatically as the current temporal, a local differential state of knowledge. The model of the discursive space is proposed as a model for the analysis and description of this knowledge. Discursive space is understood as an autonomous multidimensional space where separate discourses traverse specific trajectories of what can be presented in multidimensional parallel coordinate system. Discursive space built on the world of facts preserves the complex character of that world. Digital transformation as a discursive space has a relativistic character that means that at the same time, it is created by the dynamic discourses and these discourses are molded by the shape of this space.

Ordinal Regression with Fenton-Wilkinson Order Statistics: A Case Study of an Orienteering Race

In sports, individuals and teams are typically interested in final rankings. Final results, such as times or distances, dictate these rankings, also known as places. Places can be further associated with ordered random variables, commonly referred to as order statistics. In this work, we introduce a simple, yet accurate order statistical ordinal regression function that predicts relay race places with changeover-times. We call this function the Fenton-Wilkinson Order Statistics model. This model is built on the following educated assumption: individual leg-times follow log-normal distributions. Moreover, our key idea is to utilize Fenton-Wilkinson approximations of changeover-times alongside an estimator for the total number of teams as in the notorious German tank problem. This original place regression function is sigmoidal and thus correctly predicts the existence of a small number of elite teams that significantly outperform the rest of the teams. Our model also describes how place increases linearly with changeover-time at the inflection point of the log-normal distribution function. With real-world data from Jukola 2019, a massive orienteering relay race, the model is shown to be highly accurate even when the size of the training set is only 5% of the whole data set. Numerical results also show that our model exhibits smaller place prediction root-mean-square-errors than linear regression, mord regression and Gaussian process regression.

Kalman Filter Gain Elimination in Linear Estimation

In linear estimation, the traditional Kalman filter uses the Kalman filter gain in order to produce estimation and prediction of the n-dimensional state vector using the m-dimensional measurement vector. The computation of the Kalman filter gain requires the inversion of an m x m matrix in every iteration. In this paper, a variation of the Kalman filter eliminating the Kalman filter gain is proposed. In the time varying case, the elimination of the Kalman filter gain requires the inversion of an n x n matrix and the inversion of an m x m matrix in every iteration. In the time invariant case, the elimination of the Kalman filter gain requires the inversion of an n x n matrix in every iteration. The proposed Kalman filter gain elimination algorithm may be faster than the conventional Kalman filter, depending on the model dimensions.