Abstract: Recently, the system of quick response (QR) code is getting popular. Many companies introduce new QR code payment services and the services are competing with each other to increase the number of users. For increasing the number of users, we should grasp the difference of feature of the demographic information, usage information, and value of users between services. In this study, we conduct an analysis of real-world data provided by Nomura Research Institute including the demographic data of users and information of users’ usages of two services; LINE Pay, and PayPay. For analyzing such data and interpret the feature of them, Nonnegative Matrix Factorization (NMF) is widely used; however, in case of the target data, there is a problem of the missing data. EM-algorithm NMF (EMNMF) to complete unknown values for understanding the feature of the given data presented by matrix shape. Moreover, for comparing the result of the NMF analysis of two matrices, there is Discriminant NMF (DNMF) shows the difference of users features between two matrices. In this study, we combine EMNMF and DNMF and also analyze the target data. As the interpretation, we show the difference of the features of users between LINE Pay and Paypay.
Abstract: In this paper, we propose a framework to help users to search and retrieve the portions in the lecture video of their interest. This is achieved by temporally segmenting and indexing the lecture video using the topic keywords. We use transcribed text from the video and documents relevant to the video topic extracted from the web for this purpose. The keywords for indexing are found by applying the non-negative matrix factorization (NMF) topic modeling techniques on the web documents. Our proposed technique first creates indices on the transcribed documents using the topic keywords, and these are mapped to the video to find the start and end time of the portions of the video for a particular topic. This time information is stored in the index table along with the topic keyword which is used to retrieve the specific portions of the video for the query provided by the users.
Abstract: Non-negative matrix factorization (NMF) is a useful computational method to find basis information of multivariate nonnegative data. A popular approach to solve the NMF problem is the multiplicative update (MU) algorithm. But, it has some defects. So the columnwisely alternating gradient (cAG) algorithm was proposed. In this paper, we analyze convergence of the cAG algorithm and show advantages over the MU algorithm. The stability of the equilibrium point is used to prove the convergence of the cAG algorithm. A classic model is used to obtain the equilibrium point and the invariant sets are constructed to guarantee the integrity of the stability. Finally, the convergence conditions of the cAG algorithm are obtained, which help reducing the evaluation time and is confirmed in the experiments. By using the same method, the MU algorithm has zero divisor and is convergent at zero has been verified. In addition, the convergence conditions of the MU algorithm at zero are similar to that of the cAG algorithm at non-zero. However, it is meaningless to discuss the convergence at zero, which is not always the result that we want for NMF. Thus, we theoretically illustrate the advantages of the cAG algorithm.
Abstract: We have proposed an information filtering system
using index word selection from a document set based on the
topics included in a set of documents. This method narrows
down the particularly characteristic words in a document set
and the topics are obtained by Sparse Non-negative Matrix
Factorization. In information filtering, a document is often
represented with the vector in which the elements correspond
to the weight of the index words, and the dimension of the
vector becomes larger as the number of documents is
increased. Therefore, it is possible that useless words as index
words for the information filtering are included. In order to
address the problem, the dimension needs to be reduced. Our
proposal reduces the dimension by selecting index words
based on the topics included in a document set. We have
applied the Sparse Non-negative Matrix Factorization to the
document set to obtain these topics. The filtering is carried out
based on a centroid of the learning document set. The centroid
is regarded as the user-s interest. In addition, the centroid is
represented with a document vector whose elements consist of
the weight of the selected index words. Using the English test
collection MEDLINE, thus, we confirm the effectiveness of
our proposal. Hence, our proposed selection can confirm the
improvement of the recommendation accuracy from the other
previous methods when selecting the appropriate number of
index words. In addition, we discussed the selected index
words by our proposal and we found our proposal was able to
select the index words covered some minor topics included in
the document set.
Abstract: In this paper three different approaches for person
verification and identification, i.e. by means of fingerprints, face and
voice recognition, are studied. Face recognition uses parts-based
representation methods and a manifold learning approach. The
assessment criterion is recognition accuracy. The techniques under
investigation are: a) Local Non-negative Matrix Factorization
(LNMF); b) Independent Components Analysis (ICA); c) NMF with
sparse constraints (NMFsc); d) Locality Preserving Projections
(Laplacianfaces). Fingerprint detection was approached by classical
minutiae (small graphical patterns) matching through image
segmentation by using a structural approach and a neural network as
decision block. As to voice / speaker recognition, melodic cepstral
and delta delta mel cepstral analysis were used as main methods, in
order to construct a supervised speaker-dependent voice recognition
system. The final decision (e.g. “accept-reject" for a verification
task) is taken by using a majority voting technique applied to the
three biometrics. The preliminary results, obtained for medium
databases of fingerprints, faces and voice recordings, indicate the
feasibility of our study and an overall recognition precision (about
92%) permitting the utilization of our system for a future complex
biometric card.
Abstract: In this paper, an automated algorithm to estimate and remove the continuous baseline from measured spectra containing both continuous and discontinuous bands is proposed. The algorithm uses previous information contained in a Continuous Database Spectra (CDBS) to obtain a linear basis, with minimum number of sampled vectors, capable of representing a continuous baseline. The proposed algorithm was tested by using a CDBS of flame spectra where Principal Components Analysis and Non-negative Matrix Factorization were used to obtain linear bases. Thus, the radical emissions of natural gas, oil and bio-oil flames spectra at different combustion conditions were obtained. In order to validate the performance in the baseline estimation process, the Goodness-of-fit Coefficient and the Root Mean-squared Error quality metrics were evaluated between the estimated and the real spectra in absence of discontinuous emission. The achieved results make the proposed method a key element in the development of automatic monitoring processes strategies involving discontinuous spectral bands.
Abstract: In the last few years, three multivariate spectral
analysis techniques namely, Principal Component Analysis (PCA),
Independent Component Analysis (ICA) and Non-negative Matrix
Factorization (NMF) have emerged as effective tools for oscillation
detection and isolation. While the first method is used in determining
the number of oscillatory sources, the latter two methods
are used to identify source signatures by formulating the detection
problem as a source identification problem in the spectral domain.
In this paper, we present a critical drawback of the underlying linear
(mixing) model which strongly limits the ability of the associated
source separation methods to determine the number of sources
and/or identify the physical source signatures. It is shown that the
assumed mixing model is only valid if each unit of the process gives
equal weighting (all-pass filter) to all oscillatory components in its
inputs. This is in contrast to the fact that each unit, in general, acts
as a filter with non-uniform frequency response. Thus, the model
can only facilitate correct identification of a source with a single
frequency component, which is again unrealistic. To overcome
this deficiency, an iterative post-processing algorithm that correctly
identifies the physical source(s) is developed. An additional issue
with the existing methods is that they lack a procedure to pre-screen
non-oscillatory/noisy measurements which obscure the identification
of oscillatory sources. In this regard, a pre-screening procedure
is prescribed based on the notion of sparseness index to eliminate
the noisy and non-oscillatory measurements from the data set used
for analysis.