Abstract: Generative Adversarial Net (GAN) has proved to be a powerful machine learning tool in image data analysis and generation. In this paper, we propose to use Conditional Generative Adversarial Net (CGAN) to learn and simulate time series data. The conditions include both categorical and continuous variables with different auxiliary information. Our simulation studies show that CGAN has the capability to learn different types of normal and heavy-tailed distributions, as well as dependent structures of different time series. It also has the capability to generate conditional predictive distributions consistent with training data distributions. We also provide an in-depth discussion on the rationale behind GAN and the neural networks as hierarchical splines to establish a clear connection with existing statistical methods of distribution generation. In practice, CGAN has a wide range of applications in market risk and counterparty risk analysis: it can be applied to learn historical data and generate scenarios for the calculation of Value-at-Risk (VaR) and Expected Shortfall (ES), and it can also predict the movement of the market risk factors. We present a real data analysis including a backtesting to demonstrate that CGAN can outperform Historical Simulation (HS), a popular method in market risk analysis to calculate VaR. CGAN can also be applied in economic time series modeling and forecasting. In this regard, we have included an example of hypothetical shock analysis for economic models and the generation of potential CCAR scenarios by CGAN at the end of the paper.
Abstract: Traditionally in sensor networks and recently in the
Internet of Things, numerous heterogeneous sensors are deployed
in distributed manner to monitor a phenomenon that often can be
model by an underlying stochastic process. The big time-series
data collected by the sensors must be analyzed to detect change
in the stochastic process as quickly as possible with tolerable
false alarm rate. However, sensors may have different accuracy
and sensitivity range, and they decay along time. As a result,
the big time-series data collected by the sensors will contain
uncertainties and sometimes they are conflicting. In this study, we
present a framework to take advantage of Evidence Theory (a.k.a.
Dempster-Shafer and Dezert-Smarandache Theories) capabilities of
representing and managing uncertainty and conflict to fast change
detection and effectively deal with complementary hypotheses.
Specifically, Kullback-Leibler divergence is used as the similarity
metric to calculate the distances between the estimated current
distribution with the pre- and post-change distributions. Then mass
functions are calculated and related combination rules are applied to
combine the mass values among all sensors. Furthermore, we applied
the method to estimate the minimum number of sensors needed to
combine, so computational efficiency could be improved. Cumulative
sum test is then applied on the ratio of pignistic probability to detect
and declare the change for decision making purpose. Simulation
results using both synthetic data and real data from experimental
setup demonstrate the effectiveness of the presented schemes.
Abstract: Inferring the network structure from time series data
is a hard problem, especially if the time series is short and noisy.
DNA microarray is a technology allowing to monitor the mRNA
concentration of thousands of genes simultaneously that produces
data of these characteristics. In this study we try to investigate the
influence of the experimental design on the quality of the result.
More precisely, we investigate the influence of two different types of
random single gene perturbations on the inference of genetic networks
from time series data. To obtain an objective quality measure for
this influence we simulate gene expression values with a biologically
plausible model of a known network structure. Within this framework
we study the influence of single gene knock-outs in opposite to
linearly controlled expression for single genes on the quality of the
infered network structure.
Abstract: In the automotive industry test drives are being conducted
during the development of new vehicle models or as a part of
quality assurance of series-production vehicles. The communication
on the in-vehicle network, data from external sensors, or internal
data from the electronic control units is recorded by automotive
data loggers during the test drives. The recordings are used for fault
analysis. Since the resulting data volume is tremendous, manually
analysing each recording in great detail is not feasible.
This paper proposes to use machine learning to support domainexperts
by preventing them from contemplating irrelevant data and
rather pointing them to the relevant parts in the recordings. The
underlying idea is to learn the normal behaviour from available
recordings, i.e. a training set, and then to autonomously detect
unexpected deviations and report them as anomalies.
The one-class support vector machine “support vector data description”
is utilised to calculate distances of feature vectors. SVDDSUBSEQ
is proposed as a novel approach, allowing to classify subsequences
in multivariate time series data. The approach allows to
detect unexpected faults without modelling effort as is shown with
experimental results on recordings from test drives.
Abstract: This paper proposed a novel model for short term load
forecast (STLF) in the electricity market. The prior electricity
demand data are treated as time series. The model is composed of
several neural networks whose data are processed using a wavelet
technique. The model is created in the form of a simulation program
written with MATLAB. The load data are treated as time series data.
They are decomposed into several wavelet coefficient series using
the wavelet transform technique known as Non-decimated Wavelet
Transform (NWT). The reason for using this technique is the belief
in the possibility of extracting hidden patterns from the time series
data. The wavelet coefficient series are used to train the neural
networks (NNs) and used as the inputs to the NNs for electricity load
prediction. The Scale Conjugate Gradient (SCG) algorithm is used as
the learning algorithm for the NNs. To get the final forecast data, the
outputs from the NNs are recombined using the same wavelet
technique. The model was evaluated with the electricity load data of
Electronic Engineering Department in Mandalay Technological
University in Myanmar. The simulation results showed that the
model was capable of producing a reasonable forecasting accuracy in
STLF.