A Hybrid System of Hidden Markov Models and Recurrent Neural Networks for Learning Deterministic Finite State Automata

In this paper, we present an optimization technique or a learning algorithm using the hybrid architecture by combining the most popular sequence recognition models such as Recurrent Neural Networks (RNNs) and Hidden Markov models (HMMs). In order to improve the sequence/pattern recognition/classification performance by applying a hybrid/neural symbolic approach, a gradient descent learning algorithm is developed using the Real Time Recurrent Learning of Recurrent Neural Network for processing the knowledge represented in trained Hidden Markov Models. The developed hybrid algorithm is implemented on automata theory as a sample test beds and the performance of the designed algorithm is demonstrated and evaluated on learning the deterministic finite state automata.




References:
[1] Y.S. Abu-Mostafa, Learning from hints in neural networks, Journal of
Complexity (1990) 6:192.
[2] B. Pandey and R.B. Mishra, Knowledge and intelligent computing
system in medicine, Computers in Biology and Medicine, 39(3):215-230,
2009.
[3] E. Barnard and D. Casasent, Invariance and neural networks, IEEE
Transactions on Neural Networks, 2 :498-508, 1991.
[4] Y. Bengio, Neural Networks for Speech and Sequence Recognition,
International Thompson Computer Press, 1996.
[5] C.M. Bishop, Neural Networks for Pattern Recognition, Oxford
University Press, 1995. [6] K. David, D. Haussler, G. Martin Reese and H. Frank Eeckman, A
generalized hidden Markov model for the recognition of human genes in
DNA, Procedings of the Fourth International Conference on Intelligent
Systems for Molecular Biology, AAAI Press, 1996.
[7] Richard O. Duda, Peter E. Hart, and David G. Stork. 2000. Pattern
Classification (2nd Edition). Wiley-Interscience.
[8] J.L. Elman and D. Zipser, Learning the hidden structure of speech,
Journal of the Acoustical Society of America, 83:1615-1626, 1988.
[9] L. Fu, Mapping rule-based systems into neural architecture, Knowledge
Based Systems, 3(1):48-56, 1990.
[10] Y. Fukuoka and H. Matsuki, A Modified Back-propagation Method to
Avoid Local Minima, Neural Networks, 11:1059-1072, 1998.
[11] M.J.F. Gales, Maximum likelihood linear transformations for Hidden
Markov Model-based speech recognition, Computer Speech Language,
12:75-98, 1998.
[12] A.S. Weigend and N.A. Gershenfeld, The Future of Time Series,
Learning and Understanding, Addison-Wesley, Reading, MA, 1-17, 1993.
[13] C.L. Giles, D. Chen, C.B. Miller, H.H. Chen, G.Z. Sun and Y.C.
Lee, Second-order recurrent neural networks for grammatical inference,
1991 IEEE INNS International Joint Conference on Neural Networks,
Piscataway, NJ, IEEE Press. Reprinted in Artificial Neural Networks,
eds. E. Sanchez-Sinencia, C. Lau, IEEE Press, 273-281, 1992.
[14] Y. Hayashi and A. Imura, Fuzzy neural expert systems with automated
extraction of fuzzy if-then rules from a trained neural network,
Procedings of First IEEE Conference on Fuzzy Systems (1990) 489-494.
[15] J. Hertz, A. Krogh and R.G. Palmer, Introduction to the Theory of Neural
Computation. Addison-Wesley Publishing Company, 1991.
[16] J.J. Hopfield, Neural networks and physical systems with emergent
collective computational facilities, Proceedings of the National Academy
of Sciences of the USA, 79:2554–2558, 1982.
[17] K. Hornik, M. Stinchcombe and H. White, Multilayer feedforward
networks are universal approximators, Neural Networks, 2:359-366,
1989.
[18] J.F. Kolen and S.C. Kremer, (ed.), A Field Guide to Dynamical Recurrent
Networks, IEEE Press, Piscataway, NJ, 2001.
[19] S. Lawrence, C. L. Giles and A.C. Tsoi, Symbolic conversion,
grammatical inference and rule extraction of foreign exchange rate
prediction, Decision Technologies for Financial Engineering: Procedings
of the Fourth International Conference on Neural Networks in the
Capital Markets, World Scientific, Singapore (1998) 333-345.
[20] K. Marakami and H Taguchi, Gesture recognition using recurrent neural
networks, Procedings of the SIGCHI conference on Human factors in
computing systems, 237-242, 1991.
[21] L. R. Rabiner, A Tutorial on Hidden Markov Models and Selected
Applications in Speech Recognition , Proceedings of the IEEE,
77(2):257-285, 1989.
[22] R.D. Reed and J. Robert Mark, Neural Smithing: Supervised Learning
in Feedforward Artificial Neural Networks, The MIT Press, 1999.
[23] A.J Robinson, An application of recurrent neural nets to phone
probability estimation, IEEE transactions on Neural Networks,
5(2):298-305, 1994.
[24] D.E. Rumelhart, J.L. McCLelland and the PDP Research Group, Parallel
Distributed Processing: Explorations in the Microstructure of Cognition,
Foundations, MIT Press, Cambridge, MA, 1, 1986.
[25] G.G. Towell and J.W. Shavlik, The extraction of refined rules from
knowledge-based neural networks, Machine Learning, 13(1) (1993)
71–101.
[26] G.G. Towell and J.W. Shavlik. Knowledge-based artificial neural
networks. Artificial Intelligence, 70(1-2):119-165, 1994.
[27] P.J. Werbos, Backpropagation through time; what it does and how to do
it, Proceedings of the IEEE, 78:1550-1560, 1990.
[28] R.J. Williams and D. Zipser, A learning algorithm for continually
running fully recurrent neural networks, Neural Computation,
1(2):270-280, 1989.
[29] R.J. Williams and D. Zipser, Gradient-based learning algorithms
for recurrent networks and their computational complexity, In
Back-propagation: Theory, Architectures and Applications, 433-486.
1995.