Real-Time Recognition of Dynamic Hand Postures on a Neuromorphic System

To explore how the brain may recognise objects in its
general,accurate and energy-efficient manner, this paper proposes the
use of a neuromorphic hardware system formed from a Dynamic
Video Sensor (DVS) silicon retina in concert with the SpiNNaker
real-time Spiking Neural Network (SNN) simulator. As a first step
in the exploration on this platform a recognition system for dynamic
hand postures is developed, enabling the study of the methods used
in the visual pathways of the brain. Inspired by the behaviours of
the primary visual cortex, Convolutional Neural Networks (CNNs)
are modelled using both linear perceptrons and spiking Leaky
Integrate-and-Fire (LIF) neurons.
In this study’s largest configuration using these approaches, a
network of 74,210 neurons and 15,216,512 synapses is created and
operated in real-time using 290 SpiNNaker processor cores in parallel
and with 93.0% accuracy. A smaller network using only 1/10th of the
resources is also created, again operating in real-time, and it is able
to recognise the postures with an accuracy of around 86.4% - only
6.6% lower than the much larger system. The recognition rate of the
smaller network developed on this neuromorphic system is sufficient
for a successful hand posture recognition system, and demonstrates
a much improved cost to performance trade-off in its approach.





References:
[1] S. G. Wysoski, L. Benuskova, and N. Kasabov, “Fast and adaptive
network of spiking neurons for multi-view visual pattern recognition,”
Neurocomputing, vol. 71, no. 13, pp. 2563–2575, 2008.
[2] J. Canny, “A computational approach to edge detection,” Pattern
Analysis and Machine Intelligence, IEEE Transactions on, no. 6,
pp. 679–698, 1986.
[3] O¨ . Toygar and A. Acan, “Multiple classifier implementation of a
divide-and-conquer approach using appearance-based statistical methods
for face recognition,” Pattern Recognition Letters, vol. 25, no. 12,
pp. 1421–1430, 2004.
[4] S.-D. Wei and S.-H. Lai, “Robust and efficient image alignment based
on relative gradient matching,” Image Processing, IEEE Transactions
on, vol. 15, no. 10, pp. 2936–2943, 2006.
[5] D. G. Lowe, “Distinctive image features from scale-invariant keypoints,”
International journal of computer vision, vol. 60, no. 2, pp. 91–110,
2004.
[6] H. Bay, A. Ess, T. Tuytelaars, and L. Van Gool, “Speeded-up robust
features (SURF),” Computer vision and image understanding, vol. 110,
no. 3, pp. 346–359, 2008.
[7] M. Riesenhuber and T. Poggio, “Hierarchical models of object
recognition in cortex,” Nature neuroscience, vol. 2, no. 11,
pp. 1019–1025, 1999.
[8] J. A. Le˜nero-Bardallo, T. Serrano-Gotarredona, and B. Linares-Barranco,
“A 3.6 s latency asynchronous frame-free event-driven
dynamic-vision-sensor,” Solid-State Circuits, IEEE Journal of, vol. 46,
no. 6, pp. 1443–1455, 2011.
[9] S. B. Furber, F. Galluppi, S. Temple, and L. A. Plana, “The SpiNNaker
Project,” 2014.
[10] J. J. Hopfield, “Pattern recognition computation using action potential
timing for stimulus representation,” Nature, vol. 376, no. 6535,
pp. 33–36, 1995.
[11] T. Natschl¨ager and B. Ruf, “Spatial and temporal pattern analysis via
spiking neurons,” Network: Computation in Neural Systems, vol. 9, no. 3,
pp. 319–332, 1998.
[12] A. Gupta and L. N. Long, “Character recognition using spiking neural
networks,” in Neural Networks, 2007. IJCNN 2007. International Joint
Conference on, pp. 53–58, IEEE, 2007.
[13] J. H. Lee, P. Park, C.-W. Shin, H. Ryu, B. C. Kang, and T. Delbruck,
“Touchless hand gesture UI with instantaneous responses,” in Image
Processing (ICIP), 2012 19th IEEE International Conference on,
pp. 1957–1960, Sept 2012.
[14] L. Camunas-Mesa, C. Zamarreno-Ramos, A. Linares-Barranco, A. J.
Acosta-Jimenez, T. Serrano-Gotarredona, and B. Linares-Barranco, “An
event-driven multi-kernel convolution processor module for event-driven
vision sensors,” Solid-State Circuits, IEEE Journal of, vol. 47, no. 2,
pp. 504–517, 2012.
[15] M. Rehn and F. T. Sommer, “A network that uses few active neurones to
code visual input predicts the diverse shapes of cortical receptive fields,”
Journal of computational neuroscience, vol. 22, no. 2, pp. 135–146,
2007.
[16] P. O’Connor, D. Neil, S.-C. Liu, T. Delbruck, and M. Pfeiffer, “Real-time
classification and sensor fusion with a spiking deep belief network,”
Frontiers in neuroscience, vol. 7, 2013.
[17] T. Delbruck, “Frame-free dynamic digital vision,” in Proceedings of Intl.
Symp. on Secure-Life Electronics, Advanced Electronics for Quality Life
and Society, pp. 21–26, 2008.
[18] C. Patterson, F. Galluppi, A. Rast, and S. Furber, “Visualising large-scale
neural network models in real-time,” in Neural Networks (IJCNN), The
2012 International Joint Conference on, pp. 1–8, 2012.
[19] F. Galluppi, K. Brohan, S. Davidson, T. Serrano-Gotarredona, J.-A. P.
Carrasco, B. Linares-Barranco, and S. Furber, “A real-time, event-driven
neuromorphic system for goal-directed attentional selection,” in Neural
Information Processing, pp. 226–233, Springer, 2012.
[20] J. Lazzaro and J. Wawrzynek, “A multi-sender asynchronous extension
to the aer protocol,” in Advanced Research in VLSI, Conference on,
pp. 158–158, IEEE Computer Society, 1995.
[21] L. A. Plana, “AppNote 8 - Interfacing AER devices to SpiNNaker
using an FPGA.” https://spinnaker.cs.man.ac.uk/tiki-download wiki
attachment.php?attId=20, 4 2013.
[22] A. P. Davison, D. Br¨uderle, J. Eppler, J. Kremkow, E. Muller,
D. Pecevski, L. Perrinet, and P. Yger, “Pynn: a common interface for
neuronal network simulators,” Frontiers in neuroinformatics, vol. 2,
2008.
[23] S.-C. Liu, A. van Schaik, B. Minch, and T. Delbruck, “Event-based
64-channel binaural silicon cochlea with q enhancement mechanisms,” in
Circuits and Systems (ISCAS), Proceedings of 2010 IEEE International
Symposium on, pp. 2027–2030, May 2010.
[24] Q. Liu, C. Patterson, S. Furber, Z. Huang, Y. Hou, and H. Zhang,
“Modeling populations of spiking neurons for fine timing sound
localization,” in Neural Networks (IJCNN), The 2013 International Joint
Conference on, pp. 1–8, Aug 2013.
[25] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning
applied to document recognition,” Proceedings of the IEEE, vol. 86,
no. 11, pp. 2278–2324, 1998.
[26] G. La Camera, M. Giugliano, W. Senn, and S. Fusi, “The response of
cortical neurons to in vivo-like input current: theory and experiment,”
Biological cybernetics, vol. 99, no. 4-5, pp. 279–301, 2008.
[27] A. N. Burkitt, “A review of the integrate-and-fire neuron model: I.
homogeneous synaptic input,” Biological cybernetics, vol. 95, no. 1,
pp. 1–19, 2006.
[28] A. J. Siegert, “On the first passage time probability problem,” Physical
Review, vol. 81, no. 4, p. 617, 1951.
[29] Q. Liu, “A gabor filter prefers the horizontal lines running on SpiNNaker
in real-time .” https://www.youtube.com/watch?v=PvJy6RKAJhw&
feature=youtu.be&list=PLxZ1W-Upr3eoQuLxq87qpUL-CwSphtEBJ,
Sept. 2014.
[30] Q. Liu, “Feature extraction of live retinal input.” http://youtu.be/
FZJshPCJ1pg?list=PLxZ1W-Upr3eoQuLxq87qpUL-CwSphtEBJ, Sept.
2014.
[31] Q. Liu, “Live dynamic posture recognition on SpiNNaker.” http://youtu.
be/yxN90aGGKvg?list=PLxZ1W-Upr3eoQuLxq87qpUL-CwSphtEBJ,
Sept. 2014.
[32] M. Elmezain, A. Al-Hamadi, J. Appenrodt, and B. Michaelis, “A hidden
markov model-based isolated and meaningful hand gesture recognition,”
International Journal of Electrical, Computer, and Systems Engineering,
vol. 3, no. 3, pp. 156–163, 2009.