Deep Learning Based, End-to-End Metaphor Detection in Greek with Recurrent and Convolutional Neural Networks

This paper presents and benchmarks a number of
end-to-end Deep Learning based models for metaphor detection in
Greek. We combine Convolutional Neural Networks and Recurrent
Neural Networks with representation learning to bear on the metaphor
detection problem for the Greek language. The models presented
achieve exceptional accuracy scores, significantly improving the
previous state-of-the-art results, which had already achieved accuracy
0.82. Furthermore, no special preprocessing, feature engineering or
linguistic knowledge is used in this work. The methods presented
achieve accuracy of 0.92 and F-score 0.92 with Convolutional
Neural Networks (CNNs) and bidirectional Long Short Term Memory
networks (LSTMs). Comparable results of 0.91 accuracy and 0.91
F-score are also achieved with bidirectional Gated Recurrent Units
(GRUs) and Convolutional Recurrent Neural Nets (CRNNs). The
models are trained and evaluated only on the basis of training tuples,
the related sentences and their labels. The outcome is a state-of-the-art
collection of metaphor detection models, trained on limited labelled
resources, which can be extended to other languages and similar
tasks.




References:
[1] Gerard J Steen, Aletta G Dorst, J Berenike Herrmann, Anna A Kaal,
and Tina Krennmayr. Metaphor in usage. Cognitive Linguistics,
21(4):765–796, 2010.
[2] M. Schulder and E. D. Hovy. Metaphor detection through term
relevance. In Proceedings of the Second Workshop on Metaphor in NLP.
Association for Computational Linguistics, pages 18–26, Baltimore,
MD, USA, 2014.
[3] T. B. Sardinha. Metaphor probabilities in corpora. In Zanotto, Mara
Sophia, Cameron, Lynne and Cavalcanti, Marilda do Couto (eds.)
Confronting metaphor in use. John Benjamins, Amsterdam/Philadelphia,
2008.
[4] J. Birke and A. Sarkar. A clustering approach for the nearly unsupervised
recognition of nonliteral language. In Proc. of the 11th Conference of
the European Chapter of the Association for Computational Linguistics
(EACL-06), pages 329–336, Trento, Italy, 2006.
[5] J. E. Dunn. Evaluating the premises and results of four metaphor
identification systems. In Proceedings of the 14th International
Conference on Computational Linguistics and Intelligent Text Processing
- Volume 2 (CICLing13), pages 471–486, Samos, Greece, 2013.
[6] Jonathan Dunn. Measuring metaphoricity. In Proceedings of the
52nd Annual Meeting of the Association for Computational Linguistics
(Volume 2: Short Papers), pages 745–751, 2014.
[7] David M Blei, Andrew Y Ng, and Michael I Jordan. Latent dirichlet
allocation. Journal of machine Learning research, 3(Jan):993–1022,
2003.
[8] Yoshua Bengio, Aaron Courville, and Pascal Vincent. Representation
learning: A review and new perspectives. IEEE transactions on pattern
analysis and machine intelligence, 35(8):1798–1828, 2013.
[9] Maximilian K¨oper, Evgeny Kim, and Roman Klinger. IMS at
EmoInt-2017: Emotion intensity prediction with affective norms,
automatically extended resources and deep learning. In Proceedings
of the 8th Workshop on Computational Approaches to Subjectivity,
Sentiment and Social Media Analysis, pages 50–57, Copenhagen,
Denmark, September 2017. Association for Computational Linguistics.
[10] Marek Rei, Luana Bulat, Douwe Kiela, and Ekaterina Shutova. Grasping
the finer point: A supervised similarity network for metaphor detection.
arXiv preprint arXiv:1709.00575, 2017.
[11] John Lafferty, Andrew McCallum, and Fernando CN Pereira.
Conditional random fields: Probabilistic models for segmenting and
labeling sequence data. 2001.
[12] Yoon Kim. Convolutional neural networks for sentence classification.
arXiv preprint arXiv:1408.5882, 2014.
[13] Duyu Tang, Bing Qin, and Ting Liu. Document modeling with gated
recurrent neural network for sentiment classification. In Proceedings
of the 2015 conference on empirical methods in natural language
processing, pages 1422–1432, 2015.
[14] Xingyou Wang, Weijie Jiang, and Zhiyong Luo. Combination of
convolutional and recurrent neural network for sentiment analysis of
short texts. In Proceedings of COLING 2016, the 26th international
conference on computational linguistics: Technical papers, pages
2428–2437, 2016.
[15] Ronan Collobert, Jason Weston, L´eon Bottou, Michael Karlen,
Koray Kavukcuoglu, and Pavel Kuksa. Natural language processing
(almost) from scratch. Journal of machine learning research,
12(Aug):2493–2537, 2011.
[16] Eirini Florou, Konstantinos Perifanos, and Dionysis Goutsos. Neural
embeddings for metaphor detection in a corpus of greek texts. In 2018
9th International Conference on Information, Intelligence, Systems and
Applications (IISA), pages 1–4. IEEE, 2018.
[17] Armand Joulin, Edouard Grave, Piotr Bojanowski, and Tomas Mikolov.
Bag of tricks for efficient text classification. arXiv preprint
arXiv:1607.01759, 2016.
[18] Dionysis Goutsos. The corpus of greek texts: A reference corpus for
modern greek. Corpora, 5(1):29–44, 2010.
[19] Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff
Dean. Distributed representations of words and phrases and their
compositionality. In Advances in neural information processing systems,
pages 3111–3119, 2013.
[20] Jeffrey Pennington, Richard Socher, and Christopher D. Manning.
Glove: Global vectors for word representation. In In EMNLP, 2014.
[21] Gerard J Steen. Finding metaphor in grammar and usage: A
methodological analysis of theory and research, volume 10. John
Benjamins Publishing, 2007.
[22] Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever,
and Ruslan Salakhutdinov. Dropout: a simple way to prevent neural
networks from overfitting. The journal of machine learning research,
15(1):1929–1958, 2014.
[23] Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, and Yoshua
Bengio. Empirical evaluation of gated recurrent neural networks on
sequence modeling. arXiv preprint arXiv:1412.3555, 2014.
[24] Sepp Hochreiter and J¨urgen Schmidhuber. Long short-term memory.
Neural computation, 9(8):1735–1780, 1997.
[25] Mike Schuster and Kuldip K Paliwal. Bidirectional recurrent neural
networks. IEEE Transactions on Signal Processing, 45(11):2673–2681,
1997.
[26] Zhiheng Huang, Wei Xu, and Kai Yu. Bidirectional lstm-crf models for
sequence tagging. arXiv preprint arXiv:1508.01991, 2015.
[27] Siwei Lai, Liheng Xu, Kang Liu, and Jun Zhao. Recurrent convolutional
neural networks for text classification. In Twenty-ninth AAAI conference
on artificial intelligence, 2015.
[28] Alex Graves. Supervised sequence labelling with recurrent neural
networks. 2012. URL http://books. google. com/books, 2012.
[29] Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic
optimization. arXiv preprint arXiv:1412.6980, 2014.
[30] Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward
Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga,
and Adam Lerer. Automatic differentiation in pytorch. 2017.
[31] Matthew E Peters, Mark Neumann, Mohit Iyyer, Matt Gardner,
Christopher Clark, Kenton Lee, and Luke Zettlemoyer. Deep
contextualized word representations. arXiv preprint arXiv:1802.05365,
2018.
[32] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova.
Bert: Pre-training of deep bidirectional transformers for language
understanding. arXiv preprint arXiv:1810.04805, 2018.
[33] Thomas N. Kipf and Max Welling. Semi-supervised classification with
graph convolutional networks. In International Conference on Learning
Representations (ICLR), 2017.
[34] Liang Yao, Chengsheng Mao, and Yuan Luo. Graph convolutional
networks for text classification. In Proceedings of the AAAI Conference
on Artificial Intelligence, volume 33, pages 7370–7377, 2019.
[35] Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep Learning.
MIT Press, 2016. http://www.deeplearningbook.org.