A Survey of Response Generation of Dialogue Systems

An essential task in the field of artificial intelligence is
to allow computers to interact with people through natural language.
Therefore, researches such as virtual assistants and dialogue systems
have received widespread attention from industry and academia. The
response generation plays a crucial role in dialogue systems, so to
push forward the research on this topic, this paper surveys various
methods for response generation. We sort out these methods into
three categories. First one includes finite state machine methods,
framework methods, and instance methods. The second contains
full-text indexing methods, ontology methods, vast knowledge base
method, and some other methods. The third covers retrieval methods
and generative methods. We also discuss some hybrid methods based
knowledge and deep learning. We compare their disadvantages and
advantages and point out in which ways these studies can be improved
further. Our discussion covers some studies published in leading
conferences such as IJCAI and AAAI in recent years.




References:
[1] K. Abe, K. Kurokawa, K. Taketa, S. Ohno, and H. Fujisaki. A
new method for dialogue management in an intelligent system for
information retrieval. In Processings of the 16th International
Conference on Spoken Language Processing, pages 1–4, 2008.
[2] J. Aron. How innovative is Apple’s new voice assistant, Siri? New
Scientist, 212(2836):24–24, 2011.
[3] C. Asakiewicz, E.A. Stohr, S. Mahajan, and L. Pandey. Building a
cognitive application using watson deepqa. IT Professional, 19(4):36–44,
2017.
[4] A. Bordes, Y. L. Boureau, and J. Weston. Learning end-to-end
goal-oriented dialog. In Proceedings of the 15th Conference of the
European Chapter of the Association for Computational Linguistics,
pages 1–15, 2017.
[5] T. Broens, S. Pokraev, J. Sinderen, M.V.and Koolwaaij, and P.D. Costa.
Context-aware, ontology-based service discovery. In Proceedings of the
2004 European Symposium on Ambient Intelligence, pages 72–83, 2004.
[6] C. Chen, Q.Q. Zhu, R. Yan, and J.F. Liu. A summary of research on
open domain dialogue system based on deep learning. Chinese Journal
of Computers, 42(7):1439–1461, 2019. (In chinese).
[7] H. Chen, X. Liu, D. Yin, and J. Tang. A survey on dialogue systems:
Recent advances and new frontiers. Acm Sigkdd Explorations Newsletter,
19(2):25–35, 2017.
[8] B. Dhingra, L. Li, X. Li, J. Gao, Y.-N. Chen, F. Ahmed, and L. Deng.
Towards end-to-end reinforcement learning of dialogue agents for
information access. In Proceedings of the 55th Annual Meeting of the
Association for Computational Linguistics, volume 1, pages 484–495,
2017.
[9] P. Ehrenbrink, S. Osman, and S. M¨oller. Google now is for the
extraverted, cortana for the introverted: Investigating the influence of
personality on ipa preference. In Proceedings of the 29th Australian
Conference on Computer-Human Interaction, pages 257–265, 2017.
[10] M. Eric and C. D. Manning. Key-value retrieval networks for
task-oriented dialogue. In Proceedings of the 18th Annual SIGdial
Meeting on Discourse and Dialogue, pages 37–49, 2017.
[11] D. Goddeau, H. Meng, J. Polifroni, S. Seneff, and S. Busayapongchai.
A form-based dialogue manager for spoken language applications. In
Proceedings of the 4th International Conference on Spoken Language
Processing, pages 701–704, 1996.
[12] S.Z. He, C. Liu, K. Liu, and J. Zhao. Generating natural
answers by incorporating copying and retrieving mechanisms in
sequence-to-sequence learning. In Proceedings of the 55th Annual
Meeting of the Association for Computational Linguistics, page 199208,
2017.
[13] T. Holstein, M. Wallmyr, J. Wietzke, and R. Land. Current Challenges
in Compositing Heterogeneous User Interfaces for Automotive Purposes,
pages 531–542. Computer Science, 2015.
[14] L. Hurtado, J. Planells, E. Segarra, and E. Sanchis. Spoken dialog
systems based on online generated stochastic finite-state transducers.
Speech Communication, 83:81–93, 2016. [15] V. Ilievski, C. Musat, A. Hossmann, and M. Baeriswyl. Goal-oriented
chatbot dialog management bootstrapping with transfer learning. In
Proceedings of the 27th International Joint Conference on Artificial
Intelligence Organization, pages 4115–4120, 2018.
[16] E.S. Juliano, F. Andre, and Siegfried H. An open vocabulary semantic
parser for end-user programming using natural language. In Proceedings
of the 12th IEEE International Conference on Semantic Computing,
pages 77–83, 2019.
[17] K. Kim, C. Lee, D. Lee, J. Choi, S. Jung, and G.G. Lee. Modeling
confirmations for example-based dialog management. In Proceedingd
of 3rd IEEE Spoken Language Technology Workshop, pages 324–329,
2010.
[18] S. Kim, I. Kang, and N. Kwak. Semantic sentence matching
with densely-connected recurrent and co-attentive information. In
Proceedings of the 33rd AAAI Conference on Artificial Intelligence,
volume 33, pages 6586–6593, 2019.
[19] S. Koo, G.G. Lee, and H. Yu. Mathematical model for processing
multi-user requests on POMDP hybrid dialog management. In
Proceedings of the 10th International Conference on Ubiquitous
Information Management and Communication, pages 1–4, 2016.
[20] J.-P. Kruth, T.V. Ginderachter, P.-I. Tanaya, and P. Valckenaers. The use
of finite state machines for task-based machine tool control. Computers
in Industry, 46(3):247–258, 2001.
[21] C. Lee, Y.S. Cha, and T.Y. Kuc. Implementation of dialogue system for
intelligent service robots. In Processings of 2rd International Conference
on Control, Automation and Systems, pages 2038–2041, 2008.
[22] C. Lee, S. Jung, M. Jeong, and G.G. Lee. Chat and goal-oriented
dialog together: a unified example-based architecture for multi-domain
dialog management. In Proceedings of the 1st IEEE Spoken Language
Technology Workshop, pages 194–197, 2006.
[23] C. Lee, S. Jung, S. Kim, and G.G Lee. Example-based dialog modeling
for practical multi-domain dialog system. Speech Communication,
51(5):466–484, 2009.
[24] J. Li, W. Monroe, A. Ritter, M. Galley, J. Gao, and D. Jurafsky.
Deep reinforcement learning for dialogue generation. In Proceedings
of the 2016 Conference on Empirical Methods in Natural Language
Processing, pages 1192–1202, 2016.
[25] J. Li, W. Monroe, T. Shi, S. Jean, A. Ritter, and D. Jurafsky. Adversarial
learning for neural dialogue generation. In Proceedings of the 22nd
Empirical Methods in Natural Language Processing, page 21572169,
2017.
[26] X.-S Li. Design and implementation of question answering system based
on retrieval and answer generation. Master’s thesis, 2019.
[27] Y. Li, J. Cao, and Y.B. Wang. Implementation of intelligent question
answering system based on basketball knowledge graph. In Proceedings
of the 2019 IEEE 4th Advanced Information Technology, Electronic and
Automation Control Conference, pages 2601–2604, 2019.
[28] Y. Li, K. Qian, W.Y. Shi, and Z. Yu. End-to-end trainable
non-collaborative dialog system. In Proceedings of the 34th AAAI
Conference on Artificial Intelligence, pages 8293–8302, 2020.
[29] Z.D. Lu and H. Li. A deep architecture for matching short texts. In
Proceedings of the 2013 Neural Information Processing Systems, page
13671375, 2013.
[30] A. Madotto, C.S. Wu, and P. Fung. Mem2seq: Effectively incorporating
knowledge bases into end-to-end task-oriented dialog systems. In
Proceedings of the 56th Annual Meeting of the Association for
Computational Linguistics, pages 1468–1478, 2018.
[31] H. Mei, M. Bansal, and M.R. Walter. Coherent dialogue with
attention-based language models. In Proceedings of the 31st AAAI
Conference on Artificial Intelligence, pages 3252–3258, 2017.
[32] F. Mi, M. Huang, J. Zhang, and B. Faltings. Meta-learning for
low-resource natural language generation in task-oriented dialogue
systems. In Proceedings of the 28th International Joint Conference on
Artificial Intelligence Organization, pages 3151–3157, 2019.
[33] H. Noh, S. Ryu, D. Lee, K. Lee, C. Lee, and G.G Lee. An
example-based approach to ranking multiple dialog states for flexible
dialog management. IEEE Journal of Selected Topics in Signal
Processing, 6(8):943–958, 2012.
[34] H.J. Oh, C.H. Lee, M.G. Jang, and K.Y. Lee. An intelligent TV
interface based on statistical dialogue management. IEEE Transactions
on Consumer Electronics, 53(4):1602–1607, 2007.
[35] M.-J. Peng, Y.W. Qin, C.X. Tang, and X.M. Deng. An e-commerce
customer service robot based on intention recognition model. Journal
of Electronic Commerce in Organizations, 14(1):34–44, 2016.
[36] M. Qiu, F.-L Li, S. Wang, X. Gao, Y. Chen, W. Zhao, H. Chen,
J. Huang, and Chu W. Alime chat: A sequence to sequence and rerank
based chatbot engine. In Proceedings of the 55th Annual Meeting of
the Association for Computational Linguistics, volume 2, page 498503,
2017.
[37] A. Raux and M. Eskenazi. A finite-state turn-taking model for spoken
dialog systems. In Proceedings of Human Language Technologies:
The 2009 Annual Conference of the North American Chapter of the
Association for Computational Linguistics, pages 629–637, 2009.
[38] A. Ritter, C. Cherry, and W.B. Dolan. Data-driven response generation
in social media. In Proceedings of the 2011 Conference on Empirical
Methods in Natural Language Processing, page 583593, 2011.
[39] I. V. Serban, C. Sankar, M. Germain, S. Zhang, Z. Lin, S. Subramanian,
T. Kim, M. Pieper, S. Chandar, and N. R. Ke. A deep reinforcement
learning chatbot. arXiv preprint arXiv:1709.02349, 2017.
[40] I.V. Serban, A. Sordoni, R. Lowe, L. Charlin, J. Pineau,
A. Aaron Courville, and Y. Bengio. A hierarchical latent variable
encoder-decoder model for generating dialogues. In Proceedings of
the 31st AAAI Conference on Artificial Intelligence, pages 2157–2169,
2017.
[41] L.F. Shang, Z.D. Lu, and H. Li. Neural responding machine for
short-text conversation. In Proceedings of the 53rd Annual Meeting of
the Association for Computational Linguistics and the 7th International
Joint Conference on Natural Language Processing (Volume 1: Long
Papers), page 15771586, 2015.
[42] Y. Shao, S. Gouws, D. Britz, A. Goldie, B. Strope, and R. Kurzweil.
Generating high-quality and informative conversation responses with
sequence-to-sequence models. In Proceedings of the 2017 Conference
on Empirical Methods in Natural Language Processing, page 22102219,
2017.
[43] B. Shen and D. Inkpen. Speech intent recognition for robots. In
Proceedingds of the 3rd International Conference on Mathematics and
Computers in Sciences and in Industry, pages 185–189, 2017.
[44] Y.L. Shen, X.D. He, L. Gao, J.F. Deng, and G. Mesnil. Learning
semantic representations using convolutional neural networks for web
search. In Proceedings of the 2014 International World Wide Web
Conference, page 373374, 2014.
[45] Z.X. Shi and M.L. Huang. A deep sequential model for discourse parsing
on multi-party dialogues. In Proceedings of the 33rd AAAI Conference
on Artificial Intelligence, pages 7007–7013, 2019.
[46] O. Sihombing, N. Zendrato, Y. Laia, M. Nababan, D. Sitanggang,
W. Purba, D. Batubara, S. Aisyah, E. Indra, and S. Siregar. Smart
home design for electronic devices monitoring based wireless gateway
network using cisco packet tracer. Journal of Physics Conference Series,
1007(1):12–21, 2018.
[47] H.Y. Song, W.-N. Zhang, J.-W. Hu, and T. Liu. Generating persona
consistent dialogues by exploiting natural language inference. In
Proceedings of the 34th AAAI Conference on Artificial Intelligence,
pages 1–8, 2020.
[48] Y. Song, R. Yan, X. Li, D. Zhao, and M. Zhang. Two are better than one:
An ensemble of retrieval- and generation-based dialog systems. arXiv
preprint arXiv:1610.07149, 2016.
[49] Y.P. Song, X.Y. Zhou, and H. Wu. shall i be your chat companion?:
Towards an online human-computer conversation system. In Proceedings
of the 25th ACM International on Conference on Information and
Knowledge Management, page 649658, 2016.
[50] R. Tanaka, A. Ozeki, S. Kato, and A. Lee. Context and knowledge aware
conversational model and system combination for grounded response
generation. Computer Speech & Language, 62:1–10, 2020.
[51] A. Verma and A. Arora. Reflexive hybrid approach to provide precise
answer of user desired frequently asked question. In Proceedings of the
7th International Conference on Cloud Computing, Data Science and
Engineering - Confluence, pages 159–162, 2017.
[52] S.X. Wan, Y.Y. Lan, J.F. Guo, L. Xu, J. Pang, and X.Q. Cheng. A deep
architecture for semantic matching with multiple positional sentence
representations. In Proceedings of the 2016 National Conference on
Artificial Intelligence, page 28352841, 2016.
[53] J. Wang, J.H. Liu, W. Bi, X.J. Liu, K.J. He, R.F. Xu, and M. Yang.
Improving knowledge-aware dialogue generation via knowledge base
question answering. In Proceedings of the 34th AAAI Conference on
Artificial Intelligence, pages 1–8, 2020.
[54] W. Wang, M. Huang, X. Xu, F. Shen, and L. Nie. Chat more: Deepening
and widening the chatting topic via a deep model. In Proceedings of the
41st International ACM SIGIR Conference on Research & Development
in Information Retrieval, page 255264, 2018.
[55] Y. Wang, F.-J. Ren, and C.-Q. Quan. Review of dialogue management
methods in spoken dialogue system. Computer Science, 42(6):1–6, 2015.
(In chinese). [56] Y.-G Wei, X.-M. Zhu, S. Bo, and B. Sun. Comparative studies of aiml.
In Proceedings of the 3rd International Conference on Systems and
Informatics, pages 344–349, 2016.
[57] T.-H. Wen, D. Vandyke, N. Mrksic, M. Gasic, L. M. Rojas-Barahona,
P.-H. Su, S. Ultes, and S. Young. A network-based end-to-end trainable
task-oriented dialogue system. In Proceedings of the 15th Conference of
the European Chapter of the Association for Computational Linguistics,
volume 1, pages 438–449, 2017.
[58] J. Wu, M. Li, and C.H. Lee. A probabilistic framework for representing
dialog systems and entropy-based dialog management through dynamic
stochastic state evolution. IEEE/ACM Transactions on Audio Speech
and Language Processing, 23(11):2026–2035, 2015.
[59] Y. Wu, G. Nong, W.-H Chan, and L.-B. Han. Checking big suffix and
lcp arrays by probabilistic methods. IEEE Transactions on Computers,
65(10):1667–1674, 2017.
[60] Y. Wu, W. Wu, C. Xing, M. Zhou, and Z.J. Li. Sequential matching
network: A new architecture for multi-turn response selection in
retrieval-based chatbots. In Proceedings of the 55th Annual Meeting of
the Association for Computational Linguistics, volume 1, page 496505,
2017.
[61] Y. Wu, W. Wu, D. Yang, C. Xu, and Z. Li. Neural response generation
with dynamic vocabularies. In Proceedings of the 32nd AAAI Conference
on Artificial Intelligence, pages 5594–5601, 2018.
[62] H. Xu, J. Bao, and J. Wang. Knowledge-graph based proactive
dialogue generation with improved meta-learning. arXiv preprint
arXiv:2004.08798, 2020.
[63] Z. Yan, N. Duan, P. Chen, M. Zhou, J. Zhou, and Z. Li. Building
task-oriented dialogue systems for online shopping. In Proceedings of
the 31st AAAI Conference on Artificial Intelligence, pages 4618–4625,
2017.
[64] M. Yang, Q.G Jiang, Y. Shen, Q.Y. Wu, Z. Zhao, and W. Zhoue.
Hierarchical human-like strategy for aspect-level sentiment classification
with sentiment linguistic knowledge and reinforcement learning. Neural
Networks, 117:240–248, 2019.
[65] YanR., Y.P. Song, and H. Wu. Learning to respond with deep neural
networks for retrieval-based human-computer conversation system. In
Proceedings of the 39th International ACM SIGIR conference on
Research and Development in Information Retrieval, page 5564, 2016.
[66] W.P. Yin, H. Schtze, B. Xiang, and B. Zhou. Abcnn: Attention-based
convolutional neural network for modeling sentence pairs. Transactions
of the Association for Computational Linguistics, 4(1):259–272, 2016.
[67] Y.Y. Zhang, Q. Fang, S.S. Qian, and C.S. Xu. Knowledge-aware attentive
wasserstein adversarial dialogue response generation. ACM Transactions
on Intelligent Systems and Technology, 11(4):1–15, 2020.
[68] Z.S. Zhang, J.T. Li, P.F. Zhu, H. Zhao, and G.S. Liu. Modeling multi-turn
conversation with deep utterance aggregation. In Proceedings of the 27th
International Conference on Computational Linguistics, page 37403752,
2018.
[69] T.-C. Zhao, K. Xie, and M. Eskenazi. Rethinking action spaces for
reinforcement learning in end-to-end dialog agents with latent variable
models. In Proceedings of the 2019 Conference of the North American
Chapter of the Association for Computational Linguistics: Human
Language Technologies, page 12081218, 2019.
[70] X.L. Zhao, W. Wu, C.Y. Tao, C. Xu, D.Y. Zhao, and R. Yan.
Low-resource knowledge-grounded dialogue generation. In Proceedings
of the 2020 International Conference on Learning Representations,
pages 1–14, 2020.
[71] X.Y. Zhou, L. Li, D.X. Dong, Y. Liu, Y. Chen, W.X. Zhao, D. H. Yu, and
H. Wu. Multi-turn response selection for chatbots with deep attention
matching network. In Proceedings of the 56th Annual Meeting of the
Association for Computational Linguistics, volume 1, page 11181127,
2018.