NEURAL SEQUENCE-TO-SEQUENCE MODELING WITH ATTENTION BY LEVERAGING DEEP LEARNING ARCHITECTURES FOR ENHANCED CONTEXTUAL UNDERSTANDING IN ABSTRACTIVE TEXT SUMMARIZATION

Bhavith Chandra Challagundla; Chakradhar Reddy PeddavenkatagarI

Authors

Bhavith Chandra Challagundla Student, Computational Intelligence, School of Computing, SRM Institute of Science & Technology, India. Author
Chakradhar Reddy PeddavenkatagarI Student, Networking and Communications, School of Computing, SRM Institute of Science & Technology, India. Author

Keywords:

Abstractive Text Summarization, Neural Sequence-to-Sequence Model, Word Sense Disambiguation, Semantic Content Generalization, Machine Learning Techniques, Natural Language Processing (NLP)

Abstract

Automatic text summarization (TS) plays a pivotal role in condensing large volumes of information into concise, coherent summaries, facilitating efficient information retrieval and comprehension. This paper presents a novel framework for abstractive TS of single documents, which integrates three dominant aspects: structural, semantic, and neural-based approaches. The proposed framework merges machine learning and knowledge-based techniques to achieve a unified methodology. The framework consists of three main phases: pre-processing, machine learning, and post-processing. In the pre-processing phase, a knowledge-based Word Sense Disambiguation (WSD) technique is employed to generalize ambiguous words, enhancing content generalization. Semantic content generalization is then performed to address out-of-vocabulary (OOV) or rare words, ensuring comprehensive coverage of the input document. Subsequently, the generalized text is transformed into a continuous vector space using neural language processing techniques. A deep sequence-to-sequence (seq2seq) model with an attention mechanism is employed to predict a generalized summary based on the vector representation. In the post-processing phase, heuristic algorithms and text similarity metrics are utilized to refine the generated summary further. Concepts from the generalized summary are matched with specific entities, enhancing coherence and readability. Experimental evaluations conducted on prominent datasets, including Gigaword, Duc 2004, and CNN/DailyMail, demonstrate the effectiveness of the proposed framework. Results indicate significant improvements in handling rare and OOV words, outperforming existing state-of-the-art deep learning techniques. The proposed framework presents a comprehensive and unified approach towards abstractive TS, combining the strengths of structure, semantics, and neural-based methodologies.

References

Eneko Agirre, Oier López de Lacalle, and Aitor Soroa. 2014. Random walks for knowledge-based word sense disambiguation. Computational Linguistics

Satanjeev Banerjee and Ted Pedersen. 2002. An adapted Lesk algorithm for word sense disambiguation using wordnet. In International conference on intelligent text processing and computational linguistics, pages 136-145. Springer.

Pierpaolo Basile, Annalina Caputo, and Giovanni Semeraro. 2014. An enhanced Lesk word sense disambiguation algorithm through a distributional semantic model. In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, pages 1591-1600, Dublin, Ireland. Dublin City University and Association for Computational Linguistics.

Michele Bevilacqua and Roberto Navigli. 2020. Breaking Through the 80% Glass Ceiling: Raising the State of the Art in Word Sense Disambiguation by Incorporating Knowledge Graph Information. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 2854-2864. Association for Computational Linguistics.

Sudha Bhingardive, Dhirendra Singh, Rudra Murthy V, Hanumant Redkar and Pushpak Bhattacharyya. 2015. Unsupervised most frequent sense detection using word embeddings. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1238-1243, Denver, Colorado, Association for Computational Linguistics.

Bhavith Chandra Challagundla, Chakradhar Reddy Peddavenkatagari, Yugandhar Reddy Gogireddy, “Efficient CAPTCHA Image Recognition Using Convolutional Neural Networks and Long Short-Term Memory”, International Journal of Scientific Research in Engineering and Management, Volume 8, Issue 3 DOI : 10.55041/IJSREM29450

Terra Blevins and Luke Zettlemoyer. 2020. Moving Down the Long Tail of Word Sense Disambiguation with Gloss Informed Bi-encoders. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 1006-1017. Association for Computational Linguistics.

Andrea Moro, Alessandro Raganato, and Roberto Navigli. 2014. Entity linking meets word sense disambiguation: a unified approach. Transactions of the Association for Computational Linguistics, 2:231-244.

Preslav Nakov, Doris Hoogeveen, Lluís Màrquez, Alessandro Moschitti, Hamdy Mubarak, Timothy Baldwin, Karin Verspoor. 2017. SemEval-2017 Task 3: Community Question Answering. In Proceedings of the 11th International Workshop on Semantic Evaluations (SemEval-2017), pages 27-48.

Rada Mihalcea and Dan I. Moldovan. 2001. Extended wordNet: progress report. In Proceedings of NAACL Workshop on WordNet and Other Lexical Resources, pages 95-100.

George A. Miller, Martin Chodorow, Shari Landes, Claudia Leacock, and Robert G. Thomas. 1994. Using a semantic concordance for sense identification. In HUMAN LANGUAGE TECHNOLOGY: Proceedings of a Workshop held at Plainsboro, New Jersey, March 8-11, 1994.

George A. Miller. 1995. WordNet: a lexical database for English. Communications of the ACM, 41(2): 39-41.

Sameer Pradhan, Edward Loper, Dmitriy Dligach, and Martha Palmer. 2007. SemEval-2007 task-17: English lexical sample, SRL and all words. In Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007), pages 87-92, Prague, Czech Republic. Association for Computational Linguistics

Alessandro Raganato, Claudio Delli Bovi, and Roberto Navigli. 2017a. Neural sequence learning models for word sense disambiguation. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 1156-1167, Copenhagen, Denmark. Association for Computational Linguistics.

Alessandro Raganato, Jose Camacho-Collados, and Roberto Navigli. 2017b. Word sense disambiguation: A unified evaluation framework and empirical comparison. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, pages 99-110, Valencia, Spain. Association for Computational Linguistics

Bianca Scarlini, Tommaso Pasini, Roberto Navigli. 2020. SENSEMBERT: Context-Enhanced Sense Embeddings for Multilingual Word Sense Disambiguation. In Thirty-fourth AAAI Conference on Artificial Intelligence.

Loïc Vial, Benjamin Lecouteux and Didier Schwab. Sense Vocabulary Compression through the Semantic Knowledge of WordNet for Neural Word Sense Disambiguation. In proceedings of the 10th Global WordNet Conference.

Loïc Vial, Benjamin Lecouteux, and Didier Schwab. 2018. UFSAC: Unification of Sense Annotated Corpora and Tools. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan.

Gregor Wiedemann, Steffen Remus, Avi Chawla and Chris Biemann. Does BERT Make Any Sense? Interpretable Word Sense Disambiguation with Contextualized Embeddings. In KONVENS 2019

Sutskever, I., Vinyals, O. and Le, Q.V., 2014. Sequence to sequence learning with neural networks. Advances in neural information processing systems, 27.

Rei, Marek, Gamal KO Crichton, and Sampo Pyysalo. "Attending to characters in neural sequence labeling models." arXiv preprint arXiv:1611.04361 (2016).

Yang, J. and Zhang, Y., 2018. NCRF++: An open-source neural sequence labeling toolkit. arXiv preprint arXiv:1806.05626.

Nallapati, R., Zhou, B., Gulcehre, C. and Xiang, B., 2016. Abstractive text summarization using sequence-to-sequence rnns and beyond. arXiv preprint arXiv:1602.06023.

Suleiman, D. and Awajan, A., 2020. Deep learning based abstractive text summarization: approaches, datasets, evaluation measures, and challenges. Mathematical problems in engineering, 2020, pp.1-29.

Song, S., Huang, H. and Ruan, T., 2019. Abstractive text summarization using LSTM-CNN based deep learning. Multimedia Tools and Applications, 78(1), pp.857-875.

Khan, A., Salim, N., Farman, H., Khan, M., Jan, B., Ahmad, A., Ahmed, I. and Paul, A., 2018. Abstractive text summarization based on improved semantic graph approach. International Journal of Parallel Programming, 46, pp.992-1016.

Raganato, A., Camacho-Collados, J. and Navigli, R., 2017. Word sense disambiguation: a uinified evaluation framework and empirical comparison. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers (Vol. 1, pp. 99-110).

Scarlini, B., Pasini, T. and Navigli, R., 2020, April. Sensembert: Context-enhanced sense embeddings for multilingual word sense disambiguation. In Proceedings of the AAAI conference on artificial intelligence (Vol. 34, No. 05, pp. 8758-8765).

Wilterson, A.I. and Graziano, M.S., 2021. The attention schema theory in a neural network agent: Controlling visuospatial attention using a descriptive model of attention. Proceedings of the National Academy of Sciences, 118(33), p.e2102421118.

Choi, J.H. and Lee, J.S., 2019. EmbraceNet: A robust deep learning architecture for multimodal classification. Information Fusion, 51, pp.259-270.

Singh, A.K. and Shashi, M., 2019. Deep learning architecture for multi-document summarization as a cascade of abstractive and extractive summarization approaches. International Journal of Computer Sciences and Engineering, 7(3), pp.950-954.

Huang, L., Sun, C., Qiu, X. and Huang, X., 2019. GlossBERT: BERT for word sense disambiguation with gloss knowledge. arXiv preprint arXiv:1908.07245.

Trask, A., Michalak, P. and Liu, J., 2015. sense2vec-a fast and accurate method for word sense disambiguation in neural word embeddings. arXiv preprint arXiv:1511.06388.

NEURAL SEQUENCE-TO-SEQUENCE MODELING WITH ATTENTION BY LEVERAGING DEEP LEARNING ARCHITECTURES FOR ENHANCED CONTEXTUAL UNDERSTANDING IN ABSTRACTIVE TEXT SUMMARIZATION

Authors

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License

How to Cite

cover