Social media has a leading role to our lives due to radical upgrade of internet and smart technology. It is the primary way of informing, advertising, exchanging opinions and expressing feelings. Posts and comments under each post shape public opinion on different but important issues making social media’s role in public life crucial. It has been observed that people's opinions expressed through social networks are more direct and representative than those expressed in face-to-face communication. Data shared on social media is a cornerstone of research because patterns of social behavior can be extracted that can be used for government, social, and business decisions. When an event breaks out, social networks are flooded with posts and comments, which are almost impossible for someone to read all of them. A system that would generate summarization of social media contents is necessary. Recent years have shown that abstract summarization combined with transfer learning and transformers has achieved excellent results in the field of text summarization, producing more human-like summaries. In this paper, a presentation of text summarization methods is first presented, as well as a review of text summarization systems. Finally, a system based on the pre-trained T5 model is described to generate summaries from user comments on social media.
Published in | Automation, Control and Intelligent Systems (Volume 12, Issue 3) |
DOI | 10.11648/j.acis.20241203.11 |
Page(s) | 48-59 |
Creative Commons |
This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited. |
Copyright |
Copyright © The Author(s), 2024. Published by Science Publishing Group |
Social Media, Text Summarization, Transformers, Abstractive Summarization
Rouge 1 | Rouge 2 | Rouge L | |
---|---|---|---|
P | 0.520 | 0.514 | 0.519 |
R | 0.854 | 0.849 | 0.854 |
F | 0.646 | 0.640 | 0.645 |
NLP | Natural Language Processing |
RNN | Recurrent Neural Network |
CNN | Convolutional Neural Network |
PR | Phrase Reinforcement |
TF-IDF | Term Frequency-Inverse Document Frequency |
DTM | Decomposition Topic Model |
GDTM | Gaussian Decomposition Topic Mode |
LSTM | Long Short-Term Memory |
BERT | Bidirectional Encoder Representations from Transformers |
BART | Bidirectional and Auto-Regressive Transformer |
T5 | Text-to-Text Transfer Transformer |
LCSTS | Large-scale Chinese Short Text Summarization |
LLM | Large Language Model |
PLM | Prompt Learning Model |
MLM | Masked Language Model |
LM | Language Model |
Seq2Seq | Sequence to Sequence |
PTLM | Pre-Trained Language Model |
UniLM | Unified Language Model |
GPT | Generative Pre-trained Transformer |
ROUGE | Recall-Oriented Understudy for Gisting Evaluation |
[1] | Gupta, S. and Gupta, S. K. Abstractive summarization: An overview of the state of the art. Expert Systems with Applications 121, 2019, pp. 49–65. |
[2] | Luhn, H. P. The Automatic Creation of Literature Abstracts. IBM Journal of Research and Development, vol. 2, no. 2, Apr. 1958, pp. 159-165, |
[3] | Suleiman, D., A. Awajan, A. Deep Learning Based Abstractive Text Summarization: Approaches, Datasets, Evaluation Measures, and Challenges. Mathematical Problems in Engineering, 2020, |
[4] | Gupta, V., Lehal, G. S. A Survey of Text Summarization Extractive techniques. Journal of Emerging Technologies in Web Intelligence, 2010, pp. 258–268, |
[5] | Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., Polosukhin, I. Attention Is All You Need In 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA., June 2017. |
[6] | Sharifi, B., Hutton, M-A. and Kalita, J. (2010) “Summarizing Microblogs Automatically”. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pages 685–688, Los Angeles, California. Association for Computational Linguistics. |
[7] | Sharifi, B., Inouye, D., and Kalita, J.K. (2014) “Summarization of Twitter Microblogs”. The Computer Journal, Volume 57, Issue 3, March 2014, Pages 378–402, |
[8] | F. Amato, F. Moscato, V. Moscato, A. Picariello, G. Sperli’, “Summarizing social media content for multimedia stories creation”. The 27th Italian Symposium on Advanced Database Systems (SEB 2019). |
[9] | F. Amato, A. Castiglione, F. Mercorio, M. Mezzanzanica, V. Moscato, A. Picariello, G. Sperlì, “Multimedia story creation on social networks,” Future Generation Computer Systems, 86, 412–420, 2018, |
[10] | J. Bian, Y. Yang, H. Zhang, T. S. Chua, “Multimedia summarization for social events in microblog stream,” IEEE Transactions on Multimedia, 17(2), 216–228, 2015, |
[11] | D. Chakrabarti, K. Punera, “Event Summarization Using Tweets”. Proceedings of the International AAAI Conference on Web and Social Media, 5(1), 66-73. |
[12] | Chong, F., Chua, T., Asur, S. (2021) “Automatic Summarization of Events from Social Media”. Proceedings of the International AAAI Conference on Web and Social Media, 7(1), 81-90. |
[13] | Gao, S., Chen, X., Li, P., Ren, Z., Bing, L., Zhao, D. and Yan, R. (2019) “Abstractive Text Summarization by Incorporating Reader Comments”. In The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19), 33(01): 6399-6406. |
[14] | Liang, Z., Du, J. and Li, C. (2020) “Abstractive social media text summarization using selective reinforced Seq2Seq attention model,” Neurocomputing, 410, 432–440, |
[15] | Wang, Q. and Ren, J. (2021) “Summary-aware attention for social media short text abstractive summarization,” Neurocomputing, 425, 290–299, |
[16] | Bhandarkar, P., Thomas, K. T. (2023) “Text Summarization Using Combination of Sequence-To-Sequence Model with Attention Approach”, Springer Science and Business Media Deutschland GmbH: 283–293, 2023, |
[17] | Gupta, A., Chugh, D. and Katarya, R. (2022) “Automated News Summarization Using Transformers”, In Sustainable Advanced Computing, 2022, Volume 840. ISBN: 978-981-16-9011-2. |
[18] | M. H. Su, C. H. Wu, H. T. Cheng, “A Two-Stage Transformer-Based Approach for Variable-Length Abstractive Summarization,” IEEE/ACM Transactions on Audio Speech and Language Processing, 28, 2061–2072, 2020, |
[19] | D. Singhal, K. Khatter, A. Tejaswini, R. Jayashree, “Abstractive Summarization of Meeting Conversations,” in 2020 IEEE International Conference for Innovation in Technology, INOCON 2020, Institute of Electrical and Electronics Engineers Inc., 2020, |
[20] | Ivan S. Blekanov, Nikita Tarasov and Svetlana S. Bodrunova. 2022. Transformer-Based Abstractive Summarization for Reddit and Twitter: Single Posts vs. Comment Pools in Three Languages. Future Internet 14, 69. |
[21] | A. Pal, L. Fan, V. Igodifo, Text Summarization using BERT and T5. |
[22] | M. T. Nguyen, V. C. Nguyen, H. T. Vu, V. H. Nguyen, “Transformer-based Summarization by Exploiting Social Information,” in Proceedings - 2020 12th International Conference on Knowledge and Systems Engineering, KSE 2020, Institute of Electrical and Electronics Engineers Inc.: 25–30, 2020, |
[23] | Li, Q. and Zhang, Q. (2020) “Abstractive Event Summarization on Twitter”. In The Web Conference 2020 - Companion of the World Wide Web Conference, WWW 2020, Association for Computing Machinery: 22–23, |
[24] | Z. Kerui, H. Haichao, L. Yuxia, “Automatic text summarization on social media,” in ACM International Conference Proceeding Series, Association for Computing Machinery, 2020, |
[25] | Tampe, I., Mendoza, M. and Milios, E. (2021) “Neural Abstractive Unsupervised Summarization of Online News Discussions”. In: Arai, K. (Eds) Intelligent Systems and Applications. IntelliSys 2021. Lecture Notes in Networks and Systems, vol 295. Springer, Cham. |
[26] | Rawat, R., Rawat, P., Elahi V. and Elahi, A. (2021) "Abstractive Summarization on Dynamically Changing Text," 2021 5th International Conference on Computing Methodologies and Communication (ICCMC), Erode, India, 2021, pp. 1158-1163, |
[27] | Ding N, Hu S, Zhao W, Chen Y, Liu Z, Zheng H, et al. OpenPrompt: An Open-source Framework for Prompt-learning. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: System Demonstrations; 2022. p. 105-13. |
[28] | Shashi Narayan, Yao Zhao, Joshua Maynez, Gonçalo Simões, Vitaly Nikolaev, and Ryan McDonald. 2021. Planning with Learned Entity Prompts for Abstractive Summarization. Transactions of the Association for Computational Linguistics, 9: 1475–1492. |
[29] | Xiang Lisa Li and Percy Liang. 2021. Prefix-tuning: Optimizing continuous prompts for generation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 4582–4597, Online. Association for Computational Linguistics. |
[30] | Brian Lester, Rami Al-Rfou, and Noah Constant. 2021. The Power of Scale for Parameter-Efficient Prompt Tuning. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 3045–3059, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics. |
[31] | Wang, Jiaqi, Enze Shi, Sigang Yu, Zihao Wu, Chong Ma, Haixing Dai, Qiushi Yang, Yanqing Kang, Jinru Wu, Huawen Hu, Chenxi Yue, Haiyang Zhang, Yi-Hsueh Liu, Xiang Li, Bao Ge, Dajiang Zhu, Yixuan Yuan, Dinggang Shen, Tianming Liu and Shu Zhang. “Prompt Engineering for Healthcare: Methodologies and Applications.” |
[32] | Liu, Pengfei, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi and Graham Neubig. “Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing.” ACM Computing Surveys 55(2021): 1-35. |
[33] | Chua, F., Asur, S. Automatic Summarization of Events from Social Media. In Proceedings of the International AAAI Conference on Web and Social Media, 2023, 7(1), pp. 81-90. |
[34] | El-Kassas, W. S., Salama, C. R., Rafea, A. A. and Monhamed, H. K. Automatic Text Summarization: A Comprehensive Survey. Expert Systems with Applications. (2020). 165, 113679. |
[35] | Wang, S., Zhao, X., Li, B., Ge, B. & Tang, D. Integrating extractive and abstractive models for long text summarization. IEEE International Congress on Big Data (Big Data Congress), 2017, pp. 305-312. |
[36] | Varma, V., Kurisinkel, J., Radhakrishnan, P. Social Media Summarization, Cambria, E., Das, D., Bandyopadhyay, S., Feraco, A. (eds) A Practical Guide to Sentiment Analysis. Socio-Affective Computing, vol 5. Springer, Cham., 2017, pp 135–153 |
[37] | Lin, H., & Ng, V. Abstractive Summarization: A Survey of the State of the Art. In Proceedings of the AAAI Conference on Artificial Intelligence, 2019, 33(01), pp. 9815-9822. |
[38] | K. Pipalia, R. Bhadja, M. Shukla, “Comparative analysis of different transformer based architectures used in sentiment analysis,” in Proceedings of the 2020 9th International Conference on System Modeling and Advancement in Research Trends, SMART 2020, Institute of Electrical and Electronics Engineers Inc.: 411–415, 2020, |
[39] | T. Wolf, L. Debut, V. Sanh, J. Chaumond, C. Delangue, A. Moi, P. Cistac, T. Rault, R. Louf, M. Funtowicz, J. Davison, S. Shleifer, P. von Platen, C. Ma, Y. Jernite, J. Plu, C. Xu, T. Le Scao, S. Gugger, M. Drame, Q. Lhoest, A. M. Rush, “HuggingFace’s Transformers: State-of-the-art Natural Language Processing,” 2019. |
[40] | Lewis, M., Liu, Y., Goyal, N., Ghazvininejad, M., Mohamed, A., Levy, O., Stoyanov, V. and Zettlemoyer, L. (2019) “BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension”. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7871–7880, Online. Association for Computational Linguistics. |
[41] | Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narag, S., Matena, M., Zhou, Y., Li, W. and Liu P. J. (2021) “Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer”. In The Journal of Machine Learning Research, Volume 21, Issue 1, 2019. ISSN: 1532-4435. |
[42] | Naveed, H., Khan, A. U., Qiu, S., Saqib, M., Anwar, S., Usman, M., Barnes, N., & Mian, A. S. (2023). A Comprehensive Overview of Large Language Models. |
[43] | Zhang, J., Zhao, Y., Saleh, M. and Liu, P. J. (2020) “PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization”. In ICML'20: Proceedings of the 37th International Conference on Machine Learning, July 2020, Article No.: 1051, Pages 11328–11339. |
[44] | Rawat A., and Singh Samant, S. (2022) "Comparative Analysis of Transformer based Models for Question Answering". 2nd International Conference on Innovative Sustainable Computational Technologies (CISCT), Dehradun, India, 2022, pp. 1-6, |
[45] | Lin, C.-Y. (2004) “ROUGE: A Package for Automatic Evaluation of Summaries”. In Text Summarization Branches Out, pages 74–81, Barcelona, Spain. Association for Computational Linguistics. |
APA Style
Papagiannopoulou, A., Angeli, C. (2024). Encoder-Decoder Transformers for Textual Summaries on Social Media Content. Automation, Control and Intelligent Systems, 12(3), 48-59. https://doi.org/10.11648/j.acis.20241203.11
ACS Style
Papagiannopoulou, A.; Angeli, C. Encoder-Decoder Transformers for Textual Summaries on Social Media Content. Autom. Control Intell. Syst. 2024, 12(3), 48-59. doi: 10.11648/j.acis.20241203.11
AMA Style
Papagiannopoulou A, Angeli C. Encoder-Decoder Transformers for Textual Summaries on Social Media Content. Autom Control Intell Syst. 2024;12(3):48-59. doi: 10.11648/j.acis.20241203.11
@article{10.11648/j.acis.20241203.11, author = {Afrodite Papagiannopoulou and Chrissanthi Angeli}, title = {Encoder-Decoder Transformers for Textual Summaries on Social Media Content }, journal = {Automation, Control and Intelligent Systems}, volume = {12}, number = {3}, pages = {48-59}, doi = {10.11648/j.acis.20241203.11}, url = {https://doi.org/10.11648/j.acis.20241203.11}, eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.acis.20241203.11}, abstract = {Social media has a leading role to our lives due to radical upgrade of internet and smart technology. It is the primary way of informing, advertising, exchanging opinions and expressing feelings. Posts and comments under each post shape public opinion on different but important issues making social media’s role in public life crucial. It has been observed that people's opinions expressed through social networks are more direct and representative than those expressed in face-to-face communication. Data shared on social media is a cornerstone of research because patterns of social behavior can be extracted that can be used for government, social, and business decisions. When an event breaks out, social networks are flooded with posts and comments, which are almost impossible for someone to read all of them. A system that would generate summarization of social media contents is necessary. Recent years have shown that abstract summarization combined with transfer learning and transformers has achieved excellent results in the field of text summarization, producing more human-like summaries. In this paper, a presentation of text summarization methods is first presented, as well as a review of text summarization systems. Finally, a system based on the pre-trained T5 model is described to generate summaries from user comments on social media. }, year = {2024} }
TY - JOUR T1 - Encoder-Decoder Transformers for Textual Summaries on Social Media Content AU - Afrodite Papagiannopoulou AU - Chrissanthi Angeli Y1 - 2024/08/15 PY - 2024 N1 - https://doi.org/10.11648/j.acis.20241203.11 DO - 10.11648/j.acis.20241203.11 T2 - Automation, Control and Intelligent Systems JF - Automation, Control and Intelligent Systems JO - Automation, Control and Intelligent Systems SP - 48 EP - 59 PB - Science Publishing Group SN - 2328-5591 UR - https://doi.org/10.11648/j.acis.20241203.11 AB - Social media has a leading role to our lives due to radical upgrade of internet and smart technology. It is the primary way of informing, advertising, exchanging opinions and expressing feelings. Posts and comments under each post shape public opinion on different but important issues making social media’s role in public life crucial. It has been observed that people's opinions expressed through social networks are more direct and representative than those expressed in face-to-face communication. Data shared on social media is a cornerstone of research because patterns of social behavior can be extracted that can be used for government, social, and business decisions. When an event breaks out, social networks are flooded with posts and comments, which are almost impossible for someone to read all of them. A system that would generate summarization of social media contents is necessary. Recent years have shown that abstract summarization combined with transfer learning and transformers has achieved excellent results in the field of text summarization, producing more human-like summaries. In this paper, a presentation of text summarization methods is first presented, as well as a review of text summarization systems. Finally, a system based on the pre-trained T5 model is described to generate summaries from user comments on social media. VL - 12 IS - 3 ER -