Evaluation-Focused Multidimensional Score for Turkish Abstractive Text Summarization

Nihal Zuhal Kayalı; Sevinç İlhan Omurca

doi:10.35377/saucis...1504388

Research Article

Evaluation-Focused Multidimensional Score for Turkish Abstractive Text Summarization

Year 2024, Volume: 7 Issue: 3, 346 - 360, 31.12.2024

Nihal Zuhal Kayalı , Sevinç İlhan Omurca

https://doi.org/10.35377/saucis...1504388

Abstract

Despite the inherent complexity of Abstractive Text Summarization, which is widely acknowledged as one of the most challenging tasks in the field of natural language processing, transformer-based models have emerged as an effective solution capable of delivering highly accurate and coherent summaries. In this study, the effectiveness of transformer-based text summarization models for Turkish language is investigated. For this purpose, we utilize BERTurk, mT5 and mBART as transformer-based encoder-decoder models. Each of the models was trained separately with MLSUM, TR-News, WikiLingua and Fırat_DS datasets. While obtaining experimental results, various optimizations were made in the summary functions of the models. Our study makes an important contribution to the limited Turkish text summarization literature by comparing the performance of different language models on existing Turkish datasets. We first evaluate ROUGE, BERTScore, FastText-based Cosine Similarity and Novelty Rate metrics separately for each model and dataset, then normalize and combine the scores we obtain to obtain a multidimensional score. We validate our innovative approach by comparing the summaries produced with the human evaluation results.

Keywords

Natural language processing, Abstractive summarization, Transformers, Evaluation metrics, ROUGE

References

M. Zhang, G. Zhou, W. Yu, N. Huang, & W. Liu. (2022). A comprehensive survey of abstractive text summarization based on deep learning. Computational intelligence and neuroscience, 2022(1), 7132226.] [Akhmetov, I., Nurlybayeva, S., Ualiyeva, I., Pak, A., & Gelbukh, A. (2023). A Comprehensive Review on ATS. Computación y Sistemas, 27(4), 1203-1240.
I. Mani, & M. T. Maybury (Eds.). (1999). Advances in ATS. MIT Press.
D. Jain, M. D. Borah, & A. Biswas (2021). Summarization of legal documents: Where are we now and the way forward. Computer Science Review, 40, 100388.
D. Suleiman and A. Awajan (2020). Deep learning based abstractive text summarization: Approaches, datasets, evaluation measures, and challenges. Mathematical Problems in Engineering, 2020, 1-29. https://doi.org/10.1155/2020/9365340.
M. Allahyari, S. Pouriyeh, M. Assefi, S. Safaei, E. Trippe, J. Gutiérrez … & K. Kochut, (2017). Text summarization techniques: A brief survey. https://doi.org/10.48550/arxiv.1707.02268.
S. Gehrmann, Z. Ziegler, & G. Rushton, (2019). Generating abstractive summaries with fine-tuned language models. https://doi.org/10.18653/v1/w19-8665.
A. See, P. J. Liu & C. D. Manning (2017). Get to the point: Summarization with pointer-generator networks. arXiv preprint arXiv:1704.04368.
D. Bahdanau, K. Cho &Y. Bengio (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit,, L. Jones, A. N. Gomez, ... & I. Polosukhin. (2017). Attention is all you need. Advances in neural information processing systems, 30.
J. Devlin, M. W. Chang, K. Lee, & K. Toutanova (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, ... & D. Amodei. (2020). Language models are few-shot learners. Advances in neural information processing systems, 33, 1877-1901.
W. S. El-Kassas, C. R. Salama, A. A. Rafea, & H. K. Mohamed (2021). ATS: A comprehensive survey. Expert systems with applications, 165, 113679.
B. Baykara & T. Güngör (2023). Turkish abstractive text summarization using pre-trained sequence-to-sequence models. Natural Language Engineering, 29(5), 1275-1304.
M. Ülker, & A.B. Özer (2021, June). TTSD: A novel dataset for Turkish Text Summarization. In 2021 9th International Symposium on Digital Forensics and Security (ISDFS) (pp. 1-6). IEEE.
F. B. Fikri, K. Oflazer, & B. Yanikoglu (2021, August). Semantic similarity based evaluation for abstractive news summarization. In Proceedings of the 1st workshop on natural language generation, evaluation, and metrics (GEM 2021) (pp. 24-33).
B. Baykara & T. Güngör (2022). Abstractive text summarization and new large-scale datasets for agglutinative languages Turkish and Hungarian. Language Resources and Evaluation, 56(3), 973-1007.
A. Safaya, E. Kurtuluş, A. Göktoğan,, & D. Yuret (2022). Mukayese: Turkish NLP strikes back. arXiv preprint arXiv:2203.01215.
R. Bech, F. Sahin, & M. F. Amasyali (2022, September). Improving Abstractive Summarization for the Turkish Language. In 2022 Innovations in Intelligent Systems and Applications Conference (ASYU) (pp. 1-6). IEEE.
B. Ay, F. Ertam, G. Fidan, & G. Aydin (2023). Turkish abstractive text document summarization using text-to-text transfer transformer. Alexandria Engineering Journal, 68, 1-13.
B. Baykara & T. Güngör (2023, June). Morphosyntactic Evaluation for Text Summarization in Morphologically Rich Languages: A Case Study for Turkish. In International Conference on Applications of Natural Language to Information Systems (pp. 201-214). Cham: Springer Nature Switzerland.
Y. Yüksel., & Y. Çebi (2021, October). TR-SUM: An ATS Tool for Turkish. In the International Conference on Artificial Intelligence and Applied Mathematics in Engineering (pp. 271-284). Cham: Springer International Publishing.
S. Hochreiter, & J. Schmidhuber. (1997). Long short-term memory. Neural Computation, 9(8), 1735-1780.
Lewis, M., Liu, Y., Goyal, N., Ghazvininejad, M., Mohamed, A., Levy, O., ... & Zettlemoyer, L. (2019). Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461.
C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, ... & P. J. Liu, (2020). Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research, 21(140), 1-67.
S. Rothe, S. Narayan, & A. Severyn (2020). Leveraging pre-trained checkpoints for sequence generation tasks. Transactions of the Association for Computational Linguistics, 8, 264-280.
S. Schweter, (2020). Berturk-bert models for Turkish, April 2020. URL https://doi. org/10.5281/zenodo, 3770924.
L. Xue, N. Constant, A. Roberts, M. Kale, R. Al-Rfou, A. Siddhant, ... & C. Raffel, (2020). mT5: A massively multilingual pre-trained text-to-text transformer. arXiv preprint arXiv:2010.11934.
Y. Liu. (2020). Multilingual denoising pre-training for neural machine translation. arXiv preprint arXiv:2001.08210.
T. Scialom, P. A. Dray, S. Lamprier, B. Piwowarski, & J. Staiano, (2020). MLSUM: The multilingual summarization corpus. arXiv preprint arXiv:2004.14900.
F. Ladhak, E. Durmus, C. Cardie, & K. McKeown, (2020). WikiLingua: A new benchmark dataset for cross-lingual abstractive summarization. arXiv preprint arXiv:2010.03093.
C. Y. Lin (2004, July). Rouge: A package for automatic evaluation of summaries. In Text summarization branches out (pp. 74-81).
T. Zhang, V. Kishore, F. Wu, K. Q. Weinberger, & Y. Artzi, (2019). Bertscore: Evaluating text generation with Bert. arXiv preprint arXiv:1904.09675.
P. Bojanowski, E. Grave, A. Joulin, & T. Mikolov, (2017). Enriching word vectors with subword information. Transactions of the association for computational linguistics, 5, 135-146.

Year 2024, Volume: 7 Issue: 3, 346 - 360, 31.12.2024

Nihal Zuhal Kayalı , Sevinç İlhan Omurca

https://doi.org/10.35377/saucis...1504388

Abstract

References

M. Zhang, G. Zhou, W. Yu, N. Huang, & W. Liu. (2022). A comprehensive survey of abstractive text summarization based on deep learning. Computational intelligence and neuroscience, 2022(1), 7132226.] [Akhmetov, I., Nurlybayeva, S., Ualiyeva, I., Pak, A., & Gelbukh, A. (2023). A Comprehensive Review on ATS. Computación y Sistemas, 27(4), 1203-1240.
I. Mani, & M. T. Maybury (Eds.). (1999). Advances in ATS. MIT Press.
D. Jain, M. D. Borah, & A. Biswas (2021). Summarization of legal documents: Where are we now and the way forward. Computer Science Review, 40, 100388.
D. Suleiman and A. Awajan (2020). Deep learning based abstractive text summarization: Approaches, datasets, evaluation measures, and challenges. Mathematical Problems in Engineering, 2020, 1-29. https://doi.org/10.1155/2020/9365340.
M. Allahyari, S. Pouriyeh, M. Assefi, S. Safaei, E. Trippe, J. Gutiérrez … & K. Kochut, (2017). Text summarization techniques: A brief survey. https://doi.org/10.48550/arxiv.1707.02268.
S. Gehrmann, Z. Ziegler, & G. Rushton, (2019). Generating abstractive summaries with fine-tuned language models. https://doi.org/10.18653/v1/w19-8665.
A. See, P. J. Liu & C. D. Manning (2017). Get to the point: Summarization with pointer-generator networks. arXiv preprint arXiv:1704.04368.
D. Bahdanau, K. Cho &Y. Bengio (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit,, L. Jones, A. N. Gomez, ... & I. Polosukhin. (2017). Attention is all you need. Advances in neural information processing systems, 30.
J. Devlin, M. W. Chang, K. Lee, & K. Toutanova (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, ... & D. Amodei. (2020). Language models are few-shot learners. Advances in neural information processing systems, 33, 1877-1901.
W. S. El-Kassas, C. R. Salama, A. A. Rafea, & H. K. Mohamed (2021). ATS: A comprehensive survey. Expert systems with applications, 165, 113679.
B. Baykara & T. Güngör (2023). Turkish abstractive text summarization using pre-trained sequence-to-sequence models. Natural Language Engineering, 29(5), 1275-1304.
M. Ülker, & A.B. Özer (2021, June). TTSD: A novel dataset for Turkish Text Summarization. In 2021 9th International Symposium on Digital Forensics and Security (ISDFS) (pp. 1-6). IEEE.
F. B. Fikri, K. Oflazer, & B. Yanikoglu (2021, August). Semantic similarity based evaluation for abstractive news summarization. In Proceedings of the 1st workshop on natural language generation, evaluation, and metrics (GEM 2021) (pp. 24-33).
B. Baykara & T. Güngör (2022). Abstractive text summarization and new large-scale datasets for agglutinative languages Turkish and Hungarian. Language Resources and Evaluation, 56(3), 973-1007.
A. Safaya, E. Kurtuluş, A. Göktoğan,, & D. Yuret (2022). Mukayese: Turkish NLP strikes back. arXiv preprint arXiv:2203.01215.
R. Bech, F. Sahin, & M. F. Amasyali (2022, September). Improving Abstractive Summarization for the Turkish Language. In 2022 Innovations in Intelligent Systems and Applications Conference (ASYU) (pp. 1-6). IEEE.
B. Ay, F. Ertam, G. Fidan, & G. Aydin (2023). Turkish abstractive text document summarization using text-to-text transfer transformer. Alexandria Engineering Journal, 68, 1-13.
B. Baykara & T. Güngör (2023, June). Morphosyntactic Evaluation for Text Summarization in Morphologically Rich Languages: A Case Study for Turkish. In International Conference on Applications of Natural Language to Information Systems (pp. 201-214). Cham: Springer Nature Switzerland.
Y. Yüksel., & Y. Çebi (2021, October). TR-SUM: An ATS Tool for Turkish. In the International Conference on Artificial Intelligence and Applied Mathematics in Engineering (pp. 271-284). Cham: Springer International Publishing.
S. Hochreiter, & J. Schmidhuber. (1997). Long short-term memory. Neural Computation, 9(8), 1735-1780.
Lewis, M., Liu, Y., Goyal, N., Ghazvininejad, M., Mohamed, A., Levy, O., ... & Zettlemoyer, L. (2019). Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461.
C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, ... & P. J. Liu, (2020). Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research, 21(140), 1-67.
S. Rothe, S. Narayan, & A. Severyn (2020). Leveraging pre-trained checkpoints for sequence generation tasks. Transactions of the Association for Computational Linguistics, 8, 264-280.
S. Schweter, (2020). Berturk-bert models for Turkish, April 2020. URL https://doi. org/10.5281/zenodo, 3770924.
L. Xue, N. Constant, A. Roberts, M. Kale, R. Al-Rfou, A. Siddhant, ... & C. Raffel, (2020). mT5: A massively multilingual pre-trained text-to-text transformer. arXiv preprint arXiv:2010.11934.
Y. Liu. (2020). Multilingual denoising pre-training for neural machine translation. arXiv preprint arXiv:2001.08210.
T. Scialom, P. A. Dray, S. Lamprier, B. Piwowarski, & J. Staiano, (2020). MLSUM: The multilingual summarization corpus. arXiv preprint arXiv:2004.14900.
F. Ladhak, E. Durmus, C. Cardie, & K. McKeown, (2020). WikiLingua: A new benchmark dataset for cross-lingual abstractive summarization. arXiv preprint arXiv:2010.03093.
C. Y. Lin (2004, July). Rouge: A package for automatic evaluation of summaries. In Text summarization branches out (pp. 74-81).
T. Zhang, V. Kishore, F. Wu, K. Q. Weinberger, & Y. Artzi, (2019). Bertscore: Evaluating text generation with Bert. arXiv preprint arXiv:1904.09675.
P. Bojanowski, E. Grave, A. Joulin, & T. Mikolov, (2017). Enriching word vectors with subword information. Transactions of the association for computational linguistics, 5, 135-146.

There are 33 citations in total.

Details

Primary Language	English
Subjects	Software Engineering (Other)
Journal Section	Research Article
Authors	Nihal Zuhal Kayalı 0000-0002-6545-173X Sevinç İlhan Omurca 0000-0003-1214-9235
Early Pub Date	October 30, 2024
Publication Date	December 31, 2024
Submission Date	June 25, 2024
Acceptance Date	October 11, 2024
Published in Issue	Year 2024Volume: 7 Issue: 3

Cite

APA	Kayalı, N. Z., & İlhan Omurca, S. (2024). Evaluation-Focused Multidimensional Score for Turkish Abstractive Text Summarization. Sakarya University Journal of Computer and Information Sciences, 7(3), 346-360. https://doi.org/10.35377/saucis...1504388
AMA	Kayalı NZ, İlhan Omurca S. Evaluation-Focused Multidimensional Score for Turkish Abstractive Text Summarization. SAUCIS. December 2024;7(3):346-360. doi:10.35377/saucis.1504388
Chicago	Kayalı, Nihal Zuhal, and Sevinç İlhan Omurca. “Evaluation-Focused Multidimensional Score for Turkish Abstractive Text Summarization”. Sakarya University Journal of Computer and Information Sciences 7, no. 3 (December 2024): 346-60. https://doi.org/10.35377/saucis. 1504388.
EndNote	Kayalı NZ, İlhan Omurca S (December 1, 2024) Evaluation-Focused Multidimensional Score for Turkish Abstractive Text Summarization. Sakarya University Journal of Computer and Information Sciences 7 3 346–360.
IEEE	N. Z. Kayalı and S. İlhan Omurca, “Evaluation-Focused Multidimensional Score for Turkish Abstractive Text Summarization”, SAUCIS, vol. 7, no. 3, pp. 346–360, 2024, doi: 10.35377/saucis...1504388.
ISNAD	Kayalı, Nihal Zuhal - İlhan Omurca, Sevinç. “Evaluation-Focused Multidimensional Score for Turkish Abstractive Text Summarization”. Sakarya University Journal of Computer and Information Sciences 7/3 (December 2024), 346-360. https://doi.org/10.35377/saucis. 1504388.
JAMA	Kayalı NZ, İlhan Omurca S. Evaluation-Focused Multidimensional Score for Turkish Abstractive Text Summarization. SAUCIS. 2024;7:346–360.
MLA	Kayalı, Nihal Zuhal and Sevinç İlhan Omurca. “Evaluation-Focused Multidimensional Score for Turkish Abstractive Text Summarization”. Sakarya University Journal of Computer and Information Sciences, vol. 7, no. 3, 2024, pp. 346-60, doi:10.35377/saucis. 1504388.
Vancouver	Kayalı NZ, İlhan Omurca S. Evaluation-Focused Multidimensional Score for Turkish Abstractive Text Summarization. SAUCIS. 2024;7(3):346-60.

Download Cover Image

Article Files

Full Text

INDEXING & ABSTRACTING & ARCHIVING

29070 The papers in this journal are licensed under a Creative Commons Attribution-NonCommercial 4.0 International License