Despite the inherent complexity of Abstractive Text Summarization, which is widely acknowledged as one of the most challenging tasks in the field of natural language processing, transformer-based models have emerged as an effective solution capable of delivering highly accurate and coherent summaries. In this study, the effectiveness of transformer-based text summarization models for Turkish language is investigated. For this purpose, we utilize BERTurk, mT5 and mBART as transformer-based encoder-decoder models. Each of the models was trained separately with MLSUM, TR-News, WikiLingua and Fırat_DS datasets. While obtaining experimental results, various optimizations were made in the summary functions of the models. Our study makes an important contribution to the limited Turkish text summarization literature by comparing the performance of different language models on existing Turkish datasets. We first evaluate ROUGE, BERTScore, FastText-based Cosine Similarity and Novelty Rate metrics separately for each model and dataset, then normalize and combine the scores we obtain to obtain a multidimensional score. We validate our innovative approach by comparing the summaries produced with the human evaluation results.
Natural language processing Abstractive summarization Transformers Evaluation metrics ROUGE
Birincil Dil | İngilizce |
---|---|
Konular | Yazılım Mühendisliği (Diğer) |
Bölüm | Research Article |
Yazarlar | |
Erken Görünüm Tarihi | 30 Ekim 2024 |
Yayımlanma Tarihi | |
Gönderilme Tarihi | 25 Haziran 2024 |
Kabul Tarihi | 11 Ekim 2024 |
Yayımlandığı Sayı | Yıl 2024Cilt: 7 Sayı: 3 |
The papers in this journal are licensed under a Creative Commons Attribution-NonCommercial 4.0 International License