Research Article
BibTex RIS Cite

Year 2025, Volume: 8 Issue: 4, 637 - 650
https://doi.org/10.35377/saucis...1722643

Abstract

References

  • J. He, C. Zhou, X. Ma, T. Berg-Kirkpatrick, & G. Neubig, "Towards a unified view of parameter-efficient transfer learning", 2021. https://doi.org/10.48550/arxiv.2110.04366
  • N. Houlsby, A. Giurgiu, S. Jastrzȩbski, B. Morrone, Q. Laroussilhe, A. Gesmundoet al., "Parameter-efficient transfer learning for nlp",, 2019. https://doi.org/10.48550/arxiv.1902.00751
  • X. Liu, P. He, W. Chen, & J. Gao, "Multi-task deep neural networks for natural language understanding", 2019. https://doi.org/10.18653/v1/p19-1441
  • M. Anschütz, D. Lozano, & G. Groh, "This is not correct! negation-aware evaluation of language generation systems", 2023. https://doi.org/10.18653/v1/2023.inlg-main.12
  • A. Lodha, G. Belapurkar, S. Chalkapurkar, Y. Tao, R. Ghosh, S. Basuet al., "On surgical fine-tuning for language encoders", 2023. https://doi.org/10.18653/v1/2023.findings-emnlp.204
  • J. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wanget al., "Lora: low-rank adaptation of large language models",, 2021. https://doi.org/10.48550/arxiv.2106.09685
  • Y. Hu, Y. Xie, T. Wang, M. Chen, & Z. Pan, "Structure-aware low-rank adaptation for parameter-efficient fine-tuning", Mathematics, vol. 11, no. 20, p. 4317, 2023. https://doi.org/10.3390/math11204317
  • N. Dhinagar, S. Ozarkar, K. Buwa, S. Thomopoulos, C. Owens‐Walton, E. Laltooet al., "Parameter efficient fine-tuning of transformer-based masked autoencoder enhances resource constrained neuroimage analysis",, 2025. https://doi.org/10.1101/2025.02.15.638442
  • H. Wu, "Large language models capsule: a research analysis of in-context learning (icl) and parameter-efficient fine-tuning (peft) methods", Applied and Computational Engineering, vol. 43, no. 1, p. 327-331, 2024. https://doi.org/10.54254/2755-2721/43/20230858
  • N. Sulaiman and F. Hamzah, "Optimizing llama 7b for medical question answering: a study on fine-tuning strategies and performance on the multimedqa dataset", 2024. https://doi.org/10.31219/osf.io/g5aes
  • J. Bogaert, E. Jean, C. Bodt, & F. Standaert, "Fine-tuning is not (always) overfitting artifacts", 2023. https://doi.org/10.14428/esann/2023.es2023-152
  • G. Wiedemann, S. Yimam, & C. Biemann, "Uhh-lt at semeval-2020 task 12: fine-tuning of pre-trained transformer networks for offensive language detection",, p. 1638-1644, 2020. https://doi.org/10.18653/v1/2020.semeval-1.213
  • A. Aghajanyan, S. Gupta, & L. Zettlemoyer, "Intrinsic dimensionality explains the effectiveness of language model fine-tuning", 2021. https://doi.org/10.18653/v1/2021.acl-long.568
  • L. Feng, Y. Yang, M. Tan, T. Zeng, Z. Li, H. Tanget al., "Adaptive multi-source domain collaborative fine-tuning for transfer learning",, 2023. https://doi.org/10.20944/preprints202311.0124.v1
  • F. Ullah, U. Azam, A. Faheem, F. Kamiran, & A. Karim, "Comparing prompt-based and standard fine-tuning for urdu text classification",, p. 6747-6754, 2023. https://doi.org/10.18653/v1/2023.findings-emnlp.449
  • M. Mosbach, M. Andriushchenko, & D. Klakow, "On the stability of fine-tuning bert: misconceptions, explanations, and strong baselines",, 2020. https://doi.org/10.48550/arxiv.2006.04884
  • X. Li and P. Liang, "Prefix-tuning: optimizing continuous prompts for generation", 2021. https://doi.org/10.18653/v1/2021.acl-long.353
  • X. Ma, C. Santos, & A. Arnold, "Contrastive fine-tuning improves robustness for neural rankers", 2021. https://doi.org/10.18653/v1/2021.findings-acl.51
  • L. Pan, C. Hang, A. Sil, & S. Potdar, "Improved text classification via contrastive adversarial training", 2021. https://doi.org/10.48550/arxiv.2107.10137
  • Chen M., Tworek J., Jun H., Kaplan J., Yuan Q. and Zarinelli E., “Evaluating Large Language Models Trained on Code”, arXiv preprint arXiv:2107.03374, (2021). https://doi.org/10.48550/arXiv.2107.03374
  • Xu X., Sharma P., Kinne J. F., O’Neill M., Mazaitis K. and Bhatia S., “A Systematic Evaluation of Large Language Models of Code”, Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 662-678, (2022). https://doi.org/10.48550/arXiv.2202.13169
  • Wang Z., Cuenca G., Zhou S., Chen T., Lin B. and Matsuo Y., “MCoNaLa: A Benchmark for Code Generation from Multiple Natural Languages”, Findings of the Association for Computational Linguistics: EACL 2023, 265-273, (2023). https://doi.org/10.48550/arXiv.2203.08388
  • Cassano F., Gouwar J., Nguyen D., Bartolo M., Serrano S. and Sabour A., “MultiPL-E: A Scalable and Extensible Approach to Benchmarking Neural Code Generation”, arXiv preprint arXiv:2208.08227, (2022). https://doi.org/10.48550/arXiv.2208.08227

Fine-tuning Large Language Models for Turkish Flutter Code Generation

Year 2025, Volume: 8 Issue: 4, 637 - 650
https://doi.org/10.35377/saucis...1722643

Abstract

The rapid advancement of large language models (LLMs) for code generation has largely centered on English programming queries. This paper targets a low-resource language scenario, Turkish, in Flutter mobile app development. Two representative LLMs (a 4B-parameter multilingual model and a 3B code-specialized model) on a new Turkish question-and-answer dataset for Flutter/Dart are fine-tuned in this study. Fine-tuning with parameter-efficient techniques yields dramatic improvements in code generation quality: Bilingual Evaluation Understudy (BLEU), Recall-Oriented Understudy for Gisting Evaluation (ROUGE-L), Metric for Evaluation of Translation with Explicit Ordering (METEOR), Bidirectional Encoder Representations from Transformers Score (BERTScore), and CodeBLEU scores show significant increases. The rate of correct solutions increased from ~30–70% (for base models) to 80–90% after fine-tuning. The performance trade-offs between models are analyzed, revealing that the multilingual model slightly outperforms the code-focused model in accuracy after fine-tuning. However, the code-focused model demonstrates faster inference speeds. These results demonstrate that even with very limited non-English training data, customizing LLMs can bridge the gap in code generation, enabling high-quality assistance for Turkish developers comparable to that for English. The dataset was released on GitHub to facilitate further research in multilingual code generation.

References

  • J. He, C. Zhou, X. Ma, T. Berg-Kirkpatrick, & G. Neubig, "Towards a unified view of parameter-efficient transfer learning", 2021. https://doi.org/10.48550/arxiv.2110.04366
  • N. Houlsby, A. Giurgiu, S. Jastrzȩbski, B. Morrone, Q. Laroussilhe, A. Gesmundoet al., "Parameter-efficient transfer learning for nlp",, 2019. https://doi.org/10.48550/arxiv.1902.00751
  • X. Liu, P. He, W. Chen, & J. Gao, "Multi-task deep neural networks for natural language understanding", 2019. https://doi.org/10.18653/v1/p19-1441
  • M. Anschütz, D. Lozano, & G. Groh, "This is not correct! negation-aware evaluation of language generation systems", 2023. https://doi.org/10.18653/v1/2023.inlg-main.12
  • A. Lodha, G. Belapurkar, S. Chalkapurkar, Y. Tao, R. Ghosh, S. Basuet al., "On surgical fine-tuning for language encoders", 2023. https://doi.org/10.18653/v1/2023.findings-emnlp.204
  • J. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wanget al., "Lora: low-rank adaptation of large language models",, 2021. https://doi.org/10.48550/arxiv.2106.09685
  • Y. Hu, Y. Xie, T. Wang, M. Chen, & Z. Pan, "Structure-aware low-rank adaptation for parameter-efficient fine-tuning", Mathematics, vol. 11, no. 20, p. 4317, 2023. https://doi.org/10.3390/math11204317
  • N. Dhinagar, S. Ozarkar, K. Buwa, S. Thomopoulos, C. Owens‐Walton, E. Laltooet al., "Parameter efficient fine-tuning of transformer-based masked autoencoder enhances resource constrained neuroimage analysis",, 2025. https://doi.org/10.1101/2025.02.15.638442
  • H. Wu, "Large language models capsule: a research analysis of in-context learning (icl) and parameter-efficient fine-tuning (peft) methods", Applied and Computational Engineering, vol. 43, no. 1, p. 327-331, 2024. https://doi.org/10.54254/2755-2721/43/20230858
  • N. Sulaiman and F. Hamzah, "Optimizing llama 7b for medical question answering: a study on fine-tuning strategies and performance on the multimedqa dataset", 2024. https://doi.org/10.31219/osf.io/g5aes
  • J. Bogaert, E. Jean, C. Bodt, & F. Standaert, "Fine-tuning is not (always) overfitting artifacts", 2023. https://doi.org/10.14428/esann/2023.es2023-152
  • G. Wiedemann, S. Yimam, & C. Biemann, "Uhh-lt at semeval-2020 task 12: fine-tuning of pre-trained transformer networks for offensive language detection",, p. 1638-1644, 2020. https://doi.org/10.18653/v1/2020.semeval-1.213
  • A. Aghajanyan, S. Gupta, & L. Zettlemoyer, "Intrinsic dimensionality explains the effectiveness of language model fine-tuning", 2021. https://doi.org/10.18653/v1/2021.acl-long.568
  • L. Feng, Y. Yang, M. Tan, T. Zeng, Z. Li, H. Tanget al., "Adaptive multi-source domain collaborative fine-tuning for transfer learning",, 2023. https://doi.org/10.20944/preprints202311.0124.v1
  • F. Ullah, U. Azam, A. Faheem, F. Kamiran, & A. Karim, "Comparing prompt-based and standard fine-tuning for urdu text classification",, p. 6747-6754, 2023. https://doi.org/10.18653/v1/2023.findings-emnlp.449
  • M. Mosbach, M. Andriushchenko, & D. Klakow, "On the stability of fine-tuning bert: misconceptions, explanations, and strong baselines",, 2020. https://doi.org/10.48550/arxiv.2006.04884
  • X. Li and P. Liang, "Prefix-tuning: optimizing continuous prompts for generation", 2021. https://doi.org/10.18653/v1/2021.acl-long.353
  • X. Ma, C. Santos, & A. Arnold, "Contrastive fine-tuning improves robustness for neural rankers", 2021. https://doi.org/10.18653/v1/2021.findings-acl.51
  • L. Pan, C. Hang, A. Sil, & S. Potdar, "Improved text classification via contrastive adversarial training", 2021. https://doi.org/10.48550/arxiv.2107.10137
  • Chen M., Tworek J., Jun H., Kaplan J., Yuan Q. and Zarinelli E., “Evaluating Large Language Models Trained on Code”, arXiv preprint arXiv:2107.03374, (2021). https://doi.org/10.48550/arXiv.2107.03374
  • Xu X., Sharma P., Kinne J. F., O’Neill M., Mazaitis K. and Bhatia S., “A Systematic Evaluation of Large Language Models of Code”, Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 662-678, (2022). https://doi.org/10.48550/arXiv.2202.13169
  • Wang Z., Cuenca G., Zhou S., Chen T., Lin B. and Matsuo Y., “MCoNaLa: A Benchmark for Code Generation from Multiple Natural Languages”, Findings of the Association for Computational Linguistics: EACL 2023, 265-273, (2023). https://doi.org/10.48550/arXiv.2203.08388
  • Cassano F., Gouwar J., Nguyen D., Bartolo M., Serrano S. and Sabour A., “MultiPL-E: A Scalable and Extensible Approach to Benchmarking Neural Code Generation”, arXiv preprint arXiv:2208.08227, (2022). https://doi.org/10.48550/arXiv.2208.08227
There are 23 citations in total.

Details

Primary Language English
Subjects Computer Software, Software Engineering (Other)
Journal Section Research Article
Authors

Bugra Uluırmak 0009-0000-3077-673X

Rifat Kurban 0000-0002-0277-2210

Early Pub Date October 13, 2025
Publication Date October 15, 2025
Submission Date June 18, 2025
Acceptance Date July 14, 2025
Published in Issue Year 2025 Volume: 8 Issue: 4

Cite

APA Uluırmak, B., & Kurban, R. (2025). Fine-tuning Large Language Models for Turkish Flutter Code Generation. Sakarya University Journal of Computer and Information Sciences, 8(4), 637-650. https://doi.org/10.35377/saucis...1722643
AMA Uluırmak B, Kurban R. Fine-tuning Large Language Models for Turkish Flutter Code Generation. SAUCIS. October 2025;8(4):637-650. doi:10.35377/saucis.1722643
Chicago Uluırmak, Bugra, and Rifat Kurban. “Fine-Tuning Large Language Models for Turkish Flutter Code Generation”. Sakarya University Journal of Computer and Information Sciences 8, no. 4 (October 2025): 637-50. https://doi.org/10.35377/saucis. 1722643.
EndNote Uluırmak B, Kurban R (October 1, 2025) Fine-tuning Large Language Models for Turkish Flutter Code Generation. Sakarya University Journal of Computer and Information Sciences 8 4 637–650.
IEEE B. Uluırmak and R. Kurban, “Fine-tuning Large Language Models for Turkish Flutter Code Generation”, SAUCIS, vol. 8, no. 4, pp. 637–650, 2025, doi: 10.35377/saucis...1722643.
ISNAD Uluırmak, Bugra - Kurban, Rifat. “Fine-Tuning Large Language Models for Turkish Flutter Code Generation”. Sakarya University Journal of Computer and Information Sciences 8/4 (October2025), 637-650. https://doi.org/10.35377/saucis. 1722643.
JAMA Uluırmak B, Kurban R. Fine-tuning Large Language Models for Turkish Flutter Code Generation. SAUCIS. 2025;8:637–650.
MLA Uluırmak, Bugra and Rifat Kurban. “Fine-Tuning Large Language Models for Turkish Flutter Code Generation”. Sakarya University Journal of Computer and Information Sciences, vol. 8, no. 4, 2025, pp. 637-50, doi:10.35377/saucis. 1722643.
Vancouver Uluırmak B, Kurban R. Fine-tuning Large Language Models for Turkish Flutter Code Generation. SAUCIS. 2025;8(4):637-50.


INDEXING & ABSTRACTING & ARCHIVING


 31045 31044   ResimLink - Resim Yükle  31047 

31043 28939 28938 34240


29070    The papers in this journal are licensed under a Creative Commons Attribution-NonCommercial 4.0 International License