A Comprehensive Framework for Quality Assurance of Generative AI Text
Comprehensive Framework for Quality Assurance of Generative AI Text
الكلمات المفتاحية:
Generative AI، Quality Assurance (QA)، Text Generation، ChatGPT، Relevance Assessment، Lexical Diversity، Natural Language Processing، TF-IDF، NLTKالملخص
This paper presents a comprehensive framework for the quality assurance (QA) of text outputs generated by artificial intelligence (AI) models. The framework incorporates multiple metrics to evaluate the generated text, including grammar and spelling correctness, relevance to the prompt, and linguistic diversity. The proposed method employs the Python library language for grammatical error detection, TF-IDF vectorization coupled with cosine similarity for relevance assessment, and NLTK for measuring lexical diversity. By integrating these metrics, the framework provides a robust mechanism to ensure the generated text meets the desired quality standards. This approach is demonstrated through a sample implementation in Python, which can be easily extended and customized for various applications in generative AI.
التنزيلات
المراجع
S. Feuerriegel, J. Hartmann, C. Janiesch, and P. Zschech, "Generative AI," Business & Information Systems Engineering, vol. 66, no. 1, pp. 111–126, 2024.
E. Brynjolfsson, D. Li, and L. Raymond, "Generative AI at work," Cambridge, MA, 2023.
T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, et al., "Language models are few-shot learners," Advances in Neural Information Processing Systems, vol. 33, pp. 1877–1901, 2020.
OpenAI, "GPT-4," Mar. 14, 2023. [Online]. Available: https://openai.com/index/gpt-4-research/. [Accessed: Dec. 25, 2024].
M. T. Baldassarre, D. Caivano, B. Fernandez Nieto, D. Gigante, and A. Ragone, "The social impact of generative AI: An analysis on ChatGPT," in Proc. 2023 ACM Conf. Inf. Technol. for Social Good, Lisbon, Portugal, Sep. 2023, pp. 363–373.
T. K. Chiu, "The impact of Generative AI (GenAI) on practices, policies and research direction in education: a case of ChatGPT and Midjourney," Interactive Learning Environments, pp. 1–17, 2023.
N. R. Mannuru, S. Shahriar, Z. A. Teel, T. Wang, B. D. Lund, S. Tijani, C. O. Pohboon, D. Agbaji, J. Alhassan, J. Galley, and R. Kousari, "Artificial intelligence in developing countries: The impact of generative artificial intelligence (AI) technologies for development," Information Development, p. 02666669231200628, 2023.
Y. Li, Q. Pan, S. Wang, T. Yang, and E. Cambria, "A generative model for category text generation," Information Sciences, vol. 450, pp. 301–315, 2018.
M. Borg, J. Bengtsson, H. Österling, A. Hagelborn, I. Gagner, and P. Tomaszewski, "Quality assurance of generative dialog models in an evolving conversational agent used for Swedish language practice," in Proceedings of the 1st International Conference on AI Engineering: Software Engineering for AI, Pittsburgh, PA, May 2022, pp. 22–32.
K. Khakzad Shahandashti, "Examining the effectiveness of generative artificial intelligence for the identification of defeaters in assurance cases," M.S. thesis, York University, Toronto, Ontario, 2024.
I. Sommerville, “Software Engineering,” 10th ed. Boston, MA: Pearson Education Limited, 2016.
L. Olejnik and A. Kurasiński, “Philosophy of Cybersecurity,” CRC Press, 2023.
M. T. Younis, N. M. Hussien, Y. M. Mohialden, K. Raisian, P. Singh, and K. Joshi, "Enhancement of ChatGPT using API Wrappers Techniques," Al-Mustansiriyah Journal of Science, vol. 34, no. 2, pp. 82–86, 2023.
Bryant, Christopher, Zheng Yuan, Muhammad Reza Qorib, Hannan Cao, Hwee Tou Ng, and Ted Briscoe. "Grammatical error correction: A survey of the state of the art." Computational Linguistics 49, no. 3,pp. 643-701 , 2023.
Widianto, Adi, Eka Pebriyanto, Fitriyanti Fitriyanti, and Marna Marna. "Document Similarity using Term Frequency-Inverse Document Frequency Representation and Cosine Similarity." Journal of Dinda: Data Science, Information Technology, and Data Analytics 4, no. 2 pp.149-153,2024.
Kyle, Kristopher, Hakyung Sung, Masaki Eguchi, and Fred Zenker. "Evaluating evidence for the reliability and validity of lexical diversity indices in L2 oral task responses." Studies in Second Language Acquisition 46, no. 1 , pp. 278-299,2024.
منشور
كيفية الاقتباس
إصدار
القسم
الحقوق الفكرية (c) 2025 Saba Salman, Yasmin Makki Mohialden, Nadia Mahmood Hussien (Author)

هذا العمل مرخص بموجب Creative Commons Attribution 4.0 International License.