WER2025 - 28th Workshop on Requirements Engineering


From Elicitation Interviews to Software Requirements: Evaluating LLM Performance in Requirement Generation

Camila Almeida; Isaque Copque; Alvaro Oliveira; Murilo Arouca; Adriano Barbosa; Sávio Freire; Manoel Mendonça; Julio Cesar Leite

10.29327/1588952.28-12

PDF Scholar

Abstract

Recent advancements in artificial intelligence (AI), particularly in large language models (LLMs), offer new possibilities for automating requirements generation from elicitation interviews. This study compares the performance of ChatGPT-4 and DeepSeek-V3 in generating software requirements based on transcribed stakeholder interviews. Using two case studies, the LLMs were tasked with identifying functional and non-functional requirements. The results indicate that ChatGPT-4 performed better in extracting precise requirements, particularly nonfunctional ones, while DeepSeek-V3 demonstrated advantages in efficiency. However, both models exhibited limitations in handling ambiguity and properly categorizing requirements. This study highlights the potential of LLMs in Requirements Engineering while emphasizing the need for improved prompt/dialogues techniques and human supervision. Future research should explore hybrid AI-human approaches and domain-specific fine-tuning to enhance requirement extraction accuracy.

Keywords: Requirements engineering; Large Language Models; Requirement generation.