Icing on the Cake: Automatic Code Summarization at EricssonShow others and affiliations
2024 (English)In: Proceedings - 2024 IEEE International Conference on Software Maintenance and Evolution, ICSME 2024, Institute of Electrical and Electronics Engineers (IEEE), 2024, p. 689-700Conference paper, Published paper (Refereed)
Abstract [en]
This paper presents our findings on the automatic summarization of Java methods within Ericsson, a global telecommunications company. We evaluate the performance of an approach called Automatic Semantic Augmentation of Prompts (ASAP), which uses a Large Language Model (LLM) to generate leading summary comments (Javadocs) for Java methods. ASAP enhances the LLM's prompt context by integrating static program analysis and information retrieval techniques to identify similar exemplar methods along with their developer-written Javadocs, and serves as the baseline in our study.
In contrast, we explore and compare the performance of four simpler approaches that do not require static program analysis, information retrieval, or the presence of exemplars as in the ASAP method. Our methods rely solely on the Java method body as input, making them lightweight and more suitable for rapid deployment in commercial software development environments.
We conducted experiments on an Ericsson software project and replicated the study using two widely-used open-source Java projects, Guava and Elasticsearch, to ensure the reliability of our results. Performance was measured across eight metrics that capture various aspects of similarity. Notably, one of our simpler approaches performed as well as or better than the ASAP method on both the Ericsson project and the open-source projects.
Additionally, we performed an ablation study to examine the impact of method names on Javadoc summary generation across our four proposed approaches and the ASAP method. By masking the method names and observing the generated summaries, we found that our approaches were statistically significantly less influenced by the absence of method names compared to the baseline. This suggests that our methods are more robust to variations in method names and may derive summaries more comprehensively from the method body than the ASAP approach.
Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024. p. 689-700
Keywords [en]
Automated Code Summarization, Generative AI, Industry Study, Large Language Models, Program Comprehension, Software Maintenance, Computer software maintenance, Java programming language, Online searching, Open source software, Problem oriented languages, Search engines, Software reliability, Automated code, Ericsson, Java methods, Language model, Large language model, Semantic augmentations, Semantics
National Category
Software Engineering
Identifiers
URN: urn:nbn:se:bth-27417DOI: 10.1109/ICSME58944.2024.00073ISI: 001417197700062Scopus ID: 2-s2.0-85215507252ISBN: 9798350395686 (print)OAI: oai:DiVA.org:bth-27417DiVA, id: diva2:1933449
Conference
40th IEEE International Conference on Software Maintenance and Evolution, ICSME 2024, Flagstaff, Oct 6-11, 2024
2025-01-312025-01-312025-09-30Bibliographically approved