BR-Hunter: Detect Information Types of Bug Reports From Online Community Discussions

Abstract

In community-based software development, live-chatting services are increasingly used to discuss bugs encountered during development. Many methods have emerged to identify bugs and produce bug reports, which further improve the efficiency of software development. However, previous methods still face challenges in understanding complex conversational structures and classifying sentences in bug reports, as entertaining or meaningless utterances often lower the quality of constructed bug reports. To address this issue, we propose a method named BR-Hunter, which comprises the following four components. Specifically, the data preprocessing component disentangles and denoises the live chats, while the utterance embedding component aims to extract the semantic features of each utterance in the conversations. The bug report identification component then models the conversation as a feature graph and uses Graph Neural Networks to identify conversations containing bug reports, thereby solving Challenge 1. Finally, the bug report synthesis (BRS) component tackles Challenge 2 by classifying and reassembling sentences from conversations containing bug reports, leveraging fine-tuned BERT and prompt learning techniques. Extensive experiments conducted on eight open source projects demonstrate that BR-Hunter achieves high accuracy in identifying bug reports. Compared to baseline methods, BR-Hunter improves the average F1 score by 36.41%, 24.80%, 68.92%, 46.77%, 52.84%, 25.80%, 25.25%, and 4.19%, respectively. And BR-Hunter also achieves an average improvement of 10.34% on the BRS task, compared with the state-of-the-art method.

Department(s)

Information Technology and Cybersecurity

Document Type

Article

DOI

10.1109/TR.2025.3615161

Keywords

Bug report, community-based software development, graph neural network (GNN), prompt learning

Publication Date

1-1-2025

Journal Title

IEEE Transactions on Reliability

Share

COinS