Using Consensual Biterms from Text Structures of Requirements and Code to Improve IR-Based Traceability Recovery (bibtex)
by Hui Gao, Hongyu Kuang, Kexin Sun, Xiaoxing Ma, Alexander Egyed, Patrick Mäder, Guoping Rong, Dong Shao, He Zhang
Abstract:
Traceability approves trace links among software artifacts based on whether two artifacts are related by system functionalities. The traces are valuable for software development, but are difficult to obtain manually. To cope with the costly and fallible manual recovery, automated approaches are proposed to recover traces through textual similarities among software artifacts, such as those based on Information Retrieval (IR). However, the low quality & quantity of artifact texts negatively impact the calculated IR values, thus greatly hindering the performance of IR-based approaches. In this study, we propose to extract co-occurred word pairs from the text structures of both requirements and code (i.e., consensual biterms) to improve IR-based traceability recovery. We first collect a set of biterms based on the part-of-speech of requirement texts, and then filter them through the code texts. We then use these consensual biterms to both enrich the input corpus for IR techniques and enhance the calculations of IR values. A nine-system-based evaluation shows that in general, when solely used to enhance IR techniques, our approach can outperform pure IR-based approaches and another baseline by 21.9% & 21.8% in AP, and 9.3% & 7.2% in MAP, respectively. Moreover, when used to collaborate with another enhancing strategy from different perspectives, it can outperform this baseline by 5.9% in AP and 4.8% in MAP.
Reference:
Using Consensual Biterms from Text Structures of Requirements and Code to Improve IR-Based Traceability Recovery (Hui Gao, Hongyu Kuang, Kexin Sun, Xiaoxing Ma, Alexander Egyed, Patrick Mäder, Guoping Rong, Dong Shao, He Zhang), In 37th IEEE/ACM International Conference on Automated Software Engineering, ASE 2022, Rochester, MI, USA, October 10-14, 2022, ACM, 2022.
Bibtex Entry:
@Conference{DBLP:conf/kbse/GaoKSMEMRSZ22,
  author     = {Hui Gao and Hongyu Kuang and Kexin Sun and Xiaoxing Ma and Alexander Egyed and Patrick Mäder and Guoping Rong and Dong Shao and He Zhang},
  booktitle  = {37th {IEEE/ACM} International Conference on Automated Software Engineering, {ASE} 2022, Rochester, MI, USA, October 10-14, 2022},
  title      = {Using Consensual Biterms from Text Structures of Requirements and Code to Improve IR-Based Traceability Recovery},
  year       = {2022},
  pages      = {114:1},
  publisher  = {{ACM}},
  abstract   = {Traceability approves trace links among software artifacts based on whether two artifacts are related by system functionalities. The traces are valuable for software development, but are difficult to obtain manually. To cope with the costly and fallible manual recovery, automated approaches are proposed to recover traces through textual similarities among software artifacts, such as those based on Information Retrieval (IR). However, the low quality & quantity of artifact texts negatively impact the calculated IR values, thus greatly hindering the performance of IR-based approaches. In this study, we propose to extract co-occurred word pairs from the text structures of both requirements and code (i.e., consensual biterms) to improve IR-based traceability recovery. We first collect a set of biterms based on the part-of-speech of requirement texts, and then filter them through the code texts. We then use these consensual biterms to both enrich the input corpus for IR techniques and enhance the calculations of IR values. A nine-system-based evaluation shows that in general, when solely used to enhance IR techniques, our approach can outperform pure IR-based approaches and another baseline by 21.9% & 21.8% in AP, and 9.3% & 7.2% in MAP, respectively. Moreover, when used to collaborate with another enhancing strategy from different perspectives, it can outperform this baseline by 5.9% in AP and 4.8% in MAP.},
  bdsk-url-1 = {https://doi.org/10.1145/3551349.3556948},
  doi        = {10.1145/3551349.3556948},
  url        = {https://doi.org/10.1145/3551349.3556948},
}
Powered by bibtexbrowser