Automated Extraction of Security Concerns from Bug Reports
Published:
Abstract: Issue tracker repositories contain a wealth of textual information including bug reports that capture information, often implicit, about Information Security (IS) concerns and vulnerabilities associated with certain issues. Deriving an approach to extract such security concerns from bug reports can yield several benefits, such as bug management (e.g., prioritization) or bug triage. Existing research on Information Extraction (IE) for extracting knowledge from bug reports has mainly focused on supervised learning, which requires a significant amount of human labor in preparing a training corpus. In this paper, we explore a fully automated approach that can extract security concepts (tags) from bug reports without the need for manual training data. This approach can automatically identify and classify bug reports based on their security concepts and textual similarities. In addition, we further enrich these tags with meaningful and representative security names derived from the security domain.