Mining Open-source Repositories for Predictive Security Analytics

Security is an integral requirement and concern when developing a software system. Similar to other types of bugs, security vulnerabilities can be injected in different phases of a software engineering lifecycle. These vulnerabilities can lead to the compromise of a system in terms of confidentiality, integrity and availability. Such breach may result in tremendous reputational and financial loss for organizations. Therefore, it is critical to detect and fix vulnerabilities as soon as possible.

Traditionally, the manual processes of vulnerability identification and remediation require a significant amount of domain expertise and effort, due to the high complexity and large scale of code as well as various types of security bugs subjected to different attack patterns. To support developers and security experts resolving vulnerabilities in the wild, the focus of CREST researchers is to utilize data available in open-source repositories to develop high-performing and robust AI-enabled prediction models to automate the detection, characterization, and fixing prioritization of vulnerabilities in time. We also study the security discussions of developers on open-source repositories (e.g., GitHub) and/or Q&A websites (e.g, Stack Overflow and Security StackExchange) to understand the current security challenges/risks and give suggestions on how researchers, educators and practitioners can maximally leverage and contribute to developing more secure software. Our vision is to integrate the state-of-the-art technologies and up-to-date information into a recommended system to aid developers in vulnerability analytics on-the-fly.