The data science products that 42Analytics creates are delivered as Docker images. Models are accompanied by software to make them usable, so we are also shipping software. It is important to take software security into consideration when delivering these data science products. This is especially true when using various open source software libraries, as we like to do. Awareness of potential security issues and dealing with known vulnerabilities is therefore an integral part of doing responsible data science.
Automated security scanning is a helpful tool for finding (and fixing) security issues. At 42Analytics, we have been using Trivy as our (open source!) tool of choice to perform automated security scanning. Recently, Google announced the release of OSV-Scanner, which is promoted as a free tool for open source developers. We decided to compare OSV-Scanner to Trivy to find out if it could help us improve security in our products even more.
We performed an "on-paper" and an "in-practice" comparison to see how it would fit our current use case. The results of our comparisons can be found in this public repository. The main conclusion for us is that OSV-Scanner could not improve security for our use cases. In fact, we found out that OSV-Scanner seems to miss vulnerabilities for language-specific libraries installed on Docker images! In our example, Django 4.1 is installed on top of a Python 3.10 Docker image, OSV-Scanner picks up the vulnerability if you scan the requirements file, but not if you scan the resulting Docker image. If you are using vulnerability scanning using OSV-Scanner for Docker images, we hope you are aware of this limitation! Either use Trivy instead, or ensure that you scan the requirements file first.