
This talk from IBM Research focuses on using AI and machine learning to combat supply chain attacks. The presenters highlight the increasing lack of trust in software due to major security breaches like the XZ backdoor.
Here are the key takeaways:
- The Problem: There’s a “semantic gap” between what code is expected to do and what it actually does. This gap is exploited in supply chain attacks where malicious code is hidden in software updates or open-source projects.
- The Solution: The researchers introduce the “Code Genome Framework,” an open-source tool that creates “semantically meaningful fingerprints” of software. By comparing the “code genome” of a new piece of software to a trusted version, the framework can detect anomalies and potential threats.
- How it Works: The framework disassembles code, converts it to an intermediate representation, and then extracts a unique “code gene” or feature vector. This allows for a differential analysis to spot malicious changes.
- Applications: The Code Genome Framework can be used to detect backdoors, as demonstrated with the XZ backdoor example, and to improve the accuracy of Software Bill of Materials (SBOMs) by identifying software components at a granular level.
The framework and related tools are open-source to encourage collaboration in improving supply chain security.