This is the story of how I found a remote code execution in BentoML and what it can teach you about securing your AI systems.
BentoML is a popular AI framework used to package and serve models. It makes it very simple for you to turn any model into a REST API.

As part of my usual routine, I began exploring the different methods BentoML used to serialize and deserialize objects, especially machine learning models. All AI models are essentially just objects in a program’s runtime, and in order to transfer these objects, we convert them into a format we can easily store and transfer; this is called serialization. The opposite, the loading or bringing into memory of such a model, is called deserialization. In Python, this deserialization is commonly done using `pickle.’ This is the story of how I discovered CVE-2024-2912, the severity of it, and how `pickle` was the culprit.