Author here. If I compile a package which has 1000 transitive dependencies writt...

pornel · on Sept 26, 2024

Rust can't prevent crates from doing anything. It's not a sandbox language, and can't be made into one without losing its systems programming power and compatibility with C/C++ way of working.

There are countless obscure holes in rustc, LLVM, and linkers, because they were never meant to be a security barrier against the code they compile. This doesn't affect normal programs, because the exploits are impossible to write by accident, but they are possible to write on purpose.

---

Secondly, it's not 1000 crates from 1000 people. Rust projects tend to split themselves into dozens of micro packages. It's almost like splitting code across multiple .c files, except they're visible in Cargo. Many packages are from a few prolific authors and rust-lang members.

The risk is there, but it's not as outsized as it seems.

Maintainers of your distro do not review code they pull in for security, and the libraries you link to have their own transitive dependencies from hundreds of people, but you usually just don't see them: https://wiki.alopex.li/LetsBeRealAboutDependencies

Rust has cargo-vet and cargo-crev for vetting of dependencies. It's actually much easier to review code of small single-purpose packages.

vlovich123 · on Sept 26, 2024

There’s two different attack surfaces - compile time and runtime.

For compile time, there’s a big difference between needing the attacker to exploit the compiler vs literally just use the standard API (both in terms of difficulty of implementation and ease of spotting what should look like fairly weird code). And there’s a big difference between runtime rust vs compile time rust - there’s no reason that cargo can’t sandbox build.rs execution (not what josephg brought up but honestly my bigger concern).

There is a legitimate risk of runtime supply chain attacks and I don’t see why you wouldn’t want to have facilities within Rust to help you force contractually what code is and isn’t able to do when you invoke it as a way to enforce a top-level audit. Even though rust today doesn’t support it doesn’t make it a bad idea or one that can’t be elegantly integrated into today’s rust.

pornel · on Sept 26, 2024

I agree there's a value in forcing exploits to be weirder and more complex, since that helps spotting them in code reviews.

But beyond that, if you don't review the code, then the rest matters very little. Sandboxed build.rs can still inject code that will escape as soon as you test your code (I don't believe people are diligent enough to always strictly isolate these environments despite the inconvenience). It can attack the linker, and people don't even file CVEs for linkers, because they're expected to get only trusted inputs.

Static access permissions per dependency are generally insufficient, because an untrusted dependency is very likely to find some gadget to use by combining trusted deps, e.g. use trusted serde to deserialize some other trusted type that will do I/O, and such indirection is very hard to stop without having fully capability-based sandbox. But in Rust there's no VM to mediate access between modules or the OS, and isolation purely at the source code level is evidently impossible to get right given the complexity of the type system, and LLVM's love for undefined behavior. The soundness holes are documented all over rustc and LLVM bug trackers, including some WONTFIXes. LLVM cares about performance and compatibility first, including concerns of non-Rust languages. "Just don't write weirdly broken code that insists on hitting a paradox in the optimizer" is a valid answer for LLVM where it was never designed to be a security barrier against code that is both untrusted and expected to have maximum performance and direct low-level hardware access at the same time.

And that's just for sandbox escapes. Malware in deps can do damage in the program without crossing any barriers. Anything auth-adjacent can let an attacker in. Parsers and serializers can manipulate data. Any data structure or string library could inject malicious data that will cross the boundaries and e.g. alter file paths or cause XSS.

josephg · on Sept 26, 2024

> the exploits are impossible to write by accident, but they are possible to write on purpose.

Can you give some examples? What ways are there to write safe rust code & do nasty things, affecting other parts of the binary?

Is there any reason bugs like this in LLVM / rustc couldn't be, simply, fixed as they're found?

steveklabnik · on Sept 26, 2024

https://github.com/Speykious/cve-rs

They can be fixed, but as always, there’s a lot of work to do. The bug that the above package relies on has never been seen in the wild, only from handcrafted code to invoke it, and so is less of a priority than other things.

And some fixes are harder than others. If a fix is going to be a lot of work, but is very obscure, it’s likely to exist for a long time.

josephg · on Sept 28, 2024

Yes, true. But as others have said, there’s probably still some value in making authors of malicious code jump through hoops, even if it will take some time to fix all these bugs.

And the bugs should simply get fixed.

bormaj · on Sept 26, 2024

Are there any attempts to address this at the package management level (not a cargo-specific question)? My first thought is that the package could declare in its config file the "scope" of access that it needs, but even then I'm sure this could be abused or has limitations.

Seems like awareness about this threat vector is becoming more widespread, but I don't hear much discuss trickling through the grapevine re: solutions.

josephg · on Sept 26, 2024

Not that I know of - hence talking about it in this blog post!

vlovich123 · on Sept 26, 2024

Package scope is typically too coarse - a package might export multiple different pieces of related functionality and you’d want to be able to use the “safe” parts you audited (eg no fs access) and never call the “dangerous” ones.

The harder bit is annotating things - while you can protect against std::fs, it’s likely harder to guarantee that malicious code doesn’t just call syscalls directly via assembly. There’s too many escapes possible which is why I suspect no one has particularly championed this idea.

josephg · on Sept 26, 2024

> it’s likely harder to guarantee that malicious code doesn’t just call syscalls directly via assembly.

Hence the requirement to also limit / ban `unsafe` in untrusted code. I mean, if you can poke raw memory, the game is up. But most utility crates don't need unsafe code.

> Package scope is typically too coarse - a package might export multiple different pieces of related functionality and you’d want to be able to use the “safe” parts you audited

Yeah; I'm imagining a combination of "I give these permissions to this package" in Cargo.toml. And then at runtime, the compiler only checks the call tree of any functions I actually call. Its fine if a crate has utility methods that access std::fs, so long as they're never actually called by my program.

vlovich123 · on Sept 26, 2024

> Hence the requirement to also limit / ban `unsafe` in untrusted code

I think you’d be surprised by how much code has a transitive unsafe somewhere in the call chain. For example, RefCell and Mutex would need unsafe and I think you’d agree those are “safe constructs” that you would want available to “utility” code that should haven’t filesystem access. So now you have to go and reenable constructs that use unsafe that should be allowed anyway. It’s a massively difficult undertaking.

Having easier runtime mechanisms for dropping filesystem permissions would definitely be better. Something like you are required to do filesystem access through an ownership token that determines what you can access and you can specify the “none” token for most code and even do a dynamic downgrade. There’s some such facilities on Linux but they’re quite primitive - it’s process wide and once dropped you can never regain that permission. That’s why the model is to isolate the different parts into separate processes since that’s how OSes scope permissions but it’s super hard and a lot of boilerplate to do something that feels like it should be easy.

josephg · on Sept 26, 2024

> I think you’d be surprised by how much code has a transitive unsafe somewhere in the call chain. For example, RefCell and Mutex would need unsafe and I think you’d agree those are “safe constructs” that you would want available to “utility” code that should haven’t filesystem access. So now you have to go and reenable constructs that use unsafe that should be allowed anyway. It’s a massively difficult undertaking.

RefCell and Mutex have safe wrappers. If you stick to the safe APIs of those types, it should be impossible to read / write to arbitrary memory.

I think we just don't want untrusted code itself using unsafe. We could easily allow a way to whitelist trusted crates, even when they appear deep in the call tree. This would also be useful for things like tokio, and maybe pin_project and others.