I am super new to Rust and maybe not the best place to ask, but why would you want to use inline assembly in Rust at all? Does that not invalidate a lot of the safety features built into the language?
"safely abstracting `unsafe`" is the important concept with respect to rust and it's use of `unsafe`.
`unsafe` occurs in rust source code where the author of some code needs to do things in a way that can't be directly proven to the compiler to be safe. These include things like calling C functions and ASM code (both cases where we can't infer all the information necessary to ensure safety). The author of the unsafe code then provides an "safe" abstraction around the unsafety that ensures that when one uses the "safe" interface, no undefined behavior occurs.
At the lowest level there is always some unsafety: system calls, libc function invocations, asm, modifying various memory mapped registers. What rust provides vs C or C++ is effective isolation of the unsafety.
There are some places where you have to use assembly instructions:
- Operating systems. For example, the voodoo that happens during early boot, where you need to walk the CPU through different modes. And also fundamental kernel stuff like mapping memory.
- Cryptography. Crypto code is often hand-written in assembly for efficiency. (Sometimes it's even the other way around: crypto algorithms may be designed with specific machine instructions in mind.) Also there are places where you have to guarantee that some operation happens in constant time, which is hard to do when a compiler may choose to insert branches or jumps without telling you.
To compete with C in those spaces, Rust will need inline assembly features.
As far as safety is concerned, yes, inline assembly is completely unsafe. In that sense, it's not so different from calling functions in the C standard library (or any other C library), which might do absolutely anything with your memory. In all of these cases, the Rust programmer has to use the `unsafe` keyword, and it's up to them to make sure that Rust's rules are still respected after the unsafe code has run. Doing this properly, and wrapping it all in a safe Rust API, means that other safe code can then use your library without the `unsafe` keyword and without any risk of triggering undefined behavior.
> - Operating systems. For example, the voodoo that happens during early boot,
Isn't every system call an assembly instruction? Not just the voodoo stuff?
My only experience with "real" assembly was my OS class in college, in which a project we had involved adding system calls to Linux, and they were all snippets of assembly code called in C.
Yes, generally speaking you need inline assembly to execute whatever architecture-specific instruction is used to enter the kernel (`int 0x80`, `syscall`, `swi`, etc.). And in the kernel you need inline assembly for other various architecture-specific instructions so you can execute them as part of your regular code instead of having to write them in a separate assembly file.
For the weirder cases like right after boot or handling an interrupt you generally just have to go full assembly for that, since in general you're not started out in a state where you can just start running your typical C/Rust/etc. code. It depends heavily on your platform at that point though.
Rust has lots of unsafe features; they all require explicit `unsafe {}` blocks.
As the post mentions, inline assembly comes in handy in a number of low level contexts (e.g. dealing with memory-mapped devices, working on microcontrollers, using processor features that aren't exposed by the kernel or standard library).
Moving into or out of a control register, such as to enable a processor feature. Disabling or enabling interrupts. Reading or writing model-specific registers. Saving registers, switching stacks, and calling another function, then switching back and restoring registers when the function returns. Initializing and using hardware virtualization features (e.g. Intel VT). Using features like SMAP (preventing accidental access from the kernel to user memory through a wild pointer, temporary enabling it in careful dedicated routines like copy_from_user and copy_to_user). Making raw system calls from userspace.
Some processors implement "Instruction Level Parallelism" where you can do arithmetic on many values simultaneously with special assembly instructions.
The most widespread cases of this (SSE, AVX2) came from Intel trying to get lock-in in the high-performance computing market. So it took a while for AMD to sell chips that implemented these instructions and even Intel's catalog doesn't uniformly offer them.
It's also tough to get the compiler to emit these instructions where you want even if it knows they're available (unsurprisingly, since you're asking it to auto parallelize a computation), so in the high-performance space a lot of people just have to resort to inline assembly.
Just a minor terminology point. Instruction Level Parallelism (ILP) means something else.
What you are describing, with special instructions, is called Single Instruction Multiple Data (SIMD), or simply vector instructions.
ILP means the execution of multiple separate instructions in parallel per clock cycle, though techniques like superscalar, out-of-order and speculative execution. ILP is sometimes used as a measure: How many instructions per cycle can be issued. It does not require special instructions, it's just the hardware being clever about running existing instructions faster.
We talked about Instruction Level Parallelism vs. Thread Level Parallelism vs. Node Level Parallelism since generally if you are working on a problem with some data dependency you will really have to think about each separately to get the best performance.
Rust wants to be a serious contender for low level and embedded code. In those spaces sometimes you don't have much of a choice but to drop down into assembly.
If you have a large-ish project that is otherwise coded in Rust, but you have to do some low-level coding because of special I/O, or special intrinsic features for multimedia, or even just flat-out performance, then this allows you to do so without falling back on linking against ASM or C.
Because ultimately Rust wants to be a C++ replacement and you can't do that if you don't let people who know what they're doing opt into inline assembly and raw pointers.
Yes, but low-level programming implies direct access to memory and instructions.
Rust is not designed for memory safety only. If you only want that, there are other simpler options, like any functional, scripting or managed language. Instead, Rust is designed to bring as much memory safety as possible (but not more!) to the low-level and performance fields.
Embedded/systems programming is not possible without dropping to assembly at times. Also, it can unlock some hardware-specific performance optimizations.
Rust already has unsafe blocks (explicitly marked `unsafe`) for this kind of situations. Inline assembly obviously is allowed only in unsafe contexts and should be used very carefully.