Is AT&T syntax really that bad? In particular, it's way nicer for specifying ope...

dmytrish · on June 9, 2020

I subjectively like AT&T syntax more (as a more familiar to me and having more "machine" vibe).

However, I recognize that it's objectively worse for a human coder and contains some syntactic footguns which shoot even a experienced coder regularly:

- `number` (memory displacement/pointer) used instead of literal `$number`. It's an easy automatic mistake even for a person who knows this very well.

- the SIB clauses for x86 (memory addressing with constant Displacement and Scale and Index, Base in registers) look like `D(B, I, S)`: it's possible to remember this, but reading/writing it is not as obvious as `[D + B + I * S]`.

- Intel syntax in general is more similar to high-level languages, even though it's more verbose.

- AT&T syntax has syntactic redundancies like '%' before each register which make code much noisier than needed.

hyperman1 · on June 9, 2020

The intel syntax has some nice gotchas though as the scale can be 1. I believe there is [eax+ebp] which uses the ds segment, but [ebp+eax] uses ss.

exmadscientist · on June 9, 2020

The really fun bit is when you explicitly want to force one form or other, for some reason. Usually this is related to instruction length, but occasionally you used to run into times when one form could help avoid a stall.

Keeping tricks like that stable and clearly documented is, uh, not for the faint of heart.

pwdisswordfish2 · on June 9, 2020

% before register names isn't redundant: it allows you to refer to symbols that would otherwise have the same spelling as register names.

dmytrish · on June 9, 2020

Yes, that's where AT&T is more logical.

In practice, though, naming a symbol with a register name is error-prone and confusing. I'd rather have a way to escape a symbol when it's really needed than to pay the price of noise for an admittedly bad idea.

cesarb · on June 9, 2020

New registers are added all the time (for instance, the %zmm registers), so even if a symbol name doesn't conflict now, it might conflict in the future. Having separate namespaces for symbols and registers is a good idea.

renox · on June 10, 2020

Good point, but I wonder if it shouldn't be the labels which should be prefixed instead of the registers?

burfog · on June 10, 2020

Yes, ideally the labels would be strings in the style of C source code. That includes using \0 to put NUL bytes in the middle. It also includes wide character strings.

This allows for unusual languages like LISP and FORTH, without mangling the symbols. Symbols could have commas and spaces.

fluffything · on June 9, 2020

The problem is not whether its bad or not, but rather, having to learn a completely-different second syntax for one architecture.

Most Rust code is quite portable, targeting ARM, x86, MIPS, PPC, WASM, Sparc, s390x, riscv, ... That means, that for many snippets of inline assembly, you might encounter ~8 of them, one for each architecture, all using different syntaxes.

Intel syntax is quite similar to that of other popular architectures `op dst, args...`.

Adding another second syntax for x86 just doesn't add that much value IMO, and adds quite a bit of cost: now everybody dealing with x86 assembly needs to learn 2 syntaxes... and everybody dealing with portable code now needs to be at least able to read 2 syntaxes for x86... Without talking about the cost of implementing a second syntax in the compiler, etc.

If you prefer AT&T, you can always write a proc macro that translates it to Intel, and use that in your projects.

If I ever need to deal with such code, I'd just expand the macro to read the actual Intel syntax, modify that, and either fork the project, or submit a patch with a fix using Intel syntax.

JoshTriplett · on June 9, 2020

> If you prefer AT&T, you can always write a proc macro that translates it to Intel, and use that in your projects.

If you prefer AT&T (or you have a large body of existing AT&T code you don't want to have to translate all at once), use asm!("...", options(att_syntax)) and it'll Just Work.

amluto · on June 9, 2020

I have two big reasons for preferring Intel syntax:

1. It’s the syntax in the manual. The last thing I want to do when reading or writing asm is to mentally translate from the manual to AT&T syntax.

2. Addressing like (%rax) is tolerably. But the AT&T scale * index + offset syntax is inexcusable. Give me the verbose Intel addressing syntax any day, please.

(As a kernel programmer, I’m more familiar with AT&T. I still hate it. I’m morbidly curious how Intel syntax ought to handle things like SGDT. Maybe SGDT SIXBYTE PTR [address]? The fact that four bytes is called a DWORD isn’t great.)

jlebar · on June 9, 2020

> 1. It’s the syntax in the manual. The last thing I want to do when reading or writing asm is to mentally translate from the manual to AT&T syntax.

Right! I worked on C++ compilers for years and I don't even know where is the canonical book of AT&T mnemonics. At the rate that Intel is adding new instructions, using anything other than their official docs (and thus their official names) seems nuts.

amluto · on June 10, 2020

I'm quite confident that the canonical book of AT&T x86 assembler does not exist. There is the canonical book of Intel asm (the SDM), the canonical book of AMD asm (the APM), and the almsot canonical book of upcoming Intel asm (the ISE). Sadly, these documents are not generally entirely consistent with each other, but they all agree that the Intel syntax is the syntax.

jfkebwjsbx · on June 9, 2020

AT&T syntax is actually the common one in most low-level programming if you count by architectures and most likely also by code size produced, if only because GCC has/had been the de facto compiler for new chip uarchs for more than a decade (helped by the fact that everyone wants their chips to run Linux).

Nowadays GCC and LLVM support both styles and archs pick whatever they prefer and nobody cares, really.

> either fork the project, or submit a patch with a fix using Intel syntax.

That sounds a bit extreme? Reading/writing in both styles is not an issue for anyone that has dealt with x86 professionally.

exmadscientist · on June 9, 2020

>AT&T syntax is actually the common one in most low-level programming

Is this actually true? Admittedly I've done mostly x86 and ARM for the past several years (almost entirely Cortex-M, so v6-M and v7-M profiles, using ARM's GCC builds for embedded) and the only toolchain that prefers AT&T syntax is x86 GCC and those explicitly trying to be compatible with it. All the ARM inline assembly I've written, targetting GCC backends, has been ARM syntax, and likewise for all the disassembly output.

The DSPs and DSP-likes are... always weird. So I try to stay away from them and make them someone else's problem. But I don't think they use AT&T syntax either. It doesn't work so well for truly strange processors anyway.

I'm one of those guys who tends to have the makefiles output the disassembly, and have it open on the other monitor while I'm working, so I'd notice if it were different....

oblio · on June 9, 2020

And apparently Rust already supports both.

rectang · on June 9, 2020

They're both terrible. :( I yearn for a more ergonomic assembler.

That's not to say that the Rust team should have tried to find (or create) an alternative; it's outside their core mission, and choosing between popular existing alternatives is the right framing.