Cake – A C23 compiler frond end written from scratch in C

thradams · on Sept 13, 2022

This is my hobby project of a C23 Front End.

One of its unique features is the transpiler that can generate readable code, preserving macros and formatting.

See it online: http://thradams.com/web3/playground.html

asveikau · on Sept 13, 2022

Tried this:

    #include <stdio.h>

    int main()
    {
       puts("hello world");
       return 0;
    }

Got:

    source:5:4: error: not found 'puts'

     5 |  puts("hello world");
   |

thradams · on Sept 13, 2022

The declaration of puts is missing at the stdio.h used by the web version. (If you output preprocessed you will see the declarations are there)

I will added it. Thanks.

Meanwhile you can add:

int puts( const char *str );

rurban · on Sept 13, 2022

Caught a typo already :)

thradams · on Sept 13, 2022

thanks! :D

rch · on Sept 13, 2022

Typo in the title as well - although I like the idea that a programmer in the future will search for a contemporary feature called a 'compiler frond' and happen upon this discussion.

thradams · on Sept 13, 2022

Wow this is really bad!

This was my first topic here in hacker news.

I wasn't expecting all this traffic. Good to see and find more people interested in this kind of project.

I will be more careful in the future.

bee_rider · on Sept 13, 2022

Eh, users here just like to find things to complain about. If you didn't have some little typos, the thread would just be bikesheding. And, too much care and you'll never share.

It is a hobby project anyway, it doesn't need to be perfect, and it was good to share either way. People are clearly interested!

rch · on Sept 13, 2022

Agreed!

gus_massa · on Sept 14, 2022

> Wow this is really bad!

People making comment, even about typos means that people find the project interesting enough to read it and try to write a helpful comment. This is a great first topic in HN.

cygx · on Sept 13, 2022

While we're at it, it's "analysis", not "analisys"...

nilsb · on Sept 13, 2022

Also, there's "different" vs. "diferent" in the examples:

source:2:2: error: _Static_assert failed "types are diferent"

2 |static_assert(typeid(a) == typeid(double [10]), "types are diferent"); |^

golem14 · on Sept 13, 2022

Wow, the jackals are out in full force tonight ;)

Seth Meyers must have been offline for a while ?

cryptonector · on Sept 13, 2022

Ohhhh, a transpiler is fantastic. It means I could use C23 + extensions and still use VC as a backend when I have to. (One C codebase I work on has to build on older VC, too, which means accepting obnoxious limitations.)

EDIT: I see it has an option to target C99, but I wonder if it's sufficiently constrained C99 to support VC's not-quite-C99-hah-hah dialect of C. I'll have to try it at some point.

cygx · on Sept 13, 2022

Note that while support for anything post C89 used to be spotty at best, the Visual Studio situation has improved in recent years, see e.g. [1].

[1] https://devblogs.microsoft.com/cppblog/c11-and-c17-standard-...

cryptonector · on Sept 13, 2022

Yes, I'm aware. But a friend has to support very old VC and Windows. Don't ask. Point is that Microsoft's failure to support C99 a decade ago is still annoying.

pjmlp · on Sept 14, 2022

It wasn't a failure, the point of Visual C++ is to be good at supporting C++ and cleary communicated as a business decision.

https://herbsutter.com/2012/05/03/reader-qa-what-about-vc-an...

cryptonector · on Sept 14, 2022

But it was also required for, e.g., building supported kernel modules. It ended up imposing on others. It was cheap for MSFT, I get that, and this is life, but I also get to point out that this happened.

pjmlp · on Sept 14, 2022

Since Vista that a C++ subset is allowed for kernel code (/kernel) and nowadays there are template libraries like WIL.

numeromancer · on Sept 13, 2022

You and this guy ought to get together:

https://github.com/libav/c99-to-c89

sfpotter · on Sept 13, 2022

I thought defer and lambdas weren't making it to C23...

https://thephd.dev/ever-closer-c23-improvements

thradams · on Sept 13, 2022

Everything that is not part of C23 I annotated with "extension". For instance "Extension Defer" or "Extension Lambda" are not part of C23.

whizzter · on Sept 13, 2022

Iirc NarcissiusJS was a test-bed for future JS features at a time when development was a bit slow that allowed quick prototypes.

https://github.com/mozilla/narcissus https://wiki.mozilla.org/Narcissus

ape4 · on Sept 13, 2022

Cool - like Cake by the Ocean. https://www.youtube.com/watch?v=vWaRiD5ym74

Maybe you want to add some neat new features - like Circle C++ https://www.circle-lang.org/

rmatt2000 · on Sept 13, 2022

Or Cake by the Ocean Blue. https://www.youtube.com/watch?v=K9VfJqHwYKs

saagarjha · on Sept 13, 2022

I suspect this project is not quite as cool as sex on the beach.

aidenn0 · on Sept 13, 2022

I don't know, it has lots of uses and I don't get sand in any sensitive bits...

imachine1980_ · on Sept 13, 2022

this is really cool, i want to do something related, make a language like vala. what is the biggest difficulty in this type of transpiler in your short journey?

thradams · on Sept 13, 2022

Even having experience with C/C++, C preprocessor and C declarations took a long time. Some time to get used to the grammar as well.

A language without preprocessor and a simple grammar saves a lot time and you can go direct for the funny parts.

junon · on Sept 13, 2022

German by chance? Hopefully you're okay with language corrections :)

I live in Germany now and I often hear "funny" (lustig) used where "fun" (Spaß, used in English as an adjective too - something like spaßig but that's still considered "funny") should instead be used. Just a tip!

This project is awesome, by the way. Thanks for posting it.

spyremeown · on Sept 13, 2022

Probably brazilian by the Portuguese comments.

thradams · on Sept 13, 2022

Yes! I am Brazilian.

synergy20 · on Sept 13, 2022

with skills like this, mind to push cello forward? https://github.com/orangeduck/Cello really like it but not skillful enough to do it myself.

SV_BubbleTime · on Sept 13, 2022

Help me out... Why do I want a transpiler to take C23 code and convert it down to C99 code?

If I was interested in using the C23 features, wouldn't I use a C23-ready compiler?

That said... I do wonder how long it will be until ARM and GCC have a C23 toolchain - or did I just answer my own question?

thradams · on Sept 13, 2022

My goal with the transpiler is not only the transpiler, although my front end had to be created differently of an normal compiler and preserve more tokens that could be discarded during the compilation. This also can be useful for a tool that does refactoring.. like renaming variables etc.. so in any case it it useful.

The new C23 language has a lot of features that makes your code not compile in previous C versions like attributes digit separators etc.. Someone may wants to create a new project in C23 and soon regret because the users of the code may need C99. This would be one use case, you can create a C23 code and have C99 versions of the same base code.

Unfortunately my transpiler is not "production ready" yet and I don't have IDE plugins etc.. that is required to make the tool more productive.

The other advantage, if we had a production ready transpiler with a IDE support etc.. it that we could use C23 and compile to C99 without having to wait for compilers like msvc to implements the standards.

Also some experimental features (like defer) can be used and you can distribute your code in standard C99. We have more freedom to use wherever we want and distribute a "readable" C99 code.

By the way most of the C transpilers or compilers generates C code only for immediate compilation. CFront was like that.

My transpiler have two modes one is for direct compilation and other is to distribute generated code.

Each mode has advantages and disadvantages.

SV_BubbleTime · on Sept 13, 2022

> The new C23 language has a lot of features that makes your code not compile in previous C versions like attributes digit separators etc.. Someone may wants to create a new project in C23 and soon regret because the users of the code may need C99.

Ah. I was not aware of this.

cryptonector · on Sept 14, 2022

> Why do I want a transpiler to take C23 code and convert it down to C99 code?

Because you are required to use Visual Studio or some compiler that barely even does C99/C11.

> If I was interested in using the C23 features, wouldn't I use a C23-ready compiler?

Yes, but maybe you can't guarantee a C23-capable compiler on all the platforms you care about, or maybe you are required to support VS/VC.

> That said... I do wonder how long it will be until ARM and GCC have a C23 toolchain - or did I just answer my own question?

You did :)

Transpiling is just a very useful idea. Sure, why even bother, the world should be perfect already! But the world isn't. And anyways, transpiling is a very cool idea. If you have time and funding (e.g., maybe you're a graduate student), then you might build it. If you don't have the time and funding you might wonder why even bother, except maybe you have the need and then you get the funding? Even if you don't have the time or funding, you might make an idea you care about into a hobby. So there's many reasons why one might want to build a transpiler.

In another thread we were talking about transpilers for SQL. There the motivation is much stronger and clearer than here, but it's of the same sort: portability. In the case of SQL there's so much variation across RDBMSes that if you must support more than one, you'll quickly wish you had a transpiler for some dialect, or even for an alternative query language.

Another thing is that a transpiler won't need to do much analysis or optimization, so it can be fairly simple compared to a full compiler, but! a transpiler can be much less ambitious than -and a great starting point for- constructing a compiler.

saagarjha · on Sept 13, 2022

Targeting a compiler with poor language support?

einpoklum · on Sept 13, 2022

> The compiler can be used to translate new versions of C (like C23) to C99.

So is it a compiler, a front-end (which emits IR for another compiler), or a transpiler?

thradams · on Sept 13, 2022

It's a front end. The backend implemented generates C code. I hope to have other backends in the future.

kreco · on Sept 13, 2022

It's all of them.

A transpiler is still a compiler. A front-end compiles to an intermediate state (C99 in this case).

an1sotropy · on Sept 13, 2022

This looks impressive!

I should spend some more time reading through what you have, but can you answer: what parts of this should I be looking at if I just want something to generate an AST for C99 (no transpiling needed)?

There’s some source analysis I’d like to do (on student C code) and right now I’m considering the (python-based) C parser in CFFI, but your’s might be more complete?

thradams · on Sept 13, 2022

The part that is the "transpiler" is visit.c and visit.h.

You can use it as reference to create different "visits". For instance, I am implementing a code format at "visitformat.c".

I have implemented a "naming convention checker" inside the parser, but it also can be a "visit".

Static analysis etc..can be a visit.

Inside the visit context you will find

struct ast ast;

That is the AST.

I would say the syntactic analysis parser is 100% complete C23.

Semantic analysis that is not 100%.

G4E · on Sept 14, 2022

You can also use pycparser[0]. It is fully compatible C99, but be careful it doesn't support gnu extensions (like attributes, #indent, asm() ...). You can however work around most of them by -D defining them to empty macro in the argument.

[0] https://github.com/eliben/pycparser

an1sotropy · on Sept 14, 2022

Right, pycparser is what CFFI uses. I’ve seen some really cryptic error messages when it tries to process some of my C header files (since worked around), and I’m curious what else is out there. The ability to preserve info about formatting that the OP noted is especially interesting.

As long as we’re on this tangent, here’s the challenge I’m facing with automated analysis of student code: from foo.c make bar.c which is identical to foo.c except that comments have been turned into spaces. I think this is annoyingly non-trivial.

thradams · on Sept 14, 2022

To remove comments from source all you need is a tokenizer. You don't need all the tokens, just the "preprocessor tokens". For instance literal strings, ppnumbers ... Then /comments/ can be replaced with 1 space and //comments by \n

an1sotropy · on Sept 14, 2022

Multi-line comments would need to be turned into multiple blank lines, but yes, thank you for pointing out that I've been over-thinking this. I will look into what is the path of least resistance for this tokenizer-based transformation.

dj_mc_merlin · on Sept 13, 2022

Love it. Seeing C being used in the browser to transpile C is.. weird. In the good way.

ncake · on Sept 13, 2022

Cool. I wish CMake included an option to use a custom preprocessor like this.

Matheus28 · on Sept 13, 2022

It isn't CMake's job to do this. Just have it pass a flag to your compiler to use a custom pre-processor (--no-integrated-cpp -B<path>). Should work... hopefully. Might also have to pipe through the preprocessor again since OP's most likely won't be processing #includes and such

tambourine_man · on Sept 13, 2022

That's a good name.

gpvos · on Sept 13, 2022

No lie.