Skip to content(if available)orjump to list(if available)

Rust to C compiler – 95.9% test pass rate, odd platforms

cbmuser

I am still waiting for any of the alternative Rust front- or backends to allow me to bootstrap Rust on alpha, hppa, m68k and sh4 which are still lacking Rust support.

Originally, the rustc_codegen_gcc project made this promise but never fulfilled it.

Aurornis

> to allow me to bootstrap Rust on alpha, hppa, m68k and sh4

Do you actually use all four of those platforms, or is this an arbitrary threshold for what you consider a complete set of platform support?

im_down_w_otp

They're still common (except for alpha) platforms in some market segment specific corners of embedded development. So, maybe for those purposes?

Though, the trend I'm seeing a lot of is greenfield projects just migrating their MCUs to ARM.

Aurornis

> Though, the trend I'm seeing a lot of is greenfield projects just migrating their MCUs to ARM.

That’s what I would expect, too.

The Venn diagram of projects using an old architecture like alpha but also wanting to adopt a new programming language is nearly two separate circles.

The parent comment even included HPPA (PA-RISC) which almost makes me think they’re into either retro computing or they have some arbitrary completionist goal of covering all platforms.

shakna

"m68k-unknown-linux-gnu" was merged as a Tier-3 target for Rust, wasn't it? [0]

[0] https://github.com/rust-lang/compiler-team/issues/458

hedgehog

Did they abandon that goal? Last I heard it was still under development.

jedisct1

rust still doesn't even support OpenBSD on x86_64...

dralley

Rust has Tier 3 support for OpenBSD on x86_64

mrweasel

Do you mean x86 (as in 32bit)? Because I'm fairly sure that there's a Rust package available on x86_64 ( and aarch64, riscv64, sparc64 and powerpc64).

1vuio0pswjnm7

"Most components of std are about 95% working in .NET, and 80% working in C."

.NET

Core tests 1662 39 12 97.02%

C

Core tests 1419 294 82.83%

Missing from HN title: The "95%" pass rate only applies to .NET. For GCC/Clang it is only "80%".

cod1r

this fractalfir person is super talented. See them on the rust reddit all the time. I'm not knowledgeable on compilers at all but others seem to really like their work.

iaaan

Lots of interesting use cases for this. First one that comes to mind is better interop with other languages, like Python.

pornel

The interop is already great via PyO3, except when people want to build the Rust part from source, but are grumpy about having to install the Rust compiler.

This hack is a Rust compiler back-end. Backends get platform-specific instructions as an input, so non-trivial generated C code won't be portable. Users will need to either get pre-generated platform-specific source, or install the Rust compiler and this back-end to generate one themselves.

chrisrodrigue

They are grumpy about having to install the Rust compiler for a good reason. You can’t compile for Rust on Windows without using MSVC via Visual Studio Build Tools, which has a restrictive license.

steveklabnik

You can use the GNU ABI instead, if you don't want to use the Visual Studio Build Tools.

estebank

https://rust-lang.github.io/rustup/installation/windows.html

> When targeting the MSVC ABI, Rust additionally requires an installation of Visual Studio so rustc can use its linker and libraries.

> When targeting the GNU ABI, no additional software is strictly required for basic use. However, many library crates will not be able to compile until the full MSYS2 with MinGW has been installed.

...

> Since the MSVC ABI provides the best interoperation with other Windows software it is recommended for most purposes. The GNU toolchain is always available, even if you don’t use it by default.

xmodem

What does this gain you that you can't already do with `extern "c"` functions from rust?

nicce

’Extern c’ still uses Rust. You want to skip Rust and call C from other languages directly.

dcow

Rust doesn't have a runtime so it looks just like C in compiled form. c-bindgen even spits out a c header. I’m not sure what skipping C practically means even if you can argue there’s a philosophical skip happening.

hypeatei

Not GP, but what is the point of touching Rust at all then?

null

[deleted]

dilawar

Is it LLVM IR --> C? Or Rust AST to C?

dilawar

Found the answer in the project readme.

> My representation of .NETs IR maps nicely to C, which means that I was able to add support for compiling Rust to C in 2-3K LOC. Almost all of the codebase is reused, with the C and .NET specific code only present in the very last stage of compilation

nickpsecurity

Which might also allow one to use tools that work on .NET bytecode. They include verification, optimization, debugging, and other transpilers. You might also get a grant or job offer from MS Research. :)

epage

It is a rustc backend, ie an alternative to llvm, gcc, or the cranelift backends.

It started as a .NET backend but they found that their approach could easily support C code generation as well so they added that. They do this by turning what rustc gives them into their own IR.

Krutonium

But does it carry the Rusty guarantees?

GolDDranks

If the transpilation itself is bug-free, why not? For static guarantees, provided we transpile Rust code that already compiles on a normal Rust compiler, the guarantees are already checked and there, and the dynamic ones such as bounds checking can be implemented runtime in C with no problems.

chii

this assumes the rusty guarantees are transitive. There's no reason to believe it isn't, but it'd be nice to see some sort of proof, or at least an argument for it.

fpoling

Rust does not generate machine code itself. It uses LLVM to do that and there is no proof that the transformations done by that continue to keep the borrow checker guarantees. One just assumes with sufficient testing all bugs will be discovered.

Then the machine code generated by LLVM is not run directly by modern CPUs and is translated into internal representation first. And the future CPUs will behave like JIT-compilers with even more complex transformations.

The intermediate C code generated by this project just adds yet another transformation not fundamentally different from any of the above.

josephg

Should be. The rust borrow checker has no runtime component. It checks the code as-is before (or during) compilation.

Arguably it’s not the compiled binary that’s “safe”. It’s the code.

cryptonector

Why wouldn't it?

pornel

It could fail if the generated C code triggered Undefined Behavior.

For example, signed overflow is UB in C, but defined in Rust. Generated code can't simply use the + operator.

C has type-based alias analysis that makes some type casts illegal. Rust handles alias analysis through borrowing, so it's more forgiving about type casts.

Rust has an UnsafeCell wrapper type for hacks that break the safe memory model and would be UB otherwise. C doesn't have such thing, so only uses of UnsafeCell that are already allowed by C are safe.

FractalFir

I have workarounds for all "simple" cases of UB in C(this is partially what the talk is about). The test code is running with `-fsantize=undefined`, and triggers no UB checks.

There are also escape hatches for strict aliasing in the C standard - mainly using memcpy for all memory operations.

cryptonector

> It could fail if the generated C code triggered Undefined Behavior.

> For example, signed overflow is UB in C, but defined in Rust. Generated code can't simply use the + operator.

Obviously, yes, but it could generate overflow checks.

bregma

Wait until you find out how unsafe software written in the machine language that Rust usually transpiles to is.

claudiojulio

Very cool. C to Rust would be fantastic.

Aurornis

> C to Rust would be fantastic.

This would have to go into one big unsafe block for any nontrivial program. C doesn’t convey all of the explicit things you need to know about the code to make it even compile in Rust.

CryZe

I once implemented a WASM to Rust compiler that due to WASM's safety compiles to fully safe Rust. So I was able to compile C -> WASM -> Rust and ended up with fully safe code. Though of course, just like in WASM, the C code is still able to corrupt its own linear memory, just can't escape the "sandbox". Firefox has employed a similar strategy: https://hacks.mozilla.org/2020/02/securing-firefox-with-weba...

sitkack

I'd love to check that out. Did it unroll a wasm interpreter into wasm_op function calls?

JonChesterfield

If your translator is correct, the rust front end enforces the semantics of rust then C implements them. It's as safe as any other implementation.

If that feels uncomfortable, consider that x64 machine code has no approximation to rust safety checks, and you trust rust binaries running on x64.

"Correct" is doing some heavy lifting here but generally people seem willing to believe that their toolchain is bug free.

pests

They are discussing C to Rust, not the topic of the post. Rust would need to guess the semantics of the original C.

jeroenhd

Tools like those exist. The problem with them is that they use unsafe blocks a lot, and the code usually isn't very idiomatic. Translating global variable state machines into more idiomatic Rust state machines based on things like named enums, for instance, would be very difficult.

With the help of powerful enough AI we might be able to get a tool like this, but as AI still very much sucks at actually doing what it's supposed to do, I don't think we're quite ready yet. I imagine you'd also need enough memory to keep the entire C and Rust code base inside of your context window, which would quickly require very expensive hardware once your code grows beyond a certain threshold. If you don't, you end up like many code assisting LLMs, generating code independently that's incompatible with itself.

Still, if you're looking to take a C project and extend it in Rust, or perhaps slowly rewrite it piece by piece, https://c2rust.com/ is ready for action.

g-mork

Mark Russinovich recently gave a talk at a UK Rust conference that mentioned Microsoft's internal attempts at large scale C->Rust translation, https://www.youtube.com/watch?v=1VgptLwP588

pjmlp

Note the AI part of the tooling.

ndndjdnd

What benefit would you envision from this?

trentearl

There is DARPA program called TRACTOR to pursue this:

https://www.darpa.mil/news/2024/memory-safety-vulnerabilitie...

IshKebab

1. It means you don't need C code & a C compiler in your project any more, which simplifies infrastructure. E.g. cross compiling is easier without any C.

2. You can do LTO between Rust and the C->Rust code so in theory you could get a smaller & faster executable.

3. In most cases it is the first step to a gradual rewrite in idiomatic Rust.

jedisct1

Nim to C compiler, 100% test pass rate.

jokoon

At first I read it as C to rust compiler.

What is the point of compiling rust to C?

drdeca

I think there are probably C compilers for more platforms than there are rust compilers. So, if you want to compile your rust project on some obscure platform that doesn’t have a rust compiler for it yet, you could compile to C and then compile the resulting C code for that platform?

Just a guess.

Someone

This project doesn’t have that as a goal. In fact, it doesn’t even have “Rust to C compiler” as a goal. https://github.com/FractalFir/rustc_codegen_clr:

“ The project aims to provide a way to easily use Rust libraries in .NET. It comes with a Rust/.NET interop layer, which allows you to easily interact with .NET code from Rust

[…]

While .NET is the main focus of my work, this project can also be used to compile Rust to C, by setting the C_MODE enviroment flag to 1.

This may seem like a strange and unrelated feature, but the project was written in such a way that this is not only possible, but relatively easy.”

It also doesn’t mention for which version of C it produces code. That may or may not hinder attempts to use this to run rust on obscure platforms.

FractalFir

The README is slightly out of date, sorry. Supporting old platforms is one of the goals.

Truth be told, the support for C was at first added as a proff-of-concept that a Rust to C compiler is possible. But it worked surprisingly well, so I just decided to roll with it, and see where it takes me.

My policy in regards to C version is: I want to be as close to ANSI C as possible. So, I avoid modern C features as much as I can. I don't know if full compatibility is achievable, but I certainly hope so. Only time will tell.

Some simpler pieces of Rust work just fine with ANSI C compilers, but more complex code breaks(eg. due to unsupported intrinsics). If I will be able to solve that(+ some potential soundness issues) then I'll be able to use ANSI C.

arka2147483647

The article mentions ANSI-C at places. So seems like the old c standard is targeted.

tetha

This is a fairly common technique in compiler construction and programming language research: Don't try to emit some machine code, instead emit C or an IR for clang or GCC. And suddenly your little research language (not that rust is one) is executable on many, many platforms, can rely on optimizations the compilers can do, has potential access to debug handling, ..

kvemkon

Vala [1] is, perhaps, the most prominent example of practically used programming language with such compiler.

[1] https://en.wikipedia.org/wiki/Vala_(programming_language)

widforss

Regarding the other way, I guess a lot of (practically) legal C wouldn't compile to Rust at all due to the language's restrictions and C's laxness, while I think all Rust could be translated to C.

p0w3n3d

Exactly. Btw rust toolchain is quite complicated while a code that was tanspiled to C might be as well compiled to e.g. 6052

arghwhat

Using C compiler infrastructure, taking Rust where rustc/llvm does not go. Proprietary platforms with proprietary compilers for example.

teo_zero

> What is the point of compiling rust to C?

To address platforms that don't support Rust. TFA mentions NonStop, whatever it is.

steveklabnik

Not only does NonStop not support Rust, but apparently they failed to port gcc to it, even. So compiling Rust straight to C itself is pretty much the only option there.

nickpsecurity

They are amazing machines designed for fault tolerance (99.999% reliability). The Wikipedia article below has design details for how many generations were made. HP bought them.

https://en.m.wikipedia.org/wiki/Tandem_Computers

I think it would be useful in open-source, fault tolerance to copy one of their designs with SiFive's RISC-V cores. They could use a 20 year old approach to dodge patent issues. Despite its age, the design would probably be competitive, maybe better, than FOSS clusters on modern hardware in fault tolerance.

One might also combine the architecture with one of the strong-consistency DR'S, like FoundationDB or CochroachDB, with modifications to take advantage of its custom hardware. At the local site, the result would be easy scaling of a system whose nodes appeared to never fail. The administrator still has to do regular maintenance, though, as the system reports component failures which it works around.

jeroenhd

To use rust in places where you can only use C. I imagine there are quite a few obscure microcontrollers that would benefit greatly from this pipeline.

Hell, you might finally be able to get Rust into the Linux kernel. Just don't tell them the code was originally written in Rust to calm their nerves.

vblanco

Game consoles generally only offer clang as a possibility for compiler. If you can compile rust to C, then you can finally use rust for videogames that need to run everywhere.

koakuma-chan

Is Steam Deck a monopoly yet? I feel like if your game compiles to Linux, you can target pretty much every market out there.

dcow

I don’t think I’ve ever heard those two terms “video game” and “run everywhere” in the same sentence. Bravo.

oulipo

I guess it's to target platforms (like some microcontrollers) which don't yet have a native Rust compiler, but often do have a C compiler?

null

[deleted]

snvzz

Excellent.

Now we can quickly re-rustify projects by converting them to C.

pixelfarmer

If I see something like "At least on Linux, long and long long are both 64 bits in size." my skin starts to crawl. Not only that, but GCC defines __builtin_popcount() with unsigned int / long / long long, respective, i.e. even in the text it should be mentioned correctly (unless a different compiler uses signed types there ... ugh). The call is done with unsigned, using uint64_t as a type-cast, but using a fixed __builtin_popcountl() which translates to unsigned long. There are systems where this will fail, i.e. the only safe bet to use here is __builtin_popcountll() as this will cover at least 64 bit wide arguments.

Also, if a * b overflows within the result type, it is an undefined behavior according to the C standard, so this overflow check is at least not properly portable, either, and the shown code for that is actually buggy because the last A1 has to be A0.

No idea why all that gets me so grumpy today ...

FractalFir

Correct me if I am wrong C, unsigned overflow is well-defined - at least the GCC manual says so, but I'll have to check the standard.

https://www.gnu.org/software/c-intro-and-ref/manual/html_nod...

Since signed multiplication is bitwise-equivalent to unsigned multiplication, I use unsigned multiplication to emulate UB-free signed multiplication. The signed variant of this overflow check is a bit harder to read because of that, but it still works just fine.

bool i128_mul_ovf_check(__int128 A0 ,__int128 A1 ){

bb0:

if((A1) != (0)) goto bb1;

return false;

bb1:

return (((__int128)((__uint128_t)(A0) * (__uint128_t)(A1))) / (A1)) == (A1);

}

As for using `__builtin_popcountll` instead - you are right, my mistake. Thanks for pointing that out :).

I did not use the word "unsigned" before long long for the sake of readability - I know that repeating a word so many times can make it harder to parse for some folk. The project itself uses the correct types in the code, I was just kind of loose with the language in the article itself. My bad, I'll fix that and be a bit more accurate.

Once again, thanks for the feedback!

tialaramex

Yes, the C and C++ unsigned types are analogous to Rust's Wrapping<u8> Wrapping<u16> Wrapping<u32> and so on, except that their size isn't nailed down by the ISO document.

dlahoda

thank for PR. very fast turn around.

null

[deleted]