Your game will actually likely be more efficient if written in C. The gcc compiler has become ridiculously optimized and probably knows more tricks than you do.
Especially these days. Current-gen x86 architecture has all kinds of insane optimizations and special instruction sets that the Pentium I never had (e.g. SSE). You really do need a higher-level compiler at your back to make the most of it these days. And even then, there are cases where you have to resort to inline ASM or processor-specific intrinsics to optimize to the level that Roller Coaster Tycoon is/was. (original system specs)
I might be wrong, but doesn’t SSE require you to explicitly use it in C/C++? Laying out your data as arrays and specifically calling the SIMD operations on them?
Technically assembly is a human-readable, paper-thin abstraction of the machine code. It really only implements one additional feature over raw machine code and that’s labels, which prevents you from having to rewrite jump and goto instructions EVERY TIME you refactor upstream code to have a different number of instructions.
So not strictly the bunch of bits. But very close to it.
I think they meant the other way around, that if you wanted to use it in C/C++, you’d have to either use assembly or some specific SSE construct otherwise the compiler wouldn’t bother.
That probably was the case at one point, but I’d be surprised if it’s still the case. Though maybe that’s part of the reason why the Intel compiler can generate faster code. But I suspect it’s more of a case of better optimization by people who have a better understanding of how it works under the hood, and maybe better utilization of newer instruction set extensions.
SSE has been around for a long time and is present in most (all?) x86 chips these days and I’d be very surprised if gcc and other popular compilers don’t use it effectively today. Some of the other extensions might be different though.
If you want to use instructions from an extension (for example SIMD), you either: provide 2 versions of the function, or just won’t run in some CPUs. It would be weird for someone that doesn’t know about that to compile it for x86 and then have it not run on another x86 machine. I don’t think compilers use those instructions if you don’t tell them too.
Anyway, the SIMD the compilers will do is nowhere near the amount that it’s possible. If you manually use SIMD intrinsics/inline SIMD assembly, chances are that it will be faster than what the compiler would do. Especially because you are reducing the % of CPUs your program can run on.
Oh I see your point. Yeah, I think they meant that. And yes, there was a time you’d have to do trickery in C to force the use of SSE or whatever extensions you wanted to use.
Except everyone writing C is writing sloppy C. It’s like driving a car, there’s always a non-zero chance of an accident.
Even worse, in C the compiler is just waiting for you to trip up so it can do something weird. Think the risk of UB is overblown? I found this article from Raymond Chen enlightening: https://devblogs.microsoft.com/oldnewthing/20140627-00/?p=633
I recently came across a rust book on how pointers aren’t just ints, because of UB.
fnmain() {
a = &1
b = &2
a++
if a == b {
*a = 3print(b)
}
}
This may either: not print anything, print 3 or print 2.
Depending on the compiler, since b isn’t changed at all, it might optimize the print for print(2) instead of print(b). Even though everyone can agree that it should either not print anything or 3, but never 2.
A compiler making assumptions like that about undefined behaviour sounds just like a bug. Maybe the bug is in the spec rather than the compiler, but I can’t think of any time it would be better to optimize that code out entirely because UB is detected rather than just throwing an error or warning and otherwise ignoring the edge cases where the behaviour might break. It sounds like the worst possible option exactly for the reasons listed in that blog.
The thing about UB is that many optimizations are possible precisely because the spec specified it as UB. And the spec did so in order to make these optimizations possible.
Codebases are not 6 lines long, they are hundreds of thousands. Without optimizations like those, many CPU cycles would be lost to unnecessary code being executed.
If you write C/C++, it is because you either hate yourself or the application’s performance is important, and these optimizations are needed.
The reason rust is so impressive nowadays is that you can write high performing code without risking accidentally doing UB. And if you are going to write code that might result in UB, you have to explicitly state so with unsafe. But for C/C++, there’s no saving. If you want your compiler to optimize code in those languages, you are going to have loaded guns pointing at your feet all the time.
Your game will actually likely be more efficient if written in C. The gcc compiler has become ridiculously optimized and probably knows more tricks than you do.
Especially these days. Current-gen x86 architecture has all kinds of insane optimizations and special instruction sets that the Pentium I never had (e.g. SSE). You really do need a higher-level compiler at your back to make the most of it these days. And even then, there are cases where you have to resort to inline ASM or processor-specific intrinsics to optimize to the level that Roller Coaster Tycoon is/was. (original system specs)
I might be wrong, but doesn’t SSE require you to explicitly use it in C/C++? Laying out your data as arrays and specifically calling the SIMD operations on them?
There’s absolutely nothing you can do in C that you can’t also do in assembly. Because assembly is just the bunch of bits that the compiler generates.
That said, you’d have to be insane to write a game featuring SIMD instructions these days in assembly.
Technically assembly is a human-readable, paper-thin abstraction of the machine code. It really only implements one additional feature over raw machine code and that’s labels, which prevents you from having to rewrite jump and goto instructions EVERY TIME you refactor upstream code to have a different number of instructions.
So not strictly the bunch of bits. But very close to it.
I think they meant the other way around, that if you wanted to use it in C/C++, you’d have to either use assembly or some specific SSE construct otherwise the compiler wouldn’t bother.
That probably was the case at one point, but I’d be surprised if it’s still the case. Though maybe that’s part of the reason why the Intel compiler can generate faster code. But I suspect it’s more of a case of better optimization by people who have a better understanding of how it works under the hood, and maybe better utilization of newer instruction set extensions.
SSE has been around for a long time and is present in most (all?) x86 chips these days and I’d be very surprised if gcc and other popular compilers don’t use it effectively today. Some of the other extensions might be different though.
If you want to use instructions from an extension (for example SIMD), you either: provide 2 versions of the function, or just won’t run in some CPUs. It would be weird for someone that doesn’t know about that to compile it for x86 and then have it not run on another x86 machine. I don’t think compilers use those instructions if you don’t tell them too.
Anyway, the SIMD the compilers will do is nowhere near the amount that it’s possible. If you manually use SIMD intrinsics/inline SIMD assembly, chances are that it will be faster than what the compiler would do. Especially because you are reducing the % of CPUs your program can run on.
Oh I see your point. Yeah, I think they meant that. And yes, there was a time you’d have to do trickery in C to force the use of SSE or whatever extensions you wanted to use.
Yep but not if you write sloppy C code. Gotta keep those nuts and bolts tight!
If you’re writing sloppy C code your assembly code probably won’t work either
Except everyone writing C is writing sloppy C. It’s like driving a car, there’s always a non-zero chance of an accident.
Even worse, in C the compiler is just waiting for you to trip up so it can do something weird. Think the risk of UB is overblown? I found this article from Raymond Chen enlightening: https://devblogs.microsoft.com/oldnewthing/20140627-00/?p=633
I recently came across a rust book on how pointers aren’t just ints, because of UB.
fn main() { a = &1 b = &2 a++ if a == b { *a = 3 print(b) } }
This may either: not print anything, print 3 or print 2.
Depending on the compiler, since b isn’t changed at all, it might optimize the print for
print(2)
instead ofprint(b)
. Even though everyone can agree that it should either not print anything or 3, but never 2.A compiler making assumptions like that about undefined behaviour sounds just like a bug. Maybe the bug is in the spec rather than the compiler, but I can’t think of any time it would be better to optimize that code out entirely because UB is detected rather than just throwing an error or warning and otherwise ignoring the edge cases where the behaviour might break. It sounds like the worst possible option exactly for the reasons listed in that blog.
The thing about UB is that many optimizations are possible precisely because the spec specified it as UB. And the spec did so in order to make these optimizations possible.
Codebases are not 6 lines long, they are hundreds of thousands. Without optimizations like those, many CPU cycles would be lost to unnecessary code being executed.
If you write C/C++, it is because you either hate yourself or the application’s performance is important, and these optimizations are needed.
The reason rust is so impressive nowadays is that you can write high performing code without risking accidentally doing UB. And if you are going to write code that might result in UB, you have to explicitly state so with
unsafe
. But for C/C++, there’s no saving. If you want your compiler to optimize code in those languages, you are going to have loaded guns pointing at your feet all the time.Write it in Rust, and it’ll never even leak memory.