Author |
Message |
- Joined: 09 Aug 2021
- Posts: 131
|
SDCC: best practices for designing the game in C
Posted: Wed Apr 17, 2024 11:48 am
|
You can not write C for Z80 the way like you write C for PC or even for something like arduino. Well, you can, if that fits your needs, but that will not be efficient.
There are best practices. Generally, there are two ways of optimization: the way you write the code itself and the way you design your program in general.
as for the code, there are few recommendations:
1. global/static variables are generally faster than local
1.1 unless those variables are the few local variables, which get allocated into the CPU registers. if you have few (one or two, usually it is better to make them local so they are allocated into registers)
1.2 global/static variables means your functions loose reentrancy, which is fine in most of the cases
2. use little amount of arguments (1 or 2 arguments at most), plan them carefully.
2.1 make functions return integer results, that may increase performance
3. write simple functions, too much code affect register allocation within the function
4. don't write too simple functions because calls are expensive. use inline functions and macros for that.
5. unroll loops manually
6. arrays vs structs: you win with structs when you access one struct, you loose when you manipulate several structs. you win when you manipulate short arrays (byte index)
7. indirect call is usually faster than the long switch
as for the game design:
1. interleave calculations: actor collision on the odd frames, projectile collisions on the even frames
2. optimize algorithms: simple iteration is not the best universal solution
3. precalculate stuff
4. simulate things: what it looks like on screen is not necessary how that is implemented, that counts to almost any game AI
5. use interrupts for music and sound, lagging music annoys more than the lagging game itself.
6. optimize the hot path, other stuff like asset loading between program states changing usually not require that.
7. write the game itself, then go into details and minor stuff like menus, etc. many projects were buried because of the wrong planning.
assembly:
1. inline assembly is, in general, not very good idea. if you write assembly function, always write the whole function.
2. custom PH rules is not a universal solution either, more harm than gain, actually.
other:
1. don't invent bicycles, usually, you can find better solutions made by others. study the good opensource games.
2. profile stuff. modern emulators like Emulicious make the task very easy.
3. look into the generated assembly and how your C code changes that assembly. at some point you get the grip of what is good and what is bad when you writing the code.
|
|
|
- Joined: 05 Sep 2013
- Posts: 3828
- Location: Stockholm, Sweden
|
Posted: Wed Apr 17, 2024 1:54 pm
|
interesting list, I might add a few (minor) info
- favor 8 bit variables/constants over 16 bit (or larger!) ones
- unsigned are faster than signed
- use == , != , < , >= wherever possible, as <= and > are slower
when needing to if/else if/else among = or < or > do this:
if (a==SOMEVAL) {
// code to run when EQUAL
} else if (a<SOMEVAL) {
// code to run when LESS THAN
} else {
// code to run when MORE THAN
}
(if the a<SOMEVAL case is the more frequent outcome you can put that first and leave the a==SOMEVAL as second)
|
|
|
- Joined: 19 Oct 2023
- Posts: 140
|
Posted: Wed Apr 17, 2024 3:18 pm
|
Thanks for making this!
What about bitshift rather than multplication/division?
I've read in various places the compiler will optimise this itself in certain circumstances but haven't actually checked, I just do bitshift whenever I can.
|
|
|
- Joined: 09 Aug 2021
- Posts: 131
|
Posted: Wed Apr 17, 2024 3:36 pm
|
badcomputer wrote What about bitshift rather than multplication/division?
yes, but SDCC also optimizes that for you, as well as some cases of multiplication into unrolled series of additions like:
a * 12 == a * 8 + a * 4
add hl, hl
add hl, hl
ld d, h
ld e, l
add hl, hl
add hl, de
|
|
|
- Joined: 05 Sep 2013
- Posts: 3828
- Location: Stockholm, Sweden
|
Posted: Wed Apr 17, 2024 4:36 pm
|
badcomputer wrote What about bitshift rather than multplication/division?
as toxa said, multiplications/divisions will likely be automatically optimized as long as it's by/divided by a constant so if you write
a=b*4;
it's fine - but if you instead write
a=b*c;
even if c is 4, there won't be any optimization.
(it might have been obvious but once I had a similar discussion with a game developer about the fact he was multiplying with a variable map_width whose value could be either 32 or 64 and was suggesting that the compiler would optimize that... no, it doesn't).
|
|
|
- Joined: 19 Oct 2023
- Posts: 140
|
Posted: Wed Apr 17, 2024 6:32 pm
|
Is it worth using --opt-code-speed with SDCC? Or any other compiler options that might help?
|
|
|
- Joined: 06 Mar 2022
- Posts: 671
- Location: London, UK
|
Posted: Wed Apr 17, 2024 6:49 pm
|
badcomputer wrote Is it worth using --opt-code-speed with SDCC? Or any other compiler options that might help?
I always turn that on.
I keep meaning to do the experiment of what differences it made. I thought it caused more inlining (including of functions) but Toxa said at some point that functions are never automatically inlined, so not sure.
|
|
|
- Joined: 09 Aug 2021
- Posts: 131
|
Posted: Wed Apr 17, 2024 7:06 pm
|
badcomputer wrote Is it worth using --opt-code-speed with SDCC? Or any other compiler options that might help?
I never use that, --max-allocs-per-node with some large value produce better results.
|
|
|
- Joined: 29 Mar 2012
- Posts: 886
- Location: Spain
|
Posted: Thu Apr 18, 2024 7:44 am
|
toxa wrote badcomputer wrote Is it worth using --opt-code-speed with SDCC? Or any other compiler options that might help?
I never use that, --max-allocs-per-node with some large value produce better results.
I haven't tried this, any advice about which value to use?
|
|
|
- Joined: 09 Aug 2021
- Posts: 131
|
Posted: Thu Apr 18, 2024 8:01 am
|
kusfo wrote I haven't tried this, any advice about which value to use?
50000 and above. but that may slow down the compiling process.
|
|
|
- Joined: 29 Mar 2012
- Posts: 886
- Location: Spain
|
Posted: Thu Apr 18, 2024 8:19 am
|
toxa wrote kusfo wrote I haven't tried this, any advice about which value to use?
50000 and above. but that may slow down the compiling process.
Thanks! I'll test it, I can have it enabled only for the release process
|
|
|
- Joined: 05 Sep 2013
- Posts: 3828
- Location: Stockholm, Sweden
|
Posted: Thu Apr 18, 2024 9:42 am
|
I compile SMSlib for release using
--max-allocs-per-node 100000
but it really takes *a lot* of time and it hardly is much different from what I would get without it. But I thought it's worth for a release build.
|
|
|
- Joined: 09 Aug 2021
- Posts: 131
|
Posted: Thu Apr 18, 2024 9:43 pm
|
sverx wrote I compile SMSlib for release using
SMSlib has very little and very specific C code and thus not much opportunities to optimize anything
|
|
|
- Joined: 05 Sep 2013
- Posts: 3828
- Location: Stockholm, Sweden
|
Posted: Fri Apr 19, 2024 7:43 am
|
toxa wrote sverx wrote I compile SMSlib for release using
SMSlib has very little and very specific C code and thus not much opportunities to optimize anything
True, and that's why I said
>> it hardly is much different from what I would get without it
still, I don't get exactly the same output without it, since there are still parts which aren't written in inline asm.
|
|
|