Forums

eruiz00 wrote

LITTLE QUESTION: Those who play the roms in their Master System / Genesis (I suppose they have a cartridge with a memory card or something like....) have SRAM support with those cartridges? the game for this year relies on save status to advance.

Regards

I've support for SRAM and Flash saving in Gotris, so it can save in Everdrive and Emulators on one side, and on the flash used in Doragasu's Frugal Mapper on the other side:

https://gitlab.com/TuxedoGameDevs/gotris/-/blob/master/src/libs/saves.c?ref_type=heads

eruiz00 wrote

It lacks a function to check SRAM availibity.... or I am wrong?

Yes, there's no function to check if SRAM is present. As a game is theoretically shipped with his own cartridge it wouldn't make much sense to verify if some hardware is present... but I see your point.

Anyway, as Maxim said, it's pretty easy. If you write to SRAM and you don't read back what you had just written then it means SRAM isn't there.

Depending on your game anyway I would consider different options:

if the game NEEDS SRAM:
- you could ASSUME it's there - game will break if hardware doesn't support it
- you could CHECK if it's there - display a message that it's impossible to run the game without it
if the game uses SRAM if available instead you just have to check if SRAM is available and use it only if it is present

Note that different hardware could even support different SRAM sizes: the Master EverDrive supports up to 32 KiB (two 16 KiB pages) but custom made cartridges could have 16 KiB or even just 8 KiB (even mirrored in 16 KiB).

Well, custom hardware could have one bit of SRAM mirrored everywhere, but that’s not useful :) but 8KB was the most common seen in original games.

It’s worth noting that some emulators won’t emulate SRAM unless the ROM is larger than 48KB, and some may never emulate it.

Maxim wrote

Well, custom hardware could have one bit of SRAM mirrored everywhere

Yes, my point was that to presume that there's a certain amount of SRAM just because some SRAM is present it would be a mistake. As any game ROM already uses hardware presuming it's there (the SEGA mapper, for instance) then it could be fair just to presume the hardware has a certain amount of SRAM. One should either ship a cartridge or write on the box "requires a cartridge that features a SEGA mapper with at least X KiB of SRAM" just like in the DOS era when they would write "requires at least 512 KiB RAM, EGA/VGA" for instance.

That’s true. If writing a defensive SRAM detector, you should probably do a full RAM check including size detection/mirroring checks. However on a simplistic level, you can mostly assume there is SRAM support and die ungracefully if it fails, because it’s there in 99% of the potential use cases (emulators, Everdrives).

eruiz00 wrote

I think it would be nice (if possible and has sense from a hardware point of view) to have a psg format which have the info for all the channels in each frame.

I had missed this part. This isn't a good idea for many reasons - first of all the PSG chip has 11 registers so it means that to store all that data plus the frame terminator bytes we would need 12 bytes/frame. But let's pretend for a moment this doesn't matter... rewriting all the PSG registers every frame would make the noise channel continuously re-trigger and this would break the audio.

edit: of course one could create an PSGlib alternative where the format is different (so that it doesn't need the frame terminator byte) and the library keeps track of what's the status of the noise register, to avoid rewriting the value when not necessary - but then you would have the problem of re-triggering it when the tune requires it...

The discussion of pausing and resuming music is exactly what Phantasy Star does. The way it works is that it has twice as many per-channel structures as it needs; and then the music tracks have a marker which selects which ones are used at any one time. With PSGLib I guess it would require some code changes to make the "variables" it uses be accessed more indirectly, e.g. by putting it all into a struct and then accessing via one of the index registers rather than absolute addresses. Then you'd have a function to set the base address, to swap between them, along with the existing functions to implement fade out, pause and resume.

OK, I don't plan to do that.
What would be possible is probably to create some function to copy the player status in a separate memory space, so that to be able to restore this 'snapshot' later. But first the library needs to track the value in every register of the PSG chip, because currently it's doing it only partially...

I'm speaking without being a very knowledgable about sound and music, but maybe we could have kind of keyframes (1 every 60 frames?) that write those 12 bytes needed to set all the registers?

that wouldn't help, the same issues would apply

but if people really need a way to suspend a tune to play a different tune and be able to go back and resume the previous one at the very same point that was interrupted, I'll add this feature.

... I'd consideer it a "nice to have" but nothing more....

I do not see many SMS homebrew productions released in the scene, and those which I see do not seem to be only needing this feature to be considered finished at all.

Also, I found that the solution Sverx gave about having 2-3 tracks of the same song for short songs is acceptable, btw have no tryed the save-restore status which commented in other post...

I actually tried adding that feature very recently, just to fail miserably - but I learned the lesson so I will take a different approach next time.

So, hopefully this feature will be added soon, but I can't promise it.

(well, of course I went on a completely separate route at the end...)

In PSGlib, I just added support for SFXs using any audio channel(s) you might want to use, even all 4 of them at the same time, instead of just channels 2 and/or 3 (the noise channel).

So this means basically that:

for example you can use channel 0 and/or channel 1 for your SFXs, thus not disturbing your bass/drums/noise using channels 2 and 3 (noise). Or course you still have to decide which part(s) of your tunes will be affected, but this update brings a bit more options at your disposal.
if you want to interrupt (pause) the currently playing music to play a different (short?) tune and then resume exactly where you left, you can do it by stopping (pausing) the music, trigger your 4-channels 'music SFX' while the music is stopped and resume the music as soon as this peculiar 'SFX' is over. Even if this isn't a solution for every problem (for instance you can't trigger an SFX over this 'music SFX') I think once more this means you've got a bit more options.

This update is still in test phase, so you can find it in the PSGlib_dev branch.
The updated vgm2psg tool can be found here instead.

Once this will be confirmed working fine, it'll be merged into the main branches of their respective repos.

I hope you enjoy this. Let me know your opinion or if you stumble upon any issue.

Oooh, this is good stuff. It finally makes complex multichannel sfx like the coin drops in Wonder Boy in Monsterland possible with PSGlib. Rudimentary speech effects, albeit probably unintelligible, might also be within reach.

Will this update be be available for us assembly coders too, or just in connection with devkitSMS?

Kagesan wrote

Will this update be be available for us assembly coders too, or just in connection with devkitSMS?

This update is already available in the assembly version too. Check the dev branch of PSGlib's repository :)

Have to use It in the new project... As you saw, It is perfect to avoid having to start the music between efectos, particularly, in this kind of game!!!!

That's a nice update!

Just to make sure I am not mixing things up, PSGLib still supports one sound effect at a time, but the sound effect can now use more channels or different ones than previously available, am I correct?

eruiz00 wrote

It is perfect to avoid having to start the music between efectos, particularly, in this kind of game!!!!

Yes, you can get something pretty close to what you needed.

armixer24 wrote

Just to make sure I am not mixing things up, PSGLib still supports one sound effect at a time, but the sound effect can now use more channels or different ones than previously available, am I correct?

100% correct. One sound effect, any channel / channels combinations.

vgm2psg usage now is:

vgm2psg inputfile.VGM outputfile.PSG [[0][1][2][3]]
[optional] when converting SFXs, the third parameter specifies which channel(s) should be active, examples:
0 means the SFX is using channel 0 only
1 means the SFX is using channel 1 only
2 means the SFX is using channel 2 only
3 means the SFX is using channel 3 (noise) only
23 means the SFX is using both channel 2 and channel 3 (noise)
123 means the SFX is using channels 1 and 2 and channel 3 (noise)

I hope this is not too off topic but are there any guides or articles on writing good C code for SDCC?

I found some bits and pieces on writing C for other cross compilers but I don't know how useful it is.

The source code for Astro Force (https://github.com/SteveProXNA/AstroForceSMS) is particularly interesting. Many of the main game variables and containers seem to be bunched together at global scope, along with all the most commonly used functions grouped together in the same way.

This is completely different to how I would organise my code, but I'm willing to adapt if the gains are worth it.

I would appreciate any hints or tips, or if this is off topic please split it into a new topic.

On this sort of system, heavy use of globals is good, use of structs and passing lots of variables to functions is bad. This is because dynamic allocation, stack use for data and indirection are all rather slow. I’d also advise to lean heavily on pointer maths rather than use array indexing, again to avoid indirection and multiplication. You can debug the generated assembly pretty well in Emulicious and your goal is to have as little assembly per line of code as possible :)

SDCC is also on Compiler Explorer which can be useful for small experiments.

What I noticed is that SDCC usually generates better code when functions are pretty small, so sometimes it's a good idea to break bigger functions into sub-functions, even if too many of them will mean adding calls overhead.
Function parameters should be 2 at max to exploit the registers (fastcall) and local variables should be kept at minimum - usually just a single iterator. As Maxim said, often a pointer is better than using array indexing.
But in general I would say the compiler has become pretty good and the slow parts of the code are often the ones interacting with the VDP and VRAM, so keeping the VRAM accesses sequential whenever possible is the number one rule to do not waste valuable cycles.

badcomputer wrote

I hope this is not too off topic but are there any guides or articles on writing good C code for SDCC?

I think maybe more pertinent is guides for writing good C code for Master System. As usual, context is king, and it's the constraints of the system more than SDCC itself as a compiler that determine the best approach.

But anyway, a guide sounds like a great idea - maybe you could start one!?

I'd generalise what Maxim says somewhat: focusing on static allocation is a good first step. The old adage of premature optimisation applies here of course, but deciding on your data storage strategy is a fundamental decision that has cumulative performance implications so you're better making it a policy from day 1.
The nature of these games is such that in any case you can easily model game state as a single static storage unit in most cases.

Static storage can be achieved in 3 main ways:

1. Global variables (which are implicitly static - regardless of whether or not you use the static keyword)
2. Static local variables - declared with the static keyword within a function
3. Effective static storage can also be achieved by declaring automatic variables within a function with lifetime greater than or equal to the data you're modeling. For instance, you can use non-static local variables within the "main" function. These will be allocated on the stack, but only once at the beginning of main, so are effectively static for your purposes.

So you see you don't always need global scope to use static allocation.

Remember also that if you use the const keyword within a function it can trigger the compiler to refer to data from your ROM, rather than automatically copying the data into RAM when the function runs.
I would advise using const liberally, although it does become tedious.

In terms of structs - it depends. Certainly avoid passing structs to parameters by value which will trigger copy semantics, so if you are passing structs it's almost always better to use a pointer. I think the premature optimisation mantra applies well here though - I personally wouldn't avoid structs entirely unless I could prove that they were hurting performance.

On the array indexing I'm not so sure - the compiler should be able to treat array indexing and pointer arithmetic as equivalent and I would expect that certainly for char arrays in SDCC that the two would be equivalent (note to self: run some basic tests!). Forcing the compiler to do a different type of pointer arithmetic than the array equivalent is kind of tricky as you need to cast to a different element size first.
But in any case, array indexing and pointer arithmetic have different developer experiences and I think there are some situations where using one over the other just makes more sense when reasoning about the code.

On that last point, generally I would say the benefit of using C over assembler for retro gaming is that it lets you get to where you want to get to quicker, but the downside is that the bigger the conceptual gap between your high level and low level code, the more opportunity there is for the compiler to just get it wrong; so my advice would be to keep thinking in assembler, and code in C in such a way that you feel some confidence it's going to compile down to roughly what you would write in assembler. As Maxim says, simply inspecting the compiled code is very useful to keep it on track.

The main high level unit of composition in C is the function, which maps pretty well to call in Z80 assembler. So in general using function calls liberally is like using subroutine calls liberally - i.e. it has some overhead. The compiler can optimise calls away though, and in particular it can inline code. If you set the compiler option to optimise for speed over size it will do this more readily. You can also try and persuade it to do this by using the inline modifier on the function. I find this particularly useful - it means you can keep your functional units in C small without necessarily feeling like you need to inline everything for performance.

One other area worth mentioning specifically - floating point maths is particularly expensive on the Z80 which has no hardware floating point unit. It's fun to see what C generates for floating point operations and in some cases you might find it does a good enough job, but I'd say in a game situation it's likely you'll need to code your own fixed point arithmetic.

One last sounding of the bell around premature optimisation: your program only needs to be performant enough to run properly. So you could go around putting inline in front of every function you write, but there's little point if they aren't close to the critical path. Write your game in the way that makes most sense to you (but with static allocation, etc!), and only if it's running too slow go back and try to optimise.

Thanks everyone! This is really useful.

A summary of what's been mentioned so far along with some other obvious bits I gathered:

Quote

General
- avoid floating point maths, code your own fixed point arithmetic
- avoid multiple and division, prefer bit shifts

Variables
- prefer static storage
-- global variables are implicitly static
-- keep local variables to a minimum, but declare them static if needed
- use const liberally, it can trigger the compiler to refer to data from the ROM rather than copying data into RAM

Functions
- avoid using more than 2 parameters
- avoid passing structs as parameters by value, always pass as pointer if needed
- prefer smaller functions, there is more overhead to call them but SDCC usually generates better code for each
- avoid local variables, e.g. use 1 iterator, re-use variables

Compiler
- set compiler option to optimise for speed rather than size

Assembly
- view generated assembly in Emulicious
- generally the less lines the better
- Compiler Explorer for SDCC experiments

Mixed opinion:
- avoid structs
- prefer pointer maths rather than array indexing

Regarding avoiding structs, does this mean copying the data from your struct into a variable and using that, then copying it back, or does it mean avoiding structs (and struct arrays) altogether? For example only using arrays of bytes?

I wouldn't suggest to avoid structs at all cost - after all, the Z80 is great at accessing structs' fields.
Use structs when needed, but prefer byte arrays if you would have defined a struct just to process fields in their order.

edit: also if you have one local variable that you use for example in a for loop, don't declare it as static, as there are chances that if the function/loop is simple, SDCC will use one of the Z80 registers for that, and that would be the fastest option.

See here my unsigned char i variable:

;objects_draw.c:61: for (i=0;i<MAX_OBJECTS;i++)
; skipping iCode since result will be rematerialized
; genAssign
; genMove_o
ld c, #0x00

I supose someone with good Asm knowledge could compare the final code of using múltiple arrays of chars/ints/pointers and passing a "counter" car in functión calls against using arrays of structs passing pointers to those structs...

For me, I have to say I use the later, structs of customized data everywhere… and pointers as parameters and everything runs great.

The most cpu time consuming task in-game with smsdevkit often is the sprite related procedures, which can be heavily optimized using 8x16 sprites and the add two/three adjacejt sprites functions (four would be great!!!)...

One thing I regret from electronic dreams is to have used 8x8 sprites to save as many tiles as posible.... Now, several months later, think I could use 8x16 with minimum changes, with considerable performance improvements... Who knows.... Maybe some day....

eruiz00 wrote

The most cpu time consuming task in-game with smsdevkit often is the sprite related procedures, which can be heavily optimized using 8x16 sprites and the add two/three adjacejt sprites functions (four would be great!!!).

Indeed placing sprites on screen it's a really time consuming operation, but provided their position often change at every frame, I don't think there's a real practical alternative here.
If a function to draw four adjacent sprites can help, I have no problem adding it - I wasn't even sure anyone was using the two- or three- sprites functions, actually...

sverx wrote

Indeed placing sprites on screen it's a really time consuming operation, but provided their position often change at every frame, I don't think there's a real practical alternative here.
If a function to draw four adjacent sprites can help, I have no problem adding it - I wasn't even sure anyone was using the two- or three- sprites functions, actually...

I always use SMS_addTwoAdjoiningSprites for 16x16px sprites when I can. I wonder if there's any better way to draw a quad of 4 tiles? I often use them for 16x16px objects that I don't want as sprites, but feel I'm doing it slowly using SMS_setTileatXY x 4?

badcomputer wrote

I wonder if there's any better way to draw a quad of 4 tiles? I often use them for 16x16px objects that I don't want as sprites, but feel I'm doing it slowly using SMS_setTileatXY x 4?

Create an array of unsigned ints with the four tiles you would write and use

SMS_loadTileMapArea (x, y, src, width, height)

which is faster.

In your case for instance you could just do:

unsigned int const BGdata={32,33,34,35};

and later

SMS_loadTileMapArea (x, y, BGdata, 2, 2);

sverx wrote

badcomputer wrote

I wonder if there's any better way to draw a quad of 4 tiles? I often use them for 16x16px objects that I don't want as sprites, but feel I'm doing it slowly using SMS_setTileatXY x 4?

Create an array of unsigned ints with the four tiles you would write and use
SMS_loadTileMapArea (x, y, src, width, height)
which is faster.

In your case for instance you could just do:

unsigned int const BGdata={32,33,34,35};

and later

SMS_loadTileMapArea (x, y, BGdata, 2, 2);

Excellent, thank you!

While I'm adding a function to add 4 adjacent sprites - and hoping nobody will need 5 then ;) - I was wondering how often zoomed sprites are used, given they're 'broken' on first revision SMSs (or better: not officially supported and thus not functioning correctly) and totally missing when running on a Genesis/MegaDrive hardware.

All this because the adjacent sprites functions could be even a bit faster if support for zoomed sprites is carved away, likely using a compile define such as NO_SPRITE_ZOOM or similar.

Elaborating on this, how often do you need to use both 8×8 and 8×16 sprites? There might be more savings if code always uses just one of these modes.

Any feedback?

Delete the zoom feature dude !!!!!

(But keep the 8x16… i think there are a big performance step than with 8x8 in sprite massive games)

eruiz00 wrote

Delete the zoom feature dude !!!!!

LOL! :D :D

eruiz00 wrote

(But keep the 8x16… i think there are a big performance step than with 8x8 in sprite massive games)

I never meant to remove that!

In my current (not in C) project I have a need for both 8x16 and 8x8 sprites at different times.

same as Maxim.

I use both 8x8 and 8x16 sprites as well, even if my games are not "action" heavy, both can be convenient for various reason.

Anything that moves is 8x16 as much as possible to make it faster to update positions.

Newbie question about the line interrupt handler.

I'm using it like this:

SMS_setLineInterruptHandler(&stage_update_palettes);
SMS_setLineCounter(192);
SMS_enableLineInterrupt();

start_new_game();

for ( ;; ) {
// Game logic
update();

// Vblank
SMS_waitForVBlank();
SMS_copySpritestoSAT();
SMS_initSprites();
}

And it's working fine, but if I increase the trigger line to say 216 it doesn't fire. I want it to be lower to avoid the palette cycling corrupting the screen at the bottom. What am I missing here?

The VDP's line interrupt counter is reset outside of the visible screen area, meaning that values of greater than 191 don't do anything useful for you here (assuming you're using a 192 line display!)

If you want to delay some operation to a specific portion of vblank then I guess you have to count cycles, although most of the time you should be fine to do whatever you like in vblank - palette artefacts should only be happening in the active display.

I'm also not sure what the value of setting it to 192 is, as in your example, because I think that's either equivalent to a vblank interrupt, or possibly to none at all (can't remember on which line the interrupts stop happening)

willbritton wrote

The VDP's line interrupt counter is reset outside of the visible screen area, meaning that values of greater than 191 don't do anything useful for you here (assuming you're using a 192 line display!)

If you want to delay some operation to a specific portion of vblank then I guess you have to count cycles, although most of the time you should be fine to do whatever you like in vblank - palette artefacts should only be happening in the active display.

I'm also not sure what the value of setting it to 192 is, as in your example, because I think that's either equivalent to a vblank interrupt, or possibly to none at all (can't remember on which line the interrupts stop happening)

Ah OK thank you Will.

I'm seeing the artifacts on line ~200 (with visible border in Emulicious), I assume that's because there's very little happening in V-blank at the moment. Once I have more in the game I'll try to ensure the palette changes are off screen.

badcomputer wrote

I'm seeing the artifacts on line ~200 (with visible border in Emulicious), I assume that's because there's very little happening in V-blank at the moment. Once I have more in the game I'll try to ensure the palette changes are off screen.

Ah yeah sorry, I forgot about the border area, you're right!
Now you mention it I'm sure I did something similar before, let me try and remember...

EDIT: oh yeah, it was here when I was hacking together some palette fading functions.
You can see I added a comment about pre-calculating some stuff which basically front-loaded some CPU work such that by the time the palettes started changing the border had already been drawn. A bit dodgy but not sure there's a more reliable way. I think there have been a few forum posts in this kind of area before.

Many commercial games have “CRAM dots” in the visible border area. If you have a deterministic amount of work each frame - eg uploading the sprite table - then you can do that before setting the palette to push the dots lower. If your vblank work is more variable then it’s harder to know when to do it - but you could consider checking the line counter multiple times during the vblank and upload the palette only when it’s large enough to hide the dots.

I'm currently diving into the keyboard function of SGlib to implement a proper prompt.
If I use the following snippet I'm getting the same status codes for all keyboard rows.

//return the number and the pressed keys
unsigned char SG_GetKeycode (unsigned int *keys, unsigned char max_keys) {
unsigned char ret=0;
unsigned int status;

for(unsigned char row = 0; row < 8; ++row) {
SC_PPI_C=row;

status=~((SC_PPI_A << 8) | SC_PPI_B);
for(unsigned int bit=0x8000; bit || !status; bit >>= 1) {
if (status & bit) {
if (ret < max_keys) {
keys[ret]=bit;
status -= bit;
}
else
return ret;
++ret;
}
}
}
return ret;
}

Is there an emulator which is able to test the keyboard functionality of the SC-3000?
I'm currently using Emulicious which passthrough only a small set of keys.

Maxim wrote

Many commercial games have “CRAM dots” in the visible border area. If you have a deterministic amount of work each frame - eg uploading the sprite table - then you can do that before setting the palette to push the dots lower. If your vblank work is more variable then it’s harder to know when to do it - but you could consider checking the line counter multiple times during the vblank and upload the palette only when it’s large enough to hide the dots.

I think it's going to be trial and error to find the right timing to change the palette. I'll keep tweaking it as the game grows, but checking the line counter sounds good too, thanks!

Lost somewhere in this forum (and I couldn't find it again so far) there are info regarding which lines are 'safe' for writing to CRAM without getting the dots. I think it's worthy to put that in the wiki if we find that info again.

edit: found it! It's here for NTSC and this is for PAL

In short: lines 216-234 (inclusive) for NTSC and lines 240-258 (inclusive) for PAL

What I usually do is that I put a scanline poll at the end of my vblank routine and make it wait for a specific line if a flag is set that indicates a palette change. I then just write to cram when the line counter has reached the correct line.

The downside is that you’re wasting precious vblank time doing basically nothing, but then it only happens during frames with palette changes, which aren’t too many, usually.

The correct lines to trigger the palette change are different for NTSC and PAL, so I combine the scanline counter with a TV type value I check at startup.

Unfortunately, I’m at work atm, but I can look up working target scanlines when I get home. After all, it’s not necessary that everyone has to do trial and error only to arrive at the same result.

dark wrote

I'm currently diving into the keyboard function of SGlib to implement a proper prompt.
If I use the following snippet I'm getting the same status codes for all keyboard rows.

Here you can check how I read *some* keys - it's been tested working, so you can probably check what's different in your code.

edit: reading your snippet once again, I see it's going through all the keyboard rows but the values it returns when some keys are pressed don't have that information anywhere.
We could try to fix it, but I have no idea how the keyboard really works... if there are never more than 13 keys on a row you could use the upper three bits of the integer it returns to store the row number 0-7...

edit2: it might work, according to this, the upper 4 bits of PPI_B are never used, so you could start with bit=0x0800 and have

keys[ret]=bit & (row<<12);

edit3: of course you need to swap the order from

status=~((SC_PPI_A << 8) | SC_PPI_B);

to

status=~((SC_PPI_B << 8) | SC_PPI_A);

so that the unused 4 bits are the msb.

edit4: if all this works, you might want to switch to returning one unsigned char per key pressed, which you can do by returning the row number in 3 bits and the key number in 4 bits in a

0 r r r k k k k

form by having an additional counter k in the for loop, going from 0 to 11 - or the other way around if you prefer.

Kagesan wrote

The correct lines to trigger the palette change are different for NTSC and PAL, so I combine the scanline counter with a TV type value I check at startup.

... or you could just check when the counter changes to a lower value, this happens at 0xDA (goes 'back' to 0xD5) on NTSC and at 0xF2 (goes 'back' to 0xBA) on PAL.

sverx wrote

Kagesan wrote

The correct lines to trigger the palette change are different for NTSC and PAL, so I combine the scanline counter with a TV type value I check at startup.

... or you could just check when the counter changes to a lower value, this happens at 0xDA (goes 'back' to 0xD5) on NTSC and at 0xF2 (goes 'back' to 0xBA) on PAL.

That's actually pretty clever. Funny, 217 (NTSC) and 242 (PAL) are exactly the values I always use.

Kagesan wrote

That's actually pretty clever. Funny, 217 (NTSC) and 242 (PAL) are exactly the values I always use.

That's because it happens when the 'beam' moves back to the top of the screen, so it doesn't really draw any pixel. Anyway whatever the approach, the important thing to avoid CRAM dots is not to write to the palettes during the active part and while the VDP renders the top and bottom borders.

Hi there!

I'm trying to write a utility function that would copy 256 bytes to the VDP tiles based on the UNSAFE_SMS_VRAMmemcpy functions.

__sfr __at 0xBE VDPDataPort2;

#define SETVDPDATAPORT2 __asm ld c,#_VDPDataPort2 __endasm

void UNSAFE_SMS_VRAMmemcpy256 (unsigned int dst, const void *src)
{

SMS_setAddr(0x4000|dst);
SETVDPDATAPORT2;
OUTI128(src);

SMS_setAddr(0x4000|(dst + 128)); // or 256?
SETVDPDATAPORT2;
OUTI128(src + 64);
}

At the moment it's not working for me, but there are some parts I'm unsure of.

First, I needed access to the vdpdataport. The VDPDataPort is only declared internally, so I duplicated one locally above.

Second, I'm not sure if it's safe to do those three (setaddr, setvdpdatapport, outi) functions repeatedly. So far it doesn't seem to work correctly. The first half seems to copy to VDP but the second goes out into space somewhere. If I call two UNSAFE_SMS_VRAMmemcpy128 () with the correct offsets in a row it works, but not if I do the above.

Any idea? Thanks!

What's the +64 for? Shouldn't that be +128?

Presumably you can simply do something like the following:

SMS_setAddr(0x4000|dst);
SETVDPDATAPORT2;
OUTI128(src);
OUTI128(src + 128);

Or even

UNSAFE_SMS_VRAMmemcpy128(dst, src);
OUTI128(src + 128);

If the compiler decides to inline the initial call to UNSAFE_SMS_VRAMmemcpy128 they should be equivalent.

EDIT: and to risk stating the obvious, the whole thing is inherently "unsafe" in that you won't be able to call it when the VDP is in active display, so you need to make sure you're calling it in the right place if you want it to work.

EDIT2: Also calling OUTI128 again does reload the hl register so in theory you could cut that overload out by inlining some assembly which emitted 128 more outi's after the call to UNSAFE_SMS_VRAMmemcpy128:

.rept 128
outi
.endm

Author	Message
kusfo Joined: 29 Mar 2012 Posts: 886 Location: Spain	Posted: Sun Jan 28, 2024 9:13 am
kusfo Joined: 29 Mar 2012 Posts: 886 Location: Spain	eruiz00 wrote LITTLE QUESTION: Those who play the roms in their Master System / Genesis (I suppose they have a cartridge with a memory card or something like....) have SRAM support with those cartridges? the game for this year relies on save status to advance. Regards I've support for SRAM and Flash saving in Gotris, so it can save in Everdrive and Emulators on one side, and on the flash used in Doragasu's Frugal Mapper on the other side: https://gitlab.com/TuxedoGameDevs/gotris/-/blob/master/src/libs/saves.c?ref_type=heads

sverx Joined: 05 Sep 2013 Posts: 3828 Location: Stockholm, Sweden	Posted: Sun Jan 28, 2024 10:32 am
	eruiz00 wrote It lacks a function to check SRAM availibity.... or I am wrong? Yes, there's no function to check if SRAM is present. As a game is theoretically shipped with his own cartridge it wouldn't make much sense to verify if some hardware is present... but I see your point. Anyway, as Maxim said, it's pretty easy. If you write to SRAM and you don't read back what you had just written then it means SRAM isn't there. Depending on your game anyway I would consider different options: if the game NEEDS SRAM: - you could ASSUME it's there - game will break if hardware doesn't support it - you could CHECK if it's there - display a message that it's impossible to run the game without it if the game uses SRAM if available instead you just have to check if SRAM is available and use it only if it is present Note that different hardware could even support different SRAM sizes: the Master EverDrive supports up to 32 KiB (two 16 KiB pages) but custom made cartridges could have 16 KiB or even just 8 KiB (even mirrored in 16 KiB).

Maxim Site Admin Joined: 19 Oct 1999 Posts: 14745 Location: London	Posted: Sun Jan 28, 2024 11:27 am
	Well, custom hardware could have one bit of SRAM mirrored everywhere, but that’s not useful :) but 8KB was the most common seen in original games. It’s worth noting that some emulators won’t emulate SRAM unless the ROM is larger than 48KB, and some may never emulate it.

sverx Joined: 05 Sep 2013 Posts: 3828 Location: Stockholm, Sweden	Posted: Sun Jan 28, 2024 11:35 am
	Maxim wrote Well, custom hardware could have one bit of SRAM mirrored everywhere Yes, my point was that to presume that there's a certain amount of SRAM just because some SRAM is present it would be a mistake. As any game ROM already uses hardware presuming it's there (the SEGA mapper, for instance) then it could be fair just to presume the hardware has a certain amount of SRAM. One should either ship a cartridge or write on the box "requires a cartridge that features a SEGA mapper with at least X KiB of SRAM" just like in the DOS era when they would write "requires at least 512 KiB RAM, EGA/VGA" for instance.

Maxim Site Admin Joined: 19 Oct 1999 Posts: 14745 Location: London	Posted: Sun Jan 28, 2024 1:06 pm
	That’s true. If writing a defensive SRAM detector, you should probably do a full RAM check including size detection/mirroring checks. However on a simplistic level, you can mostly assume there is SRAM support and die ungracefully if it fails, because it’s there in 99% of the potential use cases (emulators, Everdrives).

sverx Joined: 05 Sep 2013 Posts: 3828 Location: Stockholm, Sweden	Posted: Mon Jan 29, 2024 8:58 am
	eruiz00 wrote I think it would be nice (if possible and has sense from a hardware point of view) to have a psg format which have the info for all the channels in each frame. I had missed this part. This isn't a good idea for many reasons - first of all the PSG chip has 11 registers so it means that to store all that data plus the frame terminator bytes we would need 12 bytes/frame. But let's pretend for a moment this doesn't matter... rewriting all the PSG registers every frame would make the noise channel continuously re-trigger and this would break the audio. edit: of course one could create an PSGlib alternative where the format is different (so that it doesn't need the frame terminator byte) and the library keeps track of what's the status of the noise register, to avoid rewriting the value when not necessary - but then you would have the problem of re-triggering it when the tune requires it...

Maxim Site Admin Joined: 19 Oct 1999 Posts: 14745 Location: London	Posted: Mon Jan 29, 2024 9:20 am
	The discussion of pausing and resuming music is exactly what Phantasy Star does. The way it works is that it has twice as many per-channel structures as it needs; and then the music tracks have a marker which selects which ones are used at any one time. With PSGLib I guess it would require some code changes to make the "variables" it uses be accessed more indirectly, e.g. by putting it all into a struct and then accessing via one of the index registers rather than absolute addresses. Then you'd have a function to set the base address, to swap between them, along with the existing functions to implement fade out, pause and resume.

sverx Joined: 05 Sep 2013 Posts: 3828 Location: Stockholm, Sweden	Posted: Mon Jan 29, 2024 2:03 pm
	OK, I don't plan to do that. What would be possible is probably to create some function to copy the player status in a separate memory space, so that to be able to restore this 'snapshot' later. But first the library needs to track the value in every register of the PSG chip, because currently it's doing it only partially...

kusfo Joined: 29 Mar 2012 Posts: 886 Location: Spain	Posted: Mon Jan 29, 2024 2:54 pm
kusfo Joined: 29 Mar 2012 Posts: 886 Location: Spain	I'm speaking without being a very knowledgable about sound and music, but maybe we could have kind of keyframes (1 every 60 frames?) that write those 12 bytes needed to set all the registers?

sverx Joined: 05 Sep 2013 Posts: 3828 Location: Stockholm, Sweden	Posted: Mon Jan 29, 2024 3:07 pm
	that wouldn't help, the same issues would apply but if people really need a way to suspend a tune to play a different tune and be able to go back and resume the previous one at the very same point that was interrupted, I'll add this feature.

eruiz00 Joined: 28 Jan 2017 Posts: 556 Location: Málaga, Spain	Posted: Fri Feb 09, 2024 10:51 am
	... I'd consideer it a "nice to have" but nothing more.... I do not see many SMS homebrew productions released in the scene, and those which I see do not seem to be only needing this feature to be considered finished at all. Also, I found that the solution Sverx gave about having 2-3 tracks of the same song for short songs is acceptable, btw have no tryed the save-restore status which commented in other post...

sverx Joined: 05 Sep 2013 Posts: 3828 Location: Stockholm, Sweden	Posted: Fri Feb 09, 2024 11:10 am
	I actually tried adding that feature very recently, just to fail miserably - but I learned the lesson so I will take a different approach next time. So, hopefully this feature will be added soon, but I can't promise it.

sverx Joined: 05 Sep 2013 Posts: 3828 Location: Stockholm, Sweden	Posted: Fri Feb 16, 2024 3:38 pm
	(well, of course I went on a completely separate route at the end...) In PSGlib, I just added support for SFXs using any audio channel(s) you might want to use, even all 4 of them at the same time, instead of just channels 2 and/or 3 (the noise channel). So this means basically that: for example you can use channel 0 and/or channel 1 for your SFXs, thus not disturbing your bass/drums/noise using channels 2 and 3 (noise). Or course you still have to decide which part(s) of your tunes will be affected, but this update brings a bit more options at your disposal. if you want to interrupt (pause) the currently playing music to play a different (short?) tune and then resume exactly where you left, you can do it by stopping (pausing) the music, trigger your 4-channels 'music SFX' while the music is stopped and resume the music as soon as this peculiar 'SFX' is over. Even if this isn't a solution for every problem (for instance you can't trigger an SFX over this 'music SFX') I think once more this means you've got a bit more options. This update is still in test phase, so you can find it in the PSGlib_dev branch. The updated vgm2psg tool can be found here instead. Once this will be confirmed working fine, it'll be merged into the main branches of their respective repos. I hope you enjoy this. Let me know your opinion or if you stumble upon any issue.

Kagesan Joined: 01 Feb 2014 Posts: 878	Posted: Fri Feb 16, 2024 3:54 pm
Kagesan Joined: 01 Feb 2014 Posts: 878	Oooh, this is good stuff. It finally makes complex multichannel sfx like the coin drops in Wonder Boy in Monsterland possible with PSGlib. Rudimentary speech effects, albeit probably unintelligible, might also be within reach. Will this update be be available for us assembly coders too, or just in connection with devkitSMS?

sverx Joined: 05 Sep 2013 Posts: 3828 Location: Stockholm, Sweden	Posted: Fri Feb 16, 2024 3:58 pm
	Kagesan wrote Will this update be be available for us assembly coders too, or just in connection with devkitSMS? This update is already available in the assembly version too. Check the dev branch of PSGlib's repository :)

eruiz00 Joined: 28 Jan 2017 Posts: 556 Location: Málaga, Spain	Posted: Fri Feb 16, 2024 4:10 pm
	Have to use It in the new project... As you saw, It is perfect to avoid having to start the music between efectos, particularly, in this kind of game!!!!

armixer24 Joined: 12 Aug 2021 Posts: 74	Posted: Fri Feb 16, 2024 4:41 pm
armixer24 Joined: 12 Aug 2021 Posts: 74	That's a nice update! Just to make sure I am not mixing things up, PSGLib still supports one sound effect at a time, but the sound effect can now use more channels or different ones than previously available, am I correct?

sverx Joined: 05 Sep 2013 Posts: 3828 Location: Stockholm, Sweden	Posted: Fri Feb 16, 2024 5:14 pm
	eruiz00 wrote It is perfect to avoid having to start the music between efectos, particularly, in this kind of game!!!! Yes, you can get something pretty close to what you needed. armixer24 wrote Just to make sure I am not mixing things up, PSGLib still supports one sound effect at a time, but the sound effect can now use more channels or different ones than previously available, am I correct? 100% correct. One sound effect, any channel / channels combinations. vgm2psg usage now is: vgm2psg inputfile.VGM outputfile.PSG [[0][1][2][3]] [optional] when converting SFXs, the third parameter specifies which channel(s) should be active, examples: 0 means the SFX is using channel 0 only 1 means the SFX is using channel 1 only 2 means the SFX is using channel 2 only 3 means the SFX is using channel 3 (noise) only 23 means the SFX is using both channel 2 and channel 3 (noise) 123 means the SFX is using channels 1 and 2 and channel 3 (noise)

badcomputer Joined: 19 Oct 2023 Posts: 138	Posted: Mon Mar 04, 2024 5:20 pm
badcomputer Joined: 19 Oct 2023 Posts: 138	I hope this is not too off topic but are there any guides or articles on writing good C code for SDCC? I found some bits and pieces on writing C for other cross compilers but I don't know how useful it is. The source code for Astro Force (https://github.com/SteveProXNA/AstroForceSMS) is particularly interesting. Many of the main game variables and containers seem to be bunched together at global scope, along with all the most commonly used functions grouped together in the same way. This is completely different to how I would organise my code, but I'm willing to adapt if the gains are worth it. I would appreciate any hints or tips, or if this is off topic please split it into a new topic.

Maxim Site Admin Joined: 19 Oct 1999 Posts: 14745 Location: London	Posted: Mon Mar 04, 2024 10:34 pm
	On this sort of system, heavy use of globals is good, use of structs and passing lots of variables to functions is bad. This is because dynamic allocation, stack use for data and indirection are all rather slow. I’d also advise to lean heavily on pointer maths rather than use array indexing, again to avoid indirection and multiplication. You can debug the generated assembly pretty well in Emulicious and your goal is to have as little assembly per line of code as possible :) SDCC is also on Compiler Explorer which can be useful for small experiments.

sverx Joined: 05 Sep 2013 Posts: 3828 Location: Stockholm, Sweden	Posted: Tue Mar 05, 2024 9:35 am
	What I noticed is that SDCC usually generates better code when functions are pretty small, so sometimes it's a good idea to break bigger functions into sub-functions, even if too many of them will mean adding calls overhead. Function parameters should be 2 at max to exploit the registers (fastcall) and local variables should be kept at minimum - usually just a single iterator. As Maxim said, often a pointer is better than using array indexing. But in general I would say the compiler has become pretty good and the slow parts of the code are often the ones interacting with the VDP and VRAM, so keeping the VRAM accesses sequential whenever possible is the number one rule to do not waste valuable cycles.

willbritton Joined: 06 Mar 2022 Posts: 671 Location: London, UK	Posted: Tue Mar 05, 2024 9:36 am
	badcomputer wrote I hope this is not too off topic but are there any guides or articles on writing good C code for SDCC? I think maybe more pertinent is guides for writing good C code for Master System. As usual, context is king, and it's the constraints of the system more than SDCC itself as a compiler that determine the best approach. But anyway, a guide sounds like a great idea - maybe you could start one!? I'd generalise what Maxim says somewhat: focusing on static allocation is a good first step. The old adage of premature optimisation applies here of course, but deciding on your data storage strategy is a fundamental decision that has cumulative performance implications so you're better making it a policy from day 1. The nature of these games is such that in any case you can easily model game state as a single static storage unit in most cases. Static storage can be achieved in 3 main ways: 1. Global variables (which are implicitly static - regardless of whether or not you use the static keyword) 2. Static local variables - declared with the static keyword within a function 3. Effective static storage can also be achieved by declaring automatic variables within a function with lifetime greater than or equal to the data you're modeling. For instance, you can use non-static local variables within the "main" function. These will be allocated on the stack, but only once at the beginning of main, so are effectively static for your purposes. So you see you don't always need global scope to use static allocation. Remember also that if you use the const keyword within a function it can trigger the compiler to refer to data from your ROM, rather than automatically copying the data into RAM when the function runs. I would advise using const liberally, although it does become tedious. In terms of structs - it depends. Certainly avoid passing structs to parameters by value which will trigger copy semantics, so if you are passing structs it's almost always better to use a pointer. I think the premature optimisation mantra applies well here though - I personally wouldn't avoid structs entirely unless I could prove that they were hurting performance. On the array indexing I'm not so sure - the compiler should be able to treat array indexing and pointer arithmetic as equivalent and I would expect that certainly for char arrays in SDCC that the two would be equivalent (note to self: run some basic tests!). Forcing the compiler to do a different type of pointer arithmetic than the array equivalent is kind of tricky as you need to cast to a different element size first. But in any case, array indexing and pointer arithmetic have different developer experiences and I think there are some situations where using one over the other just makes more sense when reasoning about the code. On that last point, generally I would say the benefit of using C over assembler for retro gaming is that it lets you get to where you want to get to quicker, but the downside is that the bigger the conceptual gap between your high level and low level code, the more opportunity there is for the compiler to just get it wrong; so my advice would be to keep thinking in assembler, and code in C in such a way that you feel some confidence it's going to compile down to roughly what you would write in assembler. As Maxim says, simply inspecting the compiled code is very useful to keep it on track. The main high level unit of composition in C is the function, which maps pretty well to call in Z80 assembler. So in general using function calls liberally is like using subroutine calls liberally - i.e. it has some overhead. The compiler can optimise calls away though, and in particular it can inline code. If you set the compiler option to optimise for speed over size it will do this more readily. You can also try and persuade it to do this by using the inline modifier on the function. I find this particularly useful - it means you can keep your functional units in C small without necessarily feeling like you need to inline everything for performance. One other area worth mentioning specifically - floating point maths is particularly expensive on the Z80 which has no hardware floating point unit. It's fun to see what C generates for floating point operations and in some cases you might find it does a good enough job, but I'd say in a game situation it's likely you'll need to code your own fixed point arithmetic. One last sounding of the bell around premature optimisation: your program only needs to be performant enough to run properly. So you could go around putting inline in front of every function you write, but there's little point if they aren't close to the critical path. Write your game in the way that makes most sense to you (but with static allocation, etc!), and only if it's running too slow go back and try to optimise.

badcomputer Joined: 19 Oct 2023 Posts: 138	Posted: Tue Mar 05, 2024 4:43 pm
badcomputer Joined: 19 Oct 2023 Posts: 138	Thanks everyone! This is really useful. A summary of what's been mentioned so far along with some other obvious bits I gathered: Quote General - avoid floating point maths, code your own fixed point arithmetic - avoid multiple and division, prefer bit shifts Variables - prefer static storage -- global variables are implicitly static -- keep local variables to a minimum, but declare them static if needed - use const liberally, it can trigger the compiler to refer to data from the ROM rather than copying data into RAM Functions - avoid using more than 2 parameters - avoid passing structs as parameters by value, always pass as pointer if needed - prefer smaller functions, there is more overhead to call them but SDCC usually generates better code for each - avoid local variables, e.g. use 1 iterator, re-use variables Compiler - set compiler option to optimise for speed rather than size Assembly - view generated assembly in Emulicious - generally the less lines the better - Compiler Explorer for SDCC experiments Mixed opinion: - avoid structs - prefer pointer maths rather than array indexing Regarding avoiding structs, does this mean copying the data from your struct into a variable and using that, then copying it back, or does it mean avoiding structs (and struct arrays) altogether? For example only using arrays of bytes?

sverx Joined: 05 Sep 2013 Posts: 3828 Location: Stockholm, Sweden	Posted: Tue Mar 05, 2024 5:34 pm
	I wouldn't suggest to avoid structs at all cost - after all, the Z80 is great at accessing structs' fields. Use structs when needed, but prefer byte arrays if you would have defined a struct just to process fields in their order. edit: also if you have one local variable that you use for example in a for loop, don't declare it as static, as there are chances that if the function/loop is simple, SDCC will use one of the Z80 registers for that, and that would be the fastest option. See here my unsigned char i variable: ;objects_draw.c:61: for (i=0;i<MAX_OBJECTS;i++) ; skipping iCode since result will be rematerialized ; genAssign ; genMove_o ld c, #0x00

eruiz00 Joined: 28 Jan 2017 Posts: 556 Location: Málaga, Spain	Posted: Wed Mar 06, 2024 5:58 am
	I supose someone with good Asm knowledge could compare the final code of using múltiple arrays of chars/ints/pointers and passing a "counter" car in functión calls against using arrays of structs passing pointers to those structs... For me, I have to say I use the later, structs of customized data everywhere… and pointers as parameters and everything runs great. The most cpu time consuming task in-game with smsdevkit often is the sprite related procedures, which can be heavily optimized using 8x16 sprites and the add two/three adjacejt sprites functions (four would be great!!!)... One thing I regret from electronic dreams is to have used 8x8 sprites to save as many tiles as posible.... Now, several months later, think I could use 8x16 with minimum changes, with considerable performance improvements... Who knows.... Maybe some day....

sverx Joined: 05 Sep 2013 Posts: 3828 Location: Stockholm, Sweden	Posted: Wed Mar 06, 2024 12:47 pm
	eruiz00 wrote The most cpu time consuming task in-game with smsdevkit often is the sprite related procedures, which can be heavily optimized using 8x16 sprites and the add two/three adjacejt sprites functions (four would be great!!!). Indeed placing sprites on screen it's a really time consuming operation, but provided their position often change at every frame, I don't think there's a real practical alternative here. If a function to draw four adjacent sprites can help, I have no problem adding it - I wasn't even sure anyone was using the two- or three- sprites functions, actually...

badcomputer Joined: 19 Oct 2023 Posts: 138	Posted: Wed Mar 06, 2024 5:07 pm
badcomputer Joined: 19 Oct 2023 Posts: 138	sverx wrote Indeed placing sprites on screen it's a really time consuming operation, but provided their position often change at every frame, I don't think there's a real practical alternative here. If a function to draw four adjacent sprites can help, I have no problem adding it - I wasn't even sure anyone was using the two- or three- sprites functions, actually... I always use SMS_addTwoAdjoiningSprites for 16x16px sprites when I can. I wonder if there's any better way to draw a quad of 4 tiles? I often use them for 16x16px objects that I don't want as sprites, but feel I'm doing it slowly using SMS_setTileatXY x 4?

sverx Joined: 05 Sep 2013 Posts: 3828 Location: Stockholm, Sweden	Posted: Wed Mar 06, 2024 5:32 pm
	badcomputer wrote I wonder if there's any better way to draw a quad of 4 tiles? I often use them for 16x16px objects that I don't want as sprites, but feel I'm doing it slowly using SMS_setTileatXY x 4? Create an array of unsigned ints with the four tiles you would write and use SMS_loadTileMapArea (x, y, src, width, height) which is faster. In your case for instance you could just do: unsigned int const BGdata={32,33,34,35}; and later SMS_loadTileMapArea (x, y, BGdata, 2, 2);

badcomputer Joined: 19 Oct 2023 Posts: 138	Posted: Wed Mar 06, 2024 7:13 pm
badcomputer Joined: 19 Oct 2023 Posts: 138	sverx wrote badcomputer wrote I wonder if there's any better way to draw a quad of 4 tiles? I often use them for 16x16px objects that I don't want as sprites, but feel I'm doing it slowly using SMS_setTileatXY x 4? Create an array of unsigned ints with the four tiles you would write and use SMS_loadTileMapArea (x, y, src, width, height) which is faster. In your case for instance you could just do: unsigned int const BGdata={32,33,34,35}; and later SMS_loadTileMapArea (x, y, BGdata, 2, 2); Excellent, thank you!

sverx Joined: 05 Sep 2013 Posts: 3828 Location: Stockholm, Sweden	Posted: Fri Mar 08, 2024 12:57 pm Last edited by sverx on Fri Mar 08, 2024 1:00 pm; edited 1 time in total
	While I'm adding a function to add 4 adjacent sprites - and hoping nobody will need 5 then ;) - I was wondering how often zoomed sprites are used, given they're 'broken' on first revision SMSs (or better: not officially supported and thus not functioning correctly) and totally missing when running on a Genesis/MegaDrive hardware. All this because the adjacent sprites functions could be even a bit faster if support for zoomed sprites is carved away, likely using a compile define such as NO_SPRITE_ZOOM or similar. Elaborating on this, how often do you need to use both 8×8 and 8×16 sprites? There might be more savings if code always uses just one of these modes. Any feedback?

eruiz00 Joined: 28 Jan 2017 Posts: 556 Location: Málaga, Spain	Posted: Fri Mar 08, 2024 12:58 pm
	Delete the zoom feature dude !!!!! (But keep the 8x16… i think there are a big performance step than with 8x8 in sprite massive games)

sverx Joined: 05 Sep 2013 Posts: 3828 Location: Stockholm, Sweden	Posted: Fri Mar 08, 2024 1:00 pm
	eruiz00 wrote Delete the zoom feature dude !!!!! LOL! :D :D eruiz00 wrote (But keep the 8x16… i think there are a big performance step than with 8x8 in sprite massive games) I never meant to remove that!

Maxim Site Admin Joined: 19 Oct 1999 Posts: 14745 Location: London	Posted: Fri Mar 08, 2024 1:28 pm
	In my current (not in C) project I have a need for both 8x16 and 8x8 sprites at different times.

ichigobankai Joined: 04 Jul 2010 Posts: 542 Location: Angers, France	Posted: Fri Mar 08, 2024 4:15 pm
	same as Maxim.

cireza Joined: 27 Feb 2023 Posts: 136 Location: France	Posted: Fri Mar 08, 2024 6:26 pm
cireza Joined: 27 Feb 2023 Posts: 136 Location: France	I use both 8x8 and 8x16 sprites as well, even if my games are not "action" heavy, both can be convenient for various reason. Anything that moves is 8x16 as much as possible to make it faster to update positions.

badcomputer Joined: 19 Oct 2023 Posts: 138	Posted: Sat Mar 09, 2024 12:17 pm
badcomputer Joined: 19 Oct 2023 Posts: 138	Newbie question about the line interrupt handler. I'm using it like this: SMS_setLineInterruptHandler(&stage_update_palettes); SMS_setLineCounter(192); SMS_enableLineInterrupt(); start_new_game(); for ( ;; ) { // Game logic update(); // Vblank SMS_waitForVBlank(); SMS_copySpritestoSAT(); SMS_initSprites(); } And it's working fine, but if I increase the trigger line to say 216 it doesn't fire. I want it to be lower to avoid the palette cycling corrupting the screen at the bottom. What am I missing here?

willbritton Joined: 06 Mar 2022 Posts: 671 Location: London, UK	Posted: Sat Mar 09, 2024 12:42 pm
	The VDP's line interrupt counter is reset outside of the visible screen area, meaning that values of greater than 191 don't do anything useful for you here (assuming you're using a 192 line display!) If you want to delay some operation to a specific portion of vblank then I guess you have to count cycles, although most of the time you should be fine to do whatever you like in vblank - palette artefacts should only be happening in the active display. I'm also not sure what the value of setting it to 192 is, as in your example, because I think that's either equivalent to a vblank interrupt, or possibly to none at all (can't remember on which line the interrupts stop happening)

badcomputer Joined: 19 Oct 2023 Posts: 138	Posted: Sat Mar 09, 2024 1:02 pm
badcomputer Joined: 19 Oct 2023 Posts: 138	willbritton wrote The VDP's line interrupt counter is reset outside of the visible screen area, meaning that values of greater than 191 don't do anything useful for you here (assuming you're using a 192 line display!) If you want to delay some operation to a specific portion of vblank then I guess you have to count cycles, although most of the time you should be fine to do whatever you like in vblank - palette artefacts should only be happening in the active display. I'm also not sure what the value of setting it to 192 is, as in your example, because I think that's either equivalent to a vblank interrupt, or possibly to none at all (can't remember on which line the interrupts stop happening) Ah OK thank you Will. I'm seeing the artifacts on line ~200 (with visible border in Emulicious), I assume that's because there's very little happening in V-blank at the moment. Once I have more in the game I'll try to ensure the palette changes are off screen.

willbritton Joined: 06 Mar 2022 Posts: 671 Location: London, UK	Posted: Sat Mar 09, 2024 1:15 pm
	badcomputer wrote I'm seeing the artifacts on line ~200 (with visible border in Emulicious), I assume that's because there's very little happening in V-blank at the moment. Once I have more in the game I'll try to ensure the palette changes are off screen. Ah yeah sorry, I forgot about the border area, you're right! Now you mention it I'm sure I did something similar before, let me try and remember... EDIT: oh yeah, it was here when I was hacking together some palette fading functions. You can see I added a comment about pre-calculating some stuff which basically front-loaded some CPU work such that by the time the palettes started changing the border had already been drawn. A bit dodgy but not sure there's a more reliable way. I think there have been a few forum posts in this kind of area before.

Maxim Site Admin Joined: 19 Oct 1999 Posts: 14745 Location: London	Posted: Sat Mar 09, 2024 7:20 pm
	Many commercial games have “CRAM dots” in the visible border area. If you have a deterministic amount of work each frame - eg uploading the sprite table - then you can do that before setting the palette to push the dots lower. If your vblank work is more variable then it’s harder to know when to do it - but you could consider checking the line counter multiple times during the vblank and upload the palette only when it’s large enough to hide the dots.

dark Joined: 09 Jan 2012 Posts: 67 Location: Germany	Posted: Sun Mar 10, 2024 8:11 pm
dark Joined: 09 Jan 2012 Posts: 67 Location: Germany	I'm currently diving into the keyboard function of SGlib to implement a proper prompt. If I use the following snippet I'm getting the same status codes for all keyboard rows. //return the number and the pressed keys unsigned char SG_GetKeycode (unsigned int *keys, unsigned char max_keys) { unsigned char ret=0; unsigned int status; for(unsigned char row = 0; row < 8; ++row) { SC_PPI_C=row; status=~((SC_PPI_A << 8) \| SC_PPI_B); for(unsigned int bit=0x8000; bit \|\| !status; bit >>= 1) { if (status & bit) { if (ret < max_keys) { keys[ret]=bit; status -= bit; } else return ret; ++ret; } } } return ret; } Is there an emulator which is able to test the keyboard functionality of the SC-3000? I'm currently using Emulicious which passthrough only a small set of keys.

badcomputer Joined: 19 Oct 2023 Posts: 138	Posted: Mon Mar 11, 2024 12:00 am
badcomputer Joined: 19 Oct 2023 Posts: 138	Maxim wrote Many commercial games have “CRAM dots” in the visible border area. If you have a deterministic amount of work each frame - eg uploading the sprite table - then you can do that before setting the palette to push the dots lower. If your vblank work is more variable then it’s harder to know when to do it - but you could consider checking the line counter multiple times during the vblank and upload the palette only when it’s large enough to hide the dots. I think it's going to be trial and error to find the right timing to change the palette. I'll keep tweaking it as the game grows, but checking the line counter sounds good too, thanks!

sverx Joined: 05 Sep 2013 Posts: 3828 Location: Stockholm, Sweden	Posted: Mon Mar 11, 2024 9:21 am Last edited by sverx on Mon Mar 11, 2024 11:12 am; edited 4 times in total
	Lost somewhere in this forum (and I couldn't find it again so far) there are info regarding which lines are 'safe' for writing to CRAM without getting the dots. I think it's worthy to put that in the wiki if we find that info again. edit: found it! It's here for NTSC and this is for PAL In short: lines 216-234 (inclusive) for NTSC and lines 240-258 (inclusive) for PAL

Kagesan Joined: 01 Feb 2014 Posts: 878	Posted: Mon Mar 11, 2024 9:22 am
Kagesan Joined: 01 Feb 2014 Posts: 878	What I usually do is that I put a scanline poll at the end of my vblank routine and make it wait for a specific line if a flag is set that indicates a palette change. I then just write to cram when the line counter has reached the correct line. The downside is that you’re wasting precious vblank time doing basically nothing, but then it only happens during frames with palette changes, which aren’t too many, usually. The correct lines to trigger the palette change are different for NTSC and PAL, so I combine the scanline counter with a TV type value I check at startup. Unfortunately, I’m at work atm, but I can look up working target scanlines when I get home. After all, it’s not necessary that everyone has to do trial and error only to arrive at the same result.

sverx Joined: 05 Sep 2013 Posts: 3828 Location: Stockholm, Sweden	Posted: Mon Mar 11, 2024 9:48 am
	dark wrote I'm currently diving into the keyboard function of SGlib to implement a proper prompt. If I use the following snippet I'm getting the same status codes for all keyboard rows. Here you can check how I read some keys - it's been tested working, so you can probably check what's different in your code. edit: reading your snippet once again, I see it's going through all the keyboard rows but the values it returns when some keys are pressed don't have that information anywhere. We could try to fix it, but I have no idea how the keyboard really works... if there are never more than 13 keys on a row you could use the upper three bits of the integer it returns to store the row number 0-7... edit2: it might work, according to this, the upper 4 bits of PPI_B are never used, so you could start with bit=0x0800 and have keys[ret]=bit & (row<<12); edit3: of course you need to swap the order from status=~((SC_PPI_A << 8) \| SC_PPI_B); to status=~((SC_PPI_B << 8) \| SC_PPI_A); so that the unused 4 bits are the msb. edit4: if all this works, you might want to switch to returning one unsigned char per key pressed, which you can do by returning the row number in 3 bits and the key number in 4 bits in a 0 r r r k k k k form by having an additional counter k in the for loop, going from 0 to 11 - or the other way around if you prefer.

sverx Joined: 05 Sep 2013 Posts: 3828 Location: Stockholm, Sweden	Posted: Mon Mar 11, 2024 4:49 pm
	Kagesan wrote The correct lines to trigger the palette change are different for NTSC and PAL, so I combine the scanline counter with a TV type value I check at startup. ... or you could just check when the counter changes to a lower value, this happens at 0xDA (goes 'back' to 0xD5) on NTSC and at 0xF2 (goes 'back' to 0xBA) on PAL.

Kagesan Joined: 01 Feb 2014 Posts: 878	Posted: Mon Mar 11, 2024 5:08 pm
Kagesan Joined: 01 Feb 2014 Posts: 878	sverx wrote Kagesan wrote The correct lines to trigger the palette change are different for NTSC and PAL, so I combine the scanline counter with a TV type value I check at startup. ... or you could just check when the counter changes to a lower value, this happens at 0xDA (goes 'back' to 0xD5) on NTSC and at 0xF2 (goes 'back' to 0xBA) on PAL. That's actually pretty clever. Funny, 217 (NTSC) and 242 (PAL) are exactly the values I always use.

sverx Joined: 05 Sep 2013 Posts: 3828 Location: Stockholm, Sweden	Posted: Tue Mar 12, 2024 8:53 am
	Kagesan wrote That's actually pretty clever. Funny, 217 (NTSC) and 242 (PAL) are exactly the values I always use. That's because it happens when the 'beam' moves back to the top of the screen, so it doesn't really draw any pixel. Anyway whatever the approach, the important thing to avoid CRAM dots is not to write to the palettes during the active part and while the VDP renders the top and bottom borders.

pw Joined: 10 Aug 2023 Posts: 33	Posted: Wed Mar 13, 2024 2:48 am
pw Joined: 10 Aug 2023 Posts: 33	Hi there! I'm trying to write a utility function that would copy 256 bytes to the VDP tiles based on the UNSAFE_SMS_VRAMmemcpy functions. __sfr __at 0xBE VDPDataPort2; #define SETVDPDATAPORT2 __asm ld c,#_VDPDataPort2 __endasm void UNSAFE_SMS_VRAMmemcpy256 (unsigned int dst, const void *src) { SMS_setAddr(0x4000\|dst); SETVDPDATAPORT2; OUTI128(src); SMS_setAddr(0x4000\|(dst + 128)); // or 256? SETVDPDATAPORT2; OUTI128(src + 64); } At the moment it's not working for me, but there are some parts I'm unsure of. First, I needed access to the vdpdataport. The VDPDataPort is only declared internally, so I duplicated one locally above. Second, I'm not sure if it's safe to do those three (setaddr, setvdpdatapport, outi) functions repeatedly. So far it doesn't seem to work correctly. The first half seems to copy to VDP but the second goes out into space somewhere. If I call two UNSAFE_SMS_VRAMmemcpy128 () with the correct offsets in a row it works, but not if I do the above. Any idea? Thanks!

willbritton Joined: 06 Mar 2022 Posts: 671 Location: London, UK	Posted: Wed Mar 13, 2024 8:42 am
	What's the +64 for? Shouldn't that be +128? Presumably you can simply do something like the following: SMS_setAddr(0x4000\|dst); SETVDPDATAPORT2; OUTI128(src); OUTI128(src + 128); Or even UNSAFE_SMS_VRAMmemcpy128(dst, src); OUTI128(src + 128); If the compiler decides to inline the initial call to UNSAFE_SMS_VRAMmemcpy128 they should be equivalent. EDIT: and to risk stating the obvious, the whole thing is inherently "unsafe" in that you won't be able to call it when the VDP is in active display, so you need to make sure you're calling it in the right place if you want it to work. EDIT2: Also calling OUTI128 again does reload the hl register so in theory you could cut that overload out by inlining some assembly which emitted 128 more outi's after the call to UNSAFE_SMS_VRAMmemcpy128: .rept 128 outi .endm

Forums

View topic - devkitSMS - develop your homebrew in C