Forums

Sega Master System / Mark III / Game Gear
SG-1000 / SC-3000 / SF-7000 / OMV
Home - Forums - Games - Scans - Maps - Cheats - Credits
Music - Videos - Development - Hacks - Translations - Homebrew

View topic - devkitSMS - develop your homebrew in C

Reply to topic Goto page Previous  1, 2, 3 ... 14, 15, 16, 17  Next
Author Message
  • Joined: 05 Sep 2013
  • Posts: 3828
  • Location: Stockholm, Sweden
Reply with quote
Post Posted: Wed Mar 13, 2024 9:13 am
if you're trying to avoid the overhead of calling UNSAFE_SMS_VRAMmemcpy128 twice to achieve the same, you could probably do something like:
void UNSAFE_SMS_VRAMmemcpy256 (unsigned int dst, const void *src) {
  SMS_setAddr(0x4000|dst);
  SETVDPDATAPORT;
  OUTI128(src);
  OUTI128((unsigned char *)src+128);
}

because you don't need to set the VDP address again, you just need to queue another 128 OUTIs.

edit: I see now that Will already replied with the same stuff.
Sorry mate I didn't see that before (and I have no idea why...)
  View user's profile Send private message Visit poster's website
  • Joined: 09 Aug 2021
  • Posts: 131
Reply with quote
Post Posted: Thu Mar 14, 2024 7:36 am
i think if you declare:

const void * OUTI128(const void *src) __z88dk_fastcall;

then you can do:

  OUTI128(OUTI128(src));

because in the z88dk_fastcall calling convention parameter and return value is passed in the same registers.
  View user's profile Send private message
  • Joined: 09 Aug 2021
  • Posts: 131
Reply with quote
Post Posted: Thu Mar 14, 2024 7:40 am
willbritton wrote
f the compiler decides to inline the initial call to UNSAFE_SMS_VRAMmemcpy128 they should be equivalent.

AFAIK, SDCC never inline unless you explicitly tell it, and only on the function level.
  View user's profile Send private message
  • Joined: 05 Sep 2013
  • Posts: 3828
  • Location: Stockholm, Sweden
Reply with quote
Post Posted: Thu Mar 14, 2024 10:06 am
eruiz00 wrote
The most cpu time consuming task in-game with smsdevkit often is the sprite related procedures, which can be heavily optimized using 8x16 sprites and the add two/three adjacejt sprites functions (four would be great!!!)...


SMS_addFourAdjoiningSprites just added. I suppose if you need more than that, you probably want to switch to metasprites (see SMS_addMetaSprite)
  View user's profile Send private message Visit poster's website
  • Joined: 05 Sep 2013
  • Posts: 3828
  • Location: Stockholm, Sweden
Reply with quote
Post Posted: Fri Mar 15, 2024 1:09 pm
... and of course there was a stupid bug in the recently added SMS_addFourAdjoiningSprites.
Please update again, especially if you plan to use that.
  View user's profile Send private message Visit poster's website
  • Joined: 10 Aug 2023
  • Posts: 33
Reply with quote
Post Posted: Sun Mar 17, 2024 3:49 am
Thanks for the help, everybody.

To answer a question from earlier, I use 8x16 sprites with no zoom. Also I use the adjoining sprites functions. A while back I needed a SMS_addFourAdjoiningSprites so I made one. I’ll use the new one once I update.
  View user's profile Send private message
  • Joined: 05 Sep 2013
  • Posts: 3828
  • Location: Stockholm, Sweden
Reply with quote
Post Posted: Sun Mar 17, 2024 10:10 am
pw wrote
I use 8x16 sprites with no zoom.


I'm considering adding a define to compile the whole library with a single sprite mode (either 8×8 or 8×16) so that every SMS_add*AdjoiningSprites is slightly faster as it doesn't have to adapt to multiple modes.
I still have to figure out if it's worth, probably it is.

pw wrote
A while back I needed a SMS_addFourAdjoiningSprites so I made one. I’ll use the new one once I update.


Well, don't assume mine is faster - you should actually test both and in case yours is faster please let me know! :D
  View user's profile Send private message Visit poster's website
  • Joined: 05 Sep 2013
  • Posts: 3828
  • Location: Stockholm, Sweden
Reply with quote
Post Posted: Sun Mar 17, 2024 10:18 am
I also just pushed an update for those which prefer recompiling the library themselves with NO_SPRITE_CHECKS for faster sprites functions.
(and NO_SPRITE_ZOOM will save a few more cycles in the SMS_add*AdjoiningSprites in case you don't ever need to use any sprite zoom mode). Hopefully I didn't break anything...
  View user's profile Send private message Visit poster's website
  • Joined: 28 Jan 2017
  • Posts: 556
  • Location: Málaga, Spain
Reply with quote
Post Posted: Sun Mar 17, 2024 3:10 pm
Can I ask what NO_SPRITE_CHECKS disable???

I mean.... How do I know if I can omit these checks? (If the game is the kind of Game which use sprites in a way It does not need the checks...)
  View user's profile Send private message
  • Joined: 10 Aug 2023
  • Posts: 33
Reply with quote
Post Posted: Sun Mar 17, 2024 3:48 pm
To use the adjoining sprites functions, you have pack the x and tile value into one parameter, which when passed sets them into registers. I saw that there have been changes to function call conventions in SDCC a while back. Is the packing still more efficient than passing all three Y,X,Tile parameters individually?
  View user's profile Send private message
  • Joined: 05 Sep 2013
  • Posts: 3828
  • Location: Stockholm, Sweden
Reply with quote
Post Posted: Sun Mar 17, 2024 4:20 pm
eruiz00 wrote
Can I ask what NO_SPRITE_CHECKS disable???


SMS_addSprite and the SMS_add*AdjoiningSprites functions check if there are enough sprites still available, so that you don't end up trying to use more than 64 (which would overflow the arrays) and they make sure you don't place any sprite list terminator by mistake (a sprite whose Y is equal to 209 would turn into the sprite list terminator character 0xD0) which would cause all the successive sprites not to appear on screen. SMS_addSprite also return the sprite 'id' (0 to 63) of the added sprite, or the values -1 (no more sprites available) or -2 (invalid Y coordinate) in case the sprite can't be added.

All the above are a sort of 'safety net'. But if you know for sure that you never use more than 64 sprites *and* your code already ensures that you'll never put a sprite at Y=209, and you don't care about the sprite 'id' either... you can remove that safety net and enjoy a performance boost, in case of the SMS_addSprite it's from 162 down to 116 cycles each call, but there's less savings with the other functions.

pw wrote
To use the adjoining sprites functions, you have pack the x and tile value into one parameter, which when passed sets them into registers. I saw that there have been changes to function call conventions in SDCC a while back. Is the packing still more efficient than passing all three Y,X,Tile parameters individually?


Unfortunately, SDCC currently still doesn't offer any calling convention that uses registers for more than 2 parameters - you can see that in SDCC manual, at page 72 - so any additional parameter would end up on the stack. Since on the Z80 an 16 bit register is actually just a pair of two 8 bit registers, if you ensure that each one doesn't 'spill' into the other you can combine two chars into one int parameter with no additional cost. But of course I hope one day we'll have more options here and be able to pick a different fastcall convention that doesn't require our code to do these workarounds.

edit: actually I might have misunderstood your question. I hope you're using the SMS_add*AdjoiningSprites(x,y,tile) macros, not the SMS_add*AdjoiningSprites_f functions directly...
  View user's profile Send private message Visit poster's website
  • Joined: 10 Aug 2023
  • Posts: 33
Reply with quote
Post Posted: Sun Mar 17, 2024 4:40 pm
I'm calling the adjoining sprite functions directly, using a macro to pack the x and tile that doesn't do the 0xFF masking because I already know they're 8 bit values.

Actually I'm using slight customized versions of the adjoining sprite functions, which assume the sprites are 8 pixels wide.

I haven't compared them with the new devkitSMS versions yet.

You can see part of what I'm doing in a Twitter post I made about optimizaton progress.
https://twitter.com/pw_32x/status/1769393522682065057
  View user's profile Send private message
  • Joined: 05 Sep 2013
  • Posts: 3828
  • Location: Stockholm, Sweden
Reply with quote
Post Posted: Sun Mar 17, 2024 4:55 pm
A mask operation with 0xFF on an 8 bit variable won't add any overhead, as SDCC does understand that that's not a real operation - I just needed that in the macro to ensure whatever the passed variable type was, only 8 bits had to be considered - so you're probably not improving even skipping the provided macros.

Regarding the 8 pixels wide sprite size, that's what I also recently added, with NO_SPRITE_ZOOM.
That's very little savings unfortunately, but hey...
  View user's profile Send private message Visit poster's website
  • Site Admin
  • Joined: 19 Oct 1999
  • Posts: 14745
  • Location: London
Reply with quote
Post Posted: Sun Mar 17, 2024 5:00 pm
I can’t see any further Twitter threads, but I think it’d be useful to see the assembly generated for your problematic “self time” and see if there’s ways to improve it. For example, instead of a switch, in assembly you’d write code more like

index--;
if (index == 0) functionFor1();
index--;
if (index == 0) functionFor2();

…etc. The code also seems to be using pointers to structs which probably turns into index registers; this might also generate some large/slow assembly.
  View user's profile Send private message Visit poster's website
  • Joined: 05 Sep 2013
  • Posts: 3828
  • Location: Stockholm, Sweden
Reply with quote
Post Posted: Sun Mar 17, 2024 5:09 pm
Maxim wrote
For example, instead of a switch, in assembly you’d write code more like

index--;
if (index == 0) functionFor1();
index--;
if (index == 0) functionFor2();

…etc.


Hard to tell when a switch starts to perform better/worse than a bunch of IFs - but, for what is worth, SDCC often uses jump tables.

If the switch parameter is an unsigned char, then the best option is to have values 0-3 and if it's impossible to have values other that those, have a default case instead of last one:
switch (uc) {
  case 0: func0(); break;
  case 1: func1(); break;
  case 2: func2(); break;
  default: func3(); break;
}


this because the compiler creates code that first verifies the min and max values, to build the smallest jump table possible, then creates labels for skipping to the end and to the default (if present) so in this case the jump table will have only 3 entries (0 to 2) and in every other case it'll just jump to default. Also, if an IF-ELSE-IF-ELSE-IF was faster, current SDCC *should* be able to create the same code instead of using a jump table.
  View user's profile Send private message Visit poster's website
  • Joined: 10 Aug 2023
  • Posts: 33
Reply with quote
Post Posted: Sun Mar 17, 2024 5:35 pm
sverx wrote
A mask operation with 0xFF on an 8 bit variable won't add any overhead, as SDCC does understand that that's not a real operation - I just needed that in the macro to ensure whatever the passed variable type was, only 8 bits had to be considered - so you're probably not improving even skipping the provided macros.


Using the Emulicious profiler, it looked like it made a very slight improvement, so I kept the change. Could've been just the margin of error on the profiling, though.


Maxim wrote
I can’t see any further Twitter threads, but I think it’d be useful to see the assembly generated for your problematic “self time” and see if there’s ways to improve it.
The code also seems to be using pointers to structs which probably turns into index registers; this might also generate some large/slow assembly.



C version
https://www.dropbox.com/scl/fi/3sa60ksc3652b9xy6lql8/DrawUtils_DrawBatched.c?rlk...

Asm generated code
https://www.dropbox.com/scl/fi/9hq21ykwsyuc3ebc4b2f4/DrawUtils_DrawBatched_gener...

I wouldn't even begin to know how to rewrite it to better assembly.

The animation data is indeed stored in structs, as you'll see. I'm aware of the popular opinion that "structs are bad".

One idea would be to convert the BatchedAnimationSpriteStrip to a giant array, as the values in the struct type are 8bit each. A runner goes through the values, feeding the function call. But I'd have to increment the runner and dereference it every time I want to use the next value. I don't know if this would be better than using a pointer struct, in terms of performance. It would have to be tried.
  View user's profile Send private message
  • Joined: 10 Aug 2023
  • Posts: 33
Reply with quote
Post Posted: Sun Mar 17, 2024 5:43 pm
sverx wrote


If the switch parameter is an unsigned char, then the best option is to have values 0-3 and if it's impossible to have values other that those, have a default case instead of last one:
switch (uc) {
  case 0: func0(); break;
  case 1: func1(); break;
  case 2: func2(); break;
  default: func3(); break;
}



The switch value is an unsigned char, yes.

I've tried all kinds of variations, like adding a case 0, using 0-3 instead of 1-4, adding a default, etc and the current layout generates the fastest code so far. In the current code, a 0 signifies the end of the drawing. I've also tried to return at case 0 instead of break, but that seemed to worsen the performance by an unexpectedly large amount. I won't profess to be an expert on what the compiler is doing. I try variations and look at what the profiler gives me and I keep the best version.
  View user's profile Send private message
  • Joined: 19 Oct 2023
  • Posts: 139
Reply with quote
Post Posted: Sun Mar 17, 2024 6:10 pm
pw wrote
sverx wrote


If the switch parameter is an unsigned char, then the best option is to have values 0-3 and if it's impossible to have values other that those, have a default case instead of last one:
switch (uc) {
  case 0: func0(); break;
  case 1: func1(); break;
  case 2: func2(); break;
  default: func3(); break;
}



The switch value is an unsigned char, yes.

I've tried all kinds of variations, like adding a case 0, using 0-3 instead of 1-4, adding a default, etc and the current layout generates the fastest code so far. In the current code, a 0 signifies the end of the drawing. I've also tried to return at case 0 instead of break, but that seemed to worsen the performance by an unexpectedly large amount. I won't profess to be an expert on what the compiler is doing. I try variations and look at what the profiler gives me and I keep the best version.


https://github.com/pw32x/ninjagirl/blob/main/project/source/engine/animation_uti...

I noticed in that function you're creating 6 local variables every time. I've seen some decent improvements by removing as many local variables as possible, and changing the remaining (at most 2) to static variables.

Not sure what the runner->count is either, how often is that looping?
  View user's profile Send private message Visit poster's website
  • Site Admin
  • Joined: 19 Oct 1999
  • Posts: 14745
  • Location: London
Reply with quote
Post Posted: Sun Mar 17, 2024 6:34 pm
I can see in the generated asm that the default handling is costing a bit of time, and it does a cp for each comparison instead of a dec (which is faster). However, the switch cases themselves are clearly much more code (more opcodes = more code = slower!). The code seems to be adding the pointed object’s x, y, n to some global base values for those, and then calling into the library; but I’m not sure why it’s taking some trouble to mask bits as it does so. I feel this could be better, maybe try expanding the operations to more lines of code (which can help tie lines to assembly) and check the result.

It’s also doing a decent job of walking through the pointed struct instead of indexing into it - but then it discards that when it is time to increment the pointer. If you rewrite it to walk through the pointer data after casting to unsigned char* it’d be uglier code but maybe faster.
  View user's profile Send private message Visit poster's website
  • Joined: 10 Aug 2023
  • Posts: 33
Reply with quote
Post Posted: Mon Mar 18, 2024 3:39 am
I don't want to pollute the general devkitSMS discussion, so I'd like to move the optimization discussion to my dedicated game thread here: https://www.smspower.org/forums/19979-PWsUntitledSMSHomebrewGame#130018
  View user's profile Send private message
  • Joined: 05 Sep 2013
  • Posts: 3828
  • Location: Stockholm, Sweden
Reply with quote
Post Posted: Mon Mar 18, 2024 9:01 am
Partially keeping the discussion here because it might be useful to other devkitSMS users as well: since what you're doing here is basically using 'stripes' of adjacent sprites to create a metasprite, I would consider the alternative of using SMSlib's metasprites instead - it should be faster, as it would remove all that overhead you're experiencing here.

SMS_addMetaSprite(x,y,metasprite)


Quote
metasprites definition are a sequence of n*3 char values:
(signed) delta_x, (signed) delta_y, (unsigned) tile number
terminated by a METASPRITE_END terminator value (-128)


example:
#define SPRITES_START_TILE   40
unsigned char const metasprite4[]={
0,0,SPRITES_START_TILE+1,
12,11,SPRITES_START_TILE+2,
14,-8,SPRITES_START_TILE+2,
-3,-16,SPRITES_START_TILE+2,
-16,-2,SPRITES_START_TILE+2,
-7,15,SPRITES_START_TILE+2,
METASPRITE_END};


give it a try and let us know! :)
  View user's profile Send private message Visit poster's website
  • Joined: 10 Aug 2023
  • Posts: 33
Reply with quote
Post Posted: Wed Mar 20, 2024 1:44 pm
I'm just about ready to try that out. In the meantime I noticed I'm almost filling up the first 32k of rom. I'll need to start looking at banking code.

One question I had was does the banked code feature in SDCC handle nested banked functions correctly?

FunctionA_bank1 calls FunctionB_bank4 calls FunctionC_bank6 and it all works and returns correctly? Or is it just FunctionA_bank0 calls FunctionB_bank4 and that's it?

Also I don't have the impression I'd be able to work with banked data in a banked function. So a function like one that updates sprite positions in the VDP would need to be in bank 0-1 since animation data would be in bank X.

If that's the case, then an object's update would need to be broken down into separate update (banked code) and animation update + draw (banked data) functions.
  View user's profile Send private message
  • Joined: 05 Sep 2013
  • Posts: 3828
  • Location: Stockholm, Sweden
Reply with quote
Post Posted: Wed Mar 20, 2024 2:09 pm
pw wrote
One question I had was does the banked code feature in SDCC handle nested banked functions correctly?


Yes, no issues.
Note that functions that you call from another function in the same bank doesn't need to be banked, as it's 'local' to the caller.


pw wrote
Also I don't have the impression I'd be able to work with banked data in a banked function.


Banked code uses slot 1, banked data uses slot 2 so they're fully independent. You can map any banked data from any function.
  View user's profile Send private message Visit poster's website
  • Joined: 10 Aug 2023
  • Posts: 33
Reply with quote
Post Posted: Thu Mar 21, 2024 1:17 am
I was implementing support for metasprites until I realized that the current version doesn't have a parameter for a tile vdp index. In my project sprite resources don't have a fixed location in VDP tile ram. Levels are packaged to only upload what it needs, which is different from one level to the next. Therefore sprite locations are always different.

Could you write a second tweaked version that takes a starting tile index?
  View user's profile Send private message
  • Joined: 05 Sep 2013
  • Posts: 3828
  • Location: Stockholm, Sweden
Reply with quote
Post Posted: Thu Mar 21, 2024 8:30 am
oh, I didn't expect that.
I will consider adding a different version, but it won't anyway happen immediately.
You could anyway build metasprite definitions in RAM, of course depending on how much RAM you already use. With such a workaround you could see if the metasprites route gives you that needed boost...

edit: I might have an hackish solution, involving setting a global variable before calling the SMS_addMetaSprite function. Let me know if you want to test it.
  View user's profile Send private message Visit poster's website
  • Joined: 10 Aug 2023
  • Posts: 33
Reply with quote
Post Posted: Thu Mar 21, 2024 12:44 pm
sverx wrote
oh, I didn't expect that.
I will consider adding a different version, but it won't anyway happen immediately.
You could anyway build metasprite definitions in RAM, of course depending on how much RAM you already use. With such a workaround you could see if the metasprites route gives you that needed boost...


I realized I could do that just when I was falling asleep last night.

Quote

edit: I might have an hackish solution, involving setting a global variable before calling the SMS_addMetaSprite function. Let me know if you want to test it.


Sure!

I tried hacking a version using a global variable but I couldn't figure out the magical incantation to make it work.

What I'll do is try out both versions. I've got more than enough ram left over to try out the original version. Only the player sprite has tons of frames, but everything should still fit. I've got about 2k left.

The player is 40 frames of about 6 sprites per frame, so (40 * 6 * 3) + 1 = 721 bytes.
  View user's profile Send private message
  • Joined: 05 Sep 2013
  • Posts: 3828
  • Location: Stockholm, Sweden
Reply with quote
Post Posted: Thu Mar 21, 2024 1:13 pm
OK here's the tweaked version.
You need to declare
extern unsigned char MetaSpriteBaseTile;
and set it to the 'base' tile ID for your metasprite, so that every tile ID in metasprite definitions will be used as offsets from the base (so if you leave the base to 0, it's actually working the 'regular' way).

Example usage:
MetaSpriteBaseTile=Player_Base_Tile;
SMS_addMetaSprite(Player_X, Player_Y, metasprites[Player_Frame]);


I'll see if I can create something better later.
SMSlib_NO_ZOOM_DELTA.lib.zip (10.69 KB)
SMSlib recompiled with NO ZOOM and 'delta' metasprites

  View user's profile Send private message Visit poster's website
  • Joined: 10 Aug 2023
  • Posts: 33
Reply with quote
Post Posted: Sun Mar 24, 2024 4:35 am
Finally found time to try it out. Got it working with the MetaSpriteBaseTile version and I definitely get a speed benefit. It's not super dramatic, but I'll take it.

I haven't tried the non-MetaSpriteBaseTile version.Maybe I should for comparison's sakes.

I'll switch over the regular objects to metasprites but keep the player as strips. Uploading the player sprite to vdp has less overhead if I can know the amount of sequential sprites I can copy all at once instead of one at a time.
  View user's profile Send private message
  • Joined: 05 Sep 2013
  • Posts: 3828
  • Location: Stockholm, Sweden
Reply with quote
Post Posted: Sun Mar 24, 2024 2:11 pm
pw wrote
Finally found time to try it out. Got it working with the MetaSpriteBaseTile version and I definitely get a speed benefit. It's not super dramatic, but I'll take it.


can you share some figures? I'm curious :)

pw wrote
I haven't tried the non-MetaSpriteBaseTile version.Maybe I should for comparison's sakes.


It'll be 21 cycles faster for each sprite placed on screen.
  View user's profile Send private message Visit poster's website
  • Joined: 10 Aug 2023
  • Posts: 33
Reply with quote
Post Posted: Sun Mar 24, 2024 4:26 pm
I don't know how to measure it accurately, but comparing with the previous method, I'm saving about 7 scan lines for three enemies of 5 sprites each. That's the pink area, where I draw enemies.

It's definitely nothing to sneeze at.

  View user's profile Send private message
  • Joined: 09 Aug 2021
  • Posts: 131
Reply with quote
Post Posted: Sun Mar 24, 2024 8:52 pm
pw wrote
I don't know how to measure it accurately

try procedure profiler in emulicious
  View user's profile Send private message
  • Joined: 02 Mar 2024
  • Posts: 21
  • Location: Dullsville
Reply with quote
Post Posted: Sun Mar 24, 2024 9:51 pm
i guess i wont have to learn ASM to make GG games, just have to learn C as my frist programming language to do so, yet after that huge hurdle this should be easy peasy! TY
  View user's profile Send private message
  • Joined: 05 Sep 2013
  • Posts: 3828
  • Location: Stockholm, Sweden
Reply with quote
Post Posted: Mon Mar 25, 2024 8:47 am
pw wrote
[...] comparing with the previous method, I'm saving about 7 scan lines for three enemies of 5 sprites each


this is using SMSlib metasprites compared to a set of adjoining sprites stripes, right?
I mean, this seems to be exactly the overhead you had before... so bad we couldn't make the compiler generate better code here.
  View user's profile Send private message Visit poster's website
  • Joined: 09 Aug 2021
  • Posts: 131
Reply with quote
Post Posted: Mon Mar 25, 2024 9:36 am
sverx wrote
I mean, this seems to be exactly the overhead you had before... so bad we couldn't make the compiler generate better code here.

from the description it is not clear (for me) which method saves CPU in comparison to which. i think that metasprites are more performant than those SMS_add*AdjoiningSprites() function family, provide more flexibility and make overall code simpler and more performant. in case i am right, then making compiler "generate better code" for calling series of SMS_add*AdjoiningSprites() is just useless. and combine metasprites with those obsolete api calls is just waste of ROM space.
  View user's profile Send private message
  • Joined: 05 Sep 2013
  • Posts: 3828
  • Location: Stockholm, Sweden
Reply with quote
Post Posted: Mon Mar 25, 2024 9:47 am
toxa wrote
i think that metasprites are more performant than those SMS_add*AdjoiningSprites() function family, provide more flexibility and make overall code simpler and more performant. in case i am right, then making compiler "generate better code" for calling series of SMS_add*AdjoiningSprites() is just useless. and combine metasprites with those obsolete api calls is just waste of ROM space.


Metasprites are less performant than an horizontal strip of sprites, but they do offer greater flexibility, in exchange for a bit of additional work.
But here pw was actually creating 'own' metasprites using strips of sprites (SMS_add*AdjoiningSprites calls), which would have worked well, had the overhead been lower.
I do agree that he might consider dropping entirely the previous approach and just use metasprites (and also save a few hundreds bytes of ROM by doing that) but that's their call.
  View user's profile Send private message Visit poster's website
  • Joined: 09 Aug 2021
  • Posts: 131
Reply with quote
Post Posted: Mon Mar 25, 2024 11:24 am
sverx wrote
But here pw was actually creating 'own' metasprites using strips of sprites (SMS_add*AdjoiningSprites calls), which would have worked well, had the overhead been lower.

i don't believe that. own metasprite implementation in C over those SMS_add*AdjoiningSprites() functions can not work better than the single metasprite rendering call. And there is no sense to do that at all.
  View user's profile Send private message
  • Joined: 10 Aug 2023
  • Posts: 33
Reply with quote
Post Posted: Mon Mar 25, 2024 12:52 pm
I've switched over to using metasprites completely. Quite happy about that.

When I started last August, there was no metasprite support. It only appeared in December. Using the adjoining sprites functions was the best solution available to me at the time.

Of course a metasprite function is going to be faster than the original solution, or the adjoining sprites functions.

sverx, are you looking into integrating the MetaSpriteBaseTile version of SMS_addMetaSprite into devkitSMS soon?
  View user's profile Send private message
  • Joined: 09 Aug 2021
  • Posts: 131
Reply with quote
Post Posted: Mon Mar 25, 2024 12:54 pm
pw wrote
When I started last August, there was no metasprite support. It only appeared in December. Using the adjoining sprites functions was the best solution available to me at the time.

that because you are using devkitSMS :)
  View user's profile Send private message
  • Joined: 05 Sep 2013
  • Posts: 3828
  • Location: Stockholm, Sweden
Reply with quote
Post Posted: Mon Mar 25, 2024 1:03 pm
toxa wrote
i don't believe that. own metasprite implementation in C over those SMS_add*AdjoiningSprites() functions can not work better than the single metasprite rendering call. And there is no sense to do that at all.


I never said it could have worked better, I said it would have worked 'well' (enough) *IF* the overhead was lower.

Also, as an example, a single 16×16 enemy (made of 2 8×16 sprites) can be placed in a much simpler and quicker way using an SMS_addTwoAdjoiningSprites call than an SMS_addMetasprites call.

pw wrote
are you looking into integrating the MetaSpriteBaseTile version of SMS_addMetaSprite into devkitSMS soon?


I will do that, under a conditional define.

toxa wrote
that because you are using devkitSMS :)


and this is the devkitSMS topic also :)
  View user's profile Send private message Visit poster's website
  • Joined: 09 Aug 2021
  • Posts: 131
Reply with quote
Post Posted: Mon Mar 25, 2024 2:02 pm
sverx wrote
Also, as an example, a single 16×16 enemy (made of 2 8×16 sprites) can be placed in a much simpler and quicker way using an SMS_addTwoAdjoiningSprites call than an SMS_addMetasprites call.

And for such simple case the inline function might be even more efficient.
  View user's profile Send private message
  • Joined: 05 Sep 2013
  • Posts: 3828
  • Location: Stockholm, Sweden
Reply with quote
Post Posted: Mon Mar 25, 2024 4:59 pm
Pushed an update right now. A small speed improvement too.
'Delta' metasprites are activated compiling with the
METASPRITE_DELTA_TILES
define.
  View user's profile Send private message Visit poster's website
  • Joined: 30 Mar 2009
  • Posts: 296
Reply with quote
Post Posted: Thu Mar 28, 2024 3:03 pm
Hello sverx.

I'm trying to fix a bug with my competition entry. I have music playing, using channels 0 and 1.
And when i play a specific SFX, on channel 2, the sfx play, but i also get this awful high-pitch note after the sfx end, until the next sfx play and so on.

I was doing tests, and i even added a whole bunch of "OFF" commands on the channel 2 on the music, no luck there, i also made sure the SFX was not marked as looping and also have an OFF command on channel 2 on the SFX itself.

So, i'm not sure what i'm doing wrong.
But, for the weird part, if i play the same PSG, but invoking with SFXCHANNEL3 on the code, it plays on channel 2 and no high-pitch permanent noise?

basically:
PSGSFXPlay(pickup_psg,SFX_CHANNEL3);

works, even if my psg only uses channel2, and was correctly converted using vgm2psg pickup.vgm pickup.psg 2

but this:
PSGSFXPlay(pickup_psg,SFX_CHANNEL2);

causes the permanent high-pitch noise between sfx.

Its probably me doing something wrong, so if you have a clue, i'd appreciate it.

thanks!
  View user's profile Send private message Visit poster's website
  • Joined: 05 Sep 2013
  • Posts: 3828
  • Location: Stockholm, Sweden
Reply with quote
Post Posted: Thu Mar 28, 2024 3:53 pm
@tibone - I'd like to investigate this but I don't get that note on your Compo entry using Emulicious.

If you want you can send me privately your tune and SFX and I'll test them and, as a temporary workaround, if your music always uses *only* channels 0 and 1, you can use SFX_CHANNELS2AND3 for any SFX as in
PSGSFXPlay(pickup_psg, SFX_CHANNELS2AND3);
  View user's profile Send private message Visit poster's website
  • Joined: 30 Mar 2009
  • Posts: 296
Reply with quote
Post Posted: Thu Mar 28, 2024 4:34 pm
sverx wrote
@tibone - I'd like to investigate this but I don't get that note on your Compo entry using Emulicious.


I commented that specific line on the version i sent to the compo, so no sfx plays, i didnt want to cause hear damage to anyone. :)


sverx wrote

If you want you can send me privately your tune and SFX and I'll test them and, as a temporary workaround, if your music always uses *only* channels 0 and 1, you can use SFX_CHANNELS2AND3 for any SFX as in
PSGSFXPlay(pickup_psg, SFX_CHANNELS2AND3);


Just tested it and using SFX_CHANNELS2AND3 also causes the high-pitch note.

I'll send you the files over a DM. Also, the code is on github, if you want to look at it, the problematic line is line 926.


Thanks!
  View user's profile Send private message Visit poster's website
  • Joined: 05 Sep 2013
  • Posts: 3828
  • Location: Stockholm, Sweden
Reply with quote
Post Posted: Fri Mar 29, 2024 12:35 pm
I just posted an update to PSGlib to address this issue.
It should work correctly now! :)
  View user's profile Send private message Visit poster's website
  • Joined: 05 Sep 2013
  • Posts: 3828
  • Location: Stockholm, Sweden
Reply with quote
Post Posted: Thu Apr 04, 2024 9:22 am
In PSGlib I just added PSGSetSFXVolumeAttenuation to control SFX volumes, it works exactly as PSGSetMusicVolumeAttenuation does for the background music.
(Also, this update is available both in the PSGlib asm repository and in the C-wrapped devkitSMS repository).
I hope you enjoy that!
  View user's profile Send private message Visit poster's website
  • Joined: 30 Mar 2009
  • Posts: 296
Reply with quote
Post Posted: Thu Apr 04, 2024 5:06 pm
heya, i know i'm doing something wrong, but i'm trying to implement sram on Raposa, just to learn how to do it (its not really important to the gameplay of this specific game).

In the wiki, it says all you need to do is access the unsign char array SMS_SRAM[], so.. i did this:


void checkHiScore(void) {
   if(score > hiscore) {
      hiscore = score;
      SMS_enableSRAM();
      SMS_SRAM[0] = hiscore;
      SMS_disableSRAM();
      return;
   }
   return;
}

void loadHiScore(void) {
   SMS_enableSRAM();
   hiscore = SMS_SRAM[0];
   SMS_disableSRAM();
}

void clearHiScore(void) {
   hiscore = 0;
   SMS_enableSRAM();
   SMS_SRAM[0] = hiscore;
   SMS_disableSRAM();
}


But it doesnt really work, if the hiscore is not set, or is low, it just sets as 139. (which can't really happen, since score increments in multiples of 10). And if i get over 139, it sets the hiscore correctly, but upon reloading the rom, it goes back to 139.

If there's any help, i appreciate, but also, like i said, its not the most important thing for this specific game, is more to learn.

Thanks!
  View user's profile Send private message Visit poster's website
  • Joined: 06 Mar 2022
  • Posts: 671
  • Location: London, UK
Reply with quote
Post Posted: Thu Apr 04, 2024 7:04 pm
If you're using Emulicious your cart must be at least 64KB to enable SRAM - could that be the problem?

(see this thread for some more hints)
  View user's profile Send private message Visit poster's website
  • Joined: 30 Mar 2009
  • Posts: 296
Reply with quote
Post Posted: Thu Apr 04, 2024 7:57 pm
right, my rom is 48kb at this point, and i was using, indeed emulicious, i just tested fusion and it display a similar beheaviour.

i'll do some more testing, and check that thread, thanks.
  View user's profile Send private message Visit poster's website
  • Joined: 05 Sep 2013
  • Posts: 3828
  • Location: Stockholm, Sweden
Reply with quote
Post Posted: Fri Apr 05, 2024 7:40 am
tibone wrote
In the wiki, it says all you need to do is access the unsign char array SMS_SRAM[], so.. i did this:
[...]
But it doesnt really work [...]


I should change the wording in the wiki, it's not very clear indeed.
What I meant is that SRAM is mapped as an array of chars - of course you can cast that to whatever you need.
Here is a simple example of how to map your own struct definition onto SRAM.

Another approach could be to have your struct/data in RAM and simply memcpy() that to SRAM when you want to save.
Or you could copy every single variable from RAM to SRAM using a separate write operation but it probably would get messy pretty quickly (and still you have to cast every variable that is not an unsigned char...)

edit: also - yes, your ROM needs to be at least 64 KiB for it to work correctly on (many) emulators.
You can pad that to that size by using the -pm option of ihx2sms/makesms.
  View user's profile Send private message Visit poster's website
Reply to topic Goto page Previous  1, 2, 3 ... 14, 15, 16, 17  Next



Back to the top of this page

Back to SMS Power!