Forums

Sega Master System / Mark III / Game Gear
SG-1000 / SC-3000 / SF-7000 / OMV
Home - Forums - Games - Scans - Maps - Cheats - Credits
Music - Videos - Development - Hacks - Translations - Homebrew

View topic - z88dk bug?

Reply to topic
Author Message
  • Joined: 28 Jan 2017
  • Posts: 541
  • Location: Málaga, Spain
Reply with quote
z88dk bug?
Post Posted: Fri Jun 02, 2017 11:30 pm
Hi!

This function works well in sdcc, but does not work in z88dk. Think the error is related to the char pointer c operations, but no luck changing code (always fails same way, I dont know if its filling a pair of tiles 16 times, or getting an erroneus memory value).

Really need z88dk, at least for this game. It has a complex platformer engine that it gets slow in sdcc (my fault, have not done optimizations yet)... while in z88 runs well (excepting this error).

Rest of game seems to play well in both compilers. This is the only function it fails.

Someone with more assembly/compiler knowledge than me. Some insights?

int maptileposx,maptileposy;
unsigned char maptilewidth,maptileheight;
unsigned char *maptiles;

void UpdateMapRow(unsigned char r)
{
unsigned char a;
const unsigned char *c;
unsigned char e;
unsigned char z;

// Change bank if needed
changeBank(mapbank);

// Check trams
z=maptileposx%32;

// Needed
c=maptiles+((maptileposx>>1)+(((maptileposy+r)>>1)*maptilewidth));
e=(maptileposx%2)+(((maptileposy+r)%2)<<1);

// First position
SMS_setAddr(XYtoADDR(z,(maptileposy+r)%28));

// First tile
SMS_setTile((c[16]<<2)+e+256);
if(e%2==1)
{
c++;
e--;
}else e++;

// Next ?
if(z<31)
for(a=0;a<31-z;a++)
{
SMS_setTile(((*c)<<2)+e+256);
if(e%2==1)
{
c++;
e--;
}else e++;
}

// Finish row
if(z>0)
{
SMS_setAddr(XYtoADDR(0,(maptileposy+r)%28));
for(a=0;a<z;a++)
{
SMS_setTile(((*c)<<2)+e+256);
if(e%2==1)
{
c++;
e--;
}else e++;
}
}

// Change to default bank
changeBank(FIXEDBANKSLOT);
}


Thanks!!!
  View user's profile Send private message
  • Joined: 17 Nov 2015
  • Posts: 97
  • Location: Canada
Reply with quote
Post Posted: Sat Jun 03, 2017 4:51 am
Hi eruiz,

I can't see anything wrong with the generated assembly. That's not to say there isn't something wrong - I just haven't been able to see it.

Is your version of z88dk up to date? There was a week there where I was chasing down a number of bugs but I think they've all been found now. Just in case maybe try a compile after updating. Grab a nightly build from http://nightly.z88dk.org/ (I think it was windows?), erase the old z88dk tree and unzip in the same place. The environment variables will be the same so you don't have to change those.

If the bug is still there, could you try a compile at -SO2 level instead of -SO3? The code won't be as good but at least it narrows down where the problem might be.

I did notice something while looking at the generated asm. There are a lot of divisions being generated because variables are signed.

For example, "z=maptileposx%32;" is being turned into a division:

   ld   hl,0x0020
   push   hl
   ld   hl,(_maptileposx)
   push   hl
   call   __modsint_callee


Normally a mod by a power of 2 number can be done with logical AND but in this case "maptileposx" is a signed integer so this can't be done (the sign of the result depends on the sign of the operands).

Similarly innocent mods by 2 are being turned into divisions for the same reason:

"e=(maptileposx%2)+(((maptileposy+r)%2)<<1);"

   ld   hl,0x0002
   push   hl
   ld   hl,(_maptileposx)
   push   hl
   call   __modsint_callee
   push   hl
   ld   hl,0x0002
   push   hl
   ld   l,(ix-2)
   ld   h,(ix-1)
   push   hl
   call   __modsint_callee
   pop   bc
   ld   e,l
   sla   e
   ld   a, c
   add   a, e
   ld   d, a


Just letting you know because I think you are expecting much faster code to be generated.

If you have a lot of divisions or multiplications in your code, you can configure z88dk to use fast integer arithmetic. This comes at the expense of something around 1k or so of memory but your divisions and multiplications will be much faster.

To configure the sms library you have to edit z88dk/libsrc/_DEVELOPMENT/target/sms/config_clib.m4

On line 59 change "define(`__CLIB_OPT_IMATH', 0)" to "define(`__CLIB_OPT_IMATH', 75)"

Then go to directory z88dk/libsrc/_DEVELOPMENT and rebuild the sms library by running "Winmake sms" (windows) or "make TARGET=sms" (non-windows).

You can undo that change by setting the define back to 0 and building the library again.
  View user's profile Send private message Visit poster's website
  • Joined: 28 Jan 2017
  • Posts: 541
  • Location: Málaga, Spain
Reply with quote
Post Posted: Sat Jun 03, 2017 6:15 am
Tell you:

Both compilers were downloaded now eight hours ago :)

Have changed maptileposx and maptileposy to unsigned char. My fault: No luck.

Have recompiled the library as you told and recompile the game: No luck, but hey! seems faster!

Compiled to asm and check (know some about x86 assembly (from twenty years ago) so trying, but no luck no luck :(

Compiled using SO2: LUCK! (so SO3 is the evil param).

So -SO2 is so good param that even fix my bugged code :) . Include two asm files if you need it to check the error. I see differences in the related function.

Thanks for such a quick response guy. Thanks!!!

Kind Regards
cursedcastle.zip (38.95 KB)
Both asm files

  View user's profile Send private message
  • Joined: 07 Aug 2007
  • Posts: 220
  • Location: Yach, Germany
Reply with quote
Post Posted: Sat Jun 03, 2017 7:31 am
eruiz00 wrote
Hi!
Really need z88dk, at least for this game. It has a complex platformer engine that it gets slow in sdcc (my fault, have not done optimizations yet)... while in z88 runs well (excepting this error).
Thanks!!!


Have you tried invocing SDCC with --opt-code-speed and a high value for --max-allocs-per-node, such as --max-allocs-per-node 100000 or --max-allocs-per-node 1000000.

Philipp

P.S.: I'd be interested to see the C and asm code of the functions for which the z88dk code is faster than the SDCC one.
  View user's profile Send private message Visit poster's website
  • Joined: 28 Jan 2017
  • Posts: 541
  • Location: Málaga, Spain
Reply with quote
Post Posted: Sat Jun 03, 2017 8:11 am
Sorry PKK :) !

Suspect this is a library performance difference, instead a compiler code optimization one. I use sdcc for developing, as z88dk compiling time is too long (maybe have to check compiling params, as only use one ultra optimized configuration).

I am speaking of a flixel/phaser like platformer engine, so it should be normal to see lags. Always to the limit :)
Dont forget i am only an amateur. My code has to be better. I know.

Have checked with --max-allocs-per-node 1000000 and i dont see big improvements.

Difference is visible. If you want, can send you both sms roms compiled.

This thread was made because of an "error", not an performance trouble, after all.
  View user's profile Send private message
  • Joined: 05 Sep 2013
  • Posts: 3758
  • Location: Stockholm, Sweden
Reply with quote
Post Posted: Sat Jun 03, 2017 10:59 am
still there are multiplications and modulus calls in your generated ASM code, and that will slow you down too much anyway.
I suggest you invest some time to find a way to avoid them all - for instance if you need to update a row of tiles on the tilemap, you can consider that the VRAM address of the tile at row r (unsigned char) - line 0 is:
unsigned int addr=XYtoADDR(r,0);

(which is r shifted left by 1 + a constant)
and to go to the address of the tile on row r on the next line you just need to
addr+=64;


edit: uff, I always mess up, I meant column of course! :|
  View user's profile Send private message Visit poster's website
  • Joined: 17 Nov 2015
  • Posts: 97
  • Location: Canada
Reply with quote
Post Posted: Sat Jun 03, 2017 2:35 pm
Last edited by Alcoholics Anonymous on Sat Jun 03, 2017 3:10 pm; edited 1 time in total
eruiz00 wrote

Compiled using SO2: LUCK! (so SO3 is the evil param).
So -SO2 is so good param that even fix my bugged code :) .


More accurately the other way around :)

SO3 is using a lot of aggressive peephole rules that interact amongst themselves so once in a while a bug surfaces when a rule isn't properly qualified. When writing them it's not always easy to make sure they are qualified properly.

Changing to SO2 avoids using those rules.

Anyway I can see a bug in the SO3 compile. You've changed your code to eliminate the divisions so I can't reproduce here.

Would you mind posting the updated source?
  View user's profile Send private message Visit poster's website
  • Joined: 17 Nov 2015
  • Posts: 97
  • Location: Canada
Reply with quote
Post Posted: Sat Jun 03, 2017 2:47 pm
PkK wrote

P.S.: I'd be interested to see the C and asm code of the functions for which the z88dk code is faster than the SDCC one.


The library code in z88dk is quite a bit faster so that could be it.

For C code the recommended compile ("-clib=sdcc_iy") is using "--reserve-regs-iy". Sdcc is overusing the iy register which is leading to larger and slower code. So we stop it from using iy and then fix up the code afterward (there are a number of bugs logged into the sdcc bugtracker related to using reserve-regs-iy but we fix those up too in post-processing).

This is where we see that 5% code size reduction for the C portion. The code can also speed up as result - if the code is improved inside loops there can be a step up in performance. I think this is what is happening in the dhrystone benchmark -- zsdcc is getting around 320 and sdcc is around 275. sdcc inlines the string functions so I don't believe the difference is down to library in that one. I should take a closer look to make sure it's the c code generation that is the difference.
  View user's profile Send private message Visit poster's website
  • Joined: 28 Jan 2017
  • Posts: 541
  • Location: Málaga, Spain
Reply with quote
Post Posted: Sat Jun 03, 2017 4:04 pm
Only changed the int->unsigned char ones:

unsigned char mapbank;
unsigned char maptileposx;
unsigned char maptileposy;
unsigned char maptilewidth;
unsigned char maptileheight;

void UpdateMapRow(unsigned char r)
{
unsigned char a;
unsigned int c;
unsigned char e;
unsigned char z;

// Change bank if needed
changeBank(mapbank);

// Check trams
z=maptileposx%32;

// Needed
c=((maptileposx>>1)+(((maptileposy+r)>>1)*maptilewidth));
e=(maptileposx%2)+(((maptileposy+r)%2)<<1);

// First position
SMS_setAddr(XYtoADDR(z,(maptileposy+r)%28));

// First tile
SMS_setTile((maptiles[c+16]<<2)+e+256);
if(e%2==1)
{
c++;
e--;
}else e++;

// Next ?
if(z<31)
for(a=0;a<31-z;a++)
{
SMS_setTile((maptiles[c]<<2)+e+256);
if(e%2==1)
{
c++;
e--;
}else e++;
}

// Finish row
if(z>0)
{
SMS_setAddr(XYtoADDR(0,(maptileposy+r)%28));
for(a=0;a<z;a++)
{
SMS_setTile((maptiles[c]<<2)+e+256);
if(e%2==1)
{
c++;
e--;
}else e++;
}
}

// Change to default bank
changeBank(FIXEDBANKSLOT);
}
  View user's profile Send private message
  • Joined: 17 Nov 2015
  • Posts: 97
  • Location: Canada
Reply with quote
Post Posted: Sun Jun 04, 2017 5:54 am
I've committed a fix:

https://drive.google.com/file/d/0B6XhJJ33xpOWZUIxSi1QQ2I2Sk0/view?usp=sharing

Copy those two files to z88dk/libsrc/_DEVELOPMENT replacing the two already there.

Let me know if it works.
  View user's profile Send private message Visit poster's website
  • Joined: 28 Jan 2017
  • Posts: 541
  • Location: Málaga, Spain
Reply with quote
Post Posted: Sun Jun 04, 2017 7:19 am
Yeah! it works!

Now lets finish a game worthy of so good guys.

Thanks!
  View user's profile Send private message
Reply to topic



Back to the top of this page

Back to SMS Power!