Forums

Sega Master System / Mark III / Game Gear
SG-1000 / SC-3000 / SF-7000 / OMV
Home - Forums - Games - Scans - Maps - Cheats - Credits
Music - Videos - Development - Hacks - Translations - Homebrew

View topic - SMS Target for Z88DK's New C Library

Reply to topic
Author Message
  • Joined: 17 Nov 2015
  • Posts: 97
  • Location: Canada
Reply with quote
SMS Target for Z88DK's New C Library
Post Posted: Sun Apr 30, 2017 9:57 pm
Last edited by Alcoholics Anonymous on Sun May 07, 2017 4:38 pm; edited 1 time in total
A new +sms target has been added to z88dk's new c library that will be included in the next 1.99C release due soon.

devkitSMS has been incorporated as a base api (thank you sverx) and many people from these forums have given us permission to include their devkitSMS projects as sample code. You can see these here along with a few of our own small examples. Compile instructions are included in the projects' directories along with compiled .sms files you can try if just browsing.

I have not forgotten about some of your suggestions (like stripping GG code from one SMS example - I've decided to keep the conditional GG code because we may combine the GG and SMS target into one, and 3D City for the classic lib which I am still trying to get to work!).

What is different about devkitSMS inside z88dk is that we've rewritten any c portions of the api in assembler and separated the c interface from the assembler implementation. This grants asm programmers an asm interface to devkitSMS without c overhead and allows us to make use of z88dk_callee as well as z88dk_fastcall linkage.

Only the sms portion of devkitSMS is currently there too as the decision on whether the GG should be a separate target has not been made.

devkitSMS crts and interrupt routines are automatically provided.

What is Z88DK? It's a development environment for z80 machines that includes assembler, linker, librarian, two c compilers, and front end tools that guide compilation of any number or type of source files to a final output suitable for the destination target (an .sms file for the sms). It has a very large library written in assembly language that implements the c standard as well as many extensions for graphics, sound and so on.

People have been using SDCC with devkitSMS. SDCC is also included inside Z88DK and is called zsdcc there. The difference is zsdcc solves a number of bugs in sdcc and generates smaller and faster code. Within Z88DK, SDCC has access to a much larger and much more complete c library that is written in assembler and it gains access to Z88DK's CRTs which allows it to generate code targetting 50 different z80 machines out of the box. sdcc has accommodated some of z88dk's requirements in the past year or so to make it possible to do this integration.

There are two c libraries in Z88DK as well. There is the classic one and the new one. The new c library is a re-write aiming for a subset of C11 compliance and to introduce an object oriented unix i/o model. This announcement is about the sms target with the new c library; the classic library's sms target is still there and exists in parallel.

The sms target in the new c library has the following features:

* CRTs already present and a build system that outputs .sms files.

* The Z88DK toolchain allows any mixture of asm, c, macros and so on on the compile line and can compile a complete project in one line.

* Memory banks are defined for BANK_02 through BANK_1F in hex so that assembly or c code and data can be easily placed in any bank.

* devkitSMS integration (both SMSlib and PSGlib but not GG or SC portions)

* a set of sms functions for copying to and from vram, cram and vdp rooted here (the asm implementation). This includes some code to scroll rectangular area of tiles as that was needed by the text terminal implementation.

* a means to configure libraries for the sms target, including placement of screen map, sprite attribute address and sprite pattern base address for devkitSMS.

* Text terminals are available so that printf can be used to print text to screen. These terminals are currently based on 8x8 1-bit fonts as used by many other z80 machines. Copying a 1-bit 8x8 font to vram is done via "copy_font_8x8_to_vram()" as shown in this example. The fonts in these examples are included in the z88dk library. As mentioned, z88dk's library is quite sophisticated and allows you to open any number of independently operating output terminals on screen with given rectangular dimensions and will handle all details like scrolling, pausing, etc.

* Many Z88DK features are controlled at compile time via pragmas. These pragmas can exist in a separate file for easy maintenace. The examples do exactly that and use the pragmas to populate the SEGA / SDSC headers as well as control the output text terminals' initial dimensions and settings.

If anyone is interested, please give it a trial run and see if bugs are apparent. There will be a 1.99C release of Z88DK in the near future so it would be good to get rid of those before that happens :)
  View user's profile Send private message Visit poster's website
  • Joined: 28 Jan 2017
  • Posts: 556
  • Location: Málaga, Spain
Reply with quote
Post Posted: Mon May 01, 2017 9:01 am
Thats great!!! Now can test programs in both (sdcc and z80dk) environments!
  View user's profile Send private message
  • Joined: 05 Sep 2013
  • Posts: 3828
  • Location: Stockholm, Sweden
Reply with quote
Post Posted: Thu May 04, 2017 9:11 am
I feel you guys are empowering people to create amazing new stuff for this old machine - thus thank you so much.
Keep up the great work! :)
  View user's profile Send private message Visit poster's website
  • Joined: 17 Nov 2015
  • Posts: 97
  • Location: Canada
Reply with quote
Post Posted: Sun May 07, 2017 2:15 am
sverx wrote
I feel you guys are empowering people to create amazing new stuff for this old machine - thus thank you so much.
Keep up the great work! :)


Well thanks to you especially for coming up with a good way to encapsulate the sms hardware.

The next time we come back to the sms I think we will look at doing a bitmapped display. There are a few interesting things in the z88dk library that may be fun to try including proportional fonts and a software sprite library based on tiles. The vdu only offers a thin pipe from cpu to vram but I think the software sprites may be fine to use in some situations. The particular engine is free running and does not need to synchronize with the raster to do flicker free graphics so it would fit well amongst hardware sprites if it can work.
  View user's profile Send private message Visit poster's website
  • Joined: 05 Sep 2013
  • Posts: 3828
  • Location: Stockholm, Sweden
Reply with quote
Post Posted: Sun May 07, 2017 9:15 am
unfortunately there are not enough tiles to do a full screen bitmapped mode :|
  View user's profile Send private message Visit poster's website
  • Joined: 25 Feb 2006
  • Posts: 874
  • Location: Belo Horizonte, MG, Brazil
Reply with quote
Post Posted: Sun May 07, 2017 2:43 pm
sverx wrote
unfortunately there are not enough tiles to do a full screen bitmapped mode :|


You can get some fake 4 color full screen bitmap mode through palette trickery, as mentioned in this post:

http://www.smspower.org/forums/9313-PseudoAPARoutinesForTheSegaMasterSystem#4238...
  View user's profile Send private message Visit poster's website
  • Joined: 17 Nov 2015
  • Posts: 97
  • Location: Canada
Reply with quote
Post Posted: Sun May 07, 2017 4:36 pm
sverx wrote
unfortunately there are not enough tiles to do a full screen bitmapped mode :|


It wouldn't necessarily have to be fullscreen either. You could create a subset window of variable x*y size and then distribute the locations of those tiles around the screen where needed.

haroldoop wrote
You can get some fake 4 color full screen bitmap mode through palette trickery, as mentioned in this post:


It might be worthwhile to resurrect all these old ideas. Your example also reminds me of the svg lib in z88dk, although that one is only b&w.

There is also a b&w graphics lib in the classic library in z88dk but I have not ported it to the newlib because I think there's an opportunity to greatly improve it. The idea would be to hold a currently active "draw context" in static memory that would specify things like pen colour to get away from the b&w nature of the library while remaining cross-platform. Also introduce a graphics pipeline that would mainly consist of a clipping stage followed by a draw stage. By separating that the draw functions themselves can be made much faster because they won't need to be bounds checked and at the same time you get clipping to possibly irregularly shaped windows on screen. You can also make it possible to insert matrix transformation steps into the pipeline, and that should be done, but the main goal is to get at the clipping and keep it fast.

There is only so much time you can commit to hobbies though.
  View user's profile Send private message Visit poster's website
  • Joined: 07 Aug 2007
  • Posts: 220
  • Location: Yach, Germany
Reply with quote
Post Posted: Wed May 10, 2017 1:22 pm
Alcoholics Anonymous wrote
There will be a 1.99C release of Z88DK in the near future so it would be good to get rid of those before that happens :)


SDCC has recently received lots of bug-fixes (especially since March), and this week some improvements¹ in handling of 8-bit parameters, switch statements and comparisons were comitted, most of them Z80-specific. I'd thus suggest that for your 1.99C release you use SDCC 3.6.6 #9911 or later.

Philipp

¹ I mostly did those to reduce the code size of my game Io (http://www.smspower.org/forums/16598-CodingCompetition2017IoByPkK), to allow me to implement additional features while still staying within 48KB of ROM.
  View user's profile Send private message Visit poster's website
  • Joined: 05 Sep 2013
  • Posts: 3828
  • Location: Stockholm, Sweden
Reply with quote
Post Posted: Wed May 10, 2017 2:01 pm
PkK wrote
[...] you use SDCC 3.6.6 #9911 or later.


Are you still working on this wave of improvements or is it done?
  View user's profile Send private message Visit poster's website
  • Joined: 07 Aug 2007
  • Posts: 220
  • Location: Yach, Germany
Reply with quote
Post Posted: Wed May 10, 2017 7:32 pm
sverx wrote
PkK wrote
[...] you use SDCC 3.6.6 #9911 or later.


Are you still working on this wave of improvements or is it done?


It is done. The last commit was yesterday afternoon. Today's commits were about improving DWARF support, which is currently only relevant to the hc08, s08 and stm8 backends.

Philipp
  View user's profile Send private message Visit poster's website
  • Joined: 05 Sep 2013
  • Posts: 3828
  • Location: Stockholm, Sweden
Reply with quote
Post Posted: Fri May 12, 2017 1:42 pm
@PkK: I just updated to SDCC 3.6.6 #9913 - the generated code seems to be smaller and faster, which is very good... unfortunately the compilation time for my project passed from around 75 seconds to more than 2 minutes :|
Of course I can live with that, but I just wanted to let you know this (mmm... by chance has the default --max-allocs-per-node value changed?)
  View user's profile Send private message Visit poster's website
  • Joined: 07 Aug 2007
  • Posts: 220
  • Location: Yach, Germany
Reply with quote
Post Posted: Fri May 12, 2017 2:18 pm
sverx wrote
@PkK: I just updated to SDCC 3.6.6 #9913 - the generated code seems to be smaller and faster, which is very good... unfortunately the compilation time for my project passed from around 75 seconds to more than 2 minutes :|
Of course I can live with that, but I just wanted to let you know this (mmm... by chance has the default --max-allocs-per-node value changed?)


I would not have expected such a large increase in compile time. The changes add some peephole rules, and allow more freedom in register allocation, so a small increase in compile-time is to be expected. But I would have guessed it to be in the 1% to 5% range.

Can you show me some compileable functions were the increase in compile time is particularly noticeable?

Philipp
  View user's profile Send private message Visit poster's website
  • Joined: 05 Sep 2013
  • Posts: 3828
  • Location: Stockholm, Sweden
Reply with quote
Post Posted: Fri May 12, 2017 2:41 pm
PkK wrote
I would not have expected such a large increase in compile time.


mmm... false alarm. Now it compiles in 1:33, which is just around 15 seconds more than with 3.6.0 - don't know what happened before.
Unfortunately I'm compiling a single large file (3K lines) with lots of includes, so I wouldn't know what exactly is now slower, sorry :|
  View user's profile Send private message Visit poster's website
  • Joined: 17 Nov 2015
  • Posts: 97
  • Location: Canada
Reply with quote
Post Posted: Fri May 12, 2017 7:03 pm
PkK wrote

SDCC has recently received lots of bug-fixes (especially since March), and this week some improvements¹ in handling of 8-bit parameters, switch statements and comparisons were comitted, most of them Z80-specific. I'd thus suggest that for your 1.99C release you use SDCC 3.6.6 #9911 or later.


Yes we always go to the latest version for a release. I'm using 9913 now for testing.

I haven't noticed any big changes yet but we already fixed the peephole problems so any change there won't have an impact for z88dk.

Quote

¹ I mostly did those to reduce the code size of my game Io (http://www.smspower.org/forums/16598-CodingCompetition2017IoByPkK), to allow me to implement additional features while still staying within 48KB of ROM.


You may be able to save another 2k through z88dk :)

RTS games are kind of my thing. I haven't successfully figured out how to survive yet. Even on the easiest level it seems there is barely time to build a path and upgrade the mine. I did that once and my sulphur was hovering around 50 or so before it began dropping again.
  View user's profile Send private message Visit poster's website
  • Joined: 07 Aug 2007
  • Posts: 220
  • Location: Yach, Germany
Reply with quote
Post Posted: Fri May 12, 2017 7:11 pm
Alcoholics Anonymous wrote

RTS games are kind of my thing. I haven't successfully figured out how to survive yet. Even on the easiest level it seems there is barely time to build a path and upgrade the mine. I did that once and my sulphur was hovering around 50 or so before it began dropping again.


A path is just enough to move as much sulphur as is mined in the unupgraded source. Which is also just enough to supply one idle worker.

I recommend to build either two paths or a railway, that way you can effectively use up the sulphur that accumulated at the source before the path / railway was completed.

The source node also does not supply sulphur while it is being upgraded. So I recommend building a second source node and connecting it to the HQ before upgrading the first source node.

Philipp

P.S.: Currently the ColecoVision version is a bit more advanced than the Sega one. But it has slightly simpler graphics. On the other hand, the ColecoVision version already works on hardware. I'm still working on making the Sega version work at least on my Mark III (at the moment there is a remaining issue with sprites).
  View user's profile Send private message Visit poster's website
  • Joined: 07 Aug 2007
  • Posts: 220
  • Location: Yach, Germany
Reply with quote
Post Posted: Fri May 12, 2017 7:21 pm
Alcoholics Anonymous wrote

Yes we always go to the latest version for a release. I'm using 9913 now for testing.


There is a gbz80 issue possibly introduced by the recent changes:


'ucgbz80': 4 failures, 9821 tests, 1823 test cases, 2867269 bytes, 19857619 ticks
   Failure: gen/ucgbz80/bitfields/bitfields
   Failure: gen/ucgbz80/muldiv/muldiv_storage_none_type_char_attr_volatile
   Failure: gen/ucgbz80/muldiv/muldiv_storage_static_type_char_attr_volatile


I'll try to look into it on Sunday.

Philipp
  View user's profile Send private message Visit poster's website
  • Joined: 28 Jan 2017
  • Posts: 556
  • Location: Málaga, Spain
Reply with quote
Post Posted: Fri May 12, 2017 11:30 pm
Have tested a new project and found that SMsDevKit UNSAFE functions
(UNSAFE_memcpy_64, 32...) does not exists. It is planned to add these functions? they are VITAL for performance.

Also, I see the compiler a bit slow, maybe the settings (taken from astroforce github instructions) may be used only for the final product. I'll try to disable them and see if the process is faster.
  View user's profile Send private message
  • Joined: 28 Jan 2017
  • Posts: 556
  • Location: Málaga, Spain
Reply with quote
Post Posted: Sat May 13, 2017 12:13 am
Ah, no, sorry. Was my fault. Forgot change includes of smslib for those in the z88dk kit.

As I always have performance troubles, have to say the z88dk compiled version is way faster than the sdcc (although 5-10 mins to compile, but having 8 directions tiled scroll with tile effects runs great with z88dk while in sdcc i see light slowdowns), but found trouble stopping music. Some music channel remains playing (frozen, but playing the last sound). Please fix. It is awesome!!!
  View user's profile Send private message
  • Joined: 17 Nov 2015
  • Posts: 97
  • Location: Canada
Reply with quote
Post Posted: Sat May 13, 2017 12:53 am
eruiz00 wrote
Ah, no, sorry. Was my fault. Forgot change includes of smslib for those in the z88dk kit.


Ah good. Yes you have to remember to change the includes to "<arch/sms/SMSlib.h>" and "<arch/sms/PSGlib.h>". The devkitSMS macros for the SMS/SDSC headers also have to be changed to pragmas for z88dk.

You can see the sdcc header for SMSlib.h online or maybe the plain header without the extra sdcc attributes is easier to read. That's what's supposed to be in there and working.

It is possible there are problems because that was quite a lot of code to incorporate and add c interfaces to.

Quote

As I always have performance troubles, have to say the z88dk compiled version is way faster than the sdcc (although 5-10 mins to compile, but having 8 directions tiled scroll with tile effects runs great with z88dk while in sdcc i see light slowdowns),


You're still using sdcc but instead of using libraries implemented in C, zsdcc is using libraries implemented in asm which also allows for z88dk_fastcall / z88dk_callee linkage and register preservation info. The other source of speedup is we've tried to write a bunch of peephole rules to "fix" code that sdcc sometimes stumbles on. This is an ongoing process. The next time I look at this, I'll be inspecting the example programs you and others have offered to see if there's anything that can be improved there.

The long compile times are coming from the "max-allocs-per-node" number of 200000. The default is 3000 and reducing to that will speed up compile times quite a bit. The sky is the limit with "max-allocs-per-node" and I think PkK runs at 1000000 for benchmarks, but we chose 200000 as a reasonable upper bound for defining the aggressive peephole rules (SO3). So the combination "-clib=sdcc_iy -SO3 --max-allocs-per-node200000" (and optionally "--opt-code-size") should be the sweet spot for best code output. Diminishing returns is in action for this number so even going as low as 60000 maybe won't be terrible in comparison to 200000 and going to 1000000 isn't too much better than 200000.

I did notice that it was rare in the sms examples to use a max-allocs-per-node value at all which means the default of 3000 was being used in compiles.

Making use of a makefile to get at incremental compilation will speed up development times too.

There is another compiler in z88dk called sccz80. Some people use it to get speedy compiles while developing.

sccz80 is not as standard compliant as sdcc and it has a few important features missing (like no multi-dimensional arrays), so it cannot compile all the code that sdcc can. Currently, zsdcc is also generating faster and smaller code in most circumstances as well.

Not all the sms examples we got can be compiled with sccz80 for these reasons but "Moggy Master" can if you want to compare compile speed and end result in code size.

Quote

but found trouble stopping music. Some music channel remains playing (frozen, but playing the last sound). Please fix. It is awesome!!!


Do you know which function might be at fault?

PkK wrote

There is a gbz80 issue possibly introduced by the recent changes:


I have a pending error as well that looks like it may be in the compiler. It's difficult to track down so I'm not sure where the fault is yet.
  View user's profile Send private message Visit poster's website
  • Joined: 07 Aug 2007
  • Posts: 220
  • Location: Yach, Germany
Reply with quote
Post Posted: Sat May 13, 2017 7:08 am
Alcoholics Anonymous wrote

I have a pending error as well that looks like it may be in the compiler. It's difficult to track down so I'm not sure where the fault is yet.


Do you use the SDCC regression test suite for testing z88dk?

Philipp
  View user's profile Send private message Visit poster's website
  • Joined: 07 Aug 2007
  • Posts: 220
  • Location: Yach, Germany
Reply with quote
Post Posted: Sat May 13, 2017 7:14 am
An idea for reducing code size when integrating z88dk and devkitSMS: On the SMS, we have full control over the lower address range (unlike the ColecoVision, where there is always the BIOS ROM).

One could thus place __sdcc_enter_ix at a place where it can be reached by an rst instruction and add a peephole to replace the calls.

E.g. in Io, I see 119 calls to __sdcc_enter_ix (excluding those from the standard library and those from libcv / libcvu). So there would already be savings of 238 bytes at a very moderate cost of one cycle per call.

Philipp
  View user's profile Send private message Visit poster's website
  • Joined: 07 Aug 2007
  • Posts: 220
  • Location: Yach, Germany
Reply with quote
Post Posted: Sat May 13, 2017 7:18 am
Alcoholics Anonymous wrote

You may be able to save another 2k through z88dk :)


I might look into that when looking at z88dk next time to decide which changes from z88dk should be merged into upstream SDCC. But I doubt z88dk will be able to reduce the code size by 2KB.

Philipp
  View user's profile Send private message Visit poster's website
  • Joined: 28 Jan 2017
  • Posts: 556
  • Location: Málaga, Spain
Reply with quote
Post Posted: Sat May 13, 2017 8:24 am
Do you know which function might be at fault?

May be PSGStop. I explain. I call PSGStop to stop the music, but i continue calling PSGupdate all frames. The fact is with SDCC music stops. In z88dk (yesterday downloaded z88dk nightly) i obtain random results. Sometimes one channel remains frozen (a piiiiiip), sometimes next time i call PSGPlay it does not play anything. With SDCC i obtain good results. If you want/need, i can send (private) source code/sms.

Maybe we have to use both SDCC / Z88DK suites, to get the best from both worlds!

Doing more testing, even if I call PSGFrame after PSGStop I obtain the piiiiip.
  View user's profile Send private message
  • Joined: 05 Sep 2013
  • Posts: 3828
  • Location: Stockholm, Sweden
Reply with quote
Post Posted: Sat May 13, 2017 12:23 pm
I (slowly...) started working on rewriting (some) devkitSMS functions in native z80 ASM. This, plus a substantial set of rules for the peephole optimizer (taken from z88dk), would probably make devkitSMS/SDCC and z88dk output match almost exactly. Until then, I guess one should try both if he really wants to squeeze until the last cycle from our beloved machine :)
  View user's profile Send private message Visit poster's website
  • Joined: 17 Nov 2015
  • Posts: 97
  • Location: Canada
Reply with quote
Post Posted: Sat May 13, 2017 4:00 pm
PkK wrote

Do you use the SDCC regression test suite for testing z88dk?


No but it would be useful especially to check the aggressive peephole set. We intend to set things up 'properly'. So we've set up a "zsdcc" repository at https://github.com/z88dk/zsdcc which contains one branch for unmodified sdcc, one branch for clean changes to sdcc and one branch for zsdcc as it is. sdcc's original svn repo is being copied to github by svn2github and then we fork from that.

The plan is to do a clean implementation of zsdcc (minus hackery) so that the second branch above can act as a source of patches for upstream sdcc. The main thing right now is the broken peepholer in sdcc which doesn't accurately match when registers are read/written.

zsdcc we will be brought into z88dk by including it as a git submodule. Right now everything is separate and we have to compile zsdcc separately.

When this is properly done I would like to run the sdcc regressions to make sure zsdcc remains correct.

Quote

One could thus place __sdcc_enter_ix at a place where it can be
reached by an rst instruction and add a peephole to replace the calls.


That's a pretty good idea. There is a mechanism in z88dk to change calls to RST but only if a function name begins with "__RSTXXh_..." or something like that.

For this one the correct thing would be to qualify that with "--opt-code-size".

I'll take a look at it later.

Quote

But I doubt z88dk will be able to reduce the code size by 2KB.


We usually see about 5% code size reduction in pure C. The asm libraries are much smaller but you'd only see gains from that if the programmer uses the library. (~2k on printf, ~4k on float, etc).

Quote

May be PSGStop. I explain.


OK thanks I will have a look!
  View user's profile Send private message Visit poster's website
  • Joined: 07 Aug 2007
  • Posts: 220
  • Location: Yach, Germany
Reply with quote
Post Posted: Sat May 13, 2017 4:38 pm
Alcoholics Anonymous wrote

Quote

One could thus place __sdcc_enter_ix at a place where it can be
reached by an rst instruction and add a peephole to replace the calls.


That's a pretty good idea. There is a mechanism in z88dk to change calls to RST but only if a function name begins with "__RSTXXh_..." or something like that.

For this one the correct thing would be to qualify that with "--opt-code-size".


It shouldn't be necessary to make the transformation from a call to __sdcc_enter_ix into rst depend on --opt-code-size. After all, if --opt-code-size is not specified, SDCC will not emit calls to __sdcc_enter_ix in the first place.

Philipp
  View user's profile Send private message Visit poster's website
  • Joined: 07 Aug 2007
  • Posts: 220
  • Location: Yach, Germany
Reply with quote
Post Posted: Sat May 13, 2017 4:46 pm
Alcoholics Anonymous wrote
PkK wrote

Do you use the SDCC regression test suite for testing z88dk?


No […]
When this is properly done I would like to run the sdcc regressions to make sure zsdcc remains correct.


Actually, it should be quite easy to just do regression testing with z88dk:
Assuming that z88dk can be invoced by "z88dk" and accepts the same parameters as SDCC (--fverbose-asm -DNO_VARARGS --nostdinc, -I, -L, --less-pedantic, -c, -o) you could just do a "make test-ucz80 SDCC=z88dk" in sdcc/support/regression. If parameters are a bit different, you might want to change sdcc/support/regression/ports/ucz80/spec.mk first.

Considering that these regression tests are very useful in avoiding bugs in SDCC, I assume they'd also be useful for z88dk.

Philipp
  View user's profile Send private message Visit poster's website
  • Joined: 17 Nov 2015
  • Posts: 97
  • Location: Canada
Reply with quote
Post Posted: Sun May 14, 2017 3:38 am
eruiz00 wrote

May be PSGStop.


There was a bug there a "jr z," instead of a "jr nz,". It should be fixed now in the current nightly build available for download.
  View user's profile Send private message Visit poster's website
  • Joined: 28 Jan 2017
  • Posts: 556
  • Location: Málaga, Spain
Reply with quote
Post Posted: Sun May 14, 2017 9:46 am
It works great now!

Sorry, don't want to bother too much, but trying zsdcc i am getting errors about zsdasz80 not found. I see in sdcc sdzasz80.exe, but not find zsdasz80 in z88dk folder.
  View user's profile Send private message
  • Site Admin
  • Joined: 19 Oct 1999
  • Posts: 14745
  • Location: London
Reply with quote
Post Posted: Sun May 14, 2017 11:17 am
For particular hot code paths it might be fun to post them as mini optimisation challenges.
  View user's profile Send private message Visit poster's website
  • Joined: 17 Nov 2015
  • Posts: 97
  • Location: Canada
Reply with quote
Post Posted: Sun May 14, 2017 1:56 pm
eruiz00 wrote
Sorry, don't want to bother too much, but trying zsdcc i am getting errors about zsdasz80 not found. I see in sdcc sdzasz80.exe, but not find zsdasz80 in z88dk folder.


It's no problem :)

z88dk doesn't use sdcc's assembler and linker. The backend is done by z80asm in z88dk. zsdcc is only used to translate c to asm but as it is still sdcc, if you try to do something with zsdcc directly like make a binary, it will probably try to call sdcc's back end assembler asz80.

Is that what is happening here? Are you invoking zsdcc directly instead of going through zcc?

If you're trying to do one of the following, it can be done with zcc like this:

** Create object files instead of binary (-c)

Each input file will be compiled to an object file with file extension ".o". If an output filename is given ("-o name") then all the files will be compiled into a single consolidate object file "name.o".

Since "d.o" is already an object file nothing will be done with it except in the consolidated object file case.

zcc +sms -vn -c -clib=sdcc_iy -SO3 --max-allocs-per-node200000 a.c b.s c.asm d.o

** Translate C files to assembler (-a)

The input files will be translated to .asm files. zcc will only do this for .c or .s input files so in this case "c.asm" and "d.o" will be ignored. ".s" files are experimental -- they are asm files in asz80 syntax and zcc will translate those to standard zilog form for z80asm.

.asm files in this form can act as input to the compile process so you could translate to .asm, hand edit and then use the result in compiling. "--c-code-in-asm" can be removed to get rid of c code as comments in the output.

zcc +sms -vn -a -clib=sdcc_iy -SO3 --max-allocs-per-node200000 a.c b.s c.asm d.o --c-code-in-asm

** Make a library (-x)

This line will make a library out of the source files listed in file "lib.lst".

zcc +sms -vn -x -clib=sdcc_iy -SO3 --max-allocs-per-node200000 @lib.lst -o mylib

(There are many other options. For example, z88dk uses m4 as a macro preprocessor so you can have m4 macros in your c or asm code by adding a .m4 to the filename like "foo.asm.m4". Then -m4 will stop after the m4 step.)

With "clib=sdcc_iy", zsdcc is being used as compiler with the "--reserve-regs-iy" option and the IY version of the c library. In this scenario, zsdcc is assigned the IX register for frame pointer and the c library uses IY. We find this results in the best code because the aggressive peephole rules (SO3) find more opportunities to transform code with reserve-regs-iy.

With "clib=sdcc_ix", zsdcc is again the c compiler but without "reserve-regs-iy" and the c library shares IX with the compiler.

With "clib=new" the c compiler is sccz80 and you should get rid of the zsdcc-specific optimization settings SO, max-allocs-per-node, opt-code-size.

Optimization settings for sdcc are discussed here.

It is probably useful to change the "-vn" in the above to "-v" so you can see what zcc is doing with source files. This helps to understand what is going on.

When using zsdcc you will typically see .c go through:

* zsdcpp - sdcc's c preprocessor
* zpragma - z88dk's pragma extractor
* zsdcc - translation of c to asm
* copt with sdcc_opt.1 rules - translation of asz80 syntax to zilog syntax and change asz80 directives to z80asm directives
* copt with sdcc_opt.9 rules - RST substitution for CALLs to code at RST locations, fix sdcc critical sections to use library routines
* copt with sdcc_opt.2 rules - make sdcc's calls to its primitives use callee linkage

The output from the linker is binaries for the CODE (banks 0+1),DATA (ram),BSS (ram) sections and the memory banks. If zcc is given the "-create-app" option, it will invoke appmake (another z88dk tool) to pick up those pieces and make a .sms file.

I'll mention this because the DATA section, in ROM environments like the sms, must be stored in ROM along with the code so that it can be copied into RAM at startup. This DATA section can be compressed by z88dk using zx7 (an lz77 compressor included in z88dk).

If you add "-pragma-define:CRT_MODEL=2" this will be done automatically. The compression of the DATA section must save at least ~80 bytes to be a win since the z80 decompressor code is about that size. With this pragma defined on the compile line or in the pragma file, the compressed binary will be generated and a message will come up saying how many bytes were saved so you can check if there is an advantage to using it. You can also try compressing the output "*_DATA.bin" directly with zx7 to see how much savings are had.

AstroForce has a sizable DATA section, at least 600 bytes if I recall but the compression only saved 44 bytes so it didn't make sense to use it in this case. The compression was small because AstroForce is already compressing that data with psgaiden, etc.
  View user's profile Send private message Visit poster's website
  • Site Admin
  • Joined: 19 Oct 1999
  • Posts: 14745
  • Location: London
Reply with quote
Post Posted: Sun May 14, 2017 4:09 pm
Maybe this is a dumb question but why must data be copied to RAM? I can see the necessity for global variables initialised to non default values, but I can't see why compressed graphics data needs to go there.
  View user's profile Send private message Visit poster's website
  • Joined: 17 Nov 2015
  • Posts: 97
  • Location: Canada
Reply with quote
Post Posted: Sun May 14, 2017 7:03 pm
Last edited by Alcoholics Anonymous on Sun May 14, 2017 10:41 pm; edited 1 time in total
Maxim wrote
Maybe this is a dumb question but why must data be copied to RAM? I can see the necessity for global variables initialised to non default values, but I can't see why compressed graphics data needs to go there.


You're right, the DATA section is only non-const data that must be in ram. If you use "const" in the data declaration, that stuff is assigned to an RODATA (read only data) section that will be rolled into the CODE section. Ie- it will stay in rom.


// This stuff will be placed in BSS

int score;
unsigned char table[100];

// This stuff will be placed in DATA

int value =100;
unsigned char lookup[] = {0, 1, 2, 3, 4, 5};

//  This stuff will be placed in RODATA (part of the CODE section)

const unsigned char graphics[] = {0, 1, 2, 3, 4, 5};
const int bonus = 500;

// No variables are created here.  Extern only informs the
// C compiler of their existence somewhere in another file or library

extern int a, b;
extern unsigned char more_graphics[];

// This asm will be placed in DATA (smc is a self-modiyfing code section)

SECTION smc_user
PUBLIC _oops

_oops:

   ld hl,0
   ld (_oops),hl
   ret



The BSS section will hold ram variables whose initial values are unspecified. At startup, the CRT will zero this region in ram.

The DATA section will hold non-const data or self-modifying code that must be in ram. The CRT needs a copy of this in ROM so that it can memcpy this block into ram before main is called. This is the section that can be stored in compressed form in rom.

The RODATA section is actually a subsection of CODE. Anything here will be stored in ROM. The key difference is the "const" keyword that indicates to the compiler this data will not change and therefore does not have to be in ram.


When you're using zsdcc's --constseg and --codeseg command line options, you're changing the name of the section that zsdcc assigns rodata and code to away from the "RODATA" and "CODE" equivalents used by default. This is how you can get the c compiler to place const data and code into different memory banks. sdcc is currently lacking equivalent --bssseg, --dataseg and --initseg for other important sections and this is something we're looking to add too. The --initseg is where zsdcc places user initialization code that is run before main is called.
  View user's profile Send private message Visit poster's website
  • Site Admin
  • Joined: 19 Oct 1999
  • Posts: 14745
  • Location: London
Reply with quote
Post Posted: Sun May 14, 2017 7:18 pm
Ah, a more extreme reason to use const than usual...
  View user's profile Send private message Visit poster's website
  • Joined: 28 Jan 2017
  • Posts: 556
  • Location: Málaga, Spain
Reply with quote
Post Posted: Mon May 15, 2017 5:30 am
Yeah! Playing astroforce compiled with z88dk and found some little errors fixed assigning const to some data arrays. Have to upload new fixed code!
  View user's profile Send private message
  • Joined: 05 Sep 2013
  • Posts: 3828
  • Location: Stockholm, Sweden
Reply with quote
Post Posted: Mon May 15, 2017 11:54 am
eruiz00 wrote
[...] devkitSMS UNSAFE functions [...] these functions [...] are VITAL for performance.


I'm curious to know if you're using both UNSAFE VRAM copy functions and regular "safe" ones when you can't use the faster option. If so, you might notice a performance gain by updating devkitSMS as I've just rewritten SMS_VRAMmemcpy() and SMS_VRAMmemcpy_brief() [for short transfers 1-256 bytes] in ASM so that they're now as fast as possible, while keeping them VRAM safe. Also SMS_loadTileMap() is now a macro that calls SMS_VRAMmemcpy(), which also means that constant map address calculation has moved to compile time instead of at run time.
  View user's profile Send private message Visit poster's website
  • Joined: 28 Jan 2017
  • Posts: 556
  • Location: Málaga, Spain
Reply with quote
Post Posted: Mon May 15, 2017 12:42 pm
This night will update and test the new functions. The unsafe functions are always used by me to change tile sprites. Always are called after waittovblank. (In the new game which 8 directions scroll the order is waittovblank->update hscroll and vscroll->change tile sprites with unsafe functions->fill side rows or columns when needed by scroll displacement) it runs great this way. But change order of one of these and begun to see artifacts.
  View user's profile Send private message
  • Joined: 28 Jan 2017
  • Posts: 556
  • Location: Málaga, Spain
Reply with quote
Post Posted: Mon May 15, 2017 12:56 pm
... was thinking it wont be a big improvement in my tilemap functions. The fact is i am using a 16x16 tilemap size. Each tilemap number tn represents tiles tn<<2, tn<<2+1, tn<<2+2, tn<<2+3. So i can save space and maintain really big maps in one bank with fast collisions check. The row fill method is made with a 16 times loop. Also, col filling method is made with 14 loops. I found the col function slower as i have to move the vram pointer 28 times. But is fast enough.
  View user's profile Send private message
  • Joined: 05 Sep 2013
  • Posts: 3828
  • Location: Stockholm, Sweden
Reply with quote
Post Posted: Mon May 15, 2017 3:12 pm
eruiz00 wrote
[...] the order is waittovblank->update hscroll and vscroll->change tile sprites with unsafe functions->fill side rows or columns when needed by scroll displacement.


that's good. unfortunately sometimes vblank time isn't enough to do everything you need to, thus you have to perform some operations during vdraw time. SMS_VRAMmemcpy_brief() for instance can copy a tile (32 bytes) to VRAM in roughly the time that it takes to the UNSAFE functions to copy *two* tiles. But it's still quite fast.
SMS_VRAMmemcpy() can copy 1536 bytes (32*24*2 bytes) in approx. 190 scanlines, which means you can update the visible part of the tilemap in less than a single (NTSC or PAL) frame. I'm using it to 'flip the screen' without turning it off and with no audio interruption.
  View user's profile Send private message Visit poster's website
  • Joined: 17 Nov 2015
  • Posts: 97
  • Location: Canada
Reply with quote
Post Posted: Tue May 16, 2017 6:50 am
eruiz, there was a bug in sdcc that was affecting the stage selection in AstroForce. I don't know if you noticed that the stage selector moved rather quickly in the z88dk compile. That's been fixed by PkK in #9916.

However a new bug has appeared in Baluba Balok that I am trying to track down. See attached .sms. The enemies are unable to jump onto platforms above.

I have checked the generated asm for:

gettilexyb
canmoveleft
canmoveright
doenemyhorizontalmovement
doenemiesmovement
doenemymovement
checkenemyjumpright
checkenemyjumpleft
doenemyverticalmovement
insideblock
fixy

It all looks correct to me but it's possible I missed something.
Offhand do you know in which function the bug may be?
bb.zip (10.66 KB)

  View user's profile Send private message Visit poster's website
  • Joined: 28 Jan 2017
  • Posts: 556
  • Location: Málaga, Spain
Reply with quote
Post Posted: Tue May 16, 2017 12:22 pm
Ummmm... some months ago i saw the screen selection on astro force failing on some android emulator (the problem was the opposite. It was not possible yo move the selector). I dont know why. Only thing i can think is about the key released checking which is done.

About baluba balok i have not tested on z88dk but will, this night.
  View user's profile Send private message
  • Joined: 17 Nov 2015
  • Posts: 97
  • Location: Canada
Reply with quote
Post Posted: Tue May 16, 2017 12:47 pm
eruiz00 wrote

About baluba balok i have not tested on z88dk but will, this night.


Ok cheers. If you're planning to compile, the bug is only recent and I think it's connected to the latest improvements in sdcc. I am get the bug with sdcc #9916.

If you're on windows, I've placed a zsdcc.exe in https://drive.google.com/file/d/0B6XhJJ33xpOWNU1PeVdxaGx5emM/view?usp=sharing
just extract that to z88dk/bin.
  View user's profile Send private message Visit poster's website
  • Joined: 28 Jan 2017
  • Posts: 556
  • Location: Málaga, Spain
Reply with quote
Post Posted: Tue May 16, 2017 7:56 pm
Ummmm...

For some weird reason last line (around 830) in doenemyverticalmovement does not work as expected. Enemy vertical speed does not get updated and enemies only check for horuzontal movement id vertical speed is 0.

I have changed the line (maybe the comparision of a signed char with a constant unsigned char(16) does not work?) And now works ok.

Stranger thing is that after update with your todays upload, even compiling with pkk sdcc (v 9913) fails when one hour ago (before updating zsdcc and zsdcpp) did work well!
balubabalok-fixed.rar (73.45 KB)

  View user's profile Send private message
  • Joined: 28 Jan 2017
  • Posts: 556
  • Location: Málaga, Spain
Reply with quote
Post Posted: Tue May 16, 2017 8:08 pm
Btw i see what it should be other bug...

The initial stage selection value is 0 if compiled with z88dk while is 1 if compiled in sdcc. It must be a zsdcc bug because the var gamestage is initialized with a value of 1 in function dopressinit.

Maybe this error is involved in astroforce error. I see similar patterns (key status check before set the var)...
  View user's profile Send private message
  • Joined: 17 Nov 2015
  • Posts: 97
  • Location: Canada
Reply with quote
Post Posted: Wed May 17, 2017 4:22 am
eruiz00 wrote

For some weird reason last line (around 830) in doenemyverticalmovement does not work as expected. Enemy vertical speed does not get updated and enemies only check for horuzontal movement id vertical speed is 0.

I have changed the line (maybe the comparision of a signed char with a constant unsigned char(16) does not work?) And now works ok.


That was it - I just couldn't see it. The compiler is doing an unsigned comparison instead of a signed one:


   // Ahhhh la gravedad
   if(enemygravity[a]<16)enemygravity[a]+=1;

4035  1243              l_doenemyverticalmovement_00125:
4036  1243  E1             pop   hl
4037  1244  7E             ld   a,(hl)
4038  1245  E5             push hl
4039  1246  FE 10          cp   a,0x10
4040  1248  30 04          jr   NC,l_doenemyverticalmovement_00128
4041  124A  3C             inc   a
4042  124B  E1             pop   hl
4043  124C  E5             push    hl
4044  124D  77             ld   (hl), a


If I change that line to:


//if((enemygravity[a]<16) || (enemygravity[a]>120))enemygravity[a]+=1;


it works. I'll report that at sdcc and see if PkK can fix it.

Quote

Stranger thing is that after update with your todays upload, even compiling with pkk sdcc (v 9913) fails when one hour ago (before updating zsdcc and zsdcpp) did work well!


I'm pretty sure this is a recent problem (like >= 9913). This program was working fine the last time I compiled it.


Quote

The initial stage selection value is 0 if compiled with z88dk while is 1 if compiled in sdcc. It must be a zsdcc bug because the var gamestage is initialized with a value of 1 in function dopressinit.


This:


5773  1D31  21 03 00       ld   hl,_gamestage
5774  1D34  2E 01          ld   l,0x01
5775  1D36  75             ld   (hl),l


will not work :) Thanks for finding that, now looking at what happened here.

Edit: Looks like our fault. An aggressive rule wants to do the above instead of:


ld hl,_gamestage
ld (hl),0x01
ld l,0x01


Fixed that one now.
  View user's profile Send private message Visit poster's website
  • Joined: 17 Nov 2015
  • Posts: 97
  • Location: Canada
Reply with quote
Post Posted: Sat May 20, 2017 2:10 pm
PkK was very quick in fixing the bugs at sdcc and I think we've sorted ours out now too. Everything will be right again in the May 21 build.

Thanks again for your help eruiz.
  View user's profile Send private message Visit poster's website
Reply to topic



Back to the top of this page

Back to SMS Power!