Forums

Sega Master System / Mark III / Game Gear
SG-1000 / SC-3000 / SF-7000 / OMV
Home - Forums - Games - Scans - Maps - Cheats - Credits
Music - Videos - Development - Hacks - Translations - Homebrew

View topic - VDP - Updating VRAM in display area

Reply to topic
Author Message
  • Joined: 19 Oct 2011
  • Posts: 4
  • Location: Sweden
Reply with quote
VDP - Updating VRAM in display area
Post Posted: Wed Oct 19, 2011 2:38 pm
Hi all, new member here!
I've just started doing some coding for the SMS, and after reading here I have a couple of questions.

I'm doing a complete update of the "name table" (or whatever the correct term for it is) every frame, and this takes longer than what is possible to do in the VBL area.
At first I thought it would be no problems, as I timed my updates so that I was ahead of the raster beam, ie. in the VBL I'm finished with updating the top of the screen, and while the top of the screen is being drawn, I'm updating the lower part of the screen.
However, some threads on this forum has informed me that updating the VRAM during display area needs to be carefully timed, with at least 29 cycles between VRAM writes.

Questions:

1) Is the 29 cycles including or excluding the VDP-port write? That is if I do an out ($be),a, do I need to waste 29 cycles before I do the next one, or is it enough to waste 29 - 11 cycles (time for the out command)=18 cycles?

2) I only use the lower 8 bits of the patterns, so my code looks a bit like this

; First pos on screen
(doing some calcs, result in a)
out    (VDPData),a
xor   a
out    (VDPData),a
; Second pos on screen
(doing some calcs, result in a)
out    (VDPData),a
xor   a
out    (VDPData),a
...
(repeat a total of 32x24 times)

My calculations for the low 8-bits already waste a fair bit of cycles, so padding that up a bit wouldn't be a problem, but having to extend the second VDP-write to take 29 cycles as well will really kill my performance.
Is there a way to advance the VRAM-pointer 2 steps, without having to do the 29-cycles-wait-required write?

-edit-
I'm doing all my testing on a PAL (50Hz) Megadrive with an Everdrive MD. As I understood the MD VDP is much more forgiving with fast VRAM writes, so the fact that it works for me is no indication that it will work on a real SMS.
Because right now, my code works just fine, even with only 4 cycles between the two VRAM writes, but I want to make my code run fine on a real SMS as well!
  View user's profile Send private message Visit poster's website
  • Joined: 06 Apr 2011
  • Posts: 250
  • Location: Netherlands
Reply with quote
Post Posted: Wed Oct 19, 2011 2:51 pm
Hi, Welcome on this forum!

Good to see you interested in SMS programming (I recognised you from the MSX demo Invasion of the big pixels ;) ).

Updating the full PNT is something that you should avoid if possible on SMS mode4. I had the same issue when I started (only recently) coding for SMS.
Another 'feature' I found a bit odd compared to other systems is the fact that it is impossible to update the vertical offset outside VBLANK.

I have a real SMS (PAL with 50/60hz switch) here to test the VDP writes if you like.

Good luck!
  View user's profile Send private message
  • Site Admin
  • Joined: 19 Oct 1999
  • Posts: 14745
  • Location: London
Reply with quote
Post Posted: Wed Oct 19, 2011 3:32 pm
1. Including, but the official docs are a bit unclear on the subject. In the SMS docs:

Quote
There is a timing constraint for accessing the VDP chip.
The VDP chip cannot process data any faster than the following rates:
16 Z80A T-States during VBLANK
29 Z80A T-States during active video.
This means that you should never issue two consecutive OUT or IN instructions to the VDP; they should be separated by at least a NOP instruction.

...effectively saying you can't outi during VBlank, which many games do; and the GG docs say that you should put PUSH IX; POP IX between each OUT opcode. Experimentally, you seem to be able to get away with smaller pauses, and I'm sure there's a thread on this forum explaining exactly how that works in relation to the VDP access windows.

2. Not really - re-setting the address would incur the same wait requirement, I think. There may be a trick regarding reading, rather than writing, the byte you want to skip but I'm not sure how that works.

Some games do update the whole name table like this, but they normally do it with lower frame rates. I haven't examined them closely to see exactly what they do; having two name tables and swapping at 30Hz would be one option.
  View user's profile Send private message Visit poster's website
  • Site Admin
  • Joined: 08 Jul 2001
  • Posts: 8653
  • Location: Paris, France
Reply with quote
Post Posted: Wed Oct 19, 2011 4:47 pm
Yes I recon you can read to advance the pointer.
  View user's profile Send private message Visit poster's website
  • Joined: 28 Sep 1999
  • Posts: 1197
Reply with quote
Post Posted: Wed Oct 19, 2011 8:48 pm
After setting a complete 16-bit address, you can later write one byte to port $BF to update the lower 8 bits of the VRAM address, but be warned that this doesn't work on the Genesis with a Power Base Converter. And I think Sega must have advised against the practice as only Codemasters games do it.

I agree with Maxim and Bock here, you can definitely mix in reads with writes to advance the VRAM address as much as you need. Normally you need a bit of a delay to read valid data from VRAM, but just for purposes of incrementing the address you can read faster than the VDP can provide data. In that case you have to assume the data read is garbage, as it is seldom valid.

If you can afford to put the I/O port number in register C, you can write zero to VRAM quickly with "out (c), 0" and increment the VRAM address without trashing registers with "in f, (c)". The former will not work on the Game Gear however.
  View user's profile Send private message Visit poster's website
  • Joined: 23 Jan 2011
  • Posts: 65
  • Location: The Land of Enchantment
Reply with quote
Post Posted: Wed Oct 19, 2011 11:31 pm
Be careful about emulators - NONE of them handle the timing for vram writes correctly. The only way to be sure is to try it on real hardware.

On that topic, I have both an SMS1 with MKIII Myth cart, and a Genesis Model 2 and CDX both with MD Myth carts for testing SMS games. All three are NTSC. I also have a Nomad with 50/60 Hz switch, SMS mod, and a MD Myth cart I could test either NTSC or PAL SMS games on.
  View user's profile Send private message
  • Joined: 03 Oct 2011
  • Posts: 188
  • Location: New Zealand
Reply with quote
Post Posted: Thu Oct 20, 2011 11:54 pm
Maxim wrote
and I'm sure there's a thread on this forum explaining exactly how that works in relation to the VDP access windows..


Is this the thread you were thinking of?

http://www.smspower.org/forums/viewtopic.php?t=2126
  View user's profile Send private message Visit poster's website
  • Site Admin
  • Joined: 19 Oct 1999
  • Posts: 14745
  • Location: London
Reply with quote
Post Posted: Fri Oct 21, 2011 9:32 am
honestbob wrote
Is this the thread you were thinking of?

No - that one was written by some buffoon with no clue on the electronics side of things, but somewhere around the "cycle accurate emulation" parts of the dev forum there was something with some counting of the VRAM clock cycles and the necessary accesses by the VDP display circuitry that came up with some guesses for exactly how many accesses you could make and when.
  View user's profile Send private message Visit poster's website
  • Joined: 19 Oct 2011
  • Posts: 4
  • Location: Sweden
Reply with quote
Post Posted: Fri Oct 21, 2011 2:46 pm
Thank you all for you answers!

Zipper: Hehe, nice that you've seen "Invasion of the big pixels", then you have an idea on what I'm trying to do on the SMS now! :)

Consensus seems to be that 16 in VBLANK/29 in display is safe, so my solution ended up being splitting the code into two routines.
In VBLANK I run the "fast" version and that handles the update of 17x32=544 positions.

ld      a,(table+offset)   ; 13
add   a,c               ; 4
out    (VDPData),a         ; 11
ld      a,0               ; 7
out   (VDPData),a         ; 11
...


So 28 cycles for the first write and 18 for the second, should be well above the lower limit of 16.

Then when I enter the display area, I use a "slow" version that updates the final 7x32=224 positions.

ld      a,(table+offset)   ; 13
add   a,c               ; 4
nop                     ; 4
out    (VDPData),a         ; 11
push   bc                  ; 11
pop    bc                  ; 10
xor   a                  ; 4
out   (VDPData),a         ; 11
...


There we have 32 cycles for the first write and 36 for the second, also well above the suggested 29.

Now I just need someone to test it, who can also promise to keep my routines secret since I want to release it all in a nice demo that surprises everybody later on! :D
Zipper? Chilly Willy?

I also tried the "read-to-advance-pointer" trick, and had the following code for the "fast" version

ld      a,(table+offset)   ; 13
add   a,c               ; 4
out    (VDPData),a         ; 11
in      a,(VDPData)         ; 11
...


But that lead to severe visual corruption even on my Megadrive, so I guess you need the 16 cycle delay regardless to whether you are interested in the result from the read or not.
  View user's profile Send private message Visit poster's website
  • Site Admin
  • Joined: 19 Oct 1999
  • Posts: 14745
  • Location: London
Reply with quote
Post Posted: Fri Oct 21, 2011 3:29 pm
Well, that'd explain why you want to update the whole name table. For comparison, you might want to check out the zoomer portion of Nine Pixels and some of the technical discussion as it's a similar sort of fat pixel effect.
  View user's profile Send private message Visit poster's website
  • Joined: 23 Jan 2011
  • Posts: 65
  • Location: The Land of Enchantment
Reply with quote
Post Posted: Fri Oct 21, 2011 6:01 pm
There's no problem keeping things secret. You can even go so far as an NDA if need be. Just PM me and I can give you an email address to go through instead of trying to do everything via PM.
  View user's profile Send private message
  • Joined: 19 Oct 2011
  • Posts: 4
  • Location: Sweden
Reply with quote
Post Posted: Sat Oct 22, 2011 12:24 pm
Maxim: Ah, hadn't seen "Nine pixels" before, well, guess I won't be breaking any new ground on the SMS then, but it can still be fun to code some bigpixel effects.

OT: Nine pixels doesn't run properly on my Megadrive, I just get a blackscreen when the stretching part of the Amiga workbench-lookalike is about to start. I wonder what kind of hardware trick is used there that isn't MD-compatible?

Chilly willy: Haha, no NDA needed, but I just want to avoid having the whole demo as parts for everyone to see before it's finished. I'll PM you!
  View user's profile Send private message Visit poster's website
  • Joined: 06 Feb 2009
  • Posts: 110
  • Location: Toulouse, France
Reply with quote
Post Posted: Sat Oct 22, 2011 4:55 pm
Maxim wrote
honestbob wrote
Is this the thread you were thinking of?

No - that one was written by some buffoon with no clue on the electronics side of things, but somewhere around the "cycle accurate emulation" parts of the dev forum there was something with some counting of the VRAM clock cycles and the necessary accesses by the VDP display circuitry that came up with some guesses for exactly how many accesses you could make and when.


I think you are talking about this one (look for asynchronous first message):
http://www.smspower.org/forums/viewtopic.php?t=11662

Now, if you compare with the TMS9918 timings, asynchronous assumptions make lot of sense and show how similar both design (TMS9918 and SMS VDP) probably are.

Basically, during active line (256 pixels), every 32 pixels there is a CPU access window where data in the CPU buffer is written to /read from VRAM . If you write too fast the data port, some byte are gonna be overwritten and lost. There are most likely a few additional access slots during HBLANK but not much can be said about them.

32 pixels is exactly 16 VRAM access, which is also 64 MCLK (on Master System with 10,xx MHz master clock) or 64/3 = 21.3333 CPU clocks.

I think Retrocopy is the only emulator who is trying to emulate this behavior, but I don't know how accurate it is since no real tests were made on Master System VDP (like Nemesis recently did for the Mega Drive VDP) and this remains pure speculation based on how TMS9918 was designed, though it seems very logical this way.

For the record, Mega Drive VDP has a 4-Word FIFO instead of a simple buffer so, even in MS compatibility mode, you can probably write faster to the Data port and all data will still be processed in time. In MD mode, main CPU is even waiting when the FIFO is full (VDP does not assert /DTACK) so you can't overwrite data, I don't know if the same thing happens when Z80 is main CPU (through the use of wait cycles for example ?).



Quote
OT: Nine pixels doesn't run properly on my Megadrive, I just get a blackscreen when the stretching part of the Amiga workbench-lookalike is about to start. I wonder what kind of hardware trick is used there that isn't MD-compatible?


MD VDP does not support all features of MS VDP, some register bits especially do not work as expected and would indeed blackout the screen. I think it's quite well detailled in Charles's VDP documentation. If you are using the sprite zooming function, it does not work on MD VDP for example
  View user's profile Send private message
Reply to topic



Back to the top of this page

Back to SMS Power!