Forums

Sega Master System / Mark III / Game Gear
SG-1000 / SC-3000 / SF-7000 / OMV
Home - Forums - Games - Scans - Maps - Cheats - Credits
Music - Videos - Development - Hacks - Translations - Homebrew

View topic - Genesis Mode 4 VRAM Timing

Reply to topic
Author Message
  • Joined: 01 Jan 2017
  • Posts: 6
  • Location: SF Bay Area, USA
Reply with quote
Genesis Mode 4 VRAM Timing
Post Posted: Sun Jan 01, 2017 6:02 pm
I originally posted this over on SpritesMind, but I figure it would be of interest to people here as well. I've been recently working on adding Mode 4 support to BlastEm (my Genesis emulator), so I decided to analyze VRAM traffic so I can get the timing right. This post is to share what I've found. First some general info:


  • The VDP clocks 4 bytes out of VRAM for each slot, but only the first 2 are actually used. This means it takes 2 slots to read a single tile row.
  • There are no refresh cycles during active display (seems to rely on the fact that background table reads will cover all pages)
  • The SAT cache does not appear to be used, so the sprite table scan actually reads the sprite Y coordinates from VRAM
  • Address mapping is different as Charles MacDonald observed previously (row address represents A8-A1 rather than A9-A2, column address bit 0 remains A0, but bit 1 is now A9 rather than A1. Column bits 5-2 remain A13-A10)
  • Background rendering starts 12 slots later than in Mode 5. This makes sense because it does not render 2 extra columns to support scrolling like Mode 5 does (an 8 slot savings) and there is only one background (4 slots).


Since Mode 4 is simpler than Mode 5, the access patterns are also simpler. I'm lazy, so I'm not going to draw a nice diagram like Nemesis did for Mode 5. Hopefully, the simplicity of the pattern will make a textual description sufficient. I'll start off with a description of a few types of "blocks", by which I mean a sequence of several slots that is repeated multiple times in a line. Note: "active" sprites refers to sprites that will be drawn on the current line

Sprite Render Block (rendering for 2 "active" sprites):
Sprite N X/Name Read
Sprite N+1 X/Name Read
Sprite N Tile read (1st word)
Sprite N Tile read (2nd word)
Sprite N+1 Tile Read (1st word)
Sprite N+1 Tile Read (2nd word)

Background Render Block (rendering for 4 columns):
Column N Name Table Read
External Slot
Column N Tile Read (1st word)
Column N Tile Read (2nd word)
Column N+1 Name Table Read
Sprite (16+N*1.5) Y Read (Reads Y of 2 sprites)
Column N+1 Tile Read (1st word)
Column N+1 Tile Read (2nd word)
Column N+2 Name Table Read
Sprite (16+N*1.5+2) Y Read (Reads Y of 2 sprites)
Column N+2 Tile Read (1st word)
Column N+2 Tile Read (2nd word)
Column N+3 Name Table Read
Sprite (16+N*1.5+4) Y Read (Reads Y of 2 sprites)
Column N+3 Tile Read (1st word)
Column N+3 Tile Read (2nd word)


Okay, with those building blocks defined, the sequence for a single line is as follows:

2X Sprite Render Block ("active" sprites 0-3)
External Slot
External Slot
HSYNC low
External Slot
External Slot
External Slot
2X Sprite Render block ("active" sprites 4-7)
HSYNC goes high before the tile reads for "active" sprite 7
External Slot
External Slot
Sprite 0 & 1 Y Read
Sprite 2 & 3 Y Read
Sprite 4 & 5 Y Read
Sprite 6 & 7 Y Read
Sprite 8 & 9 Y Read
Sprite 10 & 11 Y Read
Sprite 12 & 13 Y Read
Sprite 14 & 15 Y Read
8X Background Render Block
External Slot
External Slot
External Slot
External Slot

I'd like to thank Charles MacDonald for his previous work on the behavior of Mode 4 on the Genesis as it has been quite helpful.
  View user's profile Send private message Visit poster's website
  • Joined: 01 Jan 2017
  • Posts: 6
  • Location: SF Bay Area, USA
Reply with quote
Post Posted: Tue Jan 03, 2017 5:02 am
I received a couple of questions about this post via PM, but I think the answers might be of general interest so I'm going to answer them here. Hopefully that is okay.

The first question was looking for some additional context for understanding what was going on in this post. For that, I think the post Nemesis wrote on Mode 5(look for a post tiled VDP VRAM Access Timing) might be helpful. My post is essentially the same info, but for Mode 4 and presented in a less digestible format (again due to my laziness). The TMS9918 Master Timing diagram hosted here on SMS Power might also be useful. Additionally, here's a bit more of an explanation:

The SMS and Genesis VDPs are line-oriented devices. Each line they needs to fetch certain information from VRAM in order to perform rendering. My post attempts to explain the order that those reads are done in addition to the points in which the VRAM bus is idle and available to service read/write requests from the CPU (an "external slot"). Here's a longer description of each of the items in the list:


  • Sprite X/Name Read - This "slot" reads the x-coordinate and tile name from the second half of the sprite attribute table. This read is only done for sprites that are "active" on the current line.
  • Sprite Tile Read - This slot reads two bytes of the appropriate row of the tile for an "active" sprite. Since a tile row is 4 bytes total, there will be two of these for each active sprite.
  • Column Name Table Read - This slot reads an entry from the background table for a single column. The entry contains the tile name (i.e. tile address/32), priority bit, palette bit and horizontal/vertical flip bits for the tile.
  • Column Tile Read - This slot reads two bytes of the appropriate row of the tile for a column. Since a tile row is 4 bytes total, there will be two of these for each column.
  • Sprite Y Read - This slot reads the Y coordinate for two sprites from the first half of sprite attribute table. This is done for all 64 sprites in the sprite table to determine which sprites will be "active" on the next line. Each time a sprite "matches" the current line number (y coordinate <= current line && y+sprite height > current line) some information is recorded in an internal buffer for use in rendering the next line (as long as the sprite limit hasn't been hit anyway).
  • External Slot - The VDP does no rendering related accesses in this slot, so it is available to service requests from the CPU. While the VDP can read two bytes in a slot for rendering (4 when in mode 5), it can only transfer a single byte for the CPU. This is due to the fact that external slots use the random access port of the VRAM whereas rendering slots use the serial port which requires sequential access.


Since there are 171 total slots in a line and I am rather lazy, I didn't list all of them directly. Instead I called a pattern of repeated accesses a "block" and then refer to these blocks in my list.

The second question was how I acquired this information. I hooked up a logic analyzer (Open Bench Logic Sniffer) to the VRAM in a Genesis, ran a simple SMS demo I wrote with predictable access patterns and then performed a capture. I then wrote some Python code to process the capture to turn the values into a series of RAM access values. This is complicated by the fact that the VRAM uses multiplexed addresses (i.e. the address is split into a column part and a row part) and the mapping is not entirely straightforward. Fortunately, I knew how row and column addresses mapped to the VDP's internal addressing system from my work with Mode 5 and Charles provided a mapping between the Mode 5 addressing system and Mode 4.

I can provide the raw captures and the script if anyone is interested.

P.S. Sorry for the lack of links, seems my account is too new to use them.
  View user's profile Send private message Visit poster's website
  • Joined: 14 Aug 2000
  • Posts: 740
  • Location: Adelaide, Australia
Reply with quote
Genesis Mode 4 VRAM Timing
Post Posted: Tue Jan 03, 2017 1:03 pm
Nice! Thanks for sharing your findings.
  View user's profile Send private message
  • Joined: 01 Jan 2017
  • Posts: 6
  • Location: SF Bay Area, USA
Reply with quote
Post Posted: Mon Jan 09, 2017 8:15 am
My anonymous PM correspondent had some additional questions, I was hoping he/she might ask them here so it doesn't seem like I'm talking to myself, but I guess they have their reasons for sticking to PM.

Quote
Unfortunately I don't know anything about MD but on SMS in Mode 4 without extended height a y value of $d0 for a sprite indicates the end of the SAT. All sprites after that won't be processed (I mean they don't get drawn, can't cause the overflow flag to be set and can't cause the collision flag to be set). In your post you say that it always reads all 64 sprites. Doesn't it stop reading them when it reaches the end of the SAT?

I did another test for this and the VDP always reads the Y coordinate for all 64 sprites even if it hits a $D0 terminator. AFAIK, that feature does work so presumably the results are ignored for any sprites read after $D0.

He/she also was interested in the relevant files. The most interesting capture is probably the last one which I believe is from the version of the test that contains some sprites. Examining the capture directly is rather tedious so I wrote a Python script. The test program that I captured is not very interesting. My main goal was just to get as many distinct tile addresses as I could in a predictable pattern to better identify which tile read was which. I've attached a ZIP file to this post with all of the above.

I also have some new miscellaneous observations

  • Unused "active" sprite slots, fetch sprite zero including the relevant tile row even if sprite zero is not visible on the current line. Assuming this behavior is carried over from the SMS VDP, it may explain why it's typically sprite zero tile data that impacts the blanked out first column.
  • Active sprites appear in slots in sprite table order; however, unused slots appear before any used slots
  • Even in Genesis mode, pending interrupts must be cleared using the status port when in Mode 4. They are not cleared by a 68K interrupt acknowledge like in Mode 5 in Genesis mode.
  • Toggling TH does not seem to latch a new HCounter value when in Genesis mode, nor does toggling the M2 bit (controls HVC Latch in Mode 5). It's possible that externally toggling TH when it is set to an input would do the trick, but I haven't tested it.
  • At least when in Genesis mode, horizontal interrupts and the vcounter increment both appear to be 50 ticks of SC (190 master clock ticks) after !HSYNC goes low. Vertical interrupts are triggered about 96 SC ticks after !HSYNC goes low. This timing is different than Mode 5 (very different for HInt/Vcounter change, slightly different than Vint) which is somewhat surprising. Assuming that the H Counter progression is identical between Mode 4 and H32 Mode 5, that would put these events at roughly H Counter 249 and 4 respectively.

mode4_analysis.zip (24.87 KB)

  View user's profile Send private message Visit poster's website
  • Joined: 01 Jan 2014
  • Posts: 331
Reply with quote
Post Posted: Mon Jan 09, 2017 9:15 am
Hi

Interesting stuff. Thought I might add contents of active sprite slots appears to persist if the frame is interrupted even over vblank area.

You can see this in action on the demo i wrote. Checker parallax section, i transition by turning off vdp mid frame going up from bottom, turning back in in vblank. vdp attempts to write active sprites on scanline 0.

The whole split fetch situation is a doozy for vertical multiplexing on SMS :( There is a rom I wrote floating around (can't find it) that manipulates the split fetch cycle to multiplex in alex kid on real hardware. On emulators you see mario due to single sprite fetch in hblank.

Behavior is identical on MD ruling out use of sprite cache. From memory y fetch timing is off by around 20 z80 cycles between systems making multiplexing more annoying if you want compatibility.

Have you done testing related to latching of vscroll value? I would love a way to manipulate this beyond the single latch per frame.
  View user's profile Send private message
  • Joined: 01 Jan 2017
  • Posts: 6
  • Location: SF Bay Area, USA
Reply with quote
Post Posted: Tue Jan 10, 2017 10:05 pm
psidum wrote
Interesting stuff. Thought I might add contents of active sprite slots appears to persist if the frame is interrupted even over vblank area.

Not too surprising, but good to have confirmation.

psidum wrote
You can see this in action on the demo i wrote. Checker parallax section, i transition by turning off vdp mid frame going up from bottom, turning back in in vblank. vdp attempts to write active sprites on scanline 0.

Interesting, I'll have to check it out

psidum wrote
The whole split fetch situation is a doozy for vertical multiplexing on SMS :(

Are you talking about the split between the fetch of the Y and X/Name parts of the table or the split between the first 4 sprites and the last 4 sprites for a line?

psidum wrote
There is a rom I wrote floating around (can't find it) that manipulates the split fetch cycle to multiplex in alex kid on real hardware. On emulators you see mario due to single sprite fetch in hblank.

I'd be very interested in getting a copy of this if you can find it or remember the name.

psidum wrote
Behavior is identical on MD ruling out use of sprite cache.

I can confirm that the sprite cache is only used in Mode 5.

psidum wrote
Have you done testing related to latching of vscroll value?

I have not. It's a bit harder to investigate as it's not really observable externally like things fetched from VRAM are.
  View user's profile Send private message Visit poster's website
  • Joined: 31 Oct 2007
  • Posts: 853
  • Location: Estonia, Rapla city
Reply with quote
Post Posted: Sun Jan 29, 2017 3:21 am
I snooped around on my SMS2 and it seems to behave same way. In addition blanking has same timings on TMS99xx VDPs on SMS, and most possibly same function. TMS does 64 refreshes for DRAMs in blanking during first 256 pixels (and gives 64 access slots) and in remainder of the line there's 42 access slots. MD should retain this behaviour for refresh too.
  View user's profile Send private message Visit poster's website
  • Joined: 01 Jan 2017
  • Posts: 6
  • Location: SF Bay Area, USA
Reply with quote
Post Posted: Sun Jan 29, 2017 8:57 pm
TmEE wrote
I snooped around on my SMS2 and it seems to behave same way.

Same way as in the relative timing of events in Mode 4 is the same or something else?

TmEE wrote
In addition blanking has same timings on TMS99xx VDPs on SMS, and most possibly same function. TMS does 64 refreshes for DRAMs in blanking during first 256 pixels (and gives 64 access slots) and in remainder of the line there's 42 access slots. MD should retain this behaviour for refresh too.

I assume by blanking you mean, vertical blanking. Does this happen every line starting with 192 until it hits zero or only certain ones.
  View user's profile Send private message Visit poster's website
  • Joined: 31 Oct 2007
  • Posts: 853
  • Location: Estonia, Rapla city
Reply with quote
Post Posted: Mon Jan 30, 2017 7:56 am
Order of events seems same, but I lack a logic analyzer to be 100% certain. VRAM access slots were certainly in same places and same order.

And yes, vertical blanking. It starts on next line after active display and goes on until one line before active display. That one line between blanking and active display is used for sprite preparation and has same structure as normal active display lines. MD has same kind of behaviour too so you have one less blanking line from total to do updates with.
  View user's profile Send private message Visit poster's website
  • Joined: 01 Jan 2017
  • Posts: 6
  • Location: SF Bay Area, USA
Reply with quote
Post Posted: Tue Jan 31, 2017 11:14 pm
TmEE wrote
And yes, vertical blanking. It starts on next line after active display

That makes sense. During the active display it can rely on name table fetches for refresh, but that obviously won't work during vertical blanking.

TmEE wrote
and goes on until one line before active display. That one line between blanking and active display is used for sprite preparation and has same structure as normal active display lines. MD has same kind of behaviour too so you have one less blanking line from total to do updates with.

I thought Mode 4 couldn't display sprites on line 0 (Charles' doc says that sprites with a Y of zero are displayed on line 1). An extra "active" line shouldn't be needed in that case.
  View user's profile Send private message Visit poster's website
  • Joined: 31 Oct 2007
  • Posts: 853
  • Location: Estonia, Rapla city
Reply with quote
Post Posted: Wed Feb 01, 2017 5:54 am
You can definitely show sprites on line 0, just that the Y coord is one off. 0 = 1, 255 = 0.
  View user's profile Send private message Visit poster's website
  • Joined: 05 Dec 2019
  • Posts: 56
  • Location: USA
Reply with quote
Post Posted: Mon Dec 23, 2019 1:18 am
I made a diagram of this (see attachment below).

The diagram implies that the rule might be "A VRAM write will take effect within the next 22 Z80 cycles. Don't issue another while one is waiting." This would be just barely too slow for the OTIR instruction, but it's fine for any manipulation of the value between read and write.

Yet Sega documents allegedly specify 29 cycles, and hardware tests say 26. (Source: sverx's reply to "[FIXED] SMS-only VDP issue (works on MD/Genesis)") I guess there might be some sort of VDP-internal delay for scheduling the write.

  View user's profile Send private message Visit poster's website
Reply to topic



Back to the top of this page

Back to SMS Power!