|
ForumsSega Master System / Mark III / Game GearSG-1000 / SC-3000 / SF-7000 / OMV |
Home - Forums - Games - Scans - Maps - Cheats - Credits Music - Videos - Development - Hacks - Translations - Homebrew |
Author | Message |
---|---|
|
When does an HBlank start?
Posted: Wed Jun 15, 2022 12:39 pm
|
I'm playing around with interrupts and want to make the most out of the HBlank period, so I started off with a simple palette change of the 0th index sprite palette, which corresponds to the border color.
Using the Event viewer in Emulicious, I wanted to see where the HBlank was starting. I thought that the HBlank interrupt would start as soon as the HBlank period starts, but it looks like it doesn't start until it gets right back to the left side border, meaning if I use an interrupt and want to write to the VDP, I don't have any time before the SMS is drawing to the screen again. Should I be executing code until I hit the next HBlank period, or am I doing something wrong? Here is $0038: ;============================================================== ; Interrupt Handler ;============================================================== .orga $0038 ;Swap shadow registers and registers ex af, af' exx ;Get the status of the VDP in a,(VDPCommand) ld (VDPStatus), a ;Count the number of interrupts since VBlank ld hl, INTNumber ld a, (hl) inc a ld (hl), a ;Do specific scanline-based tasks call InterruptHandler ;Swap shadow registers and register back exx ex af, af' ei ;Leave reti And the InterruptHandler: ;Get here after coming from $0038 InterruptHandler: ;Check if we are at VBlank, Bit 7 tells us that ld a, (VDPStatus) bit 7, a ;Z is set if bit is 0 jp nz, + ld a, (INTNumber) cp $01 jp nz, Black NotBlack: ld hl, $c010 | CRAMWrite call SetVDPAddress ; Next we send the VDP the palette data ld (hl), $39 ld bc, $01 call CopyToVDP ret Black: ld hl, $c010 | CRAMWrite call SetVDPAddress ; Next we send the VDP the palette data ld hl, color ld bc, $01 call CopyToVDP ;Return to end Interrupt ret ;If we are on the last scanline (VBlank) +: ;Set IntNumber to zero ld hl, INTNumber ld (hl), $00 ;Update frame count call UpdateFrameCount ;Check what scene we're on ld a, (sceneID) cp $02 jp nz, ++ ++: ret |
|
|
Posted: Wed Jun 15, 2022 12:55 pm |
In general by the time you take the interrupt it’s some way into the line; games often have delay loops before pushing palette changes so the CRAM dots are offscreen. Other changes like HScroll are latched so you can write them and they take effect on the next line. | |
|
Posted: Wed Jun 15, 2022 1:09 pm |
Ahhh okay, that makes sense why I wasn't noticing this issue when doing HScroll then. Thanks! |
|
|
Posted: Wed Jun 15, 2022 1:31 pm Last edited by willbritton on Mon Jun 20, 2022 8:35 pm; edited 2 times in total |
This is a very interesting subject (if that's the kind of thing that floats your boat!) and I noted from earlier exploration of the docs on this site that the exact point of interrupt was uncertain.
Very quick and possibly flawed back of envelope reasoning: 1. Charles MacDonald provides the following rough guide of horizontal "pixel" duration here, for NTSC: Pixels H.Cnt Description
256 : 00-7F : Active display 15 : 80-87 : Right border 8 : 87-8B : Right blanking 26 : 8B-ED : Horizontal sync 2 : ED-EE : Left blanking 14 : EE-F5 : Color burst 8 : F5-F9 : Left blanking 13 : F9-FF : Left border In NTSC we can estimate a clock cycle as being roughly 262 * 342 * 60 / 3.5x10^6 = ~1.5 "pixels" long. 2. I'm going to make the assumption that the VDP doesn't issue the interrupt until at least the start of HSYNC (in practice it could well be some time later) 3. Interrupt timing for mode 1 isn't in the Z80 data sheet, but there is a very handy treatment by Achim Flammenkamp here. Note in particular the minimum timing of 13 clock cycles to get to $0038 and also that the interrupt signal won't even be strobed until the current instruction is finished, so with the longest instruction cycles that might be as many as 23 cycles for the CPU to "catch" the interrupt before servicing it. Given all this, I reckon the absolute soonest you could possibly respond to a horizontal interrupt would be around 13 * 1.5 = ~20 pixels after HSYNC which is still in the HSYNC itself; and in the worst case (13 + 23) * 1.5 = ~54 pixels after HSYNC which is around 4 pixels into the left border - possibly that's where we're seeing it in the screenshot here? Of course this doesn't include any code after the jump to $0038 which will take more cycles on top. |
|
|
Posted: Thu Jun 16, 2022 12:26 pm |
Ooo that's tough if you're doing writes to VDP in that time. That kind of variance could definitely mess up anything that was just barely fitting in the allotted HBlank timing. I think I'll play around with finding out what can and can't be squeezed into that window |
|
|
Posted: Thu Jun 16, 2022 12:35 pm |
... yet I'm quite sure there's enough time to change the H_scroll value in time for the next line to get the value, if you do it as quickly as possible.
I have to dig some code, probably, now that I said that... edit: here! ld a,(_next_bg_x_value) // needs to have the value at hand!
out (#0xBF),a ld a,#0x88 // write to hscroll VDP register out (#0xBF),a |
|
|
Posted: Thu Jun 16, 2022 2:46 pm |
So this would adjust the scroll speed for all 192 lines independently then? You could get some seriously smooth curves going on with that |
|
|
Posted: Thu Jun 16, 2022 3:14 pm |
Sure, Hang On does just that to render the curving road! For reference, here are the instructions Hang On runs from $0038: ; @ $0038: push af in a, (Port_VDPStatus) or a jp p, _RAM_C4D0_ ; condition met ; @ $C4D0: in a, (Port_VCounter) cp $5F ; decide whether we're far enough down the screen to render road jr c, _LABEL_3C8_ ; condition not met ld ($C4DA), a ld a, ($C500) out (Port_VDPAddress), a ld a, $88 out (Port_VDPAddress), a ; write the horizontal scroll value Not sure how many cycles that is, but it obviously happens very soon after the interrupt. Also, and I'm only guessing here based on how I think the VDP background processing would be designed, but I would imagine you have until perhaps 8 or 16 pixels before the left hand drawable edge of the screen to set the horizontal scroll value. That's a good way into the left border, but I doubt it would be "extended" by blanking the left hand column -- maybe though... |
|
|
Posted: Fri Jun 17, 2022 8:21 am |
provided you prepare the value for next_bg_x_value every frame and you set the vcounter register to 0... yes |
|
|
When does an HBlank start?
Posted: Fri Jun 17, 2022 8:21 am
|
@sverx were there any instructions between that code snippet and the interrupt vector? | |
|
Posted: Fri Jun 17, 2022 8:26 am |
@asynchronous sure, I'm using devkitSMS so there's quite a bit of instructions: https://github.com/sverx/devkitSMS/blob/master/SMSlib/src/SMSlib.c#L375 | |
|
When does an HBlank start?
Posted: Sat Jun 18, 2022 11:19 am
|
Ah OK, I thought for a second you were able to update H scoll for the same line and not the next line. Superhuman stuff. My bad. | |
|
Posted: Sat Jun 18, 2022 8:20 pm |
ahah no no, I meant you can do it on time to have the scroll changed in the next line. | |
|
Posted: Tue Jun 21, 2022 3:04 pm |
So I did some experimentation with Emulicious because I'm curious about the exact timings.
Not sure how representative of real hardware it is (for one thing having the debugger open does affect the results so does raise a question about CPU load maybe). Experiment 1: Change the palette very soon after line interrupt di
in a, (VDP_CMD) xor a out (VDP_CMD), a ld a, $c0 out (VDP_CMD), a ld a, (backgroundColor) xor $ff ld (backgroundColor), a out (VDP_DATA), a reti Results in "palette.png". The palette visually switches after the 150th pixel on the line, which is WAY further into the scanline than I'd expected, but sanity-checking: those instructions total 93 cycles, so at 1.5 pixels per cycle that's around 140 pixels, plus allowing the VDP some time to actually change the palette that would suggest the CPU receives the interrupt very close to the beginning of the line. Just goes to show how precious those cycles are. UPDATE: I progressively added more NOPs into the interrupt service code until I got to a max of 185 cycles worth of instructions before things started going pear-shaped, not including the reti. Makes pretty good intuitive sense, as if a CPU cycle is worth ~1.5 pixels then that's ~278 pixels worth of scanline time to do some work. Experiment 2: Update the horizontal scroll value based on a variable in RAM very soon after line interrupt di
in a, (VDP_CMD) and $80 jr nz, + ld a, (hScrollValue) inc a jr ++ + xor a ++ out (VDP_CMD), a ld (hScrollValue), a ld a, $88 out (VDP_CMD), a reti Results in "scroll.png". This doesn't really prove much, except that the horizontal scroll is indeed latched all the way through the scanline - there are no signs of any discontinuities at any point down the screen. Also implicit here is the fact that you don't get a horizontal interrupt on scanline 0 until after it has rendered - the first scanline has no scroll because it was reset by the frame interrupt. Experiment 3: Update the horizontal scroll value with the vcounter value very soon after horizontal interrupt di
in a, (VDP_CMD) in a, (VCOUNTER) out (VDP_CMD), a ld (hScrollValue), a ld a, $88 out (VDP_CMD), a reti Results in vcounter.png This illustrates the fact that vcounter tracks the actual scanline, and is correct at least at the point that horizontal scroll is set (we presume somewhere similarly far through the scanline as the palette switched in Experiment 1). Also evident here (and if you zoom right in and measure the pixels you can confirm) is that vcounter is at 193 when the vertical interrupt takes place: that's the value of the scroll position apparent on scanline 0. |
|
|
Posted: Tue Jun 21, 2022 7:49 pm |
Which events are affected by having the debugger open? I noticed a bug with the reported dot for CPU events such as interrupts and HALT. But other events all look stable to me. The 1.5 pixels per CPU cycle comes from the different clock rates. The master clock clocks both the CPU and the VDP but the CPU has a divider of 15 in between and the VDP has a divider of 5 and the VDP outputs a pixel every 2 clocks. |
|
|
Posted: Tue Jun 21, 2022 8:49 pm |
This is a curiosity in many years of query. Which is the clock of SMS VDP? The Same that CPU? 3,59? It change if is a SMS 1, SMS2 or GG? |
|
|
Posted: Wed Jun 22, 2022 8:20 am |
I think it's fair to presume that the line IRQ gets fired by the VDP as soon as possible into the new line, as I suspect the check happens as soon as the counter gets incremented. Of course the CPU will service that as soon as possible, which means when the current instruction is complete (of course provided that interrupts are enabled otherwise it can happen much later...) | |
|
Posted: Wed Jun 22, 2022 8:22 am |
It's the same clock, different dividers. The VDP gets a clock that's 3 times faster than the CPU, and it takes two clock cycles for each dot. PAL/NTSC devices have a slightly different crystal so the speed is a bit off. |
|
|
Posted: Wed Jun 22, 2022 10:03 am Last edited by segarule on Fri Jun 24, 2022 9:17 am; edited 1 time in total |
Thanks. Ah. 1.5 Pixels make total sense for me, now. (Already explained for Calindro). Im wondering if the master clock pushed to limits will affect CPU and VDP. In NESdev have "PPU dots per CPU cycle= 3". This could explain why games in Nes "seems" more smooth or somebody explained in other forum that 65c02 use few cycles per instruction compared to z80. |
|
|
Posted: Wed Jun 22, 2022 11:28 am |
In a standard system (i.e. with the clock divider mentioned) if you increased the master clock I'm pretty sure you'd get out of sync with the display fairly quickly and you'd lose the picture, but in any case the VDP can only receive data relatively slowly so without modifying your code you'd start losing data on the bus between the CPU and the VDP. (Also it depends on the CPU, the current range of DIP CMOS Z80s I believe can be clocked up to 10MHz, but the one you find in your original SMS may well not be rated more than 4MHz.)
See here for a bit more detail.
Not sure specifically, only that the pattern captured on screen changes with the debugger running, basically it looks like the events (palette change and hscroll) happen somewhat later with the debugger open, so that they occur on the next line instead of the same one. Happy to help investigate if you wanted me to get your some more info, just let me know. Least I can do to pay you back for this incredible tool!
Yeah agreed, and I discuss the time taken for the CPU to respond a little further up; the thing I'm still wondering (and grappling with this for a project I will unveil very soon...) is whether the interrupt / counter is incremented on HSYNC or as soon as the right border starts on the previous line. Not that it particularly matters for game dev of course, only for hardware nerds like me! |
|
|
Posted: Wed Jun 22, 2022 4:18 pm |
I don't see how that could happen and I don't seem to be able to reproduce it. I've tried different scenes in different roms and debugger open vs. debugger closed always matched. I'd appreciate if you could help me reproduce it. |
|
|
Posted: Wed Jun 22, 2022 4:56 pm |
I mean my question is without hardware overclock. I had in mind a code exploring fulltime the master clock.
Cool! Thanks. So i can presume that our SMS TMS is 10.7 Mhz, correct? |
|