|
ForumsSega Master System / Mark III / Game GearSG-1000 / SC-3000 / SF-7000 / OMV |
Home - Forums - Games - Scans - Maps - Cheats - Credits Music - Videos - Development - Hacks - Translations - Homebrew |
Author | Message |
---|---|
|
CPU Questions
Posted: Sun Jun 22, 2008 9:35 pm
|
Please don't just redirect me to www.z80.info. I've been there already ;).
1. Apart from things like the memory map etc., is the Z80 different in anyway than normal (incl. illegal opcodes)? 2. What are the exact Mhz for NTSC/PAL? 3. Can someone please explain to me what is meant by all of this M/T cycle business? I understand that on a 6502 CPU, the ADC, Immediate opcode takes 2 clock cycles. But at the moment I am reading many Z80 documents and I am finding it all a little bit confusing. For instance, how long would opcode $8F (ADC A, A) be? 4. Does the Z80 read or write a memory location on every clock cycle like a 6502 does? 5. Can someone please give me a chart of what happens on every clock cycle of an instruction? e.g. http://en.wikipedia.org/wiki/Z80#Instruction_execution 6. What are the powerup status of each of the registers? I know for example that PC is always $0000. But I am not sure if the SP is set to $FFFE on reset like the Game Boy. |
|
|
Posted: Sun Jun 22, 2008 10:03 pm |
http://www.smspower.org/forums/viewtopic.php?t=11200 http://www.smspower.org/dev/docs/wiki/Z80/ClockRate http://www.smspower.org/dev/docs/wiki/Z80/InstructionSet Just to begin with. I'm too lazy to do more search job for you. http://www.smspower.org/dev/docs/wiki/Site/Search http://www.smspower.org/forums/search.php |
|
|
Posted: Mon Jun 23, 2008 1:22 am |
I can try to answer a few:
1. Different than what? Not sure what you're specifically looking for. Different than the 8080? 2. 50/60 3. I don't know if you need to worry too much about machine time and clock cycles. Correct me someone if I'm wrong on that. I'll try my best to explain. First off, ADC A, A takes 1 byte, 4 clock cycles. Now, about machine time: There are four cycles for each machine cycle (M1, M2 ...) there are four clock cycles (T1-T4). This is because the fetch aspect of an instruction takes three clock cycles, and the fourth (T4) is required to decode and execute the instruction after it has been fetched. All instructions therefore require at least this many. Luckily, it seems from what I've read that most important instructions like LD and such require only this many. As a special trick, the CU can sometimes, though not in all cases, overlap one machine cycle with another to save time. Therefore, some instructions, like LD SP, HL take more than 4 clock cycles. So, to avoid a problem, the CU makes sure that in cases where it is able, it starts M2 after T4 of M1, with an overlap occuring at T1 and T2 of M2. That's all you really need to know, I think. Let me know if you have any other questions, I'll try to answer. 4. Not sure, I'm not familiar with the 6502. 5. Hmmm, I know that Leventhal did a full chart of how many cycles for each instructions, but I'm not sure you can find it online. The book, called Z80 Assembly Language Programming, can be found for pretty cheap though. 6. SP is placed at $DFFF. |
|
|
Posted: Mon Jun 23, 2008 5:59 am |
Stan, please don't muddy the waters... most of what you said is wrong and will confuse matters. | |
|
Posted: Mon Jun 23, 2008 6:52 am |
The SMS CPU is a stock Z80. The memory map is not a CPU feature at all - the memory map is probably different on every Z80 computer.
As Tom said.
M cycles are mostly unimportant for an emulator. They tell you when the CPU is accessing memory. T cycles are "core clock cycles", ie. most opcodes take 4 clocks.
Only on the M cycles. It takes a memory read per opcode byte, of course, and these tend to be about every 4 clocks, but longer instructions can be entirely internal to the CPU.
I'm sure the information is out there somewhere, but I don't think you need to care. Instructions cannot be interrupted halfway through, with the important caveat that instructions like LDIR are implemented as multiple executions of the same instruction (and thus can be interrupted between repeats). So your CPU emulator just needs to count cycles, mostly.
That's somewhat unnecessary. Most should be considered as undefined, except that on a real system, the BIOS will usually have run before the game, and will have left things in a reasonable state; so if you want to run BIOS-free, you'll need to also set a reasonable state. SP = DFF0 is probably helpful, IM 1 probably helps, apart from that nothing shoul dbe taken for granted. The initial VDP state is probably more significant. |
|
|
Posted: Mon Jun 23, 2008 9:24 am |
Excellent. So the CPU is just a stock Z80, and that we have 60Hz for NTSC and 50Hz for PAL. I know that the registers and RAM don't really have powerup values as such. So is the SP initialized to $DFFF on powerup by default, or is it something the programmer must do? Also is there a reset routine like on the 6502? Also for the ADC A, A opcode, I should increase the clock cycle counter by 4 then? | |
|
Posted: Mon Jun 23, 2008 11:00 am |
You really have to ignore what Stan said. 60Hz CPUs are not really practical. SP is not initialised, except by the BIOS, which probably leaves it somewhere around $dff0 (NOT $dfff) - but the program itself ought to be initialising it soon after booting. A typical startup is
di im 1 ld sp,$dff0 jp main and a large proportion of games look like that. The Z80 has a reset pin and software is able to execute jp $0000 or rst $00 if it wants to reset the game, but resetting to the BIOS is unheard of (if technically possible). If ADC A,A takes 4 cycles, and the docs seem to say that, then by all means add 4 to the cycle counter when executing it. |
|
|
Posted: Mon Jun 23, 2008 1:52 pm |
I believe R is incremented on each M cycle, though as far as I'm aware not much SMS software uses R (it's used as a random number source in many TI calculator programs). I'm probably going mad here, but I seem to recall reading somewhere (though for the life of me cannot remember where and cannot find it again) that the Game Gear's Z80 (in an ASIC) has a quirk where the undocumented out (c),0 actually outputted 255 rather than 0. Did I dream that? :-| |
|
|
Posted: Mon Jun 23, 2008 3:22 pm |
What the docs actually say Maxim varies. Some say that ADC A, A takes 1 cycle, and others say that it takes 4 cycles. Now I think that they may be talking about M cycles and T states. Either way, how much is subtracted from the 3579540 cycles that the NTSC CPU has? | |
|
Posted: Mon Jun 23, 2008 3:27 pm |
Something like this:
tstatesPerLine = (clockSpeedHz / fps / no_of_scanlines)
eg: tstatesPerLine = (3579545 / 60 / 262)
Round the value up to the nearest cycle. As Maxim says, T-States are the relevant measure. Check the JavaGear source code, or SMS Plus for reference. |
|
|
Posted: Mon Jun 23, 2008 3:43 pm |
Right so the following would be correct?
void opcode8F()
{ // ADC Code here PC++; cycles += 4; } |
|
|
Posted: Mon Jun 23, 2008 3:54 pm |
No, it's true - but I chose not to mention it for now :) I'm not sure if anything actually uses it. |
|
|
Posted: Mon Jun 23, 2008 3:57 pm |
Aye, though I personally increment PC immediately after fetching a byte (so, Opcode = ReadMemory(PC++)). This makes life marginally easier when dealing with relative jumps. The Z80 user manual is probably the best place to look for timings. A number of resources claim that a failed conditional JP takes only 1 T state. I assume this is due to a typo, as JP takes 10 T states, which has since been repeated on other documents.
Ah, that's a relief! |
|
|
Posted: Mon Jun 23, 2008 4:39 pm |
The PC is in fact automatically incremented after an opcode fetch. I've seen that an opcode fetch takes 4 T cycles and a memory read/write takes 3. How long does an internal operation take? Must the Z80 read or write on every cycle like the 6502? Are any of the IO ports read sensitive? | |
|
Posted: Mon Jun 23, 2008 5:39 pm |
These figures are for somewhat hand-waving accounting of the instruction length. A nop takes 4 cycles, so it's reasonable to say that getting the opcode takes 4 cycles (and 1 M cycle). But a simple operation like "inc a" takes the same number of cycles. "dec bc" takes 6 cycles, though. "ld a,(hl)" takes 7 cycles, so the 3 extra must be for the memory read - but again, "add a,(hl)" takes the same time. So it's not that simple.
It depends entirely on the operation. As you can see from the above figures, an 8-bit increment or add is small enough to happen within the 4 cycles that come from the opcode read, but a 16-bit decrement doesn't. You just don't need to care mostly.
No. It reads or writes on every M cycle, but only because an M cycle is defined as the times when it reads or writes memory.
I don't know what that means. In general, on the SMS, you can't read back the data written to an IO port - although some of the chips have read and write access on the same port number, for different data. |
|
|
Posted: Tue Jun 24, 2008 3:20 pm |
Ok thanks. I think that I understand most things now. However I still need a diagram similar to this one...
http://en.wikipedia.org/wiki/Z80#Instruction_execution ...so that I can break down my CPU core. This has been useful for me so far but I am now stuck with "ADC A,(IX+N)". It says it takes 19 T states and I believe, that means 4 for the initial opcode fetch, 4 for the prefix, and I have worked out that there are two more 4's and a 3. But I don't know in which order, and what happens during those M cycles. |
|
|
Posted: Tue Jun 24, 2008 3:38 pm |
My point was, you don't need to care about it and we don't actually know. The page you link to merely describes the "M cycle" breakdown, which really doesn't matter either. | |
|
Posted: Tue Jun 24, 2008 4:01 pm |
:) Well it matters to me. The reason I ask is I have come from 6502/NES emulation where you must take every cycle access into consideration. I'd like to bring this level of accuracy to SMS emulation, regardless of whether it is actually needed or not. | |
|
Posted: Tue Jun 24, 2008 4:28 pm |
Gee, I didn't think what I said was that wrong, based on answers it seems my explanation of clock and machine time was correct. At any rate, as I said, check to see if you can find Leventhal online. Wait, yes, found it! Here you go: http://www.msxarchive.nl/pub/msx/mirrors/hanso/datasheets/chipsz80leventhal1.pdf Pages 25-44 or however long it is is what you're looking for. Gives the amount of bytes for each operation as well as cycles. | |
|
Posted: Tue Jun 24, 2008 4:50 pm |
Stan: the Z80 does not run at 60Hz, more like 3579545Hz. M cycles can take from 3 to 6 clocks, depending on the operation but can overlap a little (within a single opcode, there's no inter-opcode pipelining on the Z80) when the data paths allow it. But when you're programming, you don't care at all about that - you just care how many clock ticks an instruction takes, and sometimes how many bytes they take up. As I understand it, the M cycle breakdown is of more use to people putting together hardware.
The number of bytes, M cycles, M1 cycles, T states, etc of the opcodes are documented in many, many places. I don't know of anywhere that lists the per-M-cycle or per-T-state breakdown of the execution of the opcodes. There's a Powerpoint file on z80.info showing the full breakdown for a few opcodes, and you could probably use that to show how to deduce much of the rest. |
|
|
Posted: Tue Jun 24, 2008 6:39 pm |
That document claims that in the register pair A is the low byte and F is the high byte when paired (page 5). Surely that's not correct? |
|
|
CPU Questions
Posted: Wed Jun 25, 2008 3:22 am
|
It's an 8-bit CPU; A is the Accumulator (main register) and F is the flag register. You PUSH and POP them off and on the Stack together as a pair, similar to the other register pairs; BC, DE, HL. | |
|
Posted: Wed Jun 25, 2008 6:05 am |
Yes, but when you push HL you see in memory the contents of L, then the contents of H. When you push AF, which do you see first? | |
|
Posted: Wed Jun 25, 2008 9:44 am |
Yes, that document have them wrong.
Z80 undocumented |
|
|
Posted: Thu Jun 26, 2008 1:54 pm |
Here are some more of my questions.
1. Is R incremented after an opcode fetch, but before the opcode execution? 2. Some of the documents just have a ? for certain opcodes like ADC etc. when it comes down to setting the half carry flag for 16bit arithmetic. For instance, how would ADC HL, HL affect the half carry flag? 3. For 16bit arithmetic that affects the sign flag, do we just take the MSB like you do with 8bit arithmetic? 4. Is the following correct? - If an EDxx instruction is not listed, it should operate as two NOPs.
- If a DDxx or FDxx instruction is not listed, it should operate as without the DD or FD prefix, and the DD or FD prefix itself should operate as a NOP. 5. I have seen some documents saying that after an opcode is executed, bit 2 becomes P or V. Can someone please explain to me what this means exactly? I know what overflow is, but not when they say P or V. 6. With an opcode that adds a byte to one of the index registers like ADC A, (IX + n), is n used as a signed data type? (i.e. -128 to 127) 7. Is there a one opcode delay after changing the interrupt disable flag using DI/EI? |
|
|
Posted: Thu Jun 26, 2008 2:57 pm |
I guess so: the sign bit of the MSB is the same bit as the sign bit of the whole word, anyway.
This one's too complicated to copy-pasta. Read the doc.
P means parity. Some instructions set it to the parity of the result, ie. 1 if there are an odd number of 1s in the result. V means overflow. It captures a signed overflow (eg. between -128 and +127 for an 8-bit operation), whereas the carry flag is for unsigned overflow (between 255 and 0). Since there's no real use to detect overflow and parity at the same time (the parity of an arithmetic result is generally not useful, and overflow from a bitwise operation doesn't make sense), the same bit does double duty.
Yes.
Sort of. The ei opcode only enables interrupts one instruction after it is issued, to allow the following code: interrupt handler:
ei reti to not have a potential stack overflow condition when interrupts are being spammed. |
|
|
Posted: Thu Jun 26, 2008 3:39 pm |
I've just read the official Z80 document and it says the following;
ADC A, (IX+d) 5 Bytes, 19 T-states (4, 4, 3, 5, 3)
So there's the breakdown that I've been looking for all along. Here is what I propose happens during every M cycle; M1: Prefix fetch
M2: Opcode fetch M3: Fetch byte M4: Add IX + byte, read from (IX + byte) M5: Add read data to A Is that correct? Shouldn't there be a read/write on cycle M5? Does anyone else have a theory? |
|
|
CPU Questions
Posted: Fri Jun 27, 2008 3:31 pm
|
If "A is the low byte and F is the high byte when paired" (according to the source above) then I would expect to see A in memory first, ie A in address(n) and F in address(n+1), because the Z80 is little-endian. | |
|
Posted: Fri Jun 27, 2008 7:45 pm |
The pair is AF, so PUSH AF writes A at SP-1 and F at SP-2. In memory, you thus see F then A.
|
|
|
Posted: Fri Jun 27, 2008 9:46 pm |
We gotta stop meeting like this Blargg... :D (it's WedNESday btw from the nesdev forums) | |
hap
|
Posted: Sun Jun 29, 2008 1:32 pm |
Does anyone know the differences between Z80 models (clones), little quirks such as above? We (#openmsx on freenode) have access to MSXes with NEC D780C, SGS Z8400, SHARP LH0080A, and ZiLOG Z8400APS, but didn't find differences yet, not having done thorough testing yet though. and hi @ nesdev guys =p |
|
|
Posted: Sun Jun 29, 2008 7:51 pm |
ZEXALL might be helpful, but it is a bit buggy and not a 100% tester. | |
Charles
|
Posted: Sun Jun 29, 2008 8:36 pm |
The only confirmed difference between any Z80 CPU that I know of is this:
A CMOS Z80 outputs $FF for the 'out (c), 0' instruction. A NMOS Z80 outputs $00 for the 'out (c), 0' instruction. I had socketed the Z80 in my SMS and tested a bunch of NMOS Z80s from different manufacturers (NEC, Sharp, Zilog, ROHM, and everything I could yank out of my arcade collection) and the CMOS Z80 in the Game Gear and CMOS Z80 that Zilog makes. All results were consistent. I don't know the results for unlicensed Z80 clones, but I think it's safe to say as a reverse-engineering they may have a lot of differences. |
|
hap
|
Posted: Sun Jun 29, 2008 10:34 pm |
Yeah ZEXALL doesn't completely test the CPU, it passes on the NEC one btw (and of course the ZiLOG one). Another thing to look for is the existence of WZ' (use exx and bit 0,(hl)), it's mentioned in some technical documentation but doesn't exist on at least NEC, SHARP, and ZiLOG models.
Thanks for the info Charles! |
|
Charles
|
Posted: Mon Jun 30, 2008 5:28 pm |
I can't find any information on WZ', could you please explain what it is and how those two instructions relate to it? Sounds interesting! I guess some kind of comprehensive test for "memptr" is necessary too, though I don't know much about that either. |
|
hap
|
Posted: Mon Jun 30, 2008 9:22 pm |
Here's a source that claims WZ' exists: h t t p://w w w.z80.info/z80arki.htm (that site also claims F is affected with exx, but that can't be true)
WZ is the register some people call memptr. bit 0,(hl) puts bit 3 and bit 5 of W into F. here's how it can be tested: ld bc,0 ld hl,0800h adc hl,bc ; wz=0801h (16 bit hl addition does wz=hl+1 after) push bc pop af bit 0,(hl) ; F and 28h == W and 28h push af ; just to make sure that the above wz quirk exists ld hl,0 adc hl,bc ; wz=0001h exx ; if wz' exists, wz<->wz' ld hl,2000h ld bc,0 adc hl,bc ; wz=2001h bit 0,(hl) push af exx ; if wz' exists, wz=0001h now bit 0,(hl) push af ; then do something like this 3 times: pop bc ld a,c and 28h call show_on_screen ; 20 20 08 if wz' does not exist ; 00 20 08 if wz' does exist |
|
|
Posted: Thu Jul 03, 2008 9:40 am |
So if I were to perform ADC HL, HL with the carry set, is the carry added to the low bytes when they are added? Also, when adding the high bytes do you include the carry over from the low bytes at that time?
L = L + L + Carry;
H = H + H + Carryfromlowbytes; |
|
|
Posted: Thu Jul 03, 2008 10:16 am |
Yes. The result wouldn't make sense any other way. | |