z80 Optimizations
Basic Optimizations
Let's say you have two lines of code:call routine retThese two lines should always be replaced with:
jp routineThis will save a byte, speed up execution, and save stack space!
If you have:
cp 0You should replace it with:
and aor:
or aEither is acceptable. It is a byte shorter and it is faster!
If you have:
ld a, 0you can usually replace it with:
xor aThis has the side-effect of destroying the flag register. However, if you don't need to preserve the contents of the flag register, use this! It is a byte shorter and faster!
Sometimes it is possible to use subtraction and decrements to avoid compares. For example:
cp 4 jr z, label1 cp 5 jr z, label2can often be replaced with:
sub 4 jr z, label1 dec a jr z, label2This saves a byte and a few clocks. This method is especially useful if you have several compares in a row. Notice that applying this optimization will destroy the value in register a, but since register a will always be zero when a jump is taken, this can sometimes be useful.
It is not necessary to load register a with a value to compare it to certain values. For example:
ld a, c cp 1can sometimes be replaced with:
dec cThis is useful if you don't need the contents of register c and register a doesn't need to know the value of register c. Note that this method is also applicable to 255 (-1) by doing an increment instead of a decrement. Keep in mind that this does not work with the 16-bit registers since the 16-bit inc/dec instructions do not set the flag register.
If you need to load register c indirectly with a value, rather than:
ld a, (address) ld c, ayou can sometimes use:
ld bc, (address)This only works if you don't care about register b and you don't need register a to be set to register c. Similarly, it is possible to load register b instead of register c, for example:
ld a, (address) ld b, acan sometimes be replaced with:
ld bc, (address - 1)
If you have:
xor $FFreplace it with:
cplThis saves space and clocks.
If you have:
ld de, constant or a sbc hl, dereplace it with:
ld de, -constant add hl, deThis saves an extra instruction in most cases.
Self-Modifying Code
Let's say you need a variable in RAM. You can use self-modifing code to speed up your program and make it smaller!variable: ld a, $00 ; The variable is stored in an instruction! ... xor a ld (variable + 1), a ; Load the variable with zeroNote that you should put the instruction with the variable stored in it in the most time intensive part of your program (if such a place exists).
Tips
- Avoid loading data from memory; try to use registers whenever possible.
- Avoid using the ix and iy registers. The instructions associated with these registers are larger and slower than those for bc, de, and hl.
- Avoid using push and pop in routines that need speed. Try using different combinations of registers until you can get by without them. If the interrupts are disabled (or can be disabled), using the shadow registers can come in handy. Keep in mind that using push and pop is still faster than using indirect addressing.
- Use hl for indirect addressing where possible.
- Use bc (or just b) as a counter.
- Use de as a data register or extra index register.
- Use ROM calls and library routines as much as possible. These save space and save you some work. Also, they are usually optimized for speed.
- Use jr as much as possible when trying to save space. Use jp as much as possible when trying to save clocks.
- When you use conditional jumps, try to make the condition that is normally false take the jump.
- If you don't need interrupts in a certain section of code and you need to save many values, use the shadow registers. Don't forget to use di first (and ei afterward if you need interrupts again)!
Boolean Logic
Clever usage of boolean logic can make your code much smaller and faster.Let's say you have a loop and you want to call a routine on every other iteration. You could do:
label: ld a, $00 cpl ld (label + 1), a call z, routine
If you have a variable that you want to switch between either 1 and 2 or 3 and 4 depending on its previous value, you could do:
ld hl, address ld a, (hl) dec a xor 1 inc a ld (hl), a
If you need to find a mod 16:
and 15This works because this is masking off the lower 4 binary digits of register a. Since %1111 is binary for 15, you get the moulus of 16. You can get the same effect if you use decimal. For example, if you want to find x mod 10, you simply take the value that is in the one's place. Since binary is a lower base than decimal, it becomes more useful for this purpose. Just as this trick will only work for 10, 100, 1000, ... in decimal, it will only work for 2, 4, 8, ... in binary.