Path: blob/master/genplus-gx32/core/cart_hw/svp/svpdoc.txt
2 views
-------------------------------------------------------------------------------1notaz's SVP doc2$Id: svpdoc.txt 349 2008-02-04 23:13:59Z notaz $3Copyright 2008, Grazvydas Ignotas (notaz)4-------------------------------------------------------------------------------56If you use this, please credit me in your work or it's documentation.7Tasco Deluxe should also be credited for his pioneering work on the subject.8Thanks.910Use monospace font and disable word wrap when reading this document.1112-------------------------------------------------------------------------------13Table of Contents14-------------------------------------------------------------------------------15160. Introduction171. Overview182. The SSP160x DSP192.1. General registers202.2. External registers212.3. Pointer registers222.4. The instruction set233. Memory map244. Other notes252627-------------------------------------------------------------------------------280. Introduction29-------------------------------------------------------------------------------3031This document is an attempt to provide technical information needed to32emulate Sega's SVP chip. It is based on reverse engineering Virtua Racing33game and on various internet sources. None of information provided here34was verified on the real hardware, so some things are likely to be35inaccurate.3637The following information sources were used while writing this document38and emulator implementation:3940[1] SVP Reference Guide (annotated) and SVP Register Guide (annotated)41by Tasco Deluxe < tasco.deluxe @ gmail.com >42http://www.sharemation.com/TascoDLX/SVP%20Reference%20Guide%202007.02.11.txt43http://www.sharemation.com/TascoDLX/SVP%20Register%20Guide%202007.02.11.txt44[2] SSP1610 disassembler45written by Pierpaolo Prazzoli, MAME source code.46http://mamedev.org/47[3] SSP1601 DSP datasheet48http://notaz.gp2x.de/docs/SSP1601.pdf49[4] DSP page (with code samples) in Samsung Semiconductor website from 199750retrieved from Internet Archive: The Wayback Machine51http://web.archive.org/web/19970607052826/www.sec.samsung.com/Products/dsp/dspcore.htm52[5] Sega's SVP Chip: The Road not Taken?53Ken Horowitz, Sega-1654http://sega-16.com/feature_page.php?id=37&title=Sega's%20SVP%20Chip:%20The%20Road%20not%20Taken?555657-------------------------------------------------------------------------------581. Overview59-------------------------------------------------------------------------------6061The only game released with SVP chip was Virtua Racing. There are at least 462versions of the game: USA, Jap and 2 different Eur revisions. Three of them63share identical SSP160x code, one of the Eur revisions has some differences.6465From the software developer's point of view, the game cartridge contains66at least:6768* Samsung SSP160x 16-bit DSP core, which includes [3]:69* Two independent high-speed RAM banks, accessed in single clock cycle,70256 words each.71* 16 x 16 bit multiply unit.72* 32-bit ALU, status register.73* Hardware stack of 6 levels.74* 128KB of DRAM.75* 2KB of IRAM (instruction RAM).76* Memory controller with address mapping capability.77* 2MB of game ROM.7879[5] claims there is also "2 Channels PWM" in the cartridge, but it's either80not used or not there at all.81Various sources claim that SSP160x is SSP1601 which is likely to be true,82because the code doesn't seem to use any SSP1605+ features.838485-------------------------------------------------------------------------------862. The SSP160x DSP87-------------------------------------------------------------------------------8889SSP160x is 16-bit DSP, capable of performing multiplication + addition in90single clock cycle [3]. It has 8 general, 8 external and 8 pointer registers.91There is a status register which has operation control bits and condition92flags. Condition flags are set/cleared during ALU (arithmetic, logic)93operations. It also has 6-level hardware stack and 2 internal RAM banks94RAM0 and RAM1, 256 words each.9596The device is only capable of addressing 16-bit words, so all addresses refer97to words (16bit value in ROM, accessed by 68k through address 0x84 would be98accessed by SSP160x using address 0x42).99100[3] mentions interrupt pins, but interrupts don't seem to be used by SVP code101(actually there are functions which look like interrupt handler routines, but102they don't seem to do anything important).1031042.1. General registers105----------------------106107There are 8 general registers: -, X, Y, A, ST, STACK, PC and P ([2] [4]).108Size is given in bits.1091102.1.1. "-"111Constant register with all bits set (0xffff). Also used for programming112external registers (blind reads/writes, see 2.2).113size: 161141152.1.2. "X"116Generic register. Also acts as a multiplier 1 for P register.117size: 161181192.1.3. "Y"120Generic register. Also acts as a multiplier 2 for P register.121size: 161221232.1.4. "A"124Accumulator. Stores the result of all ALU (but not multiply) operations,125status register is updated according to this. When directly accessed,126only upper word is read/written. Low word can be accessed by using AL127(see 2.2.8).128size: 321291302.1.5. "ST"131STatus register. Bits 0-9 are CONTROL, other are FLAG [2]. Only some of132them are actually used by SVP.133Bits: fedc ba98 7654 3210134210 - RPL "Loop size". If non-zero, makes (rX+) and (rX-) respectively135modulo-increment and modulo-decrement (see 2.3). The value136shows which power of 2 to use, i.e. 4 means modulo by 16.13743 - RB Unknown. Not used by SVP code.1385 - ST5 Affects behavior of external registers. See 2.2.1396 - ST6 Affects behavior of external registers. See 2.2.140According to [3] (5,6) bits correspond to hardware pins.1417 - IE Interrupt enable? Not used by SVP code.1428 - OP Saturated value? Not used by SVP code.1439 - MACS MAC shift? Not used by SVP code.144a - GPI_0 Interrupt 0 enable/status? Not used by SVP code.145b - GPI_1 Interrupt 1 enable/status? Not used by SVP code.146c - L L flag. Similar to carry? Not used by SVP code.147d - Z Zero flag. Set after ALU operations, when all 32 accumulator148bits become zero.149e - OV Overflow flag. Not used by SVP code.150f - N Negative flag. Set after ALU operations, when bit31 in151accumulator is 1.152size: 161531542.1.6. "STACK"155Hardware stack of 6 levels [3]. Values are "pushed" by directly writing to156it, or by "call" instruction. "Pop" is performed by directly reading the157register or by "ret" instruction.158size: 161591602.1.7. "PC"161Program Counter. Can be written directly to perform a jump. It is not clear162if it is possible to read it (SVP code never does).163size: 161641652.1.8. "P"166multiply Product - multiplication result register.167Always contains 32-bit multiplication result of X, Y and 2 (P = X * Y * 2).168X and Y are sign-extended before performing the multiplication.169size: 321701712.2. External registers172-----------------------173174The external registers, as the name says, are external to SSP160x, they are175hooked to memory controller in SVP, so by accessing them we actually program176the memory controller. They act as programmable memory access registers or177external status registers [1]. Some of them can act as both, depending on how178ST5 ans ST6 bits are set in status register. After a register is programmed,179accessing it causes reads/writes from/to external memory (see section 3 for180the memory map). The access may also cause some additional effects, like181incremental of address, associated with accessed register.182In this document and my emu, instead of using names EXT0-EXT7183from [4] I used different names for these registers. Those names are from184Tasco Deluxe's [1] doc.185186All these registers can be blind-accessed (as said in [1]) by performing187(ld -, PMx) or (ld PMx, -). This programs them to access memory (except PMC,188where the effect is different).189All registers are 16-bit.1901912.2.1. "PM0"192If ST5 or ST6 is set, acts as Programmable Memory access register193(see 2.2.7). Else it acts as status of XST (2.2.4). It is also mapped194to a15004 on 68k side:195???????? ??????101960: set, when SSP160x has written something to XST197(cleared when 015004 is read by 68k)1981: set, when 68k has written something to a15000 or a15002199(cleared on PM0 read by SSP160x)200Note that this is likely to be incorrect, but such behavior is OK for201emulation to work.2022032.2.2. "PM1"204Programmable Memory access register. Only accessed with ST bits set by205SVP code.2062072.2.3. "PM2"208Same as PM1.2092102.2.4. "XST"211If ST5 or ST6 is set, acts as Programmable Memory access register212(only used by memory test code). Else it acts as eXternal STatus213register, which is also mapped to a15000 and a15002 on 68k side.214Affects PM0 when written to.2152162.2.5. "PM4"217Programmable Memory access register. Not affected by ST5 and ST6 bits,218always stays in PMAR mode.2192202.2.6. "EXT5"221Not used by SVP, so not covered by this document.2222232.2.7. "PMC"224Programmable Memory access Control. It is set using 2 16bit writes, first225address, then mode word. After setting PMAC, PMx should be blind accessed226using (ld -, PMx) or (ld PMx, -) to program it for reading or writing227external memory respectively. Every PMx register can be programmed to228access it's own memory location with it's own mode. Registers are programmed229separately for reading and writing.230231Reading PMC register also shifts it's state (from "waiting for address" to232"waiting for mode" and back). Reads always return address word related to233last PMx register accessed, or last address word written to PMC (whichever234event happened last before PMC read).235236The address word contains bits 0-15 of the memory word-address.237The mode word format is as follows:238dsnnnv?? ???aaaaa239a: bits 16-20 of memory word-address.240n: auto-increment value. If set, after every access of PMx, word-address241value related to it will be incremented by (words):2421 - 1 5 - 162432 - 2 6 - 322443 - 4 7 - 1282454 - 8246d: make auto-increment negative - decrement by count listed above.247s: special-increment mode. If current address is even (when accessing248programmed PMx), increment it by 1. Else, increment by 32. It is not249clear what happens if d and n bits are also set (never done by SVP).250v: over-write mode when writing, unknown when reading (not used).251Over-write mode splits the word being written into 4 half-bytes and252only writes those half-bytes, which are not zero.253When auto-increment is performed, it affects all 21 address bits.2542552.2.8. "AL"256This register acts more like a general register.257If this register is blind-accessed, it is "dummy programmed", i.e. nothing258happens and PMC is reset to "waiting for address" state.259In all other cases, it is Accumulator Low, 16 least significant bits of260accumulator. Normally reading acc (ld X, A) you get 16 most significant261bits, so this allows you access the low word of 32bit accumulator.2622632.3. Pointer registers264----------------------265266There are 8 8-bit pointer registers rX, which are internal to SSP160x and are267used to access internal RAM banks RAM0 and RAM1, or program memory indirectly.268r0-r3 (ri) point to RAM0, r4-r7 (rj) point to RAM1. Each bank has 256 words of269RAM, so 8bit registers can fully address them. The registers can be accessed270directly, or 2 indirection levels can be used [ (rX), ((rX)) ]. They work271similar to * and ** operators in C, only they use different types of memory272and ((rX)) also performs post-increment. First indirection level (rX) accesses273a word in RAMx, second accesses program memory at address read from (rX), and274increments value in (rX).275276Only r0,r1,r2,r4,r5,r6 can be directly modified (ldi r0, 5), or by using277modifiers. 3 modifiers can be applied when using first indirection level278(optional):279+ : post-increment (ld a, (r0+) ). Increment register value after operation.280Can be made modulo-increment by setting RPL bits in status register281(see 2.1.5).282- : post-decrement. Also can be made modulo-decrement by using RPL bits in ST.283+!: post-increment, unaffected by RPL (probably).284These are only used on 1st indirection level, so things like ( ld a, ((r0+)) )285and (ld X, r6-) are probably invalid.286287r3 and r7 are special and can not be changed (at least Samsung samples [4] and288SVP code never do). They are fixed to the start of their RAM banks. (They are289probably changeable for ssp1605+, Samsung's old DSP page claims that).2901 of these 4 modifiers must be used on these registers (short form direct291addressing? [2]):292|00: RAMx[0] The very first word in the RAM bank.293|01: RAMx[1] Second word294|10: RAMx[2] ...295|11: RAMx[3]2962972.4. The instruction set298------------------------299300The Samsung SSP16 series assembler uses right-to-left notation ([2] [4]):301ld X, Y302means value from Y should be copied to X.303304Size of every instruction is word, some have extension words for immediate305values. When writing an interpreter, 7 most significant bits are usually306enough to determine which opcode it is.307308encoding bits are marked as:309rrrr - general or external register, in order specified in 2.1 and 2.2310(0 is '-', 1 'X', ..., 8 is 'PM0', ..., 0xf is 'AL')311dddd - same as above, as destination operand312ssss - same as above, as source operand313jpp - pointer register index, 0-7314j - specifies RAM bank, i.e. RAM0 or RAM1315i* - immediate value bits316a* - offset in internal RAM bank317mm - modifier for pointer register, depending on register:318r0-r2,r4-r6 r3,r7 examples3190: (none) |00 ld a, (r0) cmp a, (r7|00)3201: +! |01 ld (r0+!), a ld (r7|01), a3212: - |10 add a, (r0-)3223: + |11323cccc - encodes condition, only 3 used by SVP, see check_cond() below324ooo - operation to perform325326Operation is written in C-style pseudo-code, where:327program_memory[X] - access program memory at address X328RAMj[X] - access internal RAM bank j=0,1 (RAM0 or RAM1), word329offset X330RIJ[X] - pointer register rX, X=0-7331pr_modif_read(m,X) - read pointer register rX, applying modifier m:332if register is r3 or r7, return value m333else switch on value m:3340: return rX;3351: tmp = rX; rX++; return tmp; // rX+!3362: tmp = rX; modulo_decrement(rX); return tmp; // rX-3373: tmp = rX; modulo_increment(rX); return tmp; // rX+338the modulo value used (if used at all) depends on ST339RPL bits (see 2.1.5)340check_cond(c,f) - checks if a flag matches f bit:341switch (c) {342case 0: return true;343case 5: return (Z == f) ? true : false; // check Z flag344case 7: return (N == f) ? true : false; // check N flag345} // other conditions are possible, but they are not used346update_flags() - update ST flags according to last ALU operation.347sign_extend(X) - sign extend 16bit value X to 32bits.348next_op_address() - address of instruction after current instruction.3493502.4.1. ALU instructions351352All of these instructions update flags, which are set according to full 32bit353accumulator. The SVP code only checks N and Z flags, so it is not known when354exactly OV and L flags are set. Operations are performed on full A, so355(andi A, 0) would clear all 32 bits of A.356357They share the same addressing modes. The exact arithmetic operation is358determined by 3 most significant (ooo) bits:359001 - sub - subtract (OP -=)360011 - cmp - compare (OP -, flags are updated according to result)361100 - add - add (OP +=)362101 - and - binary AND (OP &=)363110 - or - binary OR (OP |=)364111 - eor - exclusive OR (OP ^=)365366syntax encoding operation367OP A, s ooo0 0000 0000 rrrr A OP r << 16;368OP A, (ri) ooo0 001j 0000 mmpp A OP RAMj[pr_modif_read(m,jpp)] << 16;369OP A, adr ooo0 011j aaaa aaaa A OP RAMj[a] << 16;370OPi A, imm ooo0 1000 0000 0000 A OP i << 16;371iiii iiii iiii iiii372op A, ((ri)) ooo0 101j 0000 mmpp tmp = pr_modif_read(m,jpp);373A OP program_memory[RAMj[tmp]] << 16;374RAMj[tmp]++;375op A, ri ooo1 001j 0000 00pp A OP RIJ[jpp] << 16;376OPi simm ooo1 1000 iiii iiii A OP i << 16;377378There is also "perform operation on accumulator" instruction:379380syntax encoding operation381mod cond, op 1001 000f cccc 0ooo if (check_cond(c,f)) switch(o) {382case 2: A >>= 1; break; // arithmetic shift383case 3: A <<= 1; break;384case 6: A = -A; break; // negate A385case 7: A = abs(A); break; // absolute val.386} // other operations are possible, but387// they are not used by SVP.3883892.4.2. Load (move) instructions390391These instructions never affect flags (even ld A).392If destination is A, and source is 16bit, only upper word is transfered (same393thing happens on opposite). If dest. is A, and source is P, whole 32bit value394is transfered. It is not clear if P can be destination operand (probably not,395no code ever does this).396Writing to STACK pushes a value there, reading pops. It is not known what397happens on overflow/underflow (never happens in SVP code).398ld -, - is used as a nop.399400syntax encoding operation401ld d, s 0000 0000 dddd ssss d = s;402ld d, (ri) 0000 001j dddd mmpp d = RAMj[pr_modif_read(m,jpp)];403ld (ri), s 0000 010j ssss mmpp RAMj[pr_modif_read(m,jpp)] = s;404ldi d, imm 0000 1000 dddd 0000 d = i;405iiii iiii iiii iiii406ld d, ((ri)) 0000 101j dddd mmpp tmp = pr_modif_read(m,jpp);407d = program_memory[RAMj[tmp]];408RAMj[tmp]++;409ldi (ri), imm 0000 110l 0000 mmpp RAMj[pr_modif_read(m,jpp)] = i;410iiii iiii iiii iiii411ld adr, a 0000 111j aaaa aaaa RAMj[a] = A;412ld d, ri 0001 001j dddd 00pp d = RIJ[jpp];413ld ri, s 0001 010j ssss 00pp RIJ[jpp] = s;414ldi ri, simm 0001 1jpp iiii iiii RIJ[jpp] = i;415ld d, (a) 0100 1010 dddd 0000 d = program_memory[A[31:16]];416// read a word from program memory. Offset417// is the upper word in A.4184192.4.3. Program control instructions420421Only 3 instructions: call, ret (alias of ld PC, STACK) and branch. Indirect422jumps can be performed by simply writing to PC.423424syntax encoding operation425call cond, addr 0100 100f cccc 0000 if (check_cond(c,f)) {426aaaa aaaa aaaa aaaa STACK = next_op_address(); PC = a;427}428bra cond, addr 0100 110f cccc 0000 if (check_cond(c,f)) PC = a;429aaaa aaaa aaaa aaaa430ret 0000 0000 0110 0101 PC = STACK; // same as ld PC, STACK4314322.4.4. Multiply-accumulate instructions433434Not sure if (ri) and (rj) really get loaded into X and Y, but multiplication435result surely is loaded into P. There is probably optional 3rd operand (1, 0;436encoded by bit16, default 1), but it's not used by SVP code.437438syntax encoding operation439mld (rj), (ri) 1011 0111 nnjj mmii A = 0; update_flags();440X = RAM0[pr_modif_read(m,0ii)];441Y = RAM1[pr_modif_read(m,1jj)];442P = sign_extend(X) * sign_extend(Y) * 2443mpya (rj), (ri) 1001 0111 nnjj mmii A += P; update_flags();444X = RAM0[pr_modif_read(m,0ii)];445Y = RAM1[pr_modif_read(m,1jj)];446P = sign_extend(X) * sign_extend(Y) * 2447mpys (rj), (ri) 0011 0111 nnjj mmii A -= P; update_flags();448X = RAM0[pr_modif_read(m,0ii)];449Y = RAM1[pr_modif_read(m,1jj)];450P = sign_extend(X) * sign_extend(Y) * 2451452-------------------------------------------------------------------------------4533. Memory map454-------------------------------------------------------------------------------455456The SSp160x can access it's own program memory, and external memory through EXT457registers (see 2.2). Program memory is read-execute-only, the size of this458space is 64K words (this is how much 16bit PC can address):459460byte address word address name4610- 7ff 0- 3ff IRAM462800-1ffff 400-ffff ROM463464There were reports that SVP has internal ROM, but fortunately they were wrong.465The location 800-1ffff is mapped from the same location in the 2MB game ROM.466The IRAM is read-only (as SSP160x doesn't have any means of writing to it's467program memory), but it can be changed through external memory space, as it's468also mapped there.469470The external memory space seems to match the one visible by 68k, with some471differences:47247368k space SVP space word address name4740-1fffff 0-1fffff 0- fffff game ROM475300000-31ffff 300000-31ffff 180000-18ffff DRAM476? 390000-3907ff 1c8000-1c83ff IRAM477390000-39ffff ? ? "cell arrange" 14783a0000-3affff ? ? "cell arrange" 2479a15000-a15009 n/a n/a Status/control registers480481The external memory can be read/written by SSP160x (except game ROM, which can482only be read).483484"cell arrange" 1 and 2 are similar to the one used in SegaCD, they map485300000-30ffff location to 390000-39ffff and 3a0000-3affff, where linear image486written to 300000 can be read as VDP patterns at 390000. Virtua Racing doesn't487seem to use this feature, it is only used by memory test code.488489Here is the list of status/control registers (16bit size):490a15000 - w/r command/result register. Visible as XST for SSP160x (2.2.4).491a15002 - mirror of the above.492a15004 - status of command/result register (see 2.2.1).493a15006 - possibly halts the SVP. Before doing DMA from DRAM, 68k code writes4940xa, and after it's finished, writes 0. This is probably done to495prevent SVP accessing DRAM and avoid bus clashes.496a15008 - possibly causes an interrupt. There is (unused?) code which writes4970, 1, and again 0 in sequence.498499500-------------------------------------------------------------------------------5014. Other notes502-------------------------------------------------------------------------------503504The game has arcade-style memory self-check mode, which can be accessed by505pressing _all_ buttons (including directions) on 3-button controller. There was506probably some loopback plug for this.507508SVP seems to have DMA latency issue similar to one in Sega CD, as the code509always sets DMA source address value larger by 2, then intended for copy.510This is even true for DMAs from ROM, as it's probably hooked through SVP's511memory controller.512513The entry point for the code seems to be at address 0x800 (word 0x400) in ROM,514but it is not clear where the address is fetched from when the system powers515up. The memory test code also sets up "ld PC, .." opcodes at 0x7f4, 0x7f8 and5160x7fc, which jump to some routines, possibly interrupt handlers. This means517that mentioned addresses might be built-in interrupt vectors.518519The SVP code doesn't seem to be timing sensitive, so it can be emulated without520knowing timing of the instructions or even how fast the chip is clocked.521Overclocking doesn't have any effect, underclocking causes slowdowns. Running52210-12M instructions/sec (or possibly less) is sufficient.523524525526