You are on page 1of 17

14

DATA ALU

An Overview Of MOtOrOlA DSP563XX PrOceSSOrS

The Motorola DSP56300 family programmable DSPs are deployed in a number of applications such as wireless infrastructure, internet telephony, base transceiver station, network interface cards, base station controllers and high speed modem banks. DSP 56300 family of processors has a number of processors such as DSP56301, DSP56305, DSP56307, DSP56309 and DSP56311 with different mix of on-chip memory, peripherals and coprocessors. These processors are built around a standardised DSP56300 core. The DSP56300 core provides up to twice the performance of Motorolas popular DSP56000 core family, while retaining code compatibility. Some details on the DSP56301 are presented first. Next its features are compared to some of the other DSPs in this family. The block diagram of DSP56301 is shown in Fig. 14.1. DSP56300 core shown in this figure is common to all the DSP56300 family DSPs. The DSP56300 core is composed of the data arithmetic logic unit (Data ALU), address generation unit (AGU), program controller (PC), instruction cache controller, bus interface unit, direct memory access (DMA) controller, on-chip emulation (OnCE) module, and a PLL-based clock oscillator.

14.1

The Data ALU performs all the arithmetic and logical operations on data operands in the DSP56300 core. It consists of a pipelined 24X24-bit multiplier-accumulator (MAC) and a 56-bit barrel shifter. It has four 24-bit general-purpose registers: X1, X0, Y1 and Y0. These registers can be used either individually or combined into two 48-bit registers called X and Y registers respectively. For singleprecision operations X0, X1, Y0 and Y1 are used individually. For double-precision operations X and Y are used. The CPU gets the operand from two independent memory areas denoted as X and Y. The X register holds the operand read or written from/to the X memory. Similarly Y register holds the operand read or written from/to the Y memory. The ALU has six registers: A2, A1, A0, B2, B1 and B0. These registers may be concatenated into two general-purpose, 56-bit accumulators, A and B, as shown in Fig. 14.2. The 8-bit parts A2 and B2 are called extension registers. They extend the range of the accumulators to about 256. The accumulators have both an integer part and a fractional part as shown in Fig. 14.2. The integer part is contained in the A2 and B2 registers. The accumulators may be switched to the saturation mode. In this mode A2 and B2 are not used and A and B are limited to fractional values.

DigitalSignalProcessors
52 6 6 3 Memory Expansion Area Triple Timer Host Interface ESSI Interface SCI Interface Program RAM 4096 *24 (Default) X Data RAM 2048 *24 (Default) Y Data RAM 2048 *24 (Default)

PM_EB

PI O_EB

Peripheral Expansion Area

Address Generation Unit Six Channel DMA Unit

YAB XAB PAB DAB 24-Bit DSP56300 Core

XM_EB

YM_EB

24 Address External 14 Bus Interface & I-cache Control Control External Data Bus Switch Power Mngmnt

Boot Strap ROM Internal Data Bus Switch

DDB YDB XDB PDB GDB

24 Data

XTAL TAL

Clock Generator PLL 2

Program Interrupt Controller

Program Decode Controller

Program Address Generator

Data ALU 24*24+56 56-bit MAC Two 56 bit Accumulator 56-bit Barrel Shifter

JTAG TM OnCE

DE

MODD/IRQA MODC/IRQB MODB/IRQC MODA/IRQD

Reset PINIT/NMI

Fig. 14.1 DSP56301 block diagram

The CPU can be made to operate either in 24-bit or 16-bit mode under software control. In the 16-bit mode, the single precision numbers have 16 bits and double precision numbers have 32 bits. The Data ALU registers can be read or written over the X data bus (XDB) and the Y data bus (YDB) as 16- or 32-bit operands. The source operands for the Data ALU, which can be 16, 32, or 40 bits, always originate from Data ALU registers. The results of all Data ALU operations are stored in an accumulator. All the Data ALU operations are performed in two clock cycles in pipeline fashion so that a new instruction can be initiated in every clock, yielding an effective execution rate of one instruction per clock cycle. The destination of every arithmetic operation can be used as a source operand for the immediately following operation without penalty.

AnOverviewofMotorolaDSP563XXProcessors
. . A B A2 B2 . . X1 Y1 A1 B1 X0 Y0 A0 B0

23

24

Fig. 14.2 Data ALU registers of motorola DSP56301

MULTIPLIER-ACCUMULATOR (MAC)

14.2

The MAC unit comprises the main arithmetic processing unit of the DSP56300 core and performs all of the calculations on data operands. For arithmetic instructions, the unit accepts as many as three input operands and outputs one 56-bit result of the following form, Extension:Most Significant Product:Least Significant Product (EXT:MSP:LSP). The multiplier executes 24-bit 24-bit, parallel, fractional multiplies between twos-complement signed, unsigned or mixed operands. The 48-bit product is rightjustified and added to the 56-bit contents of either the A or B accumulator. A 56-bit result can be stored as a 24-bit operand. The LSP can either be truncated or rounded into the MSP. Rounding is performed if specified. The input to the multiplier can come only from the X and Y registers. The output of the multiplier can be added or subtracted to either of the accumulators. The outputs of the accumulators can be moved to either X or Y areas. The MAC unit performs the multiplyaccumulate operation in two cycles. However, since it is pipelined a multiplyaccumulate operation can be initiated every cycle and can generate one output after every clock cycle and thus gives an effective rate of one MAC/cycle.

ADDRESS GENERATION UNIT (AGU)

14.3

The block diagram of DSP56300 AGU is shown in Fig. 14.3. The AGU performs the effective address calculations using integer arithmetic necessary to address data operands in memory and contains the registers that generate the addresses. It has several features which are similar to that of the data address generation logic (DAGEN) block of TI TMS320C54X. It has two address arithmetic logic unit (Address ALU). This is similar to ARAU0 and ARAU1 of 54X. The AGU operates in parallel with the main data ALU and all effective address calculations are done without using the data ALU. It has eight registers 24-bit R0-R7 which are used for either specifying the indirect address or hold the operand for instruction. Their operation is similar to the 16-bit ARs AR0-AR7 of 54X. Addresses are computed using 4 modes: linear, modulo, reverse-carry and multiple wrap-around modulo. The first three modes are equivalent to the linear, circular and bit-reversed addressing modes in 54X. The fourth mode is a special circular addressing mode. The multiple wrap-around modulo arithmetic differs from the regular modulo arithmetic in the following aspects. In the regular modulo M mode, M ranges from 2 to + 32,768. Modulo M arithmetic causes the address register value to remain within an address range of size M, defined by a lower and upper address

DigitalSignalProcessors

boundary. If an offset, Nn, is used in the address calculations, the 24-bit absolute value, |Nn|, must be less than or equal to M for proper modulo addressing. If Nn>M, the result is data dependent and unpredictable, except for the special case where Nn=P 2k , a multiple of the block size where P is a positive integer. For this special case, when using the (Rn)+Nn addressing mode, the pointer, Rn, will jump linearly to the same relative address in a new buffer, which is P blocks forward in memory. Similarly, for (Rn)-Nn, the pointer will jump P blocks backward in memory. This technique is useful in sequentially processing multiple tables or N-dimensional arrays.
Low Address ALU XAB YAB PAB High Address ALU

Triple Multiplexer N0 M0 N1 M1 N2 M2 N3 M3 Address ALU EP R0 R4 R1 R5 R2 R6 R3 R7 Global Data Bus Program Address Bus M4 N4 M5 N5 Address ALU M6 N6 M7 N7

Fig. 14.3 Block diagram DSP56300 family address generation unit

In the multiple wrap-around addressing mode, the address modification is performed modulo M, where M may be any power of 2 in the range from 21 to 214. Modulo M arithmetic causes the address register value to remain within an address range of size M defined by a lower and upper address boundary. The value M-1 is stored in the modifier register Mn least significant 15 bits while the 16th bit (bit 15) is set to one and the rest of the most significant eighth bits are not considered. The lower boundary (base address) value must have zeroes in the k LSBs, where 2k = M, and therefore must be a multiple of 2k. The upper boundary is the lower boundary plus the modulo size minus one (base address plus M-1). The address pointer is not required to start at the lower address boundary and may begin anywhere within the defined modulo address range (between the lower and upper boundaries). If the address register pointer increments past the upper boundary of the buffer (base address plus M-1) it will wrap around to the base address. If the address decrements past the lower boundary (base address) it will wrap around to the base address plus M-1. If an offset Nn is used in the address calculations, it is not required to be less than or equal to M for proper modulo addressing since multiple wraparound is supported for (Rn)+Nn, (Rn)-Nn and (Rn+Nn) address updates (multiple wrap-around cannot occur with (Rn)+, (Rn)- and -(Rn) addressing modes). Like the dual-memory addressing mode of 54X, the DSP56300 core allows two operands to be fetched from the data memory simultaneously and the addresses of the next operands in this mode can be simultaneously computed in the two address ALUs.

AnOverviewofMotorolaDSP563XXProcessors

In addition to the address registers R0-R8, there are eight 24-bit modifier registers M0-M7, eight 24-bit offset registers N0-N7 and a 24-bit stack extension pointer (EP). The contents of the 24-bit EP register are used to point to the stack extension in data memory whenever the stack extension is enabled and move operations to/from the on-chip hardware stack are needed. The operands for one of the address ALUs comes from the triplet (R0-R3, M0-M3, N0-N3). The operands for the other address ALU comes from the triplet (R4-R7, M4-M7, N4-N7). The two address ALUs are identical. Each contains a 16-bit full adder (called an offset adder). A second full adder (called a modulo adder) adds the summed result of the first full adder to a modulo value that is stored in its respective modifier register. A third full adder (called a reverse-carry adder) is also provided. The offset adder and the reverse-carry adder are in parallel and share common inputs. The only difference between them is that the carry propagates in opposite directions. Test logic determines which of the three summed results of the full adders is output. Each address ALU can update one address register from its respective address register file during one instruction cycle. The contents of the associated modifier register specifies the type of arithmetic to be used in the address register update calculation. The modifier value is decoded in the address ALU.

PROGRAM CONTROL UNIT (PCU)

14.4

The PCU performs instruction prefetch, instruction decoding, hardware DO loop control, and exception processing. The PCU implements a seven-stage pipeline and controls the different processing states of the DSP56300 core. The PCU consists of three hardware blocks: program decode controller (PDC), program address generator (PAG) and program interrupt controller. The PDC decodes the 24-bit instruction loaded into the instruction latch and generates all signals necessary for pipeline control. The PAG contains all the hardware needed for program address generation, system stack, and loop control. The PIC arbitrates among all interrupt requests (internal interrupts, as well as the five external requests IRQA, IRQB, IRQC, IRQD, and NMI) and generates the appropriate interrupt vector address. PCU supports the following features: position independent code (PIC) support, addressing modes optimised for DSP applications (including immediate offsets), on-chip instruction cache controller, onchip memory-expandable hardware stack, nested hardware DO loops, fast auto-return interrupts. The PCU contains a number of registers such as program counter register (PCN), status register (SRN), loop address register (LAN), loop counter register (LCR), vector base address register (VBAN), Stack pointer and so on. It also contains a hardware stack.

JTAG TAP AND ONCE MODULE

14.5

The DSP56300 core provides a dedicated user-accessible test access port (TAP) which is fully compatible with the IEEE 1149.1 Standard Test Access Port and Boundary Scan Architecture. This eases the problems associated with testing high density circuit boards. The OnCE module provides a means of interacting with the DSP56300 core and its peripherals nonintrusively so that a user can examine registers, memory or on-chip peripherals. This facilitates hardware and software development on the DSP56300 core processor. OnCE module functions are provided through the JTAG TAP signals.

DigitalSignalProcessors

ON-CHIP PERIPHERALS

14.6

14.6.1 Host Interface (HI32) The HI32 is a 32-bit PCI/universal parallel port that can connect directly to the data bus of a host processor. The HI32 supports a variety of buses and connects to a number of industry-standard DSPs, microcomputers, and microprocessors without requiring additional logic. The DSP core treats the HI32 as a memory-mapped peripheral occupying eight 24-bit words in data memory space. The DSP can use the HI32 as a memory-mapped peripheral, using either standard polled or interrupt programming techniques. Separate transmit and receive data registers are double-buffered to allow the DSP and host processor to transfer data efficiently at high speed. Memory mapping allows DSP core communication with the HI32 registers using standard instructions and addressing modes. 14.6.2 Enhanced Synchronous Serial Interface (ESSI) On the DSP56301 are two independent and identical ESSIs. Each ESSI has a full-duplex serial port for communication with a variety of serial devices, including one or more industry-standard codecs, other DSPs, microprocessors and peripherals that implement the Motorola SPI. The ESSI consists of independent transmitter and receiver sections and a common ESSI clock generator. The capabilities of the ESSI include the following: Independent (asynchronous) or shared (synchronous) transmit and receive sections with separate or shared internal/external clocks and frame syncs Normal mode operation using frame sync Network mode operation with as many as 32 time slots Programmable word length (8, 12 or 16 bits) Program options for frame synchronisation and clock generation 14.6.3 Serial Communications Interface (SCI) The DSP56301 SCI provides a full-duplex port for serial communication with other DSPs, microprocessors or peripherals such as modems. The SCI interfaces without additional logic to peripherals that use TTL-level signals. With a small amount of additional logic, the SCI can connect to peripheral interfaces that have non-TTL level signals, such as the RS-232C, RS-422, etc. This interface uses three dedicated signals: transmit data (TXD) receive data (RXD), and SCI serial clock (SCLK). It supports industry-standard asynchronous bit rates and protocols, as well as high-speed synchronous data transmission (up to 8.25 Mbps for a 66-MHz clock). 14.6.4 Timer Module The triple timer module is composed of a common 21-bit prescaler and three independent and identical general-purpose 24-bit timer/event counters, each with its own memory-mapped register set. Each timer has a single signal that can function as a GPIO signal or as a timer signal. Each timer can use internal or external clocking and can interrupt the DSP after a specified number of events (clocks) or can signal an external device after counting internal events. Each timer connects to the external world through one bidirectional signal. When this signal is configured as an input, the timer can function as an external event counter or measures external pulse width/signal period. When the signal is used as an output, the timer can function as either a timer, a watchdog or a pulse width modulator (PWM).

AnOverviewofMotorolaDSP563XXProcessors

ON-CHIP MEMORy

14.7

The memory space of the DSP56300 core is partitioned into program memory space, X data memory space and Y data memory space. The data memory space is divided into X data memory and Y data memory in order to work with the two address ALUs and to feed two operands simultaneously to the data ALU. Memory space includes internal RAM and ROM and can be expanded off-chip under software control. The DSP56301 has 8KX24-bit on-chip RAM which can be configured in a variety of ways shown in Table 14.1 under software control. There is also an on-chip 192 24-bit bootstrap ROM
Table 14.1 On-chipRAMconfigurationoptionsforDSP56301

Program RAM
4096X24 3072X24 2048X24 1024X24

Instruction cache
0 1024X24 0 1024X24

X data RAM
2048 X 24 2048 X 24 3072X24 3072X24

Y data RAM
2048 X 24 2048 X 24 3072X24 3072X24

INTERNAL BUSES

14.8

The following buses shown in Fig. 14.1 provide data exchange between the functional blocks of the core: Peripheral I/O expansion bus (PIO_EB) to peripherals Program memory expansion bus (PM_EB) to program RAM X memory expansion bus (XM_EB) to X memory Y memory expansion bus (YM_EB) to Y memory Global data bus (GDB) between PCU and other core structures Program data bus (PDB) for carrying program data throughout the core X memory data bus (XDB) for carrying X data throughout the core Y memory data bus (YDB) for carrying Y data throughout the core Program address bus (PAB) for carrying program memory addresses throughout the core X memory address bus (XAB) for carrying X memory addresses throughout the core Y memory address bus (YAB) for carrying Y memory addresses throughout the core. All internal buses on the DSP56300 family members are 16-bit buses except the PDB, which is a 24-bit bus.

DIRECT MEMORy ACCESS (DMA)


The DMA block of the DSP56300 core has the following features: Six DMA channels supporting internal and external accesses One-, two-, and three-dimensional transfers (including circular buffering) End-of-block-transfer interrupts Triggering from interrupt lines, all peripherals and DMA channels

14.9

DigitalSignalProcessors

INSTRUCTION SET OF DSP56300 FAMILy PROCESSORS ADDRESSING MODES

14.10

The DSP56300 core provides four different addressing modes: register direct, address register indirect, PC relative and special.

14.10.1 Register Direct Mode In this mode, the operand is in one (or more) of the 10 data ALU registers, 24 address registers or 7 control registers. 14.10.2 Address Register Indirect Modes This is similar to the indirect addressing mode used in TI TMS320C54X. One or more address registers are used to specify the address of the operand. The register used to specify the operand may be modified either before or after the operand is fetched. The address modification is carried out using the address ALUs, There are nine ways in which the instructions using the indirect addressing may modify the register which specifies the operand address. 14.10.2.1 NoUpdate(Rn) The address of the operand is in the address register, Rn. The contents of the Rn register are unchanged by executing the instruction. 14.10.2.2 PostincrementBy1(Rn)+ The address of the operand is in the address register, Rn After the operand address is used, it is incremented by 1 and stored in the same address register. The type of arithmetic used to calculate the new value of Rn is determined by Mn. The Nn register is ignored. 14.10.2.3 PostdecrementBy1(Rn)The address of the operand is in the address register, Rn. After the operand address is used, it is decremented by 1 and stored in the same address register. The type of arithmetic used to calculate is determined by Mn. The Nn register is ignored. 14.10.2.4 PostincrementByOffsetNn:(Rn)+Nn The address of the operand is in the address register, Rn. After the operand address is used, it is incremented by the contents of the Nn register and stored in the same address register. The type of arithmetic used to calculate is determined by Mn. The contents of the Nn register are unchanged. 14.10.2.5 PostdecrementByOffsetNn:(Rn)-Nn The address of the operand is in the address register, Rn. After the operand address is used, it is decremented by the contents of the Nn register and stored in the same address register. The type of arithmetic used to calculate is determined by Mn. The contents of the Nn register are unchanged. 14.10.2.6 IndexedByOffsetNn(Rn+Nn) This is similar to the indexed addressing mode in 54X. The address of the operand is the sum of the contents of the address register, Rn, and the contents of the address offset register, Nn. The type of arithmetic used to calculate the effective address is determined by Mn. The contents of the Rn and Nn registers are unchanged.

AnOverviewofMotorolaDSP563XXProcessors

14.10.2.7 PredecrementBy1-(Rn) The address of the operand is the contents of the address register, Rn, decremented by 1. The contents of Rn are decremented and stored in the same address register. The type of arithmetic used to calculate is determined by Mn. The Nn register is ignored. 14.10.2.8 Shortdisplacement(Rn+shortdisplacement) In this addressing mode the address of the operand is the sum of the contents of the address register Rn and a short displacement occupying seven bits in the instruction word. The displacement is first sign extended to 24 bits and then added to Rn to obtain the address of the operand. The contents of the Rn register are unchanged. The type of arithmetic used to calculate is determined by Mn. The Nn register is ignored. This reference is classified as a memory reference. 14.10.2.9 Longdisplacement(Rn+longdisplacement) This addressing mode requires one word (label) of instruction extension. The address of the operand is the sum of the contents of the address register Rn and the extension word. The contents of the Rn register are unchanged. The type of arithmetic used to increment Rn is determined by Mn. The Nn register is ignored. This reference is classified as a memory reference. 14.10.3 PC Relative Modes In the PC relative addressing modes, the address of the operand is obtained by adding a displacement, represented in twos complement format, to the value of the program counter (PC). The PC points to the address of the instructions opcode word. The Nn and Mn registers are ignored, and the arithmetic used is always linear. 14.10.3.1 ShortDisplacementPCRelative The short displacement occupies nine bits in the instruction operation word. The displacement is first sign extended to 24 bits and then added to the PC to obtain the address of the operand. 14.10.3.2 LongDisplacementPCRelative This addressing mode requires one word of instruction extension. The address of the operand is the sum of the contents of the PC and the extension word. 14.10.3.3 AddressRegisterPCRelative The address of the operand is the sum of the contents of the PC and the address register Rn. The Mn and Nn registers are ignored. The contents of the Rn register are unchanged. 14.10.4 Special Address Modes The special address modes do not use an address register in specifying an effective address. These modes specify the operand or the address of the operand in a field of the instruction or they implicitly reference an operand. 14.10.4.1 ImmediateData This addressing mode requires one word of instruction extension. The immediate data is a word operand in the extension word of the instruction. This reference is classified as a program reference.

10

DigitalSignalProcessors

14.10.4.2 ImmediateShortData The 8-bit or 12-bit operand is in the instruction operation word. The 8-bit operand is used for immediate move to register, ANDI and ORI instructions and it is zero extended. The 12-bit operand is used for DO and REP instructions and it is zero extended. This reference is classified as a program reference. 14.10.4.3 AbsoluteAddress This addressing mode requires one word of instruction extension. The address of the operand is in the extension word. This reference is classified as a memory reference and a program reference. 14.10.4.4 AbsoluteShortAddress For the absolute short addressing mode the address of the operand occupies six bits in the instruction operation word and it is zero extended. This reference is classified as a memory reference. 14.10.4.5 ShortJumpAddress The operand occupies 12 bits in the instruction operation word. The address is zero extended to 24 bits. This reference is classified as a program reference. 14.10.4.6 I/OShortAddress For the I/O short addressing mode the address of the operand occupies six bits in the instruction operation word and it is one extended. I/O short is used with the bit manipulation and move peripheral data instructions. 14.10.4.7 ImplicitReference Some instructions make implicit reference to the program counter (PC), system stack (SSH, SSL), loop address register (LA), loop counter (LC) or status register (SR). The registers implied and their use is defined by the individual instruction descriptions.

SUMMARy OF THE INSTRUCTION SET

14.11

DSP56300 core supports a large number of instructions. To appreciate the various types of instructions which are supported by the DSP56300 core a table of instructions is given in Table 14.2. For more details of these instructions, the DSP56300 family manual may be consulted/Some of the notations used in Table 14.2 are discussed next. The instruction set of DSP56300 is designed so as to keep ALU, AGU and PCU units busy each instruction cycle, achieving maximum speed and minimum program size. The arithmetic instructions perform all of the arithmetic operations within the data ALU. These instructions may affect all of the CCR bits. Arithmetic instructions are register based (register direct addressing modes used for operands) so that the data ALU operation indicated by the instruction does not use the XDB, the YDB or the global data bus (GDB). Optional data transfers may be specified with most arithmetic instructions, which allows for parallel data movement over the XDB and YDB or over the GDB during a data ALU operation. This parallel movement allows new data to be prefetched for use in subsequent instructions and allows results calculated in previous instructions to be stored. In Table 14.2(a)-(f) the instructions which permit parallel move are also indicated. In Table 14.2 the symbols S, D indicate the source of the operand and the destination of the result of the instructions. S1 and S2 denote the registers X0, X1, Y0 and Y2. S, D refers to one of the accumulators. Rn denotes the address registers R0-R7. Some of the other symbols and their interpretation are as follows:

AnOverviewofMotorolaDSP563XXProcessors

11

D [n] Bit n of D Destination Operand Register #n Immediate Short Data (5 Bits) #xx Immediate Short Data (8 Bits) #xxx Immediate Short Data (12 Bits) #xxxxxx Immediate Data (24 Bits) CO Control Word Offset ea Effective Address eax Effective Address for X Bus eay Effective Address for Y Bus xxxx Absolute or Long Displacement Address (24 Bits) xxx Short or Short Displacement Jump Address (12 Bits) xxx Short Displacement Jump Address (9 Bits) aaa Short Displacement Address (7 Bits Sign Extended) aa Absolute Short Address (6 Bits, Zero Extended) pp High I/O Short Address (6 Bits, Ones Extended) qq Low I/O Short Address (6 Bits)
Table 14.2(a) Arithmetic instructions

Mnemonic
ABS ADC ADD ADD(imm.) ADDL ADDR ASL ASL (mb.) ASL (mb., imm.) ASR ASR (mb.) ASR (mb., imm.) CLR CMP CMP (imm.) CMPM CMPU DEC DIV DMAC Absolute value

Description
ABS B Add long with carry Add Add (immediate operand) Shift left and add Shift right and add Arithmetic shift left Arithmetic shift left (multibit) Arithmetic shift left (multibit, immediate operand) Arithmetic shift right Arithmetic shift right (multibit) Arithmetic shift right (multibit, immediate operand) Clear an operand Compare Compare (immediate operand) Compare magnitude Compare unsigned Decrement accumulator Divide Iteration Double precision multiply-Accumulate CLR D ASR D

Syntax and instruction type I


[parallel move] [parallel move] [parallel move] [parallel move] [parallel move] [parallel move] ADC S, D ADD S, D ADDL S, D ADDRS, D ASLD

[parallel move]

[parallel move] [parallel move] [parallel move] [parallel move]

CMS1,S2 CMPM S1,S2 CMPU S1,S2 DEC D DIVS, D DMAC S1,S2,D

(Contd.)

12
INC MAC MAC MACI MACR

DigitalSignalProcessors
INCD MACS1,S2,D [parallel move]

Table 14.2(a) (Contd.)


Increment accumulator Signed multiply-accumulate

MACuu()S1,S2,D (su,uu) mixed multiply-accumulate Signed multiply-accumulate (immediate MACsu()S1,S2,D MACI #XXXXXX,S,D operand) Signed multiply-accumulate and round Signed multiply-accumulate and round (immediate operand) Transfer by signed value Transfer by magnitude Signed multiply Mixed multiply Signed multiply(immediate operand) Signed multiply and round Signed multiply and round (immediate operand) Negate accumulator Normalise Fast accumulator normalise Round Subtract long with carry Subtract Subtract (immediate operand) Shift left and subtract Shift right and subtract Transfer conditionally Transfer data ALU register Test an operand SUBLS,D SUBRS,D TCCS1,D1 [S2,D2] TFRS,D TSTS [parallel move] [parallel move] [parallel move] [parallel move] MACRS1,S2,D MACRI#XXXXXX,S,D MAXA,B MAXM A, B MPYS1,S2,D MPYuu()S1,S2,D MPYsu()S1,S2,D MPYI#XXXXXX/S,D MPYRS1,S2,D MPYRI #XXXXXX,S,D NEGD NORM Rn,D NORMF S,D RNDD SBCS,D SUBS,D [parallel move] [parallel move] [parallel move] [parallel move] [parallel move] [parallel move] [parallel move] [parallel move] [parallel move]

MACRI MAX MAXM MPY MPY (su,uu) MPYI MPYR MPYRI NEG NORM NORMF RND SBC SUB SUB(imm.) SUBL SUBR Tcc TFR TST

Table 14.2(b) Logicalinstructions

Mnemonic
AND AND(imm.) ANDI AND CLB EOR EOR(imm.) Logical AND

Description
Logical AND (immediate operand) Immediate to control register Count leading bits Logical exclusive OR Logical exclusive OR (immediate operand)

Syntax and Instruction Type


AND S,D [parallel move] AND(1)#XX,D CLB S,D EORS,D [parallel move]

(Contd.)

AnOverviewofMotorolaDSP563XXProcessors
Table 14.2(b) (Contd.)
EXTRACT EXTRACT (imm.) EXTRACTU EXTRACTU (imm.) INSERT INSERT (imm.) LSL LSL(mb.) LSL (mb., imm.) LSR LSR (mb.) LSR (mb.,imm.) MERGE NOT OR OR (imm.) ORI ROL ROR Extract bit field Extract bit field (immediate operand) Extract unsigned bit field Extract unsigned bit field (immediate operand) INSERT bit field INSERT Bit field (immediate operand) Logical shift left Logical shift left (multibit) Logical shift left (multibit, immediate operand) Logical shift right Logical shift right (multibit) Logical shift right (multibit, immediate operand) Merge two half words Logical complement Logical inclusive OR Logical inclusive OR (immediate operand) OR immediate to control register Rotate left Rotate right ORI #XX,D ROL D [parallel move] ROR D [parallel move] MERGE S,D NOT D [parallel move] OR S,D [parallel move] LSL D [parallel move] EXTRACT S1,S2,D EXTRACT #C0,S2,D EXTRACTU S1,S2,D EXTRACTU #C0,S2,D INSERT S1,S2,D INSERT #C0,S2,D LSL D [parallel move]

13

Table 14.2(c) Bit manipulation instructions

Mnemonic
BCHG BCLR BSET BTST

Description
Bit test and change Bit test and clear Bit test and set Bit test

Syntax and instruction type


BCHG #n,D BCLR #n,D BSET #n,D BTST #n,D

Table 14.2(d) Loopinstructions

Mnemonic
BRKcc DO

Description
Conditionally break the current hardware loop Start hardware loop

Syntax and instruction type


BRKcc Static DO#XXX,Expr Dynami DO X:ea,Expr DO X:ea,Expr DO S,,Expr

(Contd.)

14

DigitalSignalProcessors

Table 14.2(d) (Contd.)


DOR DO FOREVER DOR FOREVER ENDDO Start hardware loop to PC-related end-of-loop location Start forever hardware loop Start forever hardware loop to PC-related end-ofloop location Abort and exit from hardware loop DOR [X or Y]:ea,label DO FOREVER, expr DOR FOREVER, label ENDDO

Table 14.2(e) Moveinstructions

Mnemonic
LUA LRA MOVE MOVEC MOVEM MOVEP

Description
Load updated address Load PC-relative address Move data register Move control register Move program memory Move peripheral data

Syntax and instruction type


LUA ea,D LRA Rn,D LRA xxxx,D MOVES,D MOVEC S,D MOVE(M) S,P:ea MOVE(M) P:ea,D MOVEP S,X:<pp>kqq> MOVEP S,Y:<pp>l<qq> MOVEP X:<pp>kqq>,D MOVEP Y:<pp>kqq>,D

U MOVE

Update move

Table 14.2(f) Program control instructions

Mnemonic
JCLR JSET JScc Ifcc.U Ifcc Bcc BRA BRCLR BRSET BScc BSR Jump if bit clear Jump if bit set Jump to

Description

Syntax and instruction type


JCLR #n,D,XXXX JSET #n,D,XXXX JScc ea Opcode-operands IFcc Opcode-operands IFcc Bcc xxxx Bcc Rn BRA xxxx BRA Rn BRCLR #n, [X or Y] :ea,xxxx BRSET #n,[X or Y]:ea,xxxx Bss xxxx Bss Rn BSR xxxx BSR Rn

Execute conditionally and Update CCR Execute conditionally Branch conditionally Branch always Branch if bit clear Branch if bit set Branch to subroutine conditionally Branch to subroutine always

(Contd.)

AnOverviewofMotorolaDSP563XXProcessors
Table 14.2(f) (Contd.)
BSCLR BSSET DEBUGcc DEBUG Jcc JMP JSR JSCLR JSSET NOP PLOCK PUNLOCK PLOCKR PUNLOCKR PFREE PFLUSH PFLUSHUN REP Branch to subroutine if bit clear Branch to subroutine if bit set Enter into the debug mode conditionally Enter into the debug mode always Jump conditionally Jump always subroutine conditionally Jump to subroutine always Jump to subroutine if bit clear Jump to subroutine if bit set No operation Lock program cache sector Unlock program cache sector Lock PC-related program cache sector Unlock PC-related program cache sector Unlock all program cache locked sectors Reset program cache state Reset program cache state to all unlocked sectors Repeat next instruction BSCLR #n,[X or Y]:ea,xxxx BSSET #n,[X or Y]:ea,xxxx DEBUGcc DEBUG Jcc ea JMP ea JSR ea JSCLR #n,D,XXXXX JSSET #n,D,XXXXX NOP PLOCK ea PUNLOCK ea PLOCKR xxxx PUNLOCKR xxxx PFREE PFLUSH PFLUSHUN REP #xxxx REP X:ea REP Y:ea REP S RESET RTI RTS STOP TRAP cc TRAP WAIT

15

RESET RTI RTS STOP TRAP cc TRAP WAIT

Reset on-chip peripheral devices Return from interrupt Return from subroutine Stop processing (low-power standby) Trap conditionally Trap always Wait ibr interrupt (low-power standby)

COMPARISON OF THE FEATURES OF THE DSP56300 FAMILy PROCESSORS

14.12

The different members of the DSP56300 differ in the amount of on-chip memory, number and type of peripherals, number of general purpose I/O (GPIO) pins, number of coprocessors and maximum clock frequency/MIPs rating. All the members have the same on-chip boot ROM of size 12824 and external addressing space of two 2M24 data memory and one 1M24 program memory. All of them have a triple timer, PCI, ESSI, a host port interface (HPI) and DMA block. Table 14.3 provides a list of the features for the various ICs.

16

DigitalSignalProcessors

14.12.1

On-Chip Coprocessors

Cyclic-codeCoprocessor (CCOP) executes cyclic code calculations for data ciphering and deciphering, as well as parity code generation and check. The CCOP is fully programmable and not dedicated to a specific algorithm, but it is well suited for GSM A5.1 and A5.2 data ciphering algorithms. The CCOP can generate mask sequences for data ciphering, and supports fire encode and decode for burst error correction, as well as generation of cyclic redundancy code (CRC) syndrome for any polynomial of any degree up to 48.
Table 14.3 ComparisonofthefeaturesofsomeDSP56300familyprocessors

ICno. DSP
56301 56303 56305

On-chip RAM
8KX24 8KX24 12.25KX24

On-chip ROM
192X24 192X24 (9K+192)X24

Host port size in bits


32(H132) 8 (H108) 32(H132)

No. of GPIO pins


42 34 32

MIPs rating
100 100 80

Coprocessors

FCOP, VCOP, CCOP EFCOP EFCOP

56307 56309 56311

64KX24 34KX24 128KX24

192X24 192X24 192X24

8 (H108) 8 (H108) 8 (H108)

34 34 34

100 100 255

FilterCoprocessor (FCOP) implements a wide variety of convolution and correlation filtering algorithms. In GSM applications, the FCOP cross-correlates between the received training sequence and a known midamble sequence to estimate the channel impulse response, and then performs match filtering of received data symbols using coefficients derived from that estimated channel. ViterbiCoprocessor (VCOP) implements a Maximum Likelihood Sequential Estimation (MLSE) algorithm for channel decoding and equalisation (uplink) and channel convolution coding (downlink). The VCOP supports constraint lengths (k) of 4, 5, 6 or 7 with number of states 8, 16, 32 or 64, respectively; code rates of 1/2, 1/3, 1/4 or 1/6; and trace-back Trellis depth of 36. EnhancedFilteringCoprocessor (EFCOP) is a general-purpose, fully programmable coprocessor that performs filtering tasks concurrently with the DSP core, with minimum core overhead. The DSP core and the EFCOP can share data via an 8K-word shared data memory. DMA channels shuttle input and output data between the DSP core and the EFCOP. The EFCOP supports a variety of filter modes, some of which are optimised for cellular base station applications. The EFCOP supports up to 4K taps and 4K coefficients in any combination of number and length of filters (e.g., eight filters of length 512, or 16 filters of length 256). It performs either 24-bit or 16-bit precision arithmetic with full support for saturation arithmetic.

Review Questions
14.1 ListtheregistersindataALUofDSP56300core. 14.2 IntheDSP56300corehowmanybitsareallocated for representing the fractional part and integer part individually? 14.3 WhatismeantbysaturationmodeofDataALUof DSP56300core? 14.4 Howmanyclockcyclesareusedtoperformthe operationsindataALU?Howisaneffectivethroughput

AnOverviewofMotorolaDSP563XXProcessors
of one instruction/clock cycle achieved in DSP56300 core? 14.5 Compare the Multiplier unit of DSP56300 core withthatofTMS320c54X? 14.6 Compare the AGU of DSP56300 core with the DAGENunitofTMS320c54X. 14.7 Listthe25registersintheAGUofDSP56300core. 14.8 ListtheregistersinPCUofDSP56300core. 14.9 How does the regular modulo addressing mode differ from the multiple wrap-around modulo addressing? 14.10 Compare the dual-memory indirect addressing modeofTMS320c54XwiththatX,Yindirectaddressing modeofDSP56300core. 14.11 WhatistheuseoftheTAPandOnCEblocksof DSP56300core?

17

14.12 Whatarethethreeperipheralswhicharepresent in all the DSP56300 family processors? What are their functions? 14.13 Listthevariouswaysinwhichtheon-chipRAM maybeconfiguredinDSP56301. 14.14 Compare the preincrement indirect addressing mode of DSP56300 family processors with that of TMS320C54X. 14.15 Compare the low power mode of TMS320C54X withthatofDSP56300core. 14.16 Compare the features of various Ics in the DSP56300family. 14.17 ListtheDSP56300familyIcswhichhavespecial purposecoprocessors.Whatarethefunctionperformed bythesecoprocessors?

Self Test Questions


14.1 ThefactorbywhichtheMotorolaDSP56300core performance is superior compared to its predecessor DSP56000familycoreis. (a)2 (b)3 (c)4 (d)8 14.2 ThemultiplierinDSP56300coreisofsize. (a)16X16 (b)17X17 (c)18X18 (d)24X24 14.3 TheX,YregistersoftheDSP56300coreareofsize bitseach. (a)18 (b)24 (c)48 (d)56 14.4 TheaccumulatorsA,BoftheDSP56300coreare ofsizebitseach. (a)18 (b)24 (c)48 (d)56 14.5 Theno.ofaddressALUinDSP56300coreis andtheno.ofindirectaddressregistersare (a)1,8 (b)1,4 (c)2,4 (d)2,8 14.6 ThememoryinDSP56300familyDSPsisaccessed asbitdata. (a)16 (b)18 (c)24 (d)48 14.7 The no. of cycles required by the DSP56300 core for MAC operation is and using pipelining effectiverateofcycle/Macisachieved. (a)2,1 (b)3,1 (c)4,2 (d)6,3 14.8 The DSP56300 family DSP which has enhanced filtercoprocessorisDSP. (a)56301 (b)56305 (c)56307 (d)56309 14.9 The DSP56300 family DSP which has Viterbi coprocessorisDSP. (a)56301 (b)56305 (c)56307 (d)56309 14.10 TheDSP56300familyDSPwhichhascycliccode coprocessorisDSP. (a)56301 (b)56305 (c)56307 (d)56309

You might also like