Professional Documents
Culture Documents
Data memory and program memory is Data and program words share the same distinct memory space It requires two connections which results in It results in simple hardware connection to improved performance. memory, since only one connection is required More memory wires Fewer memory wires
Simultaneous program and data memory Program and data memory are accessed access separately
It consists of ADCs, timers,serial It consists of ADCs ,DACs,PWM,timers communication devices on same IC . ,counters,direct memory access controllers on same IC They provide specialized instructions for They provide instructions that are central to common embedded system control DSPs such as filtering and transformation operations such as bit- manipulations
repeatedly. For example, a pager is always a pager. In contrast, a desktop system executes a variety of programs, like spreadsheets, word processors, and video games, with new programs added frequently. 2) Tightly constrained: All computing systems have constraints on design metrics, but those on embedded systems can be especially tight. A design metric is a measure of an implementations features, such as cost, size, performance, and power. Embedded systems often must cost just a few dollars, must be sized to fit on a single chip, must perform fast enough to process data in real-time, and must consume minimum power to extend battery life or prevent the necessity of a cooling fan. 3) Reactive and real-time: Many embedded systems must continually react to changes in the systems environment, and must compute certain results in real time without delay. For example, a car's cruise controller continually monitors and reacts to speed and brake sensors. It must compute acceleration or decelerations amounts repeatedly within a limited time; a delayed computation result could result in a failure to maintain control of the car. In contrast, a desktop system typically focuses on computations, with relatively infrequent (from the computers perspective) reactions to input devices. In addition, a delay in those computations, while perhaps inconvenient to the computer user, typically does not result in a system failure.
2) Compare GPP, SPP and ASIP along with their block diagrams and any two differences.
General purpose Processor(GPP)
IR
P C
Program Memory Assembl Progr y code for : Total=0 for I=1 to Nagaraj N.K.
1MS05EC048
Features -Program memory -General data path with large register file and general ALU
User benefits -Low time to market and NRE costs -High flexibility
Pentium the most well known, but there are hundreds of others
Data Memory
Digital circuit(Hardware) Designed to execute exactly one program Ex: JPEG codec Also known as coprocessor accelerator or peripheral
Features Contains only the components needed to execute a single program No program memory
Nagaraj N.K.
1MS05EC048
IR
P C
Programmable processor optimized for a particular class of applications having common characteristics such as embedded control digital signal processing or telecommunications Ex. Microcontrollers and digital signal processor. Compromise between general purpose and signal purpose processors Features Program memory Optimized datapath Special functional units Benefits Some flexibility good performance size and power
Nagaraj N.K.
1MS05EC048
4) Explain the various metrics that need to be optimized while designing an embedded system?
A design metric is a measurable feature of a systems implementation. Commonly used metrics include: a) NRE Cost (Non-recurring engineering cost): The one time monetary cost of designing the system. Once the systems is designed any number of units can be manufactured without incurring any additional design cost; hence the term nonrecurring.
b) Unit cost : The physical space required by the system, often measured in bytes for software, and gates or transistors for hardware. c) Performance: The execution time of the system.
d) Power : The amount of power consumed by the system, which may determine the lifetime of a battery, or the cooling requirements of the IC, since more power means more heat. e) Flexibility : The ability to change the functionality of the system without incurring heavy NRE cost. Software is typically considered very flexible. f) Time to prototype: The time needed to build a working version of the system, which may be bigger or more expensive than the final system implementation, but it can be used to verify the systems usefulness and correctness and to refine the systems functionality. g) Time to market :
Nagaraj N.K. 1MS05EC048
The time required to develop a system to the point that it can be released and sold to customers. The main contributors are design time, manufacturing time, and testing time. h) Maintainability: The ability to modify the system after its initial release, especially by designers who did not originally design the system. i) Correctness: Our confidence that we have implemented the systems functionality correctly. We can check the functionality throughout the process of designing the system and we can insert test circuitry to check that manufacturing was correct. j) Safety : The probability that the system will not cause harm.
5)
What is a market window? Why is it important for products to reach early in the market? Justify
The time-to market constraint has become especially demanding in recent years. Introducing an embedded system to the marketplace early can make a big difference in the systems profitability, since market time-windows for products are becoming quite short, often measured in months. Missing this window can mean significant loss in sales. In some cases, each day that a product is delayed from introduction to the market can translate to a one million dollar loss.The average time-to-market constraint has been reported as having shrunk to only 8 months.
Nagaraj N.K.
1MS05EC048
6) Assume 8 bit encoding of input voltage ranging from -5v to +5v. find encoding for 1.2v then trace using the successive approximation approach. Find resolution of the conversion. Extend the ratio and resolution equations to any voltage in the range of Vmin to Vmax.
a. Expected output:
=>10/255 =0.0392
c.
d(8bit encoding)
10000000 10000000 10000000 10010000 10011000 10011100 10011110
Nagaraj N.K.
1MS05EC048
Add rn,rm Sub rn,rm Jz rn, relative Add one instruction to the instruction set shown that would reduce the size of summing assembly program by one instruction .show the reduced program
Reduced program is a follows Mov ro,#0 Mov r1,#10 Mov r2,#01 Mov r3,#0 Loc1: jz r1,next Add r0,r1 Sub r1,r2 Jz r3,loc1 Next: // next instruction
10
incorporating instructions from standard library routines. A linker designed for embedded processors will also try to eliminate binary code associated with uncalled procedures and functions as well as memory alloated to unused variables in order to reduce the overall program footprint. Moores law:
A trend related to ICs : IC transistor capacity has doubled roughly every 18 months for the past several decades. This trend is shown in the fig below. It was actually predicted way back in 1965 by Intel cofounder Gordon Moore. He predicted that semiconductor transistor density would double every 18 to 24 months. This trend is therefore known as Moores Law. Moore recently predicted about another decade before such growth slows down. This trend is mainly caused by improvement in IC manufacturing that results in smaller parts, such as transistor parts and wires, on the surface of the IC. The minimum part size, commonly known as feature size, for a CMOS IC in 2002 is about 130nanometers.
10) In a successive approximation ADC, calculate the correct of 5v given an analog input voltage range from 0 to +15v and an 8bit digital encoding. Also determine the resolution of this ADC.
a. Expected output:
d=85 or 01010101
=>15 /255
Nagaraj N.K. 1MS05EC048
11
=>0.0588
E 5 < (15+0)/2 5 > (7.5+0)/2 5 < (7.5+3.75)/2 5 > (5.625+3.75)/2 5 < (5.625+4.6875)/2 5 > (5.15625+4.6875)/2 5 < (5.15625+4.921875)/2 5 > (5.0390625+4.921875)/2
d(8bit encoding) 00000000 01000000 01000000 01010000 01010000 01010100 01010100 01010101
11) Determine the range and resolution of a 16 bit timer which operates at a clock frequency of 10MHz and generates and overflow signal when it reaches FFFF. Calculate the terminal count value for measuring a 3msec time interval. What is the minimum division needed in a prescalar for measuring 100ms.
Nagaraj N.K.
1MS05EC048
12
100m/6.5536m = 15.258 Therefore we need a clock division of 16 times to the original clock.
12) With diagram explain the direct mapping technique for cache.
Cache mapping is the method for assigning main memory addresses to the far fewer number of available cache addresses, and for determining whether a particular main memory address contents are in the cache. Direct mapping: In this technique, the main memory address is divided into two fields, the index and the tag. The index represents the cache address, and thus the number of index bits is determined by the cache size, i.e., index size = log2(cache size). Note that many different main memory addresses will map to the same cache address. When we store a main memory address content in the cache, we also store the tag. To determine if a desired main memory address is in the cache, we go to the cache address indicated by the index, and we then compare the tag there with the desired tag. Direct-mapped caches are easy to implement, but may result in numerous misses if two or more words with the same index are accessed frequently, since each will bump the other out of the cache. Fully-associative caches on the other hand are fast but the comparison logic is expensive to implement. Set-associative caches can reduce missescompared to direct-mapped caches, without requiring nearly as much comparison logic as fully-associative caches. Caches are usually designed to treat collections of a small number of adjacent main memory addresses as one indivisible block, typically consisting of about 8 addresses.
Nagaraj N.K.
1MS05EC048
13
13) Explain how UART is used for communication highlighting the advantages of UART.
A UART (Universal Asynchronous Receiver/Transmitter) receives serial data and stores it as parallel data (usually one byte), and takes parallel data and transmits it as serial data. The principles of serial communication appear in a later chapter. Such serial communication is beneficial when we need to communicate bytes of data between devices separated by long distances, or when we simply have few available I/O pins. Principles of serial communication will be discussed in a later chapter. For our purpose in this section, we must be aware that we must set the transmission and reception rate, called the baud rate, which indicates the frequency that the signal changes. Common rates include 2400, 4800, 9600, and 19.2k. We must also be aware that an extra bit may be added to each data word, called parity, to detect transmission errors -- the parity bit is set to high or low to indicate if the word has an even or odd number of bits. Internally, a simple UART may possess a baud-rate configuration register, and two independently operating processors, one for receiving and the other for transmitting. The transmitter may possess a register, often called a transmit buffer, that holds data to be sent. This register is a shift register, so the data can be transmitted one bit at a time by shifting at the appropriate rate. Likewise, the receiver receives data into a shift register, and then this data can be read in parallel. Note that in order to shift at the appropriate rate, based on the configuration register, a UART requires a timer. To use a UART, we must configure its baud rate by writing to the configuration register, and then we must write data to the transmit register and/or read data from the received register. Unfortunately, configuring the baud rate is usually not as simple as writing the desired rate (e.g., 4800) to a register. smod corresponds to 2 bits in a special-function register, oscfreq is the frequency of the oscillator, and TH1 is an 8-bit rate register of a built-in timer. Note that we could use a general-purpose processor to implement a UART completely in software. If we used a dedicated general-processor, the implementation would be inefficient in terms of size. We could alternatively integrate the transmit and receive functionality with our main program. This would require creating a routine to send data serially over an I/O port, making use of a timer to control the rate. It would also require using an interrupt service routine to capture serial data coming from another I/O port whenever such data begins arriving. However, as with the timer functionality, adding send and receive functionality can detract from time for other computations. Knowing the number of cycles that each instruction requires, we could write a loop that executed the desired number of instructions; when this loop completes, we
Nagaraj N.K. 1MS05EC048
14
know that the desired time passed. This implementation of a timer on a dedicated generalpurpose processor is obviously quite inefficient in terms of size. One could alternatively incorporate the timer functionality into a main program, but the timer functionality then occupies much of the programs run time, leaving little time for other computations. Thus, the benefit of assigning timer functionality to a special-purpose processor becomes evident.
Nagaraj N.K.
1MS05EC048
15
14) Explain the various events that take place when a processor executes an instruction. Explain how pipelining improves the execution speed.
A microprocessors execution of instructions consists of several basic stages: 1. Fetch instruction: the task of reading the next instruction from memory into the instruction register. 2. Decode instruction: the task of determining what operation the instruction in the instruction register represents (e.g., add, move, etc.). 3. Fetch operands: the task of moving the instructions operand data into appropriate registers. 4. Execute operation: the task of feeding the appropriate registers through the ALU and back into an appropriate register. 5. Store results: the task of writing a register into memory. If each stage takes one clock cycle, then we see that a single instruction may take several cycles to complete. Pipelining is a common way to increase the instruction throughput of a microprocessor.We make a simple analogy of two people approaching the chore of washing and drying 8 dishes. In one approach, the _rst person washes all 8 dishes, and then the second person dries all 8 dishes. Assuming 1 minute per dish per person, this approach requires 16 minutes. The approach is clearly incident since at any time only one person is working and the other.
15) Differentiate between the followings .Single purpose and General purpose processors .Full custom and PLD technologies
Single purpose and General purpose processors
Single purpose processor General purpose processor Is a digital system intended to solve a Is intended to solve a wide variety of specific computation task. Computation task.
Nagaraj N.K. 1MS05EC048
16
Examples of hardware: JPEG codec(joint photographic Exports Group), GCD custom_SPP, timers, counters. User does only processor design; no software, Hence no program memory.
DESIGN METRIC differences: Design metric Performance Size Power consumption Time-to-market NRE cost flexibility GPP Slow Large More Less Less More SPP Fast Small Less More More less
Full custom IC
All layers are optimized for an embedded systems particular digital implementation -placing transistors -sizing transistors -routing wires
Nagaraj N.K.
1MS05EC048
17
All layers already exist -Designers can purchase an IC -Connections on the IC are either created or destroyed to implement desired functionality -FPGA very popular
16) Derive the equation for percentage revenue loss ofr any market rise angle . A product was delayed by 4 weeks in releasing to market.The peak revenue for the product for on time entry to market would occur after 20 weeks for market rise angle of 45 . Determine the revenue loss?
To derive: % revenue loss=D(3W-D)/2W*100% To compute the revenue loss from delayed entery.use the simplified revenue module
Peak revenue On time Market rise Peak revenue from delayed entry Market fall Delayed Time D On time Entry Delayed entry 1MS05EC048 W 2W
Nagaraj N.K.
18
2w=product life time D=Time to market %Revenue loss=(A on time-Adelayed/A on time)*100% ={1/2(2W*Wtan)-1/2[(2W-D)*(W-D)tan] }/1/2[2W-Wtan] ={2Wtan-2Wtan+2WDtan +DWtan-Dtan}/2Wtan ={3WD-D/2W}*100% %Revenue loss= D(3W-D)/2W*100% Given data: D=4weeks 2W=40 weeks %revenue loss=[{4(3*20-4)}/2*20]*100% =28% Thus a delay of just 4 weeks results in a revenue loss of 28%.
Nagaraj N.K.
1MS05EC048
19
Pipelining to work well, instruction execution must be decomposable into roughly equal length stages and instructions should each require the same number of cycles. Speedup factor : Is a common method of comparing the performance of two systems.the speedup of the system A over system B is determined simply as : Speedup of A over B=performance of A/performance of B. Suppose the speedup of camera A over camera B is 2.then we can also say that A is 2times faster than B and B is 2times slower than A.
18) Explain pipelining. If 6000 instructions are to be executed using a 4 stage pipelined processor at a clock frequency of 12 MHz, determine the speedup of the pipelined processor when compared to non-pipelined processor.
Pipelining is a common way to increase the instruction throughput of a microprocessor. Lets say we split the instruction execution into four stages namely fetch, decode, execute and store. In pipelining, after the instruction fetch unit fetches the first instruction, the decode unit decodes it while the instruction fetch unit simultaneously fetches the next instruction.
For pipelining to work well, Instruction execution must be decomposable into roughly equal length stages. Each instruction must require the same number of cycles.
Branching poses a problem for pipelining as we dont know the next instruction until the execution stage of the branch instruction is complete. A few solutions are as follows
Nagaraj N.K.
1MS05EC048
20
When there is a branch in the pipeline stop fetching and wait for the branch instruction to complete its execution and then fetch the correct instruction. Take a guess which way the branch might go and continue pipelining. If right, then carry on, else, ignore all the instructions fetched after the branch instruction thus incurring a penalty. Modern pipelined microprocessors have a very sophisticated branch predictors built in.
Given: 4 stage pipeline, 6000 instructions, 12 MHz clock Pipelined execution time :6003 cycles required for execution of 6000 instructions. 6003*(1/12MHz)= 500.25s. Non-pipelined execution time:6000*4 cycles required for execution of 6000 instructions. 24000*(1/12MHz)=2ms.
21
Linker: A linker allows a programmer to create program in separately assembled or complied files. It combines the machine instructions of each into a single program, perhaps incorporating instructions from standard library routines. A linker designed for embedded processors will also try to eliminate binary code associated with uncalled procedures and functions as well as memory allocated to unused variables in order to reduce the overall program footprint. Debugger: Debuggers help programmers evaluate and correct their programs. They run on the development processor and support stepwise program execution, executing one instruction and then stopping, proceeding to the next instruction when instructed by the user. They permit execution up to user-specified breakpoints, which are instructions that when encountered cause the program to stop executing. Whenever the program stops, the user can examine values of various memory and register locations. A source-level debugger enables step-by-step execution in the source program language, whether assembly language or a structured language. A good debugging capability is crucial, as todays programs can be quite complex and hard to write correctly. Since debuggers are programs tat run on your development processor but execute code designed for your target processor. These debuggers are also known as instruction-set simulators (ISS) or virtual machines.
Emulator: Emulators support debugging of the program while it executes on the target processor. An emulator typically consists of a debugger coupled with a board connected to the desktop processor via a cable. The board consists of the target processor plus some support circuitry (often another processor). The board may have another cable with a device having the same pin configuration as the target processor, allowing one to plug this device into a real embedded system. Such an in-circuit emulator enables one to control and monitor the programs execution in the actual embedded system circuit. Incircuit emulators are available for nearly any processor intended for embedded use, though they can be quite expensive if they are to run at real speeds.
20) The Analog input voltage range from -5 to +5v for an 8 bit ADC. Determine the resolution and digital output in binary when input is -2v using formula. Also trace using successive
Nagaraj N.K. 1MS05EC048
22
E -2 < (5-5)/2 -2 > (0-5)/2 -2 < (0-2.5)/2 -2 < (-1.25-2.5)/2 -2 > (-1.875-2.5)/2 -2 > (-1.875-2.1875)/2 -2 < (-1.875-2.03125)/2 -2 < (-1.953125-2.03125)/2
d(8bit encoding) 00000000 01000000 01000000 01000000 01001000 01001100 01001100 01001100
Nagaraj N.K.
1MS05EC048