All 10 uncovered topics explained simply — for exam use
6
Missing topics
4
Partial topics
10
Exam questions
01
SHARC Processor — Introduction & Architecture
MISSING FROM BOTHModule 2
What is it? (Simple)
SHARC stands for Super Harvard Architecture Computer. It is a special type of processor made by Analog Devices, designed specifically for DSP (Digital Signal Processing) tasks like audio processing, speech recognition, and video encoding. Think of it as a calculator that is extremely fast at doing math — especially multiply and add operations done repeatedly.
Easy Analogy
ARM is like a general-purpose worker who can do any job. SHARC is like a specialist chef who only cooks — but cooks 10x faster than the general worker because everything in the kitchen is designed just for cooking (signal math).
Key Points to Remember
SHARC uses Modified Harvard Architecture — separate program memory and two data memory buses, allowing 3 memory accesses per cycle.
It has a MAC unit (Multiply-Accumulate) — can do multiply + add in ONE clock cycle. Critical for signal processing.
Supports 32-bit floating point and fixed-point arithmetic.
Has on-chip SRAM — very fast memory access without going off-chip.
Used in: audio equipment, modems, radar systems, medical imaging.
SHARC vs ARM: ARM = general embedded tasks. SHARC = heavy math/signal tasks.
SHARC Memory Architecture
Program Memory Bus
↓
┌─────────────────────────────┐
│ PM RAM │ CORE │ DM RAM │
│(Program) │ (CPU) │ (Data) │
└─────────────────────────────┘
↑
Two Data Memory Buses (DM)
→ 3 memory operations per cycle!
Likely Exam Questions
What does SHARC stand for? Explain its architecture.
How is SHARC different from a general-purpose processor like ARM?
Why is SHARC preferred for DSP applications?
What is a MAC unit and why is it important in SHARC?
02
Formal Specification Methods — UML, State Charts, DFD
MISSING FROM BOTHModule 1
What is it? (Simple)
Formalization means drawing a precise diagram of your system before you build it, so everyone understands exactly how it works. Just like an architect draws a blueprint before building a house, an embedded system designer draws state charts or DFDs before writing code.
Easy Analogy
Imagine a traffic light. Before coding it, you draw: OFF → RED → GREEN → YELLOW → RED. This is a state chart. Every circle is a “state” (what the system is doing), and every arrow is a “trigger” (what causes it to change). This is formalization.
Key Points to Remember
State Chart / State Machine: Shows all possible states of the system and transitions between them. Used for control-heavy embedded systems.
DFD (Data Flow Diagram): Shows how data moves through the system — from input to processing to output. Uses circles (processes), arrows (data flows), rectangles (external entities).
UML (Unified Modeling Language): A standard set of diagram types — use case diagrams, sequence diagrams, class diagrams — for modeling system behavior.
Formalization helps catch bugs before coding. Much cheaper to fix on paper than in hardware.
Key UML diagrams for embedded: Use Case, Sequence, State Machine, Component diagrams.
State = current condition
Transition = trigger/event
→ This IS formalization for system design
Likely Exam Questions
What is formalization in embedded system design? Why is it needed?
Draw a state chart for a simple embedded system (e.g. vending machine / traffic light).
Differentiate between DFD and State Machine diagrams.
What is UML? Name 3 UML diagrams used in embedded system design.
03
CPU Performance Metrics — CPI, MIPS, Benchmarks
PARTIALLY COVEREDModule 2
What is it? (Simple)
How do we measure how fast a processor is? We use metrics. CPI = how many clock cycles one instruction takes on average. MIPS = how many million instructions run per second. These help compare different processors.
Easy Analogy
A worker (CPU) processes files. CPI = how many minutes per file. MIPS = how many files per minute. A faster worker has lower CPI and higher MIPS. Clock speed (MHz) is how fast the worker’s hands move — but if each task takes more hand-moves (high CPI), speed alone doesn’t help.
Key Formulas
CPU Time = Instructions × CPI × Clock Period
Or: CPU Time = (Instructions × CPI) / Clock Frequency
MIPS = Clock Frequency / (CPI × 10⁶)
Higher MIPS = faster processor (for same type of program)
Speedup = Old Time / New Time
Key Points to Remember
CPI (Cycles Per Instruction): Ideal = 1. RISC processors aim for CPI = 1. CISC processors have CPI > 1 for complex instructions.
Clock Speed alone is misleading — a 1GHz processor with CPI=1 may be faster than 2GHz with CPI=3.
Benchmarks = standard test programs used to compare processors in real conditions (e.g. SPEC benchmarks).
Amdahl’s Law: Speedup is limited by the part of the program you CAN’T speed up. If 50% of code can’t be parallelized, max speedup = 2x even with infinite processors.
For embedded: power efficiency per MIPS matters more than raw MIPS.
Likely Exam Questions
Define CPI. How does it affect CPU performance?
A processor runs at 500MHz with average CPI=2. Calculate MIPS.
State Amdahl’s Law and explain its significance.
Why is clock speed alone not a good measure of CPU performance?
04
Power Consumption in Embedded Processors
PARTIALLY COVEREDModule 2
What is it? (Simple)
Embedded systems usually run on batteries (phones, IoT devices). So power consumption is critical. There are two types of power loss in a chip: Dynamic power (used when the circuit is switching/working) and Static power (leaked even when idle).
Easy Analogy
Dynamic power = electricity used when your fan is ON and spinning. Static power = electricity slowly leaking even when the fan switch is OFF (due to imperfect insulation). In modern chips, both matter — static leakage is a big problem at small transistor sizes.
Dynamic Power is caused by transistors switching ON/OFF. Proportional to frequency and voltage².
Static Power (Leakage) flows even when processor is idle. Bigger problem in smaller transistors (below 90nm).
DVFS (Dynamic Voltage Frequency Scaling): Lower voltage + frequency when full performance not needed. Key power-saving technique.
Sleep modes: Embedded processors have multiple sleep states (idle, sleep, deep sleep) to save power when not active.
Power-Delay Product (PDP): Measure of energy per operation. Lower = better design.
ARM processors are popular in mobile because they have excellent MIPS per milliwatt ratio.
Likely Exam Questions
Differentiate between dynamic and static power consumption.
What is DVFS? How does it help in reducing power consumption?
Write the formula for dynamic power. Which factor has the most impact?
Why is power consumption important in embedded systems design?
05
Programming I/O — Memory-Mapped I/O, Polling vs Interrupt
PARTIALLY COVEREDModule 2
What is it? (Simple)
I/O (Input/Output) programming means: how does the CPU communicate with external devices like sensors, displays, buttons? There are two main approaches. Polling = CPU keeps checking if device is ready (like refreshing email manually). Interrupt = device taps CPU on shoulder when it needs attention (like getting a notification).
Easy Analogy
Polling = You go to the door every 5 minutes to check if a delivery arrived. Interrupts = Doorbell rings and you go only when delivery actually comes. Interrupts save CPU time. Polling wastes CPU cycles but is simpler to code.
Polling
CPU continuously checks device status
Simple to implement
Wastes CPU cycles
Good for fast/predictable devices
No latency — immediate response
Interrupt-Driven I/O
Device signals CPU when ready
CPU free to do other work
More complex (ISR needed)
Good for slow/unpredictable devices
Small latency to handle interrupt
Key Points to Remember
Memory-Mapped I/O: Peripheral devices are assigned memory addresses. CPU reads/writes those addresses to communicate with devices — same instructions as normal memory access.
ISR (Interrupt Service Routine): Special function that runs automatically when interrupt occurs. Must be short and fast.
DMA (Direct Memory Access): Hardware moves data between memory and device WITHOUT CPU involvement. CPU is completely free.
ARM uses memory-mapped I/O. All peripherals (GPIO, UART, SPI) accessed via specific memory addresses.
Likely Exam Questions
What is memory-mapped I/O? How does it differ from port-mapped I/O?
Compare polling and interrupt-driven I/O with advantages and disadvantages.
What is an ISR? What are the rules for writing an ISR?
Explain DMA. When would you use DMA instead of interrupt-driven I/O?
06
Component Interfacing — SPI, I2C, UART
PARTIALLY COVEREDModule 3
What is it? (Simple)
Embedded systems need to talk to sensors, displays, memory chips etc. They use standard communication protocols. UART, SPI, I2C are the three most common protocols — like different languages the CPU uses to talk to components.
Easy Analogy
UART = sending letters one by one (slow but works over long distance). SPI = a dedicated phone call with one person at a time (fast, but needs separate phone line per person). I2C = a group meeting where everyone shares one room but takes turns speaking using their name (efficient, only 2 wires).
Key Points to Remember
UART (Universal Async Receiver Transmitter): 2 wires (TX, RX). No clock line. Asynchronous. Simple point-to-point. Used for: debug console, GPS, Bluetooth modules.
SPI (Serial Peripheral Interface): 4 wires (MOSI, MISO, SCLK, CS). Synchronous. Master-slave. Fast. Each slave needs its own CS line. Used for: flash memory, SD cards, displays.
I2C (Inter-Integrated Circuit): Only 2 wires (SDA, SCL). Multiple devices share same bus using unique addresses. Slower than SPI. Used for: sensors (temperature, accelerometer), EEPROM.
Compare SPI and I2C protocols. When would you choose one over the other?
Explain UART communication. What is baud rate?
How many devices can be connected on I2C bus? How are they addressed?
What are MOSI, MISO, SCLK, CS in SPI?
07
Debugging Embedded Systems — JTAG, GDB, ICE
MISSING FROM BOTHModule 3
What is it? (Simple)
Debugging embedded systems is hard — you can’t just use printf and a screen like in normal programming. Special tools are needed to see inside the running hardware. JTAG is the most common standard interface used to debug embedded chips. It lets you pause the CPU, inspect registers, set breakpoints — all from your laptop.
Easy Analogy
JTAG is like a doctor’s stethoscope for a chip. The chip is running inside a machine with no screen. JTAG gives you a way to “listen in” on what the CPU is doing — check its pulse (registers), pause it for examination (breakpoints), and even inject instructions.
Key Points to Remember
JTAG (Joint Test Action Group): IEEE 1149.1 standard. Uses 4-5 pins (TDI, TDO, TCK, TMS, TRST). Connects laptop to chip for debugging.
JTAG uses a TAP (Test Access Port) — a state machine built into every modern chip that allows external access to internal signals.
GDB (GNU Debugger): Open-source debugger. For embedded, it connects via JTAG to set breakpoints, inspect memory/registers, step through code on real hardware.
ICE (In-Circuit Emulator): Replaces the actual processor chip with a special emulator chip that has full debug visibility. Expensive but powerful — used in early development.
Simulators vs Emulators: Simulator = software model of hardware (fast, no real chip needed). Emulator = mimics real hardware behavior exactly (slower, more accurate).
ARM has CoreSight debug architecture built in — supports JTAG and SWD (Serial Wire Debug, 2-pin alternative to JTAG).
Likely Exam Questions
What is JTAG? Explain its pins and how it is used for debugging.
Differentiate between a simulator and an emulator in embedded systems.
What is ICE (In-Circuit Emulator)? How is it different from JTAG debugging?
How is GDB used to debug an embedded system running on ARM hardware?
08
Hardware-Software Co-design
MISSING FROM BOTHModule 3
What is it? (Simple)
In embedded systems, hardware and software are designed together at the same time — not one after the other. This is called hardware-software co-design. The goal is to decide: which parts of the system should be hardware (fast but fixed) and which should be software (flexible but slower)?
Easy Analogy
Building a restaurant: you design the kitchen layout (hardware) and the chef’s recipe process (software) at the same time. If you design the kitchen first then write recipes, you might find the kitchen has no oven when the recipe needs baking. Co-design prevents such mismatches.
Key Points to Remember
HW/SW Partitioning: Deciding which functions go to hardware (FPGA/ASIC) and which go to software (CPU). Hardware = faster, less flexible. Software = slower, easily changed.
What is hardware-software co-design? Why is it important?
Explain HW/SW partitioning with an example.
What are the advantages of co-design over sequential design?
Draw the hardware-software co-design flow diagram.
09
Execution Time Analysis & Optimization (WCET)
MISSING FROM BOTHModule 4
What is it? (Simple)
WCET = Worst Case Execution Time — the maximum time a piece of code can EVER take to run. In embedded systems (especially real-time), if a task takes too long even once, the system can fail. So we must analyze and optimize how long our code runs.
Easy Analogy
An air-bag controller must inflate the bag within 30ms of a crash — always. If the code sometimes takes 25ms and sometimes takes 45ms, the system is dangerous. We need to guarantee it ALWAYS finishes in time. WCET analysis proves that guarantee.
Key Points to Remember
WCET (Worst Case Execution Time): The longest time your code can possibly take. Must be less than the deadline in real-time systems.
Code optimization techniques for speed:
— Loop unrolling: reduce loop overhead by expanding iterations
— Inlining functions: remove function call overhead
— Avoid recursion: use iteration instead
— Use local variables: faster than global (register allocation)
— Minimize branching: branches flush CPU pipeline
Profiling: Measuring how much time each part of code takes. Tools: gprof, ARM DS-5 profiler. Find the “hotspot” (slowest part) and optimize that first.
Pipeline effects: Branch misprediction wastes cycles. Data hazards cause stalls. Write code to minimize both.
Cache optimization: Data accessed together should be stored together (spatial locality). Use data in loops repeatedly (temporal locality). Cache misses are expensive.
Compiler optimization flags: -O0 (none), -O1, -O2, -O3 (max). -Os optimizes for size. Higher optimization = faster but harder to debug.
Slow code patterns
Recursion (stack overhead)
Dynamic memory (malloc/free)
Global variables
Long if-else chains
Function calls in loops
Cache-unfriendly access
Fast code patterns
Iteration instead of recursion
Static/stack allocation
Local variables (registers)
Switch/lookup tables
Inline small functions
Sequential memory access
Likely Exam Questions
What is WCET? Why is it critical in real-time embedded systems?
List and explain any 4 techniques to optimize execution time of embedded software.
What is code profiling? How does it help in optimization?
Explain loop unrolling with a code example. What is the benefit?
What are compiler optimization levels (-O flags)? What is the trade-off?
10
SHARC Processor — Instruction Set & I/O Programming
MISSING FROM BOTHModule 2
What is it? (Simple)
SHARC has its own instruction set designed for signal processing. Its most powerful feature is parallel instruction execution — in one clock cycle, it can do a math operation, move data in memory, AND move data between registers — all at the same time. Its I/O model is also special with DMA-driven transfers.