Embedded Systems

Embedded Systems — Missing Topics Exam Notes
Exam Preparation Notes

Embedded Systems
Missing Topics Guide

All 10 uncovered topics explained simply — for exam use

6
Missing topics
4
Partial topics
10
Exam questions
01
SHARC Processor — Introduction & Architecture
MISSING FROM BOTH Module 2

What is it? (Simple)

SHARC stands for Super Harvard Architecture Computer. It is a special type of processor made by Analog Devices, designed specifically for DSP (Digital Signal Processing) tasks like audio processing, speech recognition, and video encoding. Think of it as a calculator that is extremely fast at doing math — especially multiply and add operations done repeatedly.

Easy Analogy

ARM is like a general-purpose worker who can do any job. SHARC is like a specialist chef who only cooks — but cooks 10x faster than the general worker because everything in the kitchen is designed just for cooking (signal math).

Key Points to Remember

SHARC uses Modified Harvard Architecture — separate program memory and two data memory buses, allowing 3 memory accesses per cycle.
It has a MAC unit (Multiply-Accumulate) — can do multiply + add in ONE clock cycle. Critical for signal processing.
Supports 32-bit floating point and fixed-point arithmetic.
Has on-chip SRAM — very fast memory access without going off-chip.
Used in: audio equipment, modems, radar systems, medical imaging.
SHARC vs ARM: ARM = general embedded tasks. SHARC = heavy math/signal tasks.

SHARC Memory Architecture

Program Memory Bus
    ↓
┌─────────────────────────────┐
│ PM RAM │ CORE │ DM RAM │
│(Program) │ (CPU) │ (Data) │
└─────────────────────────────┘
             ↑
  Two Data Memory Buses (DM)
→ 3 memory operations per cycle!

Likely Exam Questions

What does SHARC stand for? Explain its architecture.
How is SHARC different from a general-purpose processor like ARM?
Why is SHARC preferred for DSP applications?
What is a MAC unit and why is it important in SHARC?
02
Formal Specification Methods — UML, State Charts, DFD
MISSING FROM BOTH Module 1

What is it? (Simple)

Formalization means drawing a precise diagram of your system before you build it, so everyone understands exactly how it works. Just like an architect draws a blueprint before building a house, an embedded system designer draws state charts or DFDs before writing code.

Easy Analogy

Imagine a traffic light. Before coding it, you draw: OFF → RED → GREEN → YELLOW → RED. This is a state chart. Every circle is a “state” (what the system is doing), and every arrow is a “trigger” (what causes it to change). This is formalization.

Key Points to Remember

State Chart / State Machine: Shows all possible states of the system and transitions between them. Used for control-heavy embedded systems.
DFD (Data Flow Diagram): Shows how data moves through the system — from input to processing to output. Uses circles (processes), arrows (data flows), rectangles (external entities).
UML (Unified Modeling Language): A standard set of diagram types — use case diagrams, sequence diagrams, class diagrams — for modeling system behavior.
Formalization helps catch bugs before coding. Much cheaper to fix on paper than in hardware.
Key UML diagrams for embedded: Use Case, Sequence, State Machine, Component diagrams.

Simple State Machine Example (Traffic Light)

[RED] ──timer──→ [GREEN]
                   ↓ timer
[RED] ←─timer── [YELLOW]

State = current condition
Transition = trigger/event
→ This IS formalization for system design

Likely Exam Questions

What is formalization in embedded system design? Why is it needed?
Draw a state chart for a simple embedded system (e.g. vending machine / traffic light).
Differentiate between DFD and State Machine diagrams.
What is UML? Name 3 UML diagrams used in embedded system design.
03
CPU Performance Metrics — CPI, MIPS, Benchmarks
PARTIALLY COVERED Module 2

What is it? (Simple)

How do we measure how fast a processor is? We use metrics. CPI = how many clock cycles one instruction takes on average. MIPS = how many million instructions run per second. These help compare different processors.

Easy Analogy

A worker (CPU) processes files. CPI = how many minutes per file. MIPS = how many files per minute. A faster worker has lower CPI and higher MIPS. Clock speed (MHz) is how fast the worker’s hands move — but if each task takes more hand-moves (high CPI), speed alone doesn’t help.

Key Formulas

CPU Time = Instructions × CPI × Clock Period
Or: CPU Time = (Instructions × CPI) / Clock Frequency

MIPS = Clock Frequency / (CPI × 10⁶)
Higher MIPS = faster processor (for same type of program)

Speedup = Old Time / New Time

Key Points to Remember

CPI (Cycles Per Instruction): Ideal = 1. RISC processors aim for CPI = 1. CISC processors have CPI > 1 for complex instructions.
Clock Speed alone is misleading — a 1GHz processor with CPI=1 may be faster than 2GHz with CPI=3.
Benchmarks = standard test programs used to compare processors in real conditions (e.g. SPEC benchmarks).
Amdahl’s Law: Speedup is limited by the part of the program you CAN’T speed up. If 50% of code can’t be parallelized, max speedup = 2x even with infinite processors.
For embedded: power efficiency per MIPS matters more than raw MIPS.

Likely Exam Questions

Define CPI. How does it affect CPU performance?
A processor runs at 500MHz with average CPI=2. Calculate MIPS.
State Amdahl’s Law and explain its significance.
Why is clock speed alone not a good measure of CPU performance?
04
Power Consumption in Embedded Processors
PARTIALLY COVERED Module 2

What is it? (Simple)

Embedded systems usually run on batteries (phones, IoT devices). So power consumption is critical. There are two types of power loss in a chip: Dynamic power (used when the circuit is switching/working) and Static power (leaked even when idle).

Easy Analogy

Dynamic power = electricity used when your fan is ON and spinning. Static power = electricity slowly leaking even when the fan switch is OFF (due to imperfect insulation). In modern chips, both matter — static leakage is a big problem at small transistor sizes.

Key Formulas

Dynamic Power = α × C × V² × f
α=activity factor, C=capacitance, V=voltage, f=frequency

Reduce V → Power drops by V² (most effective!)
Halving voltage = 4x less dynamic power

Key Points to Remember

Dynamic Power is caused by transistors switching ON/OFF. Proportional to frequency and voltage².
Static Power (Leakage) flows even when processor is idle. Bigger problem in smaller transistors (below 90nm).
DVFS (Dynamic Voltage Frequency Scaling): Lower voltage + frequency when full performance not needed. Key power-saving technique.
Sleep modes: Embedded processors have multiple sleep states (idle, sleep, deep sleep) to save power when not active.
Power-Delay Product (PDP): Measure of energy per operation. Lower = better design.
ARM processors are popular in mobile because they have excellent MIPS per milliwatt ratio.

Likely Exam Questions

Differentiate between dynamic and static power consumption.
What is DVFS? How does it help in reducing power consumption?
Write the formula for dynamic power. Which factor has the most impact?
Why is power consumption important in embedded systems design?
05
Programming I/O — Memory-Mapped I/O, Polling vs Interrupt
PARTIALLY COVERED Module 2

What is it? (Simple)

I/O (Input/Output) programming means: how does the CPU communicate with external devices like sensors, displays, buttons? There are two main approaches. Polling = CPU keeps checking if device is ready (like refreshing email manually). Interrupt = device taps CPU on shoulder when it needs attention (like getting a notification).

Easy Analogy

Polling = You go to the door every 5 minutes to check if a delivery arrived. Interrupts = Doorbell rings and you go only when delivery actually comes. Interrupts save CPU time. Polling wastes CPU cycles but is simpler to code.

Polling

  • CPU continuously checks device status
  • Simple to implement
  • Wastes CPU cycles
  • Good for fast/predictable devices
  • No latency — immediate response

Interrupt-Driven I/O

  • Device signals CPU when ready
  • CPU free to do other work
  • More complex (ISR needed)
  • Good for slow/unpredictable devices
  • Small latency to handle interrupt

Key Points to Remember

Memory-Mapped I/O: Peripheral devices are assigned memory addresses. CPU reads/writes those addresses to communicate with devices — same instructions as normal memory access.
ISR (Interrupt Service Routine): Special function that runs automatically when interrupt occurs. Must be short and fast.
DMA (Direct Memory Access): Hardware moves data between memory and device WITHOUT CPU involvement. CPU is completely free.
ARM uses memory-mapped I/O. All peripherals (GPIO, UART, SPI) accessed via specific memory addresses.

Likely Exam Questions

What is memory-mapped I/O? How does it differ from port-mapped I/O?
Compare polling and interrupt-driven I/O with advantages and disadvantages.
What is an ISR? What are the rules for writing an ISR?
Explain DMA. When would you use DMA instead of interrupt-driven I/O?
06
Component Interfacing — SPI, I2C, UART
PARTIALLY COVERED Module 3

What is it? (Simple)

Embedded systems need to talk to sensors, displays, memory chips etc. They use standard communication protocols. UART, SPI, I2C are the three most common protocols — like different languages the CPU uses to talk to components.

Easy Analogy

UART = sending letters one by one (slow but works over long distance). SPI = a dedicated phone call with one person at a time (fast, but needs separate phone line per person). I2C = a group meeting where everyone shares one room but takes turns speaking using their name (efficient, only 2 wires).

Key Points to Remember

UART (Universal Async Receiver Transmitter): 2 wires (TX, RX). No clock line. Asynchronous. Simple point-to-point. Used for: debug console, GPS, Bluetooth modules.
SPI (Serial Peripheral Interface): 4 wires (MOSI, MISO, SCLK, CS). Synchronous. Master-slave. Fast. Each slave needs its own CS line. Used for: flash memory, SD cards, displays.
I2C (Inter-Integrated Circuit): Only 2 wires (SDA, SCL). Multiple devices share same bus using unique addresses. Slower than SPI. Used for: sensors (temperature, accelerometer), EEPROM.
Speed comparison: SPI (fastest) > I2C (medium) > UART (slowest for peripherals).
Wires needed: I2C=2, UART=2, SPI=4+ (more slaves = more CS wires).

Quick Comparison Table

Protocol | Wires | Speed | Multi-device | Type
─────────────────────────────────────────────
UART │ 2 │ Slow │ No (point-to-point) │ Async
SPI │ 4+ │ Fast │ Yes (separate CS) │ Sync
I2C │ 2 │ Medium │ Yes (addresses) │ Sync

Likely Exam Questions

Compare SPI and I2C protocols. When would you choose one over the other?
Explain UART communication. What is baud rate?
How many devices can be connected on I2C bus? How are they addressed?
What are MOSI, MISO, SCLK, CS in SPI?
07
Debugging Embedded Systems — JTAG, GDB, ICE
MISSING FROM BOTH Module 3

What is it? (Simple)

Debugging embedded systems is hard — you can’t just use printf and a screen like in normal programming. Special tools are needed to see inside the running hardware. JTAG is the most common standard interface used to debug embedded chips. It lets you pause the CPU, inspect registers, set breakpoints — all from your laptop.

Easy Analogy

JTAG is like a doctor’s stethoscope for a chip. The chip is running inside a machine with no screen. JTAG gives you a way to “listen in” on what the CPU is doing — check its pulse (registers), pause it for examination (breakpoints), and even inject instructions.

Key Points to Remember

JTAG (Joint Test Action Group): IEEE 1149.1 standard. Uses 4-5 pins (TDI, TDO, TCK, TMS, TRST). Connects laptop to chip for debugging.
JTAG uses a TAP (Test Access Port) — a state machine built into every modern chip that allows external access to internal signals.
GDB (GNU Debugger): Open-source debugger. For embedded, it connects via JTAG to set breakpoints, inspect memory/registers, step through code on real hardware.
ICE (In-Circuit Emulator): Replaces the actual processor chip with a special emulator chip that has full debug visibility. Expensive but powerful — used in early development.
Simulators vs Emulators: Simulator = software model of hardware (fast, no real chip needed). Emulator = mimics real hardware behavior exactly (slower, more accurate).
ARM has CoreSight debug architecture built in — supports JTAG and SWD (Serial Wire Debug, 2-pin alternative to JTAG).

Likely Exam Questions

What is JTAG? Explain its pins and how it is used for debugging.
Differentiate between a simulator and an emulator in embedded systems.
What is ICE (In-Circuit Emulator)? How is it different from JTAG debugging?
How is GDB used to debug an embedded system running on ARM hardware?
08
Hardware-Software Co-design
MISSING FROM BOTH Module 3

What is it? (Simple)

In embedded systems, hardware and software are designed together at the same time — not one after the other. This is called hardware-software co-design. The goal is to decide: which parts of the system should be hardware (fast but fixed) and which should be software (flexible but slower)?

Easy Analogy

Building a restaurant: you design the kitchen layout (hardware) and the chef’s recipe process (software) at the same time. If you design the kitchen first then write recipes, you might find the kitchen has no oven when the recipe needs baking. Co-design prevents such mismatches.

Key Points to Remember

HW/SW Partitioning: Deciding which functions go to hardware (FPGA/ASIC) and which go to software (CPU). Hardware = faster, less flexible. Software = slower, easily changed.
Co-design flow: Specification → Partitioning → HW design + SW design (parallel) → Integration → Testing.
System-level language: Tools like SystemC or SpecC allow modelling both HW and SW in one language before committing to implementation.
Co-simulation: Simulating HW and SW together to find bugs before building real hardware.
Trade-offs: More in hardware = higher cost, faster speed, less flexibility. More in software = lower cost, slower, easy to update.

Co-design Flow

     [System Specification]
          ↓
    [HW/SW Partitioning]
     ↙         ↘
[HW Design]    [SW Design]
     ↘         ↙
   [Co-simulation]
          ↓
   [Integration & Test]

Likely Exam Questions

What is hardware-software co-design? Why is it important?
Explain HW/SW partitioning with an example.
What are the advantages of co-design over sequential design?
Draw the hardware-software co-design flow diagram.
09
Execution Time Analysis & Optimization (WCET)
MISSING FROM BOTH Module 4

What is it? (Simple)

WCET = Worst Case Execution Time — the maximum time a piece of code can EVER take to run. In embedded systems (especially real-time), if a task takes too long even once, the system can fail. So we must analyze and optimize how long our code runs.

Easy Analogy

An air-bag controller must inflate the bag within 30ms of a crash — always. If the code sometimes takes 25ms and sometimes takes 45ms, the system is dangerous. We need to guarantee it ALWAYS finishes in time. WCET analysis proves that guarantee.

Key Points to Remember

WCET (Worst Case Execution Time): The longest time your code can possibly take. Must be less than the deadline in real-time systems.
Code optimization techniques for speed:
— Loop unrolling: reduce loop overhead by expanding iterations
— Inlining functions: remove function call overhead
— Avoid recursion: use iteration instead
— Use local variables: faster than global (register allocation)
— Minimize branching: branches flush CPU pipeline
Profiling: Measuring how much time each part of code takes. Tools: gprof, ARM DS-5 profiler. Find the “hotspot” (slowest part) and optimize that first.
Pipeline effects: Branch misprediction wastes cycles. Data hazards cause stalls. Write code to minimize both.
Cache optimization: Data accessed together should be stored together (spatial locality). Use data in loops repeatedly (temporal locality). Cache misses are expensive.
Compiler optimization flags: -O0 (none), -O1, -O2, -O3 (max). -Os optimizes for size. Higher optimization = faster but harder to debug.

Slow code patterns

  • Recursion (stack overhead)
  • Dynamic memory (malloc/free)
  • Global variables
  • Long if-else chains
  • Function calls in loops
  • Cache-unfriendly access

Fast code patterns

  • Iteration instead of recursion
  • Static/stack allocation
  • Local variables (registers)
  • Switch/lookup tables
  • Inline small functions
  • Sequential memory access

Likely Exam Questions

What is WCET? Why is it critical in real-time embedded systems?
List and explain any 4 techniques to optimize execution time of embedded software.
What is code profiling? How does it help in optimization?
Explain loop unrolling with a code example. What is the benefit?
What are compiler optimization levels (-O flags)? What is the trade-off?
10
SHARC Processor — Instruction Set & I/O Programming
MISSING FROM BOTH Module 2

What is it? (Simple)

SHARC has its own instruction set designed for signal processing. Its most powerful feature is parallel instruction execution — in one clock cycle, it can do a math operation, move data in memory, AND move data between registers — all at the same time. Its I/O model is also special with DMA-driven transfers.

Key Points to Remember

SHARC Instruction Types: Compute instructions (ALU/MAC), Move instructions (data transfer), Program flow (branch/call).
Parallel execution: One SHARC instruction can specify: Compute + DM move + PM move simultaneously. Written as: compute || DM move || PM move in one line.
Zero-overhead loop (DO UNTIL): Hardware loop counter means no branch instruction needed at end of loop — loop at zero extra cost.
SHARC I/O: Uses DMA (Direct Memory Access) heavily — data streams in/out without CPU involvement, allowing CPU to keep doing DSP math simultaneously.
Serial ports: SHARC has dedicated serial ports (SPORT) for connecting to ADCs and DACs directly — critical for audio and signal applications.
Key comparison with ARM: ARM is load-store architecture with sequential execution. SHARC uses parallel issue and is optimized for streaming data math.

Likely Exam Questions

Explain the parallel execution model of SHARC processor with an example.
What is a zero-overhead loop in SHARC? Why is it useful?
How does SHARC handle I/O? What is the role of SPORT and DMA?
Compare ARM and SHARC instruction sets with respect to embedded application suitability.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top