πŸ“˜ L2: Process Abstraction

Lecture Goal

Understand why the OS needs the process abstraction, master the three contexts (Memory, Hardware, OS), and know how processes transition through states and interact with the OS via system calls, exceptions, and interrupts.

The Big Picture

A process is the OS’s fundamental unit of execution. Everything the OS does β€” scheduling, memory management, IPC β€” revolves around processes. Understanding this abstraction is foundational.


1. Why We Need Processes

🧠 Core Concept

A program is a static set of instructions sitting on disk. A process (also called a task or job) is a dynamic abstraction of a program currently executing on the CPU.

The OS must support multiprogramming β€” running multiple programs seemingly at the same time by rapidly switching between them. To switch from running Program A to Program B, the OS needs to:

  1. Save all the execution information of Program A
  2. Load the execution information of Program B

This requires a formal structure to describe a running program β†’ the Process.

ContextContents
Memory ContextCode (Text), Data, Stack, Heap
Hardware ContextRegisters, PC, Stack Pointer, Frame Pointer
OS ContextProcess ID, Process State, Resources used
flowchart TD
    Program["Program<br/>(Static, on disk)"]
    Process["Process<br/>(Dynamic, executing)"]
    
    Program -->|"Loaded into memory"| Process
    
    Process --> MC["Memory Context<br/>Text, Data, Stack, Heap"]
    Process --> HC["Hardware Context<br/>PC, SP, FP, Registers"]
    Process --> OC["OS Context<br/>PID, State, Resources"]
    
    style Program fill:#ffcc99
    style Process fill:#99ff99

⚠️ Common Pitfalls

Pitfall: Program β‰  Process

A program is passive (a file on disk). A process is active (it has memory allocated, a PC pointing somewhere, registers with values). The same program can spawn multiple processes simultaneously (e.g., two terminal windows running the same shell).


❓ Mock Exam Questions

Q1: Dynamic Abstraction

Why is a process described as a β€œdynamic abstraction” for an executing program?

Answer

Answer: A process captures the runtime state of a program β€” its memory allocations, register values, and execution position. Unlike a static program file, a process changes continuously as it executes. The abstraction lets the OS manage execution without knowing program internals.

Q2: Three Contexts

Name the three major contexts that make up the process abstraction and give one example item from each.

Answer

Answer:

  • Memory Context: Text segment, Data segment, Stack, Heap
  • Hardware Context: Program Counter, Stack Pointer, General Purpose Registers
  • OS Context: Process ID (PID), Process State, Open file descriptors

2. Memory Context

2.1 Code & Data Regions

🧠 Core Concept

When a program is loaded into memory, it is divided into regions:

RegionDescriptionProperties
Text SegmentMachine instructions (compiled code)Read-only during execution
Data SegmentGlobal and static variablesSize known at compile time

Example: The C code int i = 0; i = i + 20; compiles to:

addi $1, $0, 0      # register $1 = 0
sw   $1, 4096       # i = 0 (store to address 4096 in Data)
lw   $2, 4096       # load i
addi $3, $2, 20     # $3 = i + 20
sw   $3, 4096       # i = i + 20

The assembly instructions live in Text; variable i lives in Data at address 4096.


⚠️ Common Pitfalls

Pitfall: Local variables β‰  Data segment

The Data region is only for global/static variables known at compile time. Local variables live on the stack.


2.2 Stack Memory & Function Calls

🧠 Core Concept

Functions require memory that is created when called and destroyed when they return. This is too dynamic for the Data region.

Solution: Stack Memory

Each function invocation gets a stack frame (activation record) pushed onto the stack. The stack follows LIFO order β€” most recently called function is at the top.

FieldPurpose
Return PCWhere to resume execution in the caller
ParametersArguments passed to the function
Saved SPOld Stack Pointer to restore on return
Saved FPOld Frame Pointer
Saved RegistersRegisters the callee must preserve
Local VariablesSpace for variables inside the function
Return ResultReturn value before returning
flowchart TD
    subgraph Stack ["Stack Memory"]
        direction TB
        FrameH["Stack Frame for h()<br/>β”œβ”€ Local vars<br/>β”œβ”€ Saved regs<br/>β”œβ”€ Parameters<br/>└─ Return PC"]
        FrameG["Stack Frame for g()"]
        FrameF["Stack Frame for f()"]
        FrameMain["Stack Frame for main()"]
    end
    
    SP["SP β†’ Top of stack"]
    
    style Stack fill:#ffcc99
    style FrameH fill:#99ff99

Key Registers:

RegisterRole
Stack Pointer (SP)Points to the top of the stack; moves dynamically
Frame Pointer (FP)Fixed reference within current frame; stable for accessing locals/parameters

⚠️ Common Pitfalls

Pitfall: SP vs FP confusion

SP = moves dynamically, always points to current stack top. FP = stays fixed for one function call, used as stable reference.

Pitfall: No universal calling convention

The exact stack frame layout varies by architecture and language. The key concepts (return address, parameters, locals) are universal; their ordering and who sets them up differs.

Pitfall: Stack overflow

Unbounded recursion causes the stack to grow until it collides with the heap β†’ crash.


❓ Mock Exam Questions

Q3: Return PC Purpose

What is the purpose of saving the Return PC on the stack frame? What would go wrong if we did not save it?

Answer

Answer: The Return PC tells the CPU where to resume execution in the caller after the callee returns. Without it, the callee wouldn’t know where to return β€” execution would jump to an arbitrary location, likely crashing.

Q4: Register Spilling

What is β€œregister spilling” and why is it needed?

Answer

Answer: CPUs have limited General Purpose Registers (GPRs). When a function needs more registers than available, it β€œspills” some register values to memory (in its stack frame), uses the registers freely, then restores them before returning.


2.3 Heap Memory

🧠 Core Concept

Some memory needs can only be determined at runtime (e.g., allocating an array whose size comes from user input). This can’t go in Data or Stack:

  1. Not Data β†’ Size unknown at compile time
  2. Not Stack β†’ No fixed deallocation time

Solution: Heap Memory

Memory Layout of a Process:
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  High Address
β”‚    Stack     β”‚  (grows downward ↓)
β”‚      ↓       β”‚
β”‚   (free)     β”‚
β”‚      ↑       β”‚
β”‚    Heap      β”‚  (grows upward ↑)
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚    Data      β”‚  (global variables)
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚    Text      β”‚  (instructions)
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  Low Address
RegionAllocationDeallocation
StackAutomatic (function call)Automatic (function return)
HeapExplicit (malloc()) or implicit (GC)Explicit (free()) or GC

⚠️ Common Pitfalls

Pitfall: Using stack for large dynamic data

Variable-Length Arrays (VLAs) on the stack are risky for large sizes β€” can cause stack overflow. Use malloc() for large/dynamic allocations.

Pitfall: Memory leaks

In C/C++, failing to free() heap memory means it’s never reclaimed until process termination.


❓ Mock Exam Questions

Q5: Heap vs Stack

Why can dynamically allocated memory (malloc()) not be placed in the Stack region?

Answer

Answer: Stack memory follows strict LIFO allocation β€” a function’s local data is freed when it returns. Heap data must persist beyond function boundaries (e.g., a function returns a pointer to allocated memory). Placing it on the stack would cause it to be freed prematurely.

Q6: Heap Fragmentation

What is heap fragmentation? Give a scenario that causes it.

Answer

Answer: Fragmentation occurs when free memory is scattered into small non-contiguous blocks. Example: Allocate 100 bytes, then 50 bytes. Free the 100-byte block. Now you have a 100-byte gap, but if you need 80 bytes and a separate 50-byte block exists elsewhere, the 100-byte gap might be unusable if allocation requires contiguous space.


3. Hardware Context

🧠 Core Concept

The hardware context captures the state of CPU registers. During a context switch, the OS must save the current hardware context (into the PCB) and restore the new process’s hardware context.

RegisterRole
General Purpose RegistersIntermediate values, function arguments, return values
Program Counter (PC)Address of next instruction to execute
Stack Pointer (SP)Top of stack
Frame Pointer (FP)Fixed reference in current stack frame
flowchart LR
    subgraph ContextSwitch ["Context Switch"]
        Save["Save current process<br/>PC, SP, FP, GPRs β†’ PCB"]
        Restore["Load next process<br/>PC, SP, FP, GPRs ← PCB"]
    end
    
    Save --> Restore
    
    style Save fill:#ffcc99
    style Restore fill:#99ff99

The Program Counter is critical: without saving/restoring it, the OS cannot resume a process at the correct instruction.


⚠️ Common Pitfalls

Pitfall: Non-atomic context save

The hardware context must be saved atomically. If the OS saves some registers but gets interrupted mid-save, the stored context will be inconsistent. Real systems disable interrupts during save/restore.


4. OS Context – Process State & ID

🧠 Core Concept

Process ID (PID)

Every process gets a unique numeric identifier β€” the Process ID (PID). This lets the OS distinguish processes.

The 5-State Process Model

flowchart LR
    New["New"]
    Ready["Ready"]
    Running["Running"]
    Blocked["Blocked"]
    Terminated["Terminated"]
    
    New -->|"admit"| Ready
    Ready -->|"schedule"| Running
    Running -->|"release CPU"| Ready
    Running -->|"event wait"| Blocked
    Blocked -->|"event occurs"| Ready
    Running -->|"exit"| Terminated
    
    style New fill:#e0e0e0
    style Ready fill:#99ff99
    style Running fill:#ffcc99
    style Blocked fill:#ff9999
    style Terminated fill:#cccccc
StateMeaning
NewProcess just created; still initializing
ReadyProcess is ready to run but waiting for CPU
RunningProcess is currently executing on the CPU
BlockedProcess is waiting for an event (I/O, etc.)
TerminatedProcess has finished; OS may need cleanup

Key Transitions:

TransitionTrigger
New β†’ ReadyInitialization complete, admitted by OS
Ready β†’ RunningScheduler selects this process
Running β†’ ReadyPreempted by scheduler OR voluntarily yields
Running β†’ BlockedRequests unavailable resource (I/O wait)
Blocked β†’ ReadyAwaited event occurs
Running β†’ TerminatedProcess calls exit() or crashes

Global Constraint

With CPUs, at most processes can be Running simultaneously. Multiple processes can be Ready or Blocked at the same time.


⚠️ Common Pitfalls

Pitfall: Ready β‰  Blocked

A Ready process has everything it needs to run β€” just waiting for a CPU slot. A Blocked process cannot run even if a CPU is free β€” it’s waiting for an external event.

Pitfall: Running β†’ Ready vs Running β†’ Blocked

  • Running β†’ Ready: process could run, just lost its CPU time slot
  • Running β†’ Blocked: process cannot run until something happens externally

❓ Mock Exam Questions

Q7: Ready vs Blocked

Describe the difference between Ready and Blocked states. Give a concrete example of a transition from Running β†’ Blocked and then Blocked β†’ Ready.

Answer

Answer:

  • Ready: Process can run immediately if given CPU; waiting only for scheduler.
  • Blocked: Process cannot run; waiting for external event (I/O, signal).

Example: Process calls read() from disk β†’ Running β†’ Blocked. When disk completes, interrupt handler moves it β†’ Blocked β†’ Ready.

Q8: Maximum States

On a single-CPU system with 10 processes, what is the maximum number in Running, Ready, and Blocked states?

Answer

Answer:

  • Running: at most 1 (only one CPU)
  • Ready: at most 9 (if 1 is Running, others could all be Ready)
  • Blocked: at most 10 (all could be waiting for I/O)

Q9: State Transition Trace

A process calls read() to read from disk. Trace the state transitions using the 5-state model until it resumes executing.

Answer

Answer:

  1. Process is Running, calls read()
  2. Running β†’ Blocked (waiting for disk I/O)
  3. Disk I/O completes, interrupt fires
  4. Blocked β†’ Ready (event occurred)
  5. Scheduler picks this process
  6. Ready β†’ Running (resumes execution)

5. Process Control Block & Process Table

🧠 Core Concept

The Process Control Block (PCB) is the kernel’s data structure storing all information about one process:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Process Control Block      β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  PID                        β”‚  ← OS Context
β”‚  Process State              β”‚
β”‚  Resources Used             β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  PC, SP, FP, …              β”‚  ← Hardware Context
β”‚  General Purpose Registers  β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Memory Region Info         β”‚  ← Memory Context
β”‚  (pointers to regions)      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

The Process Table is the collection of all PCBs β€” one entry per process.

Context switch mechanism:

  1. Save current process’s hardware context β†’ its PCB
  2. Load next process’s hardware context ← its PCB
  3. CPU now runs the new process

⚠️ Common Pitfalls

Pitfall: PCB stores metadata, not data

The PCB stores pointers/metadata about memory regions. The actual stack and heap data live in the process’s own memory space, not inside the PCB.


6. OS Interaction with Processes

🧠 Core Concept

Processes interact with the OS in three ways:

TypeInitiated ByTimingExample
System CallProcess (voluntary)Synchronousgetpid(), read()
ExceptionHardware (instruction error)SynchronousDivision by zero, segfault
InterruptHardware (external event)AsynchronousTimer, keyboard
flowchart TD
    subgraph Sync ["Synchronous"]
        SC["System Call<br/>Process-initiated"]
        EX["Exception<br/>Instruction-triggered"]
    end
    
    subgraph Async ["Asynchronous"]
        INT["Interrupt<br/>External event"]
    end
    
    style Sync fill:#ffcc99
    style Async fill:#99ff99

6.1 System Calls

A system call is how a user process requests OS kernel services. It requires a mode switch (user β†’ kernel) via a TRAP instruction.

Mechanism (7 steps):

flowchart LR
    subgraph User ["User Space"]
        U1["1. Call getpid()"]
        U2["2. Library puts syscall# in register"]
        U3["3. Execute TRAP"]
    end
    
    subgraph Kernel ["Kernel Space"]
        K1["4. Dispatcher finds handler"]
        K2["5. Handler executes"]
        K3["6. Return result"]
    end
    
    U3 -->|"mode switch"| K1
    K3 -->|"mode switch"| U1
    
    style User fill:#ffcc99
    style Kernel fill:#99ff99
OSNumber of System Calls
Unix/POSIX~100
Windows API~1000+

6.2 Exceptions

An exception is triggered by instruction execution causing an error:

  • Arithmetic: overflow, division by zero
  • Memory: illegal address, misaligned access

Effect: CPU transfers control to an exception handler. After handling, execution may resume (or process terminates for fatal exceptions).


6.3 Interrupts

An interrupt is triggered by external hardware events, independent of program execution:

  • Timer interrupt (preemption)
  • I/O completion (disk read done)
  • Network packet arrived

Handler steps:

  1. Save register/CPU state
  2. Perform handler routine
  3. Restore register/CPU state
  4. Return from interrupt

⚠️ Common Pitfalls

Pitfall: System call β‰  Function call

A system call must switch privilege levels (user β†’ kernel) via TRAP. Cost is much higher than a regular function call.

Pitfall: Synchronous vs Asynchronous

  • Synchronous: tied to a specific instruction (system call, exception)
  • Asynchronous: can happen at any time (interrupt)

❓ Mock Exam Questions

Q10: TRAP Instruction

Explain the role of the TRAP instruction in the system call mechanism.

Answer

Answer: The TRAP instruction switches the CPU from user mode to kernel mode. This is necessary because user programs cannot directly access kernel memory or hardware. The TRAP triggers a controlled entry into the kernel, which then dispatches to the appropriate handler.

Q11: Exception vs Interrupt

What is the difference between an exception and an interrupt? Give one example of each.

Answer

Answer:

  • Exception: Synchronous, caused by instruction execution (e.g., division by zero, segmentation fault)
  • Interrupt: Asynchronous, caused by external hardware event (e.g., timer interrupt, keyboard input)

Q12: printf() and System Calls

When printf() is called in a C program, is it a system call? What system call does it ultimately invoke?

Answer

Answer: printf() is a library function, not a system call directly. It buffers output and eventually invokes the write() system call to actually output data to the file descriptor. The library provides a friendlier interface over the raw system call.


7. Summary

πŸ“Š Key Concepts

ConceptDefinition
ProcessDynamic abstraction of an executing program
Memory ContextText, Data, Stack, Heap regions
Hardware ContextPC, SP, FP, GPRs
OS ContextPID, State, Resources
Context SwitchSaving one process’s state, restoring another’s
System CallSynchronous, process-initiated kernel request
ExceptionSynchronous, instruction-triggered error
InterruptAsynchronous, external hardware event

🎯 Decision Guide: Which Memory Region?

flowchart TD
    Start["Where does this data go?"]
    Start --> Q1{"Size known at compile time?"}
    Q1 -- "Yes" --> Q2{"Scope?"}
    Q1 -- "No" --> Q3{"Lifetime?"}
    
    Q2 -- "Global/static" --> Data["Data Segment"]
    Q2 -- "Local" --> Stack["Stack"]
    
    Q3 -- "Function lifetime" --> Stack
    Q3 -- "Arbitrary" --> Heap["Heap (malloc/free)"]
    
    style Data fill:#99ff99
    style Stack fill:#ffcc99
    style Heap fill:#ff9999

πŸ”— Connections

πŸ“š LectureπŸ”— Connection
L1OS introduction β€” processes are the OS’s main abstraction
L3Process scheduling β€” scheduler picks from Ready queue
L4IPC β€” processes need to communicate
L5Threads β€” multiple threads share one process’s memory context

The Key Insight

The process abstraction lets the OS manage complexity. By encapsulating execution state (hardware, memory, OS context), the OS can transparently switch between processes, providing the illusion of simultaneous execution.