🧵 L5: Process Alternative — Threads

Lecture Goal

Understand why threads were invented as a lighter alternative to processes, how they differ from processes in resource sharing, and master the POSIX thread (pthread) API.

The Big Picture

Threads are the dominant concurrency mechanism in modern computing. They solve the two fatal flaws of processes: expensive creation and hard communication. But shared memory is a double-edged sword — it enables easy communication but causes race conditions!

1. Why Threads? (Motivation)

🧠 The Core Problem

Processes are expensive to create and manage for two main reasons:

flowchart TD
    Problem["Process Model Problems"]
    Problem --> P1["High Creation Cost<br/>fork() duplicates entire memory"]
    Problem --> P2["Hard IPC<br/>Processes have independent memory"]
    P1 --> Solution["Threads!"]
    P2 --> Solution

    style Problem fill:#ffcc99
    style Solution fill:#99ff99

Problem	Description
High creation cost	`fork()` duplicates the entire memory space and process context — wasteful when you just want multiple tasks within the same program
Hard IPC	Processes have independent memory spaces; passing data requires special mechanisms (pipes, shared memory, message queues)

🍱 The Cooking Analogy

Imagine preparing lunch: steam rice 🍚, fry fish 🐟, cook soup 🍲.

Single-threaded approach: One cook does all three tasks sequentially:

flowchart LR
    A["steamRice()"] --> B["fryFish()"] --> C["cookSoup()"] --> D["Lunch Ready!"]

    style A fill:#ffcc99
    style B fill:#ffcc99
    style C fill:#ffcc99
    style D fill:#99ff99

Total time = sum of all three task durations.

Multi-threaded approach: Three threads run concurrently within the same process:

flowchart TD
    subgraph Threads
        T1["Thread 1: steamRice()"]
        T2["Thread 2: fryFish()"]
        T3["Thread 3: cookSoup()"]
    end
    T1 --> Wait["Wait for all"]
    T2 --> Wait
    T3 --> Wait
    Wait --> Done["Lunch Ready!"]

    style T1 fill:#99ff99
    style T2 fill:#99ff99
    style T3 fill:#99ff99
    style Done fill:#99ff99

Total time ≈ duration of the longest task (not the sum!).

⚠️ Common Pitfalls

Pitfall 1: Concurrency ≠ Parallelism

Concurrent threads can interleave on a single CPU (they take turns). True parallelism requires multiple CPUs running threads simultaneously.

Pitfall 2: fork() is NOT threading

Using fork() creates a separate process with its own memory copy — threads within the same process share memory directly.

❓ Mock Exam Questions

Q1: Process Model Problems

Name two reasons why the process model (using fork()) is considered expensive.

Answer

Answer:

High creation cost — fork() duplicates entire memory space

Hard inter-process communication — processes have independent memory spaces, requiring IPC mechanisms

Q2: Multi-threaded Speedup

In the cooking analogy, why is the multi-threaded version faster than the single-threaded version?

Answer

Answer: Multi-threaded runs tasks concurrently, so total time ≈ longest task duration. Single-threaded runs sequentially, so total time = sum of all durations.

Q3: fork() Communication

Why does fork() make it hard for processes to communicate?

Answer

Answer: fork() creates an independent copy of the parent’s memory. Changes in the child don’t affect the parent. Communicating requires explicit IPC (pipes, shared memory, message queues).

2. What is a Thread?

🧠 Core Concept

A thread is a unit of execution within a process. Traditionally, a process has a single thread of control — only one instruction executes at any point in time.

$Multithreaded Process = Shared Resources + N \times Thread (own PC, registers, stack)$

flowchart TD
    subgraph Process["Process"]
        Shared["Shared: Code, Data, Heap, Files"]
        subgraph Threads
            T1["Thread 1<br/>PC, Registers, Stack"]
            T2["Thread 2<br/>PC, Registers, Stack"]
            T3["Thread 3<br/>PC, Registers, Stack"]
        end
    end
    Shared --> Threads

    style Shared fill:#ffcc99
    style Threads fill:#99ff99

Each thread has its own Program Counter (PC) that tracks where it is in the code.

⚠️ Common Pitfalls

Pitfall 1: Thread ≠ Process

Multiple threads live inside one process and share its address space.

Pitfall 2: "At the same time" is conceptual

On a single-core CPU, threads take turns very quickly, giving the illusion of parallelism.

❓ Mock Exam Questions

Q4: Thread of Control

What does “thread of control” mean? What hardware register tracks it?

Answer

Answer: “Thread of control” means the sequence of instructions being executed. The Program Counter (PC) register tracks it.

Q5: Program Counters

If a process has 3 threads, how many Program Counters exist?

Answer

Answer: 3 Each thread has its own PC to track its execution position.

3. Process vs Thread

Resource	Shared Between Threads?
Code (Text Segment)	✅ Yes
Global Data (Data Segment)	✅ Yes
Heap (dynamic memory)	✅ Yes
Open Files / Process ID	✅ Yes (OS Context)
Registers (GPR, PC, SP, FP)	❌ No — each thread has its own
Stack	❌ No — each thread has its own
Thread ID	❌ No — unique to each thread

Why Each Thread Needs Its Own Stack

Each thread calls functions independently, so it needs its own call stack to track local variables and return addresses.

🔄 Context Switch Comparison

flowchart LR
    subgraph ProcessSwitch["Process Context Switch (Heavy)"]
        PS1["Save OS Context"]
        PS2["Save Hardware Context"]
        PS3["Switch Page Tables<br/>← Expensive!"]
        PS4["Load new PCB"]
    end

    subgraph ThreadSwitch["Thread Context Switch (Light)"]
        TS1["Save Hardware Context"]
        TS2["Change SP/FP<br/>← Just registers!"]
    end

    style ProcessSwitch fill:#ff9999
    style ThreadSwitch fill:#99ff99

Component	Process Switch	Thread Switch (same process)
General Purpose Registers	✅ Save/restore	✅ Save/restore
Program Counter (PC)	✅ Save/restore	✅ Save/restore
Stack Pointer (SP)	✅ Save/restore	✅ Save/restore
Page Table	✅ Switch — expensive!	❌ Not needed
OS Context (PCB, files)	✅ Load new PCB	❌ Not needed
TLB Flush	✅ Usually required	❌ Not required

$Thread switch cost ≪ Process switch cost$

This is why threads are called lightweight processes.

📊 Benefits of Threads

Benefit	Explanation
Economy	Less resources needed than managing multiple processes
Resource Sharing	Threads share memory — no IPC overhead
Responsiveness	UI thread stays responsive while worker threads compute
Scalability	Can utilize multiple CPU cores simultaneously

⚠️ Common Pitfalls

Pitfall 1: Shared memory is dangerous!

Threads can read/write each other’s data, enabling easy communication but causing race conditions if not synchronized.

Pitfall 2: Stack vs. Heap confusion

Local variables (on stack) are private to each thread. Global variables and heap allocations are shared by all threads.

❓ Mock Exam Questions

Q6: Shared vs. Unique Resources

List three things threads share, and two things unique to each thread.

Answer

Answer:

Shared: Code, global data, heap, open files, process ID

Unique: Registers (PC, SP, FP), stack, thread ID

Q7: Context Switch Cost

Why is a thread context switch cheaper than a process context switch?

Answer

Answer: Thread switching avoids expensive memory context switch (no page table swap, no TLB flush). Only hardware context (registers) needs saving/restoring.

Q8: Race Condition

Thread A and Thread B each increment counter++ 1000 times. Should the final value be 2000?

Answer

Answer: NO counter++ is not atomic (LOAD → ADD → STORE). If both threads read the same value before either writes, one increment is lost. Requires mutex locks to fix.

4. Thread Models: User vs Kernel Threads

🧠 Two Implementation Approaches

flowchart TD
    Model["Thread Models"]
    Model --> User["User Threads"]
    Model --> Kernel["Kernel Threads"]

    User --> U1["Library-managed<br/>OS unaware"]
    Kernel --> K1["OS-managed<br/>System calls"]

    style User fill:#ffcc99
    style Kernel fill:#99ff99

👤 User Threads

Implemented entirely as a user-space library. The OS kernel is completely unaware that multiple threads exist — it only sees one process.

Advantage	Disadvantage
Works on any OS	OS schedules at process level — one thread blocks → all block
Thread ops are library calls — very fast	Cannot exploit multiple CPUs
Highly configurable scheduling

$One thread blocks \Rightarrow Process blocks \Rightarrow ALL threads blocked$

🔧 Kernel Threads

Threads are implemented inside the OS. Thread operations are system calls.

Advantage	Disadvantage
Kernel schedules per-thread	Every operation is a system call — slower
One thread blocking does not block others	Less flexible — one-size-fits-all
True multi-core parallelism

⚠️ Common Pitfalls

Pitfall 1: User threads aren't "bad"

They’re faster for thread operations and great for programs that don’t need true parallelism. But they fail hard if any thread makes a blocking system call.

Pitfall 2: Kernel threads are heavier

But essential when you need real multi-core parallelism.

❓ Mock Exam Questions

Q9: User Thread Blocking

In the user thread model, what happens if one thread makes a blocking system call?

Answer

Answer: The OS sees the entire process as blocked. All other threads in that process are unable to run, even if ready.

Q10: Kernel Thread Advantage

Why can kernel threads exploit multiple CPUs but user threads cannot?

Answer

Answer: The OS is aware of each kernel thread and can schedule them independently on different CPUs. User threads are invisible to the OS, which only sees one schedulable unit (the process).

Q11: Web Server Scenario

A web server uses user threads for 100 clients. One thread blocks on a database query. What happens?

Answer

Answer: All 99 other client threads are blocked too — the web server becomes unresponsive until the database query returns.

5. Hybrid Thread Model

🧠 Best of Both Worlds

The hybrid model supports both user threads and kernel threads:

flowchart TD
    UT["User-level threads (many)"]
    LWP["Light-weight Processes / LWPs"]
    KT["Kernel-level threads"]
    CPU["Physical CPUs"]

    UT <-->|"many-to-many"| LWP
    LWP <-->|"1-to-1"| KT
    KT <--> CPU

    style UT fill:#ffcc99
    style LWP fill:#99ff99
    style KT fill:#99ff99

Benefits:

Flexibility — tune how many kernel threads a process gets
Concurrency control — limit parallelism per process/user
Efficiency — cheap user-thread operations within kernel thread context

🔧 Hardware-Level Threading (SMT)

Modern CPUs use Simultaneous Multi-Threading (SMT), aka Hyperthreading:

Feature	Description
Multiple register sets	One physical core has multiple logical cores
Shared execution units	Threads share ALU, cache, etc.
True parallelism	No software context switch needed

SMT ≠ Multiple Cores

Hyperthreading makes one physical core look like two logical cores, but they share execution units — performance gains are workload-dependent.

❓ Mock Exam Questions

Q12: Hybrid Model Advantage

How does the hybrid model avoid the main weakness of pure user threads?

Answer

Answer: By binding user threads to kernel threads, when one user thread blocks, the kernel can switch to another kernel thread (with its bound user threads), allowing other threads to continue.

Q13: SMT vs. Software Threading

What is the key difference between SMT and software-level multi-threading?

Answer

Answer: SMT is hardware-level — multiple threads run truly simultaneously on the same physical core with no context switch overhead. Software threading requires the OS to save/restore context.

6. POSIX Threads (pthread)

🛠️ The Standard API

pthreads is the standard thread API for Unix/Linux, defined by POSIX.

#include <pthread.h>
// Compile with:
gcc myprogram.c -lpthread

Key data types:

Type	Purpose
`pthread_t`	Thread ID
`pthread_attr_t`	Thread attributes (priority, stack size, etc.)

🚀 Thread Creation: `pthread_create`

int pthread_create(
    pthread_t* tidCreated,           // Output: TID of new thread
    const pthread_attr_t* threadAttr, // Thread attributes (NULL = defaults)
    void* (*startRoutine)(void*),    // Function for thread to execute
    void* argForStartRoutine         // Argument passed to that function
);

Returns 0 on success, non-zero on error
New thread begins executing startRoutine(arg) immediately
Caller and new thread run concurrently!

🛑 Thread Termination: `pthread_exit`

int pthread_exit(void* exitValue);

exitValue is captured by pthread_join
If not called explicitly, thread terminates when startRoutine returns

🔗 Thread Synchronization: `pthread_join`

int pthread_join(pthread_t threadID, void** status);

Blocks caller until target thread terminates
status receives the exit value
Analogous to waitpid() for processes

flowchart LR
    Main["Main Thread"] --> Create["pthread_create()"]
    Create --> Child["Child Thread Runs"]
    Child --> Exit["pthread_exit()"]
    Main --> Join["pthread_join()"]
    Join --> Continue["Continue"]

    style Join fill:#ffcc99
    style Exit fill:#99ff99

📖 Code Examples

Example 1 — Basic Creation

void* sayHello(void* arg) {
    printf("Just to say hello!\n");
    pthread_exit(NULL);
}
 
int main() {
    pthread_t tid;
    pthread_create(&tid, NULL, sayHello, NULL);
    printf("Thread created with tid %i\n", tid);
    return 0;  // ⚠️ PROBLEM: may exit before child prints!
}

Bug: No pthread_join

main() might exit before sayHello prints! The main thread doesn’t wait.

Fix:

pthread_join(tid, NULL);  // Wait for child to finish
return 0;

Example 2 — Shared Memory Race Condition

int globalVar;  // Shared between ALL threads
 
void* doSum(void* arg) {
    for (int i = 0; i < 1000; i++)
        globalVar++;  // All 5 threads modify the SAME globalVar
}
 
int main() {
    pthread_t tid[5];
    for (int i = 0; i < 5; i++)
        pthread_create(&tid[i], NULL, doSum, NULL);
 
    // ⚠️ PROBLEM: prints before threads finish!
    printf("Global variable is %i\n", globalVar);
    return 0;
}

Two problems:

No pthread_join → main prints before threads finish
globalVar++ is not atomic → race condition

Fix:

for (int i = 0; i < 5; i++)
    pthread_join(tid[i], NULL);
 
printf("Global variable is %i\n", globalVar);
// Note: Race condition on globalVar++ still exists!
// Requires mutex locks (covered later)

⚠️ Common Pitfalls

Pitfall 1: Forgetting pthread_join

Main thread may exit and kill all child threads before they finish.

Pitfall 2: pthread_exit in main vs return

return 0 in main may terminate all threads. pthread_exit(NULL) lets other threads finish.

Pitfall 3: globalVar++ is NOT atomic

It compiles to LOAD → ADD → STORE. Another thread can interrupt between steps.

❓ Mock Exam Questions

Q14: pthread_join Purpose

What is the role of pthread_join? What happens if you forget it?

Answer

Answer: It blocks the calling thread until the target thread terminates. Forgetting it means the main thread may exit before child threads finish, killing them prematurely.

Q15: pthread_exit in main

Why does pthread_exit(NULL) in main() behave differently from return 0?

Answer

Answer: return 0 exits the process, terminating all threads. pthread_exit(NULL) terminates only the main thread, allowing other threads to continue running.

Q16: Expected vs. Actual Result

5 threads each increment globalVar 1000 times. What’s expected? What’s actual?

Answer

Answer:

Expected: 5000

Actual: Unpredictable (less than 5000) due to race condition on non-atomic increment.

7. Summary

📊 Key Takeaways

Concept	Key Point
Why threads?	Solve expensive process creation and hard IPC
Thread vs Process	Threads share memory; each has own PC, registers, stack
Context switch	Thread switch is cheap (no page table swap)
User threads	Fast but can’t use multiple CPUs; one blocks → all block
Kernel threads	Slower syscalls but true parallelism
pthread API	`pthread_create`, `pthread_join`, `pthread_exit`
Race conditions	Shared variables need synchronization (mutex)

🔗 Connections

📚 Lecture	🔗 Connection
L6	Process Synchronization — how to fix race conditions with mutex locks and semaphores

The Key Insight

Threads give you concurrency with cheap context switching and easy shared memory. But with great power comes great responsibility — shared mutable state requires synchronization!

Quartz 4

Explorer

L5

🧵 L5: Process Alternative — Threads

1. Why Threads? (Motivation)

🧠 The Core Problem

🍱 The Cooking Analogy

⚠️ Common Pitfalls

❓ Mock Exam Questions

2. What is a Thread?

🧠 Core Concept

⚠️ Common Pitfalls

❓ Mock Exam Questions

3. Process vs Thread

🧠 What Threads Share vs. Own

🔄 Context Switch Comparison

📊 Benefits of Threads

⚠️ Common Pitfalls

❓ Mock Exam Questions

4. Thread Models: User vs Kernel Threads

🧠 Two Implementation Approaches

👤 User Threads

🔧 Kernel Threads

⚠️ Common Pitfalls

❓ Mock Exam Questions

5. Hybrid Thread Model

🧠 Best of Both Worlds

🔧 Hardware-Level Threading (SMT)

❓ Mock Exam Questions

6. POSIX Threads (pthread)

🛠️ The Standard API

🚀 Thread Creation: pthread_create

🛑 Thread Termination: pthread_exit

🔗 Thread Synchronization: pthread_join

📖 Code Examples

Example 1 — Basic Creation

Example 2 — Shared Memory Race Condition

⚠️ Common Pitfalls

❓ Mock Exam Questions

7. Summary

📊 Key Takeaways

🔗 Connections

Graph View

Table of Contents

Backlinks

🚀 Thread Creation: `pthread_create`

🛑 Thread Termination: `pthread_exit`

🔗 Thread Synchronization: `pthread_join`