- Understand why earliest finish time is the greedy choice for activity selection
- See how greedy choice property and optimal substructure guarantee optimality
- Learn to implement activity selection in O(n log n) time
Activity Selection Problem
You're organizing a conference room. People book it for different time slots:
- Meeting A: 9:00-10:00
- Meeting B: 9:30-11:00
- Meeting C: 10:00-11:30
- Meeting D: 11:00-12:00
- Meeting E: 10:30-13:00
How many meetings can you fit without overlaps? You can't attend both A and B (they overlap from 9:30-10:00). Which ones do you pick?
This is the activity selection problem. Given activities with start and end times, select the maximum number that don't overlap.
The Greedy Strategy
Intuition says: pick the shortest meetings first, or the ones that start earliest. Let's try.
Pick shortest first:
- Meeting A: 1 hour
- Meeting D: 1 hour
Can we fit both? A ends at 10:00, D starts at 11:00. Yes. Can we add more? Meeting C starts at 10:00, fits between A and D. That's three meetings: A, C, D.
Pick earliest finish time:
- Meeting A finishes at 10:00 (earliest)
- Next available: C finishes at 11:30 (starts at 10:00, no overlap with A)
- Wait, no. After A (ends 10:00), next activity starting at 10:00 or later with earliest finish? C finishes at 11:30, D finishes at 12:00. Pick C.
- After C (ends 11:30), next activity? Nothing starts at 11:30 or later in our list.
Actually, let me reconsider with the correct greedy strategy: always pick the activity that finishes earliest among those compatible with what you've already selected.
Start by sorting all activities by finish time:
A: 9:00-10:00
B: 9:30-11:00
C: 10:00-11:30
D: 11:00-12:00
E: 10:30-13:00
Pick A (finishes earliest at 10:00).
Next compatible activity (starts ≥ 10:00)?
- B starts at 9:30 (overlaps), skip
- C starts at 10:00 (compatible), finishes at 11:30
- D starts at 11:00 (compatible), finishes at 12:00
- E starts at 10:30 (compatible), finishes at 13:00
Which finishes earliest? C at 11:30. Pick C.
After C (ends 11:30), next compatible?
- D starts at 11:00 (overlaps with C), skip
- E starts at 10:30 (overlaps), skip
Done. Selected: A, C. Two meetings.
But wait, could we do better? Let me check all possibilities:
- A + C = 2
- A + D = 2 (A ends 10:00, D starts 11:00, compatible)
- D alone = 1
Actually, A and D don't overlap (A: 9:00-10:00, D: 11:00-12:00). And after picking A + D, can we fit C? C is 10:00-11:30, which overlaps with D. So A + D gives 2 meetings.
Hmm, the greedy strategy (earliest finish time) gave us A + C. But A + D also gives 2. Let me re-examine.
Actually, the greedy algorithm for activity selection is:
- Sort by finish time
- Pick the first activity
- Pick the next activity whose start time ≥ last finish time
Sorted by finish time:
A: 9:00-10:00 (finish 10:00)
B: 9:30-11:00 (finish 11:00)
C: 10:00-11:30 (finish 11:30)
D: 11:00-12:00 (finish 12:00)
E: 10:30-13:00 (finish 13:00)
- Pick A (finish 10:00)
- Check B: starts 9:30 < 10:00, skip
- Check C: starts 10:00 ≥ 10:00, pick (finish 11:30)
- Check D: starts 11:00 < 11:30, skip
- Check E: starts 10:30 < 11:30, skip
Result: A, C (2 activities)
Is this optimal? Let's verify by trying other combinations:
- A + D: A(9-10), D(11-12) = 2 activities ✓
- A + C: A(9-10), C(10-11:30) = 2 activities ✓
- Can we get 3? After A, we can pick C or D but not both. So max is 2.
Both are optimal. Greedy found one optimal solution.
Why Earliest Finish Time?
The intuition: finishing early frees up the most time for future activities.
Suppose we pick an activity that finishes late. It blocks a larger time window, potentially excluding multiple shorter activities. Picking the earliest finish gives maximum room for what comes next.
Proof sketch: If an optimal solution doesn't include the earliest-finishing activity, we can swap it in without creating new conflicts (since it finishes no later), proving earliest finish is always safe.
The Algorithm
def activity_selection(activities):
# Sort by finish time
activities.sort(key=lambda x: x[1])
selected = [activities[0]]
last_finish = activities[0][1]
for i in range(1, len(activities)):
start, finish = activities[i]
if start >= last_finish:
selected.append(activities[i])
last_finish = finish
return selected
For activities [(1,3), (2,4), (3,5), (4,6)]:
Sorted by finish: [(1,3), (2,4), (3,5), (4,6)]
Pick (1,3), last_finish = 3
Check (2,4): start 2 < 3, skip
Check (3,5): start 3 ≥ 3, pick, last_finish = 5
Check (4,6): start 4 < 5, skip
Selected: [(1,3), (3,5)]
Time: O(n log n) for sorting + O(n) for greedy selection = O(n log n).
A Harder Example
Activities:
A: [0, 6]
B: [1, 4]
C: [3, 5]
D: [5, 7]
E: [3, 9]
F: [5, 9]
G: [6, 10]
H: [8, 11]
I: [8, 12]
J: [2, 14]
Sorted by finish time:
B: [1, 4]
C: [3, 5]
A: [0, 6]
D: [5, 7]
E: [3, 9]
F: [5, 9]
G: [6, 10]
H: [8, 11]
I: [8, 12]
J: [2, 14]
Greedy picks:
- B (ends 4)
- C: starts 3 < 4, skip
- A: starts 0 < 4, skip
- D: starts 5 ≥ 4, pick (ends 7)
- E: starts 3 < 7, skip
- F: starts 5 < 7, skip
- G: starts 6 < 7, skip
- H: starts 8 ≥ 7, pick (ends 11)
- I: starts 8 < 11, skip
- J: starts 2 < 11, skip
Selected: B, D, H (3 activities)
Is this optimal? Let's check if we can get 4:
- To fit 4 activities in [0, 14], each would average 3.5 time units.
- B(1-4), D(5-7), H(8-11) leaves gaps: 0-1, 4-5, 7-8, 11-14.
- Can we fit an activity in 11-14? None available.
- What if we pick differently? Let's try A, D, H: A(0-6), D(5-7)—overlap! Can't do that.
Greedy found an optimal solution with 3 activities.
Why It Works
Greedy choice property: Picking the earliest-finishing activity is safe. You can always swap it into an optimal solution without making it worse.
Optimal substructure: After picking an activity, the remaining problem is the same: select max activities from those starting after the last finish time.
These properties guarantee greedy finds an optimal solution.
Variations
Weighted activities: Each activity has a value. Maximize total value, not count. Greedy fails here—you need dynamic programming.
Resource constraints: Multiple rooms/resources. Greedy can still work with modifications (assign to the resource that frees up earliest).
Intervals on a line: Instead of time intervals, consider intervals on a number line. Same algorithm applies.
Real Applications
Meeting scheduling: Booking conference rooms to maximize usage.
Job scheduling: On a single machine, schedule jobs to complete the most within a deadline.
Bandwidth allocation: Allocate time slots for data transmission to maximize throughput.
The activity selection problem is the foundation. Master it, and you understand the greedy paradigm for scheduling and resource allocation.
