- Understand what sets are and their unique properties
- Create and modify sets
- Perform set operations (union, intersection, difference)
- Know when to use sets over lists
Sets: Unique Collections
Imagine you're collecting stamps from around the world. You don't want duplicates – having two identical French stamps doesn't make your collection any richer. You just want one of each unique stamp. This is exactly what sets do in Python!
Sets are collections that automatically eliminate duplicates. They're also incredibly efficient at answering questions like "Is this item in my collection?" and performing mathematical operations like finding what's common between two groups.
What is a Set?
A set is an unordered collection of unique elements:
Set Characteristics
UNIQUE: No duplicates allowed
{1, 2, 2, 3} becomes {1, 2, 3}
UNORDERED: No guaranteed order
You can't access items by index
MUTABLE: Can add and remove items
(but items themselves must be immutable)
FAST MEMBERSHIP: O(1) lookup - extremely fast!
"Is x in the set?" is lightning quick
SET OPERATIONS: Union, intersection, difference
Mathematical operations built-in
Real-World Analogies
| Real-World | Why Use a Set |
|---|---|
| Unique words in a book | Count vocabulary, no duplicates |
| Members of a club | Each person appears once |
| Unique visitor IDs | Track who visited, not how many times |
| Tags on a blog post | Each tag used only once |
| Collected achievements | Each achievement earned once |
Creating Sets
Basic Set Creation
# Set with initial values
fruits = {"apple", "banana", "cherry"}
# From a list (removes duplicates!)
numbers = set([1, 2, 2, 3, 3, 3])
print(numbers) # {1, 2, 3}
# From a string (unique characters)
letters = set("hello")
print(letters) # {'h', 'e', 'l', 'o'} (only one 'l'!)
# EMPTY SET - Be careful!
empty_set = set() # Correct
empty_dict = {} # This creates an empty DICTIONARY!
print(type(empty_set)) # <class 'set'>
print(type(empty_dict)) # <class 'dict'>
Visual Representation
Set vs List
List (allows duplicates):
1 2 2 3 3 → 5 items
Set (unique only):
1
2 → 3 unique items
3
Note: Sets don't have a specific order!
Modifying Sets
Adding Elements
fruits = {"apple", "banana"}
# add() - Add single element
fruits.add("cherry")
print(fruits) # {'apple', 'banana', 'cherry'}
# Adding duplicate does nothing (no error!)
fruits.add("apple")
print(fruits) # Still {'apple', 'banana', 'cherry'}
# update() - Add multiple elements
fruits.update(["date", "elderberry", "fig"])
print(fruits) # {'apple', 'banana', 'cherry', 'date', 'elderberry', 'fig'}
Removing Elements
fruits = {"apple", "banana", "cherry", "date"}
# remove() - Remove element (raises error if not found)
fruits.remove("cherry")
print(fruits) # {'apple', 'banana', 'date'}
# fruits.remove("grape") # KeyError!
# discard() - Remove element (no error if not found)
fruits.discard("banana")
print(fruits) # {'apple', 'date'}
fruits.discard("grape") # No error!
# pop() - Remove and return arbitrary element
item = fruits.pop()
print(f"Removed: {item}")
# clear() - Remove all elements
fruits.clear()
print(fruits) # set()
Set Operations
This is where sets truly shine! They support mathematical set operations:
Visual Guide to Set Operations
Set Operations Visualized
A = {1, 2, 3} B = {3, 4, 5}
UNION (A | B)
All elements from both
Result: {1, 2, 3, 4, 5}
(A) (B)
1 2 4 5 Everything!
3
INTERSECTION (A & B)
Only common elements
Result: {3}
(A) (B)
. . . . Only overlap!
3
DIFFERENCE (A - B)
In A but not in B
Result: {1, 2}
(A) (B)
1 2 . . . Only A's part!
Set Operations in Code
A = {1, 2, 3, 4}
B = {3, 4, 5, 6}
# UNION: All unique elements from both
union = A | B # or A.union(B)
print(union) # {1, 2, 3, 4, 5, 6}
# INTERSECTION: Elements in both
intersection = A & B # or A.intersection(B)
print(intersection) # {3, 4}
# DIFFERENCE: In A but not in B
difference = A - B # or A.difference(B)
print(difference) # {1, 2}
# SYMMETRIC DIFFERENCE: In either, but not both
sym_diff = A ^ B # or A.symmetric_difference(B)
print(sym_diff) # {1, 2, 5, 6}
Summary Table
| Operation | Operator | Method | Meaning |
|---|---|---|---|
| Union | A | B |
A.union(B) |
All from both |
| Intersection | A & B |
A.intersection(B) |
Common only |
| Difference | A - B |
A.difference(B) |
In A, not B |
| Symmetric Diff | A ^ B |
A.symmetric_difference(B) |
In one, not both |
Membership and Comparisons
Fast Membership Testing
numbers = {1, 2, 3, 4, 5}
# Very fast! O(1) time
print(3 in numbers) # True
print(10 in numbers) # False
print(10 not in numbers) # True
Set Comparisons
A = {1, 2, 3}
B = {1, 2, 3, 4, 5}
C = {1, 2, 3}
# Subset: Is A contained in B?
print(A <= B) # True (A is subset of B)
print(A < B) # True (A is proper subset)
# Superset: Does B contain A?
print(B >= A) # True (B is superset of A)
# Equality
print(A == C) # True (same elements)
# Disjoint: No common elements?
D = {6, 7, 8}
print(A.isdisjoint(D)) # True (no overlap)
Practical Examples
Example 1: Finding Unique Items
# Remove duplicates from a list
shopping_list = ["apple", "milk", "bread", "apple", "milk", "eggs", "bread"]
unique_items = list(set(shopping_list))
print(f"Unique items: {unique_items}")
print(f"Reduced from {len(shopping_list)} to {len(unique_items)} items")
Example 2: Comparing Skills
# Job candidate matching
job_requires = {"Python", "SQL", "Git", "Docker", "AWS"}
candidate_has = {"Python", "SQL", "Git", "JavaScript", "React"}
# What does the candidate have that matches?
matching = job_requires & candidate_has
print(f" Matching skills: {matching}")
# What is the candidate missing?
missing = job_requires - candidate_has
print(f" Missing skills: {missing}")
# Extra skills the candidate has
extra = candidate_has - job_requires
print(f" Extra skills: {extra}")
# Match percentage
match_percent = len(matching) / len(job_requires) * 100
print(f" Match: {match_percent:.0f}%")
Example 3: Finding Common Friends
# Social network: common connections
alice_friends = {"Bob", "Carol", "David", "Eve"}
bob_friends = {"Alice", "Carol", "Frank", "Eve"}
carol_friends = {"Alice", "Bob", "Grace", "Eve"}
# Friends that Alice and Bob have in common
common_ab = alice_friends & bob_friends
print(f"Alice & Bob's common friends: {common_ab}")
# Friends that all three know
all_common = alice_friends & bob_friends & carol_friends
print(f"Known by all three: {all_common}")
# All unique people in the network
everyone = alice_friends | bob_friends | carol_friends
print(f"Total people in network: {len(everyone)}")
Example 4: Tracking Website Visitors
# Daily unique visitors
monday_visitors = {"user1", "user2", "user3", "user4"}
tuesday_visitors = {"user2", "user3", "user5", "user6"}
wednesday_visitors = {"user3", "user4", "user6", "user7"}
# New visitors on Tuesday (not seen Monday)
new_tuesday = tuesday_visitors - monday_visitors
print(f"New visitors Tuesday: {new_tuesday}")
# Loyal visitors (visited all 3 days)
loyal = monday_visitors & tuesday_visitors & wednesday_visitors
print(f"Visited all 3 days: {loyal}")
# Total unique visitors over 3 days
total_unique = monday_visitors | tuesday_visitors | wednesday_visitors
print(f"Total unique visitors: {len(total_unique)}")
When to Use Sets
Decision Guide: When to Use Sets
USE SETS when:
• You need only unique values
• You need fast membership testing ("is x in collection?")
• Performing set operations (union, intersection)
• Removing duplicates from a list
• Order doesn't matter
USE LISTS instead when:
• Order matters
• You need duplicates
• You need to access items by index
USE DICTIONARIES instead when:
• You need key-value pairs
• Items have associated data
Key Takeaways
Remember These Points
Sets store unique elements only: {1, 2, 3}
Sets are unordered - no indexing!
Empty set: set(), NOT {} (that's a dict!)
Add: add() for one, update() for many
Remove: discard() (safe) or remove() (raises error)
Operations:
• Union: A | B (all elements)
• Intersection: A & B (common elements)
• Difference: A - B (in A but not B)
Membership test (x in set) is super fast!
Module Complete!
Congratulations! You've completed the Data Structures module!
You now understand Python's four core data structures:
- Lists: Ordered, mutable, for general collections
- Tuples: Ordered, immutable, for fixed data
- Dictionaries: Key-value pairs, for labeled data
- Sets: Unique elements only, for membership and set operations
Each has its strengths – choosing the right one makes your code cleaner, faster, and easier to understand. In the next module, you'll learn about Functions – how to write reusable blocks of code!
