Advanced Lists and Collections

Introduction

Python's built-in data structures are powerful, but the standard library offers advanced tools that make complex operations simple. Think of these as specialized tools in a carpenter's workshop - each designed for specific tasks that would be cumbersome with basic tools.

In this lesson, we'll explore advanced list operations and the collections module, which provides high-performance container datatypes beyond the built-in ones.

List Comprehensions: Elegant List Creation

List comprehensions are a concise way to create lists from existing iterables. They're like a factory assembly line - taking raw materials and transforming them into finished products in one smooth operation.

Basic Syntax

# Traditional approach
squares = []
for x in range(10):
    squares.append(x**2)

# List comprehension
squares = [x**2 for x in range(10)]

With Filtering

# Get even numbers only
evens = [x for x in range(20) if x % 2 == 0]

# Filter and transform
words = ["hello", "world", "python", "programming"]
upper_words = [word.upper() for word in words if len(word) > 5]

Nested Comprehensions

# Create a multiplication table
table = [[i*j for j in range(1, 6)] for i in range(1, 6)]

# Flatten a nested list
nested = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
flattened = [item for sublist in nested for item in sublist]

Advanced List Operations

Slicing with Steps

numbers = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

# Every other element
every_other = numbers[::2]  # [0, 2, 4, 6, 8]

# Reverse the list
reversed_list = numbers[::-1]  # [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

# Every third element starting from index 1
pattern = numbers[1::3]  # [1, 4, 7]

List Methods for Efficiency

# Extend vs Append
list1 = [1, 2, 3]
list2 = [4, 5, 6]

list1.append(list2)  # [1, 2, 3, [4, 5, 6]] - adds one element
list1.extend(list2)  # [1, 2, 3, 4, 5, 6] - adds multiple elements

# Remove vs Pop
items = ['apple', 'banana', 'cherry', 'date']

items.remove('banana')  # Removes first occurrence
removed_item = items.pop(1)  # Removes and returns item at index

Collections Module: Specialized Containers

The collections module provides alternatives to built-in containers that are optimized for specific use cases.

Counter: Frequency Counting

Counter is like a tally sheet - it automatically counts how many times each element appears.

from collections import Counter

# Count word frequencies
text = "the quick brown fox jumps over the lazy dog"
words = text.split()
word_counts = Counter(words)
print(word_counts)  # Counter({'the': 2, 'quick': 1, 'brown': 1, ...})

# Most common elements
print(word_counts.most_common(3))  # [('the', 2), ('quick', 1), ('brown', 1)]

# Count characters
char_counts = Counter("hello world")
print(char_counts)  # Counter({'l': 3, 'o': 2, 'h': 1, 'e': 1, ' ': 1, 'w': 1, 'r': 1, 'd': 1})

Defaultdict: Automatic Key Creation

Defaultdict provides default values for missing keys, eliminating the need for key existence checks.

from collections import defaultdict

# Traditional approach with errors
word_groups = {}
for word in ["apple", "banana", "cherry", "date"]:
    key = word[0]  # First letter
    if key not in word_groups:
        word_groups[key] = []
    word_groups[key].append(word)

# With defaultdict
word_groups = defaultdict(list)
for word in ["apple", "banana", "cherry", "date"]:
    word_groups[word[0]].append(word)

print(dict(word_groups))  # {'a': ['apple'], 'b': ['banana'], 'c': ['cherry'], 'd': ['date']}

# Other default types
int_dict = defaultdict(int)
for item in ['a', 'b', 'a', 'c', 'b', 'a']:
    int_dict[item] += 1
print(dict(int_dict))  # {'a': 3, 'b': 2, 'c': 1}

Namedtuple: Readable Tuples

Namedtuple creates tuple subclasses with named fields, making code more readable and self-documenting.

from collections import namedtuple

# Define a Point
Point = namedtuple('Point', ['x', 'y'])
p1 = Point(3, 4)
p2 = Point(6, 8)

print(p1.x, p1.y)  # 3 4
print(p1)  # Point(x=3, y=4)

# Define a Person
Person = namedtuple('Person', 'name age city')
alice = Person('Alice', 30, 'New York')
print(alice.name)  # Alice
print(alice.age)   # 30

# Convert to dictionary
print(alice._asdict())  # {'name': 'Alice', 'age': 30, 'city': 'New York'}

Deque: Double-Ended Queue

Deque is optimized for fast appends and pops from both ends, making it perfect for queues and stacks.

from collections import deque

# As a queue (FIFO)
queue = deque()
queue.append('task1')
queue.append('task2')
queue.append('task3')

print(queue.popleft())  # 'task1'
print(queue.popleft())  # 'task2'

# As a stack (LIFO)
stack = deque()
stack.append('bottom')
stack.append('middle')
stack.append('top')

print(stack.pop())  # 'top'
print(stack.pop())  # 'middle'

# Efficient rotation
numbers = deque([1, 2, 3, 4, 5])
numbers.rotate(2)   # Rotate right by 2: [4, 5, 1, 2, 3]
numbers.rotate(-1)  # Rotate left by 1: [5, 1, 2, 3, 4]

Real-World Applications

Text Analysis

from collections import Counter, defaultdict

def analyze_text(text):
    words = text.lower().split()
    
    # Word frequency
    word_freq = Counter(words)
    
    # Words by length
    by_length = defaultdict(list)
    for word in words:
        by_length[len(word)].append(word)
    
    # Most common words
    common = word_freq.most_common(5)
    
    return {
        'total_words': len(words),
        'unique_words': len(word_freq),
        'most_common': common,
        'by_length': dict(by_length)
    }

text = "Python is a powerful programming language for data science and web development"
result = analyze_text(text)
print(result)

Recent Items Cache

from collections import deque

class RecentItems:
    def __init__(self, max_size=10):
        self.items = deque(maxlen=max_size)
    
    def add(self, item):
        self.items.append(item)
    
    def get_recent(self, count=None):
        if count is None:
            return list(self.items)
        return list(self.items)[-count:]

recent = RecentItems(max_size=5)
for i in range(10):
    recent.add(f"item_{i}")

print(recent.get_recent())    # ['item_5', 'item_6', 'item_7', 'item_8', 'item_9']
print(recent.get_recent(3))   # ['item_7', 'item_8', 'item_9']

Performance Considerations

Data Structure	Best For	Time Complexity
List	Random access, modification	O(1) access, O(n) insert/delete
Deque	End operations	O(1) append/pop from ends
Counter	Frequency counting	O(1) increment, O(n) most_common
Defaultdict	Grouping operations	O(1) access with defaults
Namedtuple	Read-only structured data	O(1) access by name

Key Points to Remember

List comprehensions provide elegant, readable ways to create and transform lists
Collections.Counter excels at frequency counting and statistical operations
Collections.defaultdict eliminates the need for key existence checks
Namedtuple makes tuple data more readable and self-documenting
Deque is optimized for operations at both ends of sequences
Choose the right tool for each job to write efficient, maintainable code

These advanced list operations and collections tools form the foundation for more complex data structures. In the next lesson, we'll explore stacks and queues - fundamental data structures that power many algorithms and real-world applications.