Final Review: Problem Sets 1

Problem 1: Function Logger Decorator with Generators

Given the following Python code:

from functools import wraps
 
def function_logger(func):
    counter = 0
 
    @wraps(func)
    def wrapper(*args, **kwargs):
        nonlocal counter
 
        counter += 1
        result = func(*args, **kwargs)
 
        print(f"The function {func.__name__} has run {counter} times")
 
        return result
    return wrapper
 
 
@function_logger
def count_generator():
    i = 0
    while True:
        yield i
        i += 1
 
 
@function_logger
def do_nothing() -> None:
    pass
 
 
gen = count_generator()
for _ in range(5):
    print(next(gen))
 
for _ in range(5):
    do_nothing()

(a) How many times will the output The function count_generator has run n times be printed?

Solution

The output The function count_generator has run n times will be printed 1 time.

Explanation: When you decorate count_generator() with @function_logger, the decorator wraps the function. When you call count_generator(), it calls the wrapper function once, which increments the counter to 1 and prints “The function count_generator has run 1 times”, then returns a generator object. The subsequent calls to next(gen) don’t invoke the decorated function again—they operate on the generator object that was already created.

(b) How many times will the output The function do_nothing has run n times be printed?

Solution

The output The function do_nothing has run n times will be printed 5 times.

Explanation: Each call to do_nothing() in the loop invokes the wrapper function, incrementing the counter. So:

  • First call: counter = 1, prints “The function do_nothing has run 1 times”
  • Second call: counter = 2, prints “The function do_nothing has run 2 times”
  • Third call: counter = 3, prints “The function do_nothing has run 3 times”
  • Fourth call: counter = 4, prints “The function do_nothing has run 4 times”
  • Fifth call: counter = 5, prints “The function do_nothing has run 5 times”

The key difference is that count_generator is called once and returns a generator object, while do_nothing is called 5 times in the loop.

Problem 2: Fibonacci Coroutine with Send

Write a coroutine function that yields Fibonacci numbers indefinitely, but allows resetting its state via send(). The coroutine will receive any integer value as a sentinel for resetting its state.

fib_gen = fibonacci_generator()
 
# Generate some numbers
print(next(fib_gen))  # 0
print(next(fib_gen))  # 1
print(next(fib_gen))  # 1
print(next(fib_gen))  # 2
print(next(fib_gen))  # 3
 
# Reset the sequence
print(fib_gen.send(0))  # 0
 
# Start over
print(next(fib_gen))  # 1
print(next(fib_gen))  # 1
Solution
def fibonacci_generator():
    a, b = 0, 1
    while True:
        reset = yield a
        if reset is not None: # ie. a value was sent, be it 0 or otherwise
            a, b = 0, 1
        else:
            a, b = b, a + b

Explanation: The coroutine uses yield with a value to receive data via send(). When send() is called with a value, the yield statement returns that value to the variable reset. If a value is sent (not None), the Fibonacci sequence is reset to the beginning. Otherwise, the sequence continues to the next Fibonacci number.

Problem 3: Decorator Replacing Generator

What will be the output of the following code?

def decorator(func):
    def inner():
        return (x for x in range(5))
 
    return inner
 
@decorator
def gen():
    for x in range(3):
        yield x
 
print(sum(gen()))
Solution

Output:

10

Explanation: The typical idea with a decorator is to wrap a function, enhancing its behavior by executing code before and/or after the original function call, but the point is that the inner function of a decorator will call the original function (the func parameter in the above code) at some point. However, in this case, the decorator completely replaces the original gen function with a new function inner that returns a generator expression (x for x in range(5)).

When gen() is called, it actually calls inner(), which returns a generator that yields numbers from 0 to 4. The sum() function then adds these numbers together: 0 + 1 + 2 + 3 + 4 = 10.

Problem 4: Nested Generator Expression Sum

Determine the output of this nested generator comprehension:

print(sum(x for x in (y for y in range(3))))
Solution

Output:

3

Explanation: The inner generator (y for y in range(3)) produces 0, 1, 2. The outer generator (x for x in ...) iterates over these values. The sum() function adds them: 0 + 1 + 2 = 3.

Problem 5: Generator Comprehension with Tuples

Is there a syntax error in this code? If not, determine the output:

gen = ((x * y,) for x in range(2) for y in range(2))
print(list(gen))
Solution

No syntax error. Single-element tuples must have a trailing comma, otherwise it would not be recognized as a tuple. Output:

[(0,), (0,), (0,), (1,)]

Explanation: The generator expression creates tuples containing the product of x and y. It iterates through:

  • x=0, y=0: (0*0,) = (0,)
  • x=0, y=1: (0*1,) = (0,)
  • x=1, y=0: (1*0,) = (0,)
  • x=1, y=1: (1*1,) = (1,)

Note: The parentheses around x * y, create a tuple. Without them, it would be a syntax error because the comma would be ambiguous in the comprehension.

Problem 6: Multiple Decorators Order

Predict the output of the following code involving multiple decorators:

def dec1(func):
    def wrapper(x):
        return func(x) * 10
    return wrapper
 
 
def dec2(func):
    def wrapper(x):
        return -func(x)
    return wrapper
 
 
def dec3(func):
    def wrapper(x):
        return func(x) + 5
    return wrapper
 
 
@dec1
@dec2
@dec3
def number(n):
    return n
 
 
print(number(5))
Solution

Output:

-100

Explanation: Decorators are applied bottom-to-top. The decoration order is:

  1. number(n) returns n
  2. @dec3 wraps it: a function that returns func(x) + 5, so dec3(5) = 5 + 5 = 10
  3. @dec2 wraps it: a function that returns -func(x), so dec2(5) = -(10) = -10
  4. @dec1 wraps it: a function that returns func(x) * 10, so dec1(5) = -10 * 10 = -100

The final result is -100.

Problem 7: Nested Generator Expression Trace

Trace the following nested generator expression. What is the output of the print statements?

gen = ((x, y, z) for x in 'abc' for y in '123' for z in [True, False])
 
print(next(gen))
print(next(gen))
print(next(gen))
Solution

Output:

('a', '1', True)
('a', '1', False)
('a', '2', True)

Explanation: The nested generator creates tuples by iterating through x, then y, then z. The innermost loop (z) completes first:

  • First: x=‘a’, y=‘1’, z=True → (‘a’, ‘1’, True)
  • Second: x=‘a’, y=‘1’, z=False → (‘a’, ‘1’, False)
  • Third: x=‘a’, y=‘2’, z=True → (‘a’, ‘2’, True)

The order is depth-first with the rightmost loop varying fastest.

Problem 8: Dictionary Copy Behavior

What is the output of the following code? If we want the output to be the opposite, what do we have to change about the code?

data = {
    "name": "General Kenobi",
    "lines": {
        "hello there": 1,
        "this is where the fun begins": 0,
    }
}
 
new_data = data.copy()
new_data["lines"]["hello there"] = 2
 
print(data == new_data)
Solution

Output:

True

Explanation: This demonstrates the shallow copy pitfall. data.copy() creates a shallow copy, meaning it copies the top-level dictionary but not the nested dictionaries. Both data and new_data share the same reference to the nested “lines” dictionary. When you modify new_data["lines"]["hello there"], you’re modifying the shared nested dictionary, so data is also affected. Therefore, data == new_data is True.

To get the opposite output (False), use a deep copy:

import copy
 
new_data = copy.deepcopy(data)
new_data["lines"]["hello there"] = 2
 
print(data == new_data)  # Output: False

A deep copy recursively copies all nested objects, so modifications to new_data don’t affect data.

Problem 9: Dictionary Unpacking

What is the output?

d1 = {'a': 1, 'b': 2}
d2 = {'b': 3, 'c': 4}
 
d2 = {**d1, **d2}
print(d2)
Solution

Output:

{'a': 1, 'b': 3, 'c': 4}

Explanation: Dictionary unpacking (**) is applied left to right. First, d1 is unpacked giving {'a': 1, 'b': 2}. Then, d2 (the original {'b': 3, 'c': 4}) is unpacked. When there are duplicate keys, the rightmost value wins. So 'b' has value 3 (from the second unpacking), not 2. The result is {'a': 1, 'b': 3, 'c': 4}.

Problem 10: Memoization Function

Write a function create_memo that accepts a function reference func as an argument. create_memo should:

Example Usage:

def expensive_calculation(x):
    print(f"Calculating for {x}...")
    return x * x
 
calc = create_memo(expensive_calculation)
print(calc(5))  # Output: Calculating for 5... 25
print(calc(5))  # Output: 25 (cached)
print(calc(7))  # Output: Calculating for 7... 49
print(calc(5))  # Output: 25 (cached)
print(calc(7))  # Output: 49 (cached)
Solution
def create_memo(func):
    cache = {}
 
    def memoized(x):
        if x not in cache:
            cache[x] = func(x)
        return cache[x]
 
    return memoized

Explanation: The memoization function creates a cache dictionary. When the memoized function is called, it creates a key from the arguments. If the key is in the cache, it returns the cached result. Otherwise, it calls the original function, stores the result in the cache, and returns it. This avoids redundant calculations for the same inputs.

Problem 11: File Processing with Line Filtering

In this problem, you will write functions to process text files.

Assume you have access to a global variable PUNCTUATION that is a string containing all possible punctuation characters. If you are asked to analyze individual words, make sure to normalize the words by stripping the punctuation from the ends of words.

(a) Write a function that opens the file, reads all lines, and prints only the lines that contain more than 30 characters.

Solution

Print lines with more than 30 characters

def print_long_lines(filename):
    with open(filename, 'r') as fd:
        for line in fd.read().splitlines():
            if len(line) > 30:
                print(line)

(b) Modify your function to return a list of all words (separated by whitespace) that appear more than once in the file.

Solution

Return words that appear more than once

def get_repeated_words(filename):
    word_count = {}
 
    with open(filename, 'r') as fd:
        for line in fd:
            words = line.split()
            for word in words:
                # Strip punctuation from ends
                clean_word = word.strip(PUNCTUATION)
                if clean_word:
                    word_count[clean_word] = word_count.get(clean_word, 0) + 1
 
    return [word for word, count in word_count.items() if count > 1]

(c) Write a function that counts the number of lines that start with a vowel (a, e, i, o, u), case-insensitive.

Solution

Count lines starting with a vowel

def count_vowel_start_lines(filename):
    vowels = 'aeiou'
    count = 0
 
    with open(filename, 'r') as fd:
        for line in fd:
            if line and line[0].lower() in vowels:
                count += 1
    return count

Problem 12: Unique Words and Text Processing

In this problem, you will write functions to process text files.

Assume you have access to a global variable PUNCTUATION that is a string containing all possible punctuation characters. If you are asked to analyze individual words, make sure to normalize the words by stripping the punctuation from the ends of words.

(a) Write a function that takes a filename and returns the number of unique words in the file (case-insensitive).

Solution

Count unique words (case-insensitive, strip punctuation from ends)

def count_unique_words(filename):
    unique_words = set()
 
    with open(filename, 'r') as fd:
        for line in fd:
            words = line.lower().split()
            for word in words:
                # Strip punctuation from ends
                clean_word = word.strip(PUNCTUATION)
                if clean_word:
                    unique_words.add(clean_word)
 
    return len(unique_words)

(b) Modify your function to also return the 5 most common words and their counts.

Solution

Return unique count and 5 most common words (using a dictionary)

def count_unique_words_with_top(filename):
    word_count = {}
 
    with open(filename, 'r') as fd:
        for line in fd:
            words = line.lower().split()
            for word in words:
                clean_word = word.strip(PUNCTUATION)
                if clean_word:
                    word_count[clean_word] = word_count.get(clean_word, 0) + 1
 
    unique_count = len(word_count)
    sorted_by_decreasing_word_count = sorted(word_count.items(), key=lambda x: x[1], reverse=True)
 
    return unique_count, sorted_by_decreasing_word_count[:5]

Alternative using defaultdict:

from collections import defaultdict
 
def count_unique_words_with_top(filename):
    # this creates a defaultdict where missing keys default to 0
    # but is unnecessarily verbose compared to using defaultdict(int)
    # word_count = defaultdict(lambda: 0)
 
    # defaultdict(int) works as an equivalent to defaultdict(lambda: 0) because invoking int() returns 0,
    # therefore `int` can act as a callback that returns 0 for missing keys.
    word_count = defaultdict(int)
 
    with open(filename, 'r') as fd:
        for line in fd:
            words = line.lower().split()
            for word in words:
                clean_word = word.strip(PUNCTUATION)
                if clean_word:
                    word_count[clean_word] += 1
 
    unique_count = len(word_count)
    sorted_by_decreasing_word_count = sorted(word_count.items(), key=lambda x: x[1], reverse=True)
 
    return unique_count, sorted_by_decreasing_word_count[:5]

(c) Write a generator function that yields each sentence (ending with a period, exclamation mark, or question mark) from the file, one at a time.

Solution

Generator function for sentences

def sentence_generator(filename):
    with open(filename, 'r') as fd:
        text = fd.read()
 
    sentence = ""
    for char in text:
        sentence += char
        if char in '.!?':
            yield sentence.strip() # strip leading/trailing whitespace
            sentence = ""