Iterators, Generators, Closures, and Decorators in Python
Iterators
Iteration is a way to access elements of a collection. An iterator is an object that remembers the position during traversal. Iterator objects start from the first element of the collection and go until all elements are accessed. Iterators can only move forward, not backward.
1. Iterable Objects
We know that we can use the for...in... loop syntax on data types like list, tuple, str, etc., to retrieve data sequentially. This process is called traversal or iteration.
But can all data types be used in a for...in... statement? Let's test:
>>> for i in 100:
... print(i)
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'int' object is not iterable
int is not iterable
Now, let's define a custom container class MyList to hold data, with an add method to add data:
>>> class MyList(object):
... def __init__(self):
... self.container = []
... def add(self, item):
... self.container.append(item)
...
>>> mylist = MyList()
>>> mylist.add(1)
>>> mylist.add(2)
>>> mylist.add(3)
>>> for num in mylist:
... print(num)
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'MyList' object is not iterable
The MyList container object is also not iterable.
We have created a custom container type MyList, but it cannot be used in a for...in... loop. Objects that can be iterated with a for...in... statement are called iterable objects.
2. Checking If an Object is Iterable
You can use isinstance() to check if an object is iterable:
In [50]: from collections import Iterable
In [51]: isinstance([], Iterable)
Out[51]: True
In [52]: isinstance({}, Iterable)
Out[52]: True
In [53]: isinstance('abc', Iterable)
Out[53]: True
In [54]: isinstance(mylist, Iterable)
Out[54]: False
In [55]: isinstance(100, Iterable)
Out[55]: False
3. The Essence of Iterable Objects
When iterating over an iterable object, each iteration (i.e., each loop in for...in...) returns the next data item. There must be something that records the current position to return the next data each time. This "something" is called an iterator.
The essence of an iterable object is that it provides an iterator to help us iterate through its data.
An iterable object provides an iterator through the __iter__ method. When we iterate over an iterable object, we first get its iterator and then use that iterator to fetch each data item.
Therefore, an object that has an __iter__ method is an iterable object.
>>> class MyList(object):
... def __init__(self):
... self.container = []
... def add(self, item):
... self.container.append(item)
... def __iter__(self):
... """Return an iterator"""
... # We will ignore how to construct an iterator object for now
... pass
...
>>> mylist = MyList()
>>> from collections import Iterable
>>> isinstance(mylist, Iterable)
True
Now, the mylist object with the __iter__ method is an iterable object.
4. The iter() and next() Functions
list, tuple, etc., are iterable objects. We can get their iterators using the iter() function. Then we can keep using next() on the iterator to get the next data item. The iter() function essentially calls the __iter__ method of the iterable object.
>>> li = [11, 22, 33, 44, 55]
>>> li_iter = iter(li)
>>> next(li_iter)
11
>>> next(li_iter)
22
>>> next(li_iter)
33
>>> next(li_iter)
44
>>> next(li_iter)
55
>>> next(li_iter)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
Note: After iterating through all data, calling next() again raises a StopIteration exception, indicating that all data has been exhausted.
5. Checking If an Object is an Iterator
You can use isinstance() to check if an object is an iterator:
In [56]: from collections import Iterator
In [57]: isinstance([], Iterator)
Out[57]: False
In [58]: isinstance(iter([]), Iterator)
Out[58]: True
In [59]: isinstance(iter("abc"), Iterator)
Out[59]: True
6. Iterator
Based on the analysis above, we know that an iterator helps us record the current position during iteration. When we use next() on an iterator, it returns the next data item. Infact, calling next() invokes the __next__ method of the iterator object (in Python 3). To construct an iterator, we must implement its __next__ method. Additionally, Python requires that an iterator is also iterable, so we must implement the __iter__ method, which should return the iterator itself.
An object that implements both __iter__ and __next__ methods is an iterator.
class MyList(object):
"""A custom iterable object"""
def __init__(self):
self.items = []
def add(self, value):
self.items.append(value)
def __iter__(self):
iterator = MyIterator(self)
return iterator
class MyIterator(object):
"""Custom iterator for the above iterable object"""
def __init__(self, mylist):
self.mylist = mylist
self.current = 0 # Track current position
def __next__(self):
if self.current < len(self.mylist.items):
item = self.mylist.items[self.current]
self.current += 1
return item
else:
raise StopIteration
def __iter__(self):
return self
if __name__ == '__main__':
mylist = MyList()
mylist.add(1)
mylist.add(2)
mylist.add(3)
mylist.add(4)
mylist.add(5)
for num in mylist:
print(num)
7. The Essence of the for...in... Loop
The for item in iterable loop essentially:
- Calls
iter(iterable)to get an iterator. - Repeatedly calls
next()on the iterator to get the next value and assigns it toitem. - Stops when a
StopIterationexception is raised.
8. Application Scenarios for Iterators
The core function of an iterator is to return the next data value via next(). If the data values are generated programmatically by some rule rather than read from an existing data collection, we don't need to cache all data beforehand. This can save a significant amount of memory.
For example, consider the famous Fibonacci sequence: the first number is 0, the second is 1, and each subsequent number is the sum of the two preceding ones: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, ...
We can implement a Fibonacci sequence iterator to generate the first n numbers on demand:
class FibonacciIterator(object):
"""Fibonacci sequence iterator"""
def __init__(self, n):
"""
:param n: int, generate the first n numbers of the sequence
"""
self.n = n
self.current = 0
self.prev_prev = 0 # First number
self.prev = 1 # Second number
def __next__(self):
if self.current < self.n:
result = self.prev_prev
self.prev_prev, self.prev = self.prev, self.prev_prev + self.prev
self.current += 1
return result
else:
raise StopIteration
def __iter__(self):
return self
if __name__ == '__main__':
fib = FibonacciIterator(10)
for num in fib:
print(num, end=" ")
9. Not Only for Loops Accept Iterable Objects
Besides for loops, functions like list() and tuple() also accept iterable objects:
li = list(FibonacciIterator(15))
print(li)
tp = tuple(FibonacciIterator(6))
print(tp)
Generators
1. Generators
With iterators, we can generate data on the fly using next(). However, implementing an iterator requires manually tracking the state. To simplify this, Python provides generators, which are a special type of iterator. Generators automatically maintain the state and support iteration.
2. Creating a Generator: Method 1
The first method is simple: replace the square brackets [] of a list comprehension with parentheses ().
In [15]: L = [x * 2 for x in range(5)]
In [16]: L
Out[16]: [0, 2, 4, 6, 8]
In [17]: G = (x * 2 for x in range(5))
In [18]: G
Out[18]: <generator object <genexpr> at 0x7f626c132db0>
The difference between L and G is only the brackets. L is a list, while G is a generator. You can use generator G with next(), for loops, list(), etc.
In [19]: next(G)
Out[19]: 0
In [20]: next(G)
Out[20]: 2
In [21]: next(G)
Out[21]: 4
In [22]: next(G)
Out[22]: 6
In [23]: next(G)
Out[23]: 8
In [24]: next(G)
---------------------------------------------------------------------------
StopIteration Traceback (most recent call last)
<ipython-input-24-380e167d6934> in <module>()
----> 1 next(G)
StopIteration:
In [26]: G = (x * 2 for x in range(5))
In [27]: for x in G:
....: print(x)
....:
0
2
4
6
8
3. Creating a Generator: Method 2 (Using Functions)
If the generation logic is complex and cannot be expressed with a simple list comprehension, we can use a function with the yield keyword.
Recall the Fibonacci iterator example. Here's how to implement it using a generator:
In [30]: def fibonacci_generator(n):
....: current = 0
....: num1, num2 = 0, 1
....: while current < n:
....: num = num1
....: num1, num2 = num2, num1 + num2
....: current += 1
....: yield num
....: return 'done'
....:
In [31]: F = fibonacci_generator(5)
In [32]: next(F)
Out[32]: 1
In [33]: next(F)
Out[33]: 1
In [34]: next(F)
Out[34]: 2
In [35]: next(F)
Out[35]: 3
In [36]: next(F)
Out[36]: 5
In [37]: next(F)
---------------------------------------------------------------------------
StopIteration Traceback (most recent call last)
<ipython-input-37-8c2b02b4361a> in <module>()
----> 1 next(F)
StopIteration: done
In this implementation, the logic that was in the __next__ method is now in a function, and return is replaced with yield. This function becomes a generator function. Calling fibonacci_generator(5) returns a generator object (F), which can be used like an iterator.
Summary
- A function containing the
yieldkeyword is a generator function. When called, it returns a generator object. - The
yieldkeyword does two things:- Saves the current execution state (breakpoint) and suspends execution (pauses the generator).
- Returns the value after
yieldas the result (similar toreturn).
- Use
next()to resume the generator from where it was paused. - In Python 3, a generator can use
returnto provide a final return value. In Python 2,returnis not allowed to return a value (it only exits).
4. Waking a Generator with send()
Besides next(), you can use send() to resume a generator. The advantage is that send() can pass an additional value to the generator at the breakpoint.
Example: When yield is executed, the function pauses and returns the value. The temp variable receives the value sent by send() (or None if next() is used).
In [10]: def generator_with_send():
....: i = 0
....: while i < 5:
....: temp = yield i
....: print(temp)
....: i += 1
....:
Using send()
In [43]: f = generator_with_send()
In [44]: next(f)
Out[44]: 0
In [45]: f.send('haha')
haha
Out[45]: 1
In [46]: next(f)
None
Out[46]: 2
In [47]: f.send('haha')
haha
Out[47]: 3
Using next()
In [11]: f = generator_with_send()
In [12]: next(f)
Out[12]: 0
In [13]: next(f)
None
Out[13]: 1
In [14]: next(f)
None
Out[14]: 2
In [15]: next(f)
None
Out[15]: 3
In [16]: next(f)
None
Out[16]: 4
In [17]: next(f)
None
---------------------------------------------------------------------------
StopIteration Traceback (most recent call last)
<ipython-input-17-468f0afdf1b9> in <module>()
----> 1 next(f)
StopIteration:
Using __next__() (Less Common)
In [18]: f = generator_with_send()
In [19]: f.__next__()
Out[19]: 0
In [20]: f.__next__()
None
Out[20]: 1
In [21]: f.__next__()
None
Out[21]: 2
In [22]: f.__next__()
None
Out[22]: 3
In [23]: f.__next__()
None
Out[23]: 4
In [24]: f.__next__()
None
---------------------------------------------------------------------------
StopIteration Traceback (most recent call last)
<ipython-input-24-39ec527346a9> in <module>()
----> 1 f.__next__()
StopIteration:
Closures
1. Function References
def test1():
print("--- in test1 func----")
# Call function
test1()
# Reference function
ret = test1
print(id(ret))
print(id(test1))
# Call function via reference
ret()
Output:
--- in test1 func----
140212571149040
140212571149040
--- in test1 func----
2. What is a Closure?
# Define a function
def outer_function(number):
# Define an inner function that uses the outer function's variable
def inner_function(number_in):
print("in inner_function, number_in is %d" % number_in)
return number + number_in
# Return the inner function (the closure)
return inner_function
# Assign: the argument 20 goes to 'number'
result = outer_function(20)
# Note: 100 goes to 'number_in'
print(result(100))
# Note: 200 goes to 'number_in'
print(result(200))
Output:
in inner_function, number_in is 100
120
in inner_function, number_in is 200
220
3. A Practical Example of a Closure
def create_line(a, b):
def line(x):
return a * x + b
return line
line1 = create_line(1, 1)
line2 = create_line(4, 5)
print(line1(5))
print(line2(5))
In this example, the function line together with variables a and b forms a closure. By setting different values for a and b, we can create different linear functions (e.g., y = x + 1 and y = 4x + 5). This demonstrates how closures improve code reusability.
Without closures, we would need to pass all three parameters (a, b, x) every time we create a line function, which would require more parameter passing and reduce code portability.
Note:
- Closures can optimize variable handling, sometimes replacing the need for class instances.
- Because closures reference local variables of the outer function, those variables are not released immediately, which can consume memory.
4. Modifying Variables in the Outer Function
Python 3 Method
def counter(start=0):
def increment():
nonlocal start
start += 1
return start
return increment
c1 = counter(5)
print(c1())
print(c1())
c2 = counter(50)
print(c2())
print(c2())
print(c1())
print(c1())
print(c2())
print(c2())
Python 2 Method (Workaround)
def counter(start=0):
count = [start]
def increment():
count[0] += 1
return count[0]
return increment
c1 = counter(5)
print(c1()) # 6
print(c1()) # 7
c2 = counter(100)
print(c2()) # 101
print(c2()) # 102
LEGB Rule
Python uses the LEGB order to look up a symbol (name):
locals -> enclosing function -> globals -> builtins
a = 1 # global
def function():
a = 2 # enclosing
def inner_function():
a = 3 # local
print("a=%d" % a)
return inner_function
f = function()
f()
-
locals: The current namespace (e.g., function or module). Function parameters are also considered local variables.
-
enclosing: The namespace of enclosing functions (common in closures).
def fun1(): a = 10 def fun2(): # a is in the enclosing namespace print(a) -
globals: The global namespace of the module where the function is defined.
a = 1 def fun(): global a a = 2 # Modifies global variable, does not create a new local one -
builtins: The namespace of the built-in modules. Python loads many built-in functions and classes (like
dict,list,type,print) into the__builtin__module. This is why we can use them without importing. Thebuiltinsmodule is loaded automatically on startup. The name resolution follows LEGB, whereBstands for builtins.
Decorators
Decorators are a powerful feature in Python frequently used in development. They can significantly improve development efficiency. Understanding decorators is essential for Python interviews.
Basic Example: @w1
def w1(func):
def inner():
# Verification 1
# Verification 2
# Verification 3
func()
return inner
@w1
def f1():
print('f1')
The Python interpreter reads the code from top to bottom:
def w1(func):loads the functionw1into memory.@w1is encountered. This is syntactic sugar. It executes the following:w1is called with the function below (f1) as its argument:w1(f1).- Inside
w1,inneris defined, which includes the originalf1call and some extra verification code. inneris returned.
- The name
f1is reassigned to the returnedinnerfunction.
So, when f1() is called later, it actually executes inner(), which first performs verification and then calls the original f1. This way, we add functionality without modifying the original function.
3. Revisiting Decorators
# Define function to wrap in bold
def make_bold(fn):
def wrapped():
return "<b>" + fn() + "</b>"
return wrapped
# Define function to wrap in italic
def make_italic(fn):
def wrapped():
return "<i>" + fn() + "</i>"
return wrapped
@make_bold
def test1():
return "hello world-1"
@make_italic
def test2():
return "hello world-2"
@make_bold
@make_italic
def test3():
return "hello world-3"
print(test1())
print(test2())
print(test3())
Output:
<b>hello world-1</b>
<i>hello world-2</i>
<b><i>hello world-3</i></b>
4. Common Uses of Decorators
- Logging
- Timing function execution
- Preprocessing before execution
- Cleanup after execution
- Permission checking
- Caching
5. Decorator Examples
Example 1: Decorating a Function with No Arguments
from time import ctime, sleep
def time_log(func):
def wrapper():
print("%s called at %s" % (func.__name__, ctime()))
func()
return wrapper
@time_log
def foo():
print("I am foo")
foo()
sleep(2)
foo()
Understanding: foo = time_log(foo) assigns the original foo to func, then foo is reassigned to the wrapper function returned by time_log.
Example 2: Decorating a Function with Arguments
from time import ctime, sleep
def time_log(func):
def wrapper(a, b):
print("%s called at %s" % (func.__name__, ctime()))
print(a, b)
func(a, b)
return wrapper
@time_log
def foo(a, b):
print(a + b)
foo(3, 5)
sleep(2)
foo(2, 4)
Example 3: Decorating a Function with Variable-Length Arguments
from time import ctime, sleep
def time_log(func):
def wrapper(*args, **kwargs):
print("%s called at %s" % (func.__name__, ctime()))
func(*args, **kwargs)
return wrapper
@time_log
def foo(a, b, c):
print(a + b + c)
foo(3, 5, 7)
sleep(2)
foo(2, 4, 9)
Example 4: Handling Return Values
from time import ctime, sleep
def time_log(func):
def wrapper():
print("%s called at %s" % (func.__name__, ctime()))
func()
return wrapper
@time_log
def foo():
print("I am foo")
@time_log
def get_info():
return '----hahah---'
foo()
sleep(2)
foo()
print(get_info())
Output (without properly handling return):
foo called at Fri Nov 4 21:55:35 2016
I am foo
foo called at Fri Nov 4 21:55:37 2016
I am foo
getInfo called at Fri Nov 4 21:55:37 2016
None
If we modify the wrapper to return func(), the return value is preserved:
def time_log(func):
def wrapper():
print("%s called at %s" % (func.__name__, ctime()))
return func()
return wrapper
Output:
foo called at Fri Nov 4 21:55:57 2016
I am foo
foo called at Fri Nov 4 21:55:59 2016
I am foo
getInfo called at Fri Nov 4 21:55:59 2016
----hahah---
Summary: For general-purpose decorators, it's best to return the result of the original function.
Example 5: Parameterized Decorators
from time import ctime, sleep
def time_log_with_prefix(prefix="hello"):
def decorator(func):
def wrapper():
print("%s called at %s %s" % (func.__name__, ctime(), prefix))
return func()
return wrapper
return decorator
@time_log_with_prefix("itcast")
def foo():
print("I am foo")
@time_log_with_prefix("python")
def too():
print("I am too")
foo()
sleep(2)
foo()
too()
sleep(2)
too()
This can be understood as foo() == time_log_with_prefix("itcast")(foo)().
Example 6: Class Decorators
A decorator must accept a callable and return a callable. Any object that implements __call__ is callable.
class Test():
def __call__(self):
print('call me!')
t = Test()
t() # call me
Class decorator example:
class Test(object):
def __init__(self, func):
print("---initialization---")
print("func name is %s" % func.__name__)
self.__func = func
def __call__(self):
print("---decorator functionality---")
self.__func()
@Test
def test():
print("----test---")
test()
Explanation:
- When
Testis used as a decorator ontest, an instance ofTestis created, and the originaltestfunction is passed to__init__. - The name
testnow points to thisTestinstance. - Calling
test()invokes__call__on the instance, which executes the decorator logic and then the original function (viaself.__func).
Output:
---initialization---
func name is test
---decorator functionality---
----test---
functools Module
The functools module was introduced in Python 2.5 and provides utility functions.
In Python 3.5, the module includes functions like wraps, partial, lru_cache, etc.
The wraps Function
When using decorators, some side effects occur, such as changing the function name and docstring.
def note(func):
"note function"
def wrapper():
"wrapper function"
print('note something')
return func()
return wrapper
@note
def test():
"test function"
print('I am test')
test()
print(test.__doc__)
Output:
note something
I am test
wrapper function
The __doc__ of test has been replaced by that of wrapper. To preserve metadata, use @functools.wraps:
import functools
def note(func):
"note function"
@functools.wraps(func)
def wrapper():
"wrapper function"
print('note something')
return func()
return wrapper
@note
def test():
"test function"
print('I am test')
test()
print(test.__doc__)
Output:
note something
I am test
test function