Generator is a special type of function that produces a series of values over time. That means we can use a generate and get a value, then stop, and later get the next value.
This is the process of lazy generation of sequence of values. Let’s understand the problem first, that generators are trying to solve, and why we need a lazy generation.
Problem
Say, we have a function that generates a bunch of square values from 1 to a given number. Check the function below-
def square_store(n):
result = []
for i in range(n):
result.append((i + 1) ** 2)
return result
print(square_store(100))
PythonNote that, this function creates a bunch of numbers, saves those in a list, and finally returns the list of values.
What happens when the provided value of “n” is a very large value?
All those numbers will be generated, and saved in a list before being returned. This will make the memory usage high.
NOTES
We do not even need these numbers altogether, most of the time.
We need those one by one.
Solution
The solution to this is to create a single value at a time, and return that. And avoid generating the full sequence entirely.
Step #1: using “yield“
To create a generator we need “yield”. The “yield” keyword turns a function into a generator.
Advantages of using “yield”-
Check the code below-
def sample_numbers():
yield 1
yield 4
yield 9
yield 16
Python“yield” will return a value, and pause the execution at that point. When called the next time, the execution starts from where it was paused.
So when called the first time this sample_numbers() function will return 1.
Call it again and it will return 4.
Call it the next time and 9 is returned.
This is the difference between “return” and “yield”-
Generator as Iterator
Let’s see what the generator returns, check the code below-
from collections.abc import Iterable, Iterator
def sample_numbers():
yield 1
yield 4
yield 9
yield 16
# return 20
sample_gen = sample_numbers()
print("sample_gen value: ", sample_gen)
print("sample_gen type: ", type(sample_gen))
print("is sample_gen an iterator? ",isinstance(sample_gen, Iterator))
print("sample_gen dir result: ", dir(sample_gen))
PythonOutput:
After the code execution, we see the following-
sample_gen value: <generator object sample_numbers at 0x7f585747a030>
sample_gen type: <class 'generator'>
is sample_gen an iterator? True
sample_gen dir result:
['__class__', '__del__', '__delattr__', '__dir__', '__doc__',
'__eq__', '__format__', '__ge__', '__getattribute__', '__gt__',
'__hash__', '__init__', '__init_subclass__', '__iter__', '__le__',
'__lt__', '__name__', '__ne__', '__new__', '__next__',
'__qualname__', '__reduce__', '__reduce_ex__', '__repr__',
'__setattr__', '__sizeof__', '__str__', '__subclasshook__',
'close', 'gi_code', 'gi_frame', 'gi_running', 'gi_yieldfrom',
'send', 'throw']
PlaintextYou can see that the generator is an Iterator. And in the result of “dir”, we can find the following functions-
So we can iterate over the result returned by the generator.
Let’s check the different ways to get the data yielded by the generator.
Usage #1: In a Loop
Use the generator as an iterator in a loop, and it will iterate through the generated value-
for number in sample_numbers():
print(number)
PythonOutput:
1
4
9
16
PlaintextUsage #2: Using next()
As we get an iterator back from the generator, so the “next()” function can be used to get results one by one.
Each time we call the generator with “next()” we get one result back.
number_gen = sample_numbers()
print(next(number_gen))
print(next(number_gen))
print(next(number_gen))
print(next(number_gen))
# try to get beyond limit
# print(next(number_gen))
# throws StopIternation
PythonOutput:
1
4
9
16
PlaintextUsage #3: Get All as List
We can also use the list() constructor function to extract the full result from the generator and convert it to a list.
This is not an ideal use case for a generator, but in some cases, you might need to use this.
big_box_num_list = list(sample_numbers())
print(big_box_num_list)
PythonWARNING
The advantages of the generator are lost when we convert it to a list. As, the full result is generated at once, and converted to list.
Output:
[1, 4, 9, 16]
PlaintextUsage #4: Generator Expression Unpacking
We can use generator expression unpacking in print() function to print the elements. Add asterisk(*) before the generator call.
The same generator expression can be used to unpack the values in other places, like a list, tuple, or function arguments.
print(*sample_numbers())
# Unpack Generated Values to List
numbers_list = [*sample_numbers()]
print(numbers_list)
# Unpack Generated Values to tuple
numbers_tuple = (*sample_numbers(),)
print(numbers_tuple)
def my_func(*args):
print("Inside my_func: ", args)
my_func(*sample_numbers())
PythonOutput:
1 4 9 16
[1, 4, 9, 16]
(1, 4, 9, 16)
Inside my_func: (1, 4, 9, 16)
PlaintextStep #2: Use “yield” in Loop
In the previous step, we had a bunch of “yield” statements, which were static.
Now let’s add the “yield” statement in loop. This way we can iterate and produce the yield as many times as we want.
Each time the code reaches to “yield”, it will return the data. And resume from there, when it is called the next time.
def sample_numbers():
for i in range(5):
yield i
for num in sample_numbers():
print(num)
PythonOutput:
0
1
2
3
4
PlaintextStep #3: Pass Range as Param
After the next step, we can do some more modifications.
Add a param to the generator, and accept the range for the limit of the loop.
This is not directly related to generator construction, but most of the times the generator is used like this.
def sample_numbers(n):
for i in range(n):
yield i
sample_gen = sample_numbers(5)
for num in sample_gen:
print(num)
PythonOutput:
0
1
2
3
4
PlaintextGenerator Expression
There is a compact way of defining a generator. It uses the following syntax-
generator = (expression for item in iterable if condition)
PythonThe above one-liner expression has the following criteria-
NOTES
This expression is similar to the list comprehensions. Just, in list compression, we use square brackets [], but in generator expression, we use parentheses ().
Example #1: Number Generator
Let’s use the generator expression to create a number generator. It works the same way as the generator created with “yield”.
sample_gen_ex = (x**2 for x in range(10))
print(sample_gen_ex)
print("Individual sample number:")
print(next(sample_gen_ex))
print(next(sample_gen_ex))
print(next(sample_gen_ex))
print("In the loop:")
for n in sample_gen_ex:
print(n)
PythonOutput:
<generator object <genexpr> at 0x7f1f12a502e0>
Individual samplem number:
0
1
4
In the loop:
9
16
25
36
49
64
81
PlaintextExample #2: Number Generator with Condition
We can add a condition for the generated numbers. Just add a “if” statement at the end of the loop.
Here we are considering numbers from the loop, which are divisible by two(2).
sample_gen_ex = (x*x for x in range(10) if x%2 == 0)
print(sample_gen_ex)
print("Individual sample number:")
print(next(sample_gen_ex))
print(next(sample_gen_ex))
print(next(sample_gen_ex))
print("In the loop:")
for n in sample_gen_ex:
print(n)
PythonOutput:
<generator object <genexpr> at 0x7fc0d5e842e0>
Individual sample number:
0
4
16
In the loop:
36
64
PlaintextResetting a Generator
Each time we go to the next step, the generator moves to the next yield element. But in the following cases, the generator starts from the beginning.
If we initialize the generator, then it will start from the beginning. In the following example we are running a generator in the loop twice-
# Generator that generate numbers
# from 0 to 4
def sample_numbers():
for i in range(5):
yield i
# Use the generator
for num in sample_numbers():
print(num)
# Use the generator again
for num in sample_numbers():
print(num)
PythonOutput:
0
1
2
3
4
0
1
2
3
4
PlaintextWhat if we break the first loop before completing it?
the generator will start from the beginning, when we call it again. Check the example below.
# Generator that generate numbers
# from 0 to 4
def sample_numbers():
for i in range(5):
yield i
# Use the generator
for num in sample_numbers():
print(num)
# Break with some condition
# before the loop is comple
if num == 2:
break
# Use the generator again
for num in sample_numbers():
print(num)
PythonOutput:
0
1
2
0
1
2
3
4
PlaintextCheck the following example. It will make the case more clear-
# Generator that generate numbers
# from 0 to 4
def sample_numbers():
for i in range(5):
yield i
# Generator initialized first time
gen = sample_numbers()
print(next(gen))
print(next(gen))
print(next(gen))
# Generator initialized again
gen = sample_numbers()
print(next(gen))
print(next(gen))
PythonOutput:
0
1
2
0
1
PlaintextClosing a Generator
Use the “close()” function on the generator to close it. We can not use the generator to generate data anymore.
NOTES
Once the generator is closed we can not restart or resume it.
If we try to get the next value then it throws an exception(StopIteration)
# Generator that generate numbers
# from 0 to 4
def sample_numbers():
for i in range(5):
yield i
# Generator initialized first time
gen = sample_numbers()
print(next(gen))
print(next(gen))
# Close the generator
gen.close()
print(next(gen))
PythonOutput:
0
1
<generator object sample_numbers at 0x7f5b797da030>
Traceback (most recent call last):
File "generator2.py", line 19, in <module>
print(next(gen))
StopIteration
PlaintextWe can handle the case, by handling the exception, like below.
# Generator that generate numbers
# from 0 to 4
def sample_numbers():
for i in range(5):
yield i
# Generator initialized first time
gen = sample_numbers()
print(next(gen))
print(next(gen))
# Close the generator
gen.close()
try:
print(next(gen))
except StopIteration:
print("Generator is closed.")
PythonOutput:
0
1
Generator is closed.
PlaintextWhen “close()” is called, it raises “GeneratorExit”.
We can catch “GeneratorExit” specifically, or just use the finally block inside the generator, if some cleanup is required.
# Generator that generate numbers
# from 0 to 4
def sample_numbers():
try:
for i in range(5):
yield i
# except GeneratorExit:
# print("GeneratorExit: Cleaning up generator")
finally:
print("Generator closed, cleaning up.")
# Generator initialized first time
gen = sample_numbers()
print(next(gen))
print(next(gen))
# Close the generator
gen.close()
PythonOutput:
0
1
Generator closed, cleaning up.
PlaintextSend Data to “yield“
We can send data to the generator for the next step in each call. Use the “send()” function to send data. send() resumes the generator call and sends data to the generator,
Assign the yield value to some variable in the generator.
WARNING
We can not send the value of the first call to the generator. If we use send() to send some value in the first call, it throws the following error-
TypeError: can’t send non-None value to a just-started generator.
We can use send(None) in the first call, if we want to. Or, use next to in the first call.
def sample_gen():
received_val1 = yield 100
print("1. Received in generator: ", received_val1)
received_val2 = yield "Second Value"
print("2. Received in generator: ", received_val2)
received_val3 = yield "ABCDEF"
print("3. Received in generator: ", received_val3)
received_val4 = yield 999
print("4. Received in generator: ", received_val4)
# Generator initialized
gen = sample_gen()
# First call
print("First Call-")
print("Outside generator: ", next(gen))
# Second call with value
print("Second Call-")
print("Outside generator: ", gen.send("S2"))
# Third call without value
print("Third Call-")
print("Outside generator: ", next(gen))
# Fourth call with value
print("Fourth Call-")
print("Outside generator: ", gen.send("S4"))
PythonOutput:
First Call-
Outside generator: 100
Second Call-
1. Received in generator: S2
Outside generator: Second Value
Third Call-
2. Received in generator: None
Outside generator: ABCDEF
Fourth Call-
3. Received in generator: S4
Outside generator: 999
PlaintextSub-Generator with “yield from“
Let’s see how can use another generator inside a generator-
Sub-Generator in Loop [not recommended]
We can call the generator in a loop like below, and that will work as expected-
def sub_generator():
yield 1
yield 2
def main_generator():
for value in sub_generator():
yield value
for value in main_generator():
print(value)
PythonOutput:
1
2
PlaintextUse “yield from” to Simplify
Instead of using a loop, we can use “yield from” and then call the generator.
The use of “yield from” will call the generator, and iterate over the items returned by the yield. So, works exactly like iterating in a loop.
def sub_generator():
yield 1
yield 2
def main_generator():
yield from sub_generator()
for value in main_generator():
print(value)
PythonOutput:
1
2
Plaintext“yield from” Multiple Times
We can use “yield from” multiple times, to call multiple generators inside a generator.
In that case, the processing of the first “yield from” is done completely, then the next one is processed.
def sub_generator1():
yield 1
yield 2
def sub_generator2():
yield 444
yield 555
yield 666
yield 777
def main_generator():
yield from sub_generator1() # First this is complete
yield from sub_generator2() # This done after the previous one is complete
for value in main_generator():
print(value)
PythonOutput:
1
2
444
555
666
777
Plaintext“yield from” with Return Value
We can return from the sub generator and get that in the main generator. This is possible with “yield from”.
def sub_generator():
yield 1
yield 2
return "sub_generator done"
def main_generator():
result = yield from sub_generator()
print(f"In main generator: {result}")
for value in main_generator():
print(value)
PythonOutput:
1
2
In main generator: sub_generator done
PlaintextIterate over Iteratables
If we have iterables on multiple levels, we can use the following format to iterate over each item one by one.
Here we have a list, tuple, set, inside a list. We are using “yield from” in the generator, and it will iterate over each iterable item inside the list.
def print_iter(*iterables):
for i in iterables:
yield from i
for item in print_iter([1, 2, 3, 4], ("abc", "def"), {"big", "box", "code"}):
print(item)
PythonOutput:
1
2
3
4
abc
def
box
code
big
PlaintextMemory Usage Comparison
Let’s company the memory usage, for the same task for a generator and simple function without a generator.
The following scripts are used for comparison-
import tracemalloc
# Start memory tracking
tracemalloc.start()
def square_store(n):
result = []
for i in range(n):
yield (i + 1) ** 2
ss = square_store(10_000_000)
for i in ss:
print(i)
# Get memory usage
current, peak = tracemalloc.get_traced_memory()
print(f"Current memory usage: {current / 1024:.2f} KB")
print(f"Peak memory usage: {peak / 1024:.2f} KB")
# Stop memory tracking
tracemalloc.stop()
Pythonimport tracemalloc
# Start memory tracking
tracemalloc.start()
def square_store(n):
result = []
for i in range(n):
result.append((i + 1) ** 2)
return result
ss = square_store(10_000_000)
for i in ss:
print(i)
# Get memory usage
current, peak = tracemalloc.get_traced_memory()
print(f"Current memory usage: {current / 1024:.2f} KB")
print(f"Peak memory usage: {peak / 1024:.2f} KB")
# Stop memory tracking
tracemalloc.stop()
PythonHere are the peak memory usage-
NOTES
The following data is taken on a single machine, and not represented as exact memory usage measurement.
This is represented as a comparison, see the pattern of memory usage measurement.
Dataset Size (number/length) | Peak Memory(KB) Without “yield“ | Peak Memory(KB) with “yield“ |
---|---|---|
100 | 4.27 | 1.25 |
1,000 | 36.62 | 1.28 |
10,000 | 357.25 | 1.33 |
100,000 | 3,779.85 | 1.33 |
1,000,000 | 39,373.35 | 1.33 |
10,000,000 | 399,379.63 | 1.33 |
Here is the same data represented in a chart. You can see the spike in memory usage for a normal function, the generator stays very minimal memory usage.
Advantages
Here are the advantages of using a generator-
Examples
Let’s take at few example, that we can use in real life-
Example #1: Number Generator
def infinite_number_gen(start=0):
while True:
yield start
# For the next number
start += 1
inf_gen = infinite_number_gen()
# First get 10 numbers in loop
for _ in range(10):
print(next(inf_gen))
# Generate 2 more
print(next(inf_gen))
print(next(inf_gen))
PythonOutput:
0
1
2
3
4
5
6
7
8
9
10
11
PlaintextExample #2: Fibonacci Generator
def fibonacci(n):
# Initial values
n1, n2 = 0, 1
# Run in range
for _ in range(n):
# Return one value
yield n1
# Set the values for next steps
n1, n2 = n2, n1 + n2
# Generate 10 fibonacci numbers
for fib_num in fibonacci(10):
print(fib_num)
PythonOutput:
0
1
1
2
3
5
8
13
21
34
PlaintextExample #3: Prime Number Generator
# Checker for prime
def is_prime(num):
if num < 2:
return False
for i in range(2, int(num**0.5) + 1):
if num % i == 0:
return False
return True
# Generator
def prime_num_gen(limit):
for num in range(2, limit):
if is_prime(num):
yield num
# Demo usage
prime_nums = prime_num_gen(20)
print(list(prime_nums))
PythonOutput:
[2, 3, 5, 7, 11, 13, 17, 19]
PlaintextExample #4: File Reader
2025-01-05 10:15:32,451 - INFO - Starting the application
2025-01-05 10:15:32,459 - DEBUG - Loading configuration from config.json
2025-01-05 10:15:32,467 - INFO - Configuration loaded successfully
2025-01-05 10:15:32,482 - WARNING - No backup found. Proceeding without backup.
2025-01-05 10:15:32,495 - ERROR - Failed to connect to database: Connection timed out
2025-01-05 10:15:32,511 - INFO - Retrying connection to database
2025-01-05 10:15:32,532 - INFO - Connected to database successfully
2025-01-05 10:15:32,548 - DEBUG - Initializing cache
Plaintextdef read_file(path):
with open(path) as file:
for line in file:
yield line
for line in read_file("big_box_log.txt"):
print(line)
PythonOutput:
2025-01-05 10:15:32,451 - INFO - Starting the application
2025-01-05 10:15:32,459 - DEBUG - Loading configuration from config.json
2025-01-05 10:15:32,467 - INFO - Configuration loaded successfully
2025-01-05 10:15:32,482 - WARNING - No backup found. Proceeding without backup.
2025-01-05 10:15:32,495 - ERROR - Failed to connect to database: Connection timed out
2025-01-05 10:15:32,511 - INFO - Retrying connection to database
2025-01-05 10:15:32,532 - INFO - Connected to database successfully
2025-01-05 10:15:32,548 - DEBUG - Initializing cache
PlaintextCognitive Clarifications
Here are some questions and clarifications about Python Generators.
As the generator calculates and generates values on requests, so we can use this to avoid unnecessary computations.
No.
Generators are not thread-safe, so can not be shared across threads without proper synchronization.