Skip to content

Using Python Generators and yield: A Complete Guide

Using Python Generators and yield A Complete Guide Cover Image

In this tutorial, you’ll learn how to use generators in Python, including how to interpret the yield expression and how to use generator expressions. You’ll learn what the benefits of Python generators are and why they’re often referred to as lazy iteration. Then, you’ll learn how they work and how they’re different from normal functions.

Python generators provide you with the means to create your own iterator functions. These functions allow you to generate complex, memory-intensive operations. These operations will be executed lazily, meaning that you can better manage the memory of your Python program.

By the end of this tutorial, you’ll have learned:

  • What Python generators are and how to use the yield expression
  • How to use multiple yield keywords in a single generator
  • How to use generator expressions to make generators simpler to write
  • Some common use cases for Python generators

Understanding Python Generators

Before diving into what generators are, let’s explore what iterators are. Iterators are objects that can be iterated upon, meaning that they return one action or item at a time. To be considered an iterator, objects need to implement two methods: __iter__() and __next__(). Some common examples of iterators in Python include for loops and list comprehensions.

Generators are a Pythonic implementation of creating iterators, without needing to explicitly implement a class with __iter__() and __next__() methods. Similarly, you don’t need to keep track of the object’s internal state. An important thing to note is that generators iterate over an object lazily, meaning they do not store their contents in memory.

The yield statement’s job is to control the flow of a generator function. The statement goes further to handle the state of the generator function, pausing it until it’s called again, using the next() function.

Creating a Simple Generator

In this section, you’ll learn how to create a basic generator. One of the key syntactical differences between a normal function and a generator function is that the generator function includes a yield statement.

Let’s see how we can create a simple generator function:

# Creating a Simple Generator Function in Python
def return_n_values(n):
    num = 0
    while num < n:
        yield num
        num += 1

Let’s break down what is happening here:

  1. We define a function, return_n_values(), which takes a single parameter, n
  2. In the function, we first set the value of num to 0
  3. We then enter a while loop that evaluates whether the value of num is less than our function argument, n
  4. While that condition is True, we yield the value of num
  5. Then, we increment the value of num using the augmented assignment operator

Immediately, there are two very interesting things that happen:

  1. We use yield instead of return
  2. A statement follows the yield statement, which isn’t ignored

Let’s see how we can actually use this function:

# Calling a Simple Generator Function
values = return_n_values(5)
print(values)

# Returns: <generator object return_n_values at 0x7fe06812cc10>

In the code above, we create a variable values, which is the result of calling our generator function with an argument of 5 passed in. When we print the value of values, a generator object is returned.

So, how do we access the values in our generator object? This is done using the next() function, which calls the internal .__iter__() method. Let’s see how this works in Python:

# Using the next() Function to Access Generator Values
values = return_n_values(5)
print(next(values))

# Returns: 0

We can see here that the value of 0 is returned. However, intuitively, we know that the values of 0 through 4 should be returned. Because a Python generator remembers the function’s state, we can call the next() function multiple times. Let’s call it a few more times:

# Calling the next() Function Multiple Times
values = return_n_values(5)

print(next(values))
print(next(values))
print(next(values))
print(next(values))
print(next(values))

# Returns: 
# 0
# 1
# 2
# 3
# 4

In this case, we’ve yielded all of the values that the while loop will accept. Let’s see what happens when we call the next() function a sixth time:

# Exhausting yielded Elements in a Python Generator
def return_n_values(n):
    num = 0
    while num < n:
        yield num
        num += 1

values = return_n_values(5)
print(next(values))
print(next(values))
print(next(values))
print(next(values))
print(next(values))
print(next(values))

# Returns: 
# 0
# 1
# 2
# 3
# 4

# Traceback (most recent call last):
#   File "/Users/nikpi/datagy/generators.py", line 13, in <module>
#     print(next(values))
# StopIteration

We can see in the code sample above that when the condition of our while loop is no longer True, Python will raise StopIteration.

In the next section, you’ll learn how to create a Python generator using a for loop.

Creating a Python Generator with a For Loop

In the previous example, you learned how to create and use a simple generator. However, the example above is complicated by the fact that we’re yielding a value and then incrementing it. This can often make generators much more difficult for beginners and novices to understand.

Instead, we can use a for loop, rather than a while loop, for simpler generators. Let’s rewrite our previous generator using a for loop to make the process a little more intuitive:

# Creating a Generator with a For Loop
def return_n_values(n):
    for i in range(n):
        yield i

three = return_n_values(3)

print(next(three))
print(next(three))
print(next(three))

# Returns:
# 0
# 1
# 2

In the code block above, we used a for loop instead of a while loop. We used the Python range() function to create a range of values from 0 through to the end of the values. This simplifies the generator a little bit, making it more approachable to readers of your code.

Unpacking a Generator with a For Loop

In many cases, you’ll see generators wrapped inside of for loops, in order to exhaust all possible yields. In these cases, the benefit of generators is less about remembering the state (though this is used, of course, internally), and more about using memory wisely.

# Unpacking Generators With a For Loop
def return_n_values(n):
    num = 0
    while num < n:
        yield num
        num += 1

values = return_n_values(5)
for val in values:
    print(val, end=' ')

# Returns: 0 1 2 3 4 

In the code block above, we used a for loop to loop over each iteration of the generator. This implicitly calls the __next__() method. Note that we’re using the optional end= parameter of the print function, which allows you to overwrite the default newline character.

Creating a Python Generator with Multiple Yield Statements

A very interesting difference between Python functions and generators is that a generator can actually hold more than one yield expressions! While, theoretically, a function can have more than one return keyword, nothing after the first will execute.

Let’s take a look at an example where we define a generator with more than one yield statement:

# Using Multiple yield Statements in a Generator
def multiple_yields():
    yield 'Hello world.'
    yield 'Welcome to datagy.io!'

multiple = multiple_yields()
print(next(multiple))
print(next(multiple))
print(next(multiple))

# Returns:
# Hello world.
# Welcome to datagy.io!
# Traceback (most recent call last):
#   File "/Users/nikpi/datagy/generators.py", line 13, in <module>
#     print(next(values))
# StopIteration

In the code block above, our generator has more than one yield statement. When we call the first next() function, it returns only the first yielded value. We can keep calling the next() function until all the yielded values are depleted. At this point, the generator will raise a StopIteration exception.

Understanding the Performance of Python Generators

One of the key things to understand is why you’d want to use a Python generator. Because Python generators evaluate lazily, they use significantly less memory than other objects.

For example, if we created a generator that yielded the first one million numbers, the generator doesn’t actually hold the values. Meanwhile, by using a list comprehension to create a list of the first one million values, the list actually holds the values. Let’s see what this looks like:

# Comparing the Size of a List versus a Generator
import sys

def return_n_values(n):
    num = 0
    while num < n:
        yield num
        num += 1

gen_values = return_n_values(1_000_000)
list_values = [i for i in range(1_000_000)]

print(f'Generator size: {sys.getsizeof(gen_values)}')
print(f'List size: {sys.getsizeof(list_values)}')

# Returns:
# Generator size: 112
# List size: 8448728

In the code block above, we import the sys library which allows us to access the getsizeof() function. We then print the size of both the generator and the list. We can see that the list is over 75,000 times larger.

In the following section, you’ll learn how to simplify creating generators by using generator expressions.

Creating Python Generator Expressions

When you want to create one-off generators, using a function can seem redundant. Similar to list and dictionary comprehensions, Python allows you to create generator expressions. This simplifies the process of creating generators, especially for generators that you only need to use once.

In order to create a generator expression, you wrap the expression in parentheses. Say you wanted to create a generator that yields the numbers from zero through four. Then, you could write (i for i in range(5)).

# Creating Generator Expressions
five_values = (i for i in range(5))
print(next(five_values))
print(next(five_values))
print(next(five_values))
print(next(five_values))
print(next(five_values))

# Returns:
# 0
# 1
# 2
# 3
# 4

In the example above, we used a generator expression to yield values from 0 to 4. We then call the next() function five times to print out the values in the generator.

In the following section, we’ll dive further into the yield statement.

Understanding the Python yield Statement

The Python yield statement can often feel unintuitive to newcomers to generators. What separates the yield statement from the return statement is that rather than ending the process, it simply suspends the current process.

The yield statement will suspend the process and return the yielded value. When the subsequent next() function is called, the process is resumed until the following value is yielded.

What is great about this is that the state of the process is saved. This means that Python will know where to pick up its iteration, allowing it to move forward without a problem.

How to Throw Exceptions in Python Generators Using throw

Python generators have access to a special method, .throw(), which allows them to throw an exception at a specific point of iteration. This can be helpful if you know that an erroneous value may exist in the generator.

Let’s take a look at how we can use the .throw() method in a Python generator:

# Throwing an Exception with the .throw() Method
five_values = (i for i in range(5))
for value in five_values:
    print(value)
    if value == 3:
        five_values.throw(ValueError('The number is three!'))

# Returns:
# 0
# 1
# 2
# 3
# ValueError: The number is three!

Let’s break down how we can use the .throw() method to throw an exception in a Python generator:

  1. We create our generator using a generator expression
  2. We then use a for loop to loop over each value
  3. Within the for loop, we use an if statement to check if the value is equal to 3. If it is, we call the .throw() method, which raises an error

In some cases, you may simply want to stop a generator, rather than throwing an exception. This is what you’ll learn in the following section.

How to Stop a Python Generator Using stop

Python allows you to stop iterating over a generator by using the .close() function. This can be very helpful if you’re reading a file using a generator and you only want to read the file until a certain condition is met.

Let’s repeat our previous example, though we’ll stop the generator rather than throwing an exception:

# Stopping Execution of a Generator Using the .close() Method
five_values = (i for i in range(5))
for value in five_values:
    print(value)
    if value == 3:
        five_values.close()

# Returns:
# 0
# 1
# 2
# 3

In the code block above we used the .close() method to stop the iteration. While the example above is simple, it can be extended quite a lot. Imagine reading a file using Python – rather than reading the entire file, you may only want to read it until you find a given line.

Conclusion

In this tutorial, you learned how to use generators in Python, including how to interpret the yield expression and how to use generator expressions. You learned what the benefits of Python generators are and why they’re often referred to as lazy iteration. Then, you learned how they work and how they’re different from normal functions.

Additional Resources

To learn more about related topics, check out the resources below:

Nik Piepenbreier

Nik is the author of datagy.io and has over a decade of experience working with data analytics, data science, and Python. He specializes in teaching developers how to use Python for data science using hands-on tutorials.View Author posts

2 thoughts on “Using Python Generators and yield: A Complete Guide”

Leave a Reply

Your email address will not be published. Required fields are marked *