Skip to main content

Python Iterators and Generators Tutorial

Explore the difference between Python Iterators and Generators and learn which are the best to use in various situations.
Nov 2022  · 10 min read

Iterators are objects that can be iterated upon. They serve as a common feature of the Python programming language, neatly tucked away for looping and list comprehensions. Any object that can derive an iterator is known as an iterable. 

There is a lot of work that goes into constructing an iterator. For instance, the implementation of each iterator object must consist of an __iter__() and __next__() method. In addition to the prerequisite above, the implementation must also have a way to track the object's internal state and raise a StopIteration exception once no more values can be returned. These rules are known as the iterator protocol

Implementing your own iterator is a drawn-out process, and it is only sometimes necessary. A simpler alternative is to use a generator object. Generators are a special type of function that use the yield keyword to return an iterator that may be iterated over, one value at a time. 

The ability to discern the appropriate scenarios to implement an iterator or use a generator will improve your skills as a Python programmer. In the remainder of this tutorial, we will emphasize the distinctions between the two objects, which will help you decide the best one to use for various situations. 

Glossary

Term

Definition

Iterable 

A Python object which can be looped over or iterated over in a loop. Examples of iterables include lists, sets, tuples, dictionaries, strings, etc. 

Iterator

An iterator is an object that can be iterated upon. Thus, iterators contain a countable number of values. 

Generator

A special type of function which does not return a single value: it returns an iterator object with a sequence of values.

Lazy Evaluation 

An evaluation strategy whereby certain objects are only produced when required. Consequently, certain developer circles also refer to lazy evaluation as “call-by-need.”

Iterator Protocol 

A set of rules that must be followed to define an iterator in Python. 

next()

A built-in function used to return the next item in an iterator. 

iter()

A built-in function used to convert an iterable to an iterator. 

yield()

A python keyword similar to the return keyword, except yield returns a generator object instead of a value. 

Python Iterators & Iterables

Iterables are objects capable of returning their members one at a time – they can be iterated over. Popular built-in Python data structures such as lists, tuples, and sets qualify as iterables. Other data structures like strings and dictionaries are also considered iterables: a string can produce iteration of its characters, and the keys of a dictionary can be iterated upon. As a rule of thumb, consider any object that can be iterated over in a for-loop as an iterable. 

Exploring Python iterables with examples

Given the definitions, we may conclude that all iterators are also iterable. However, every iterable is not necessarily an iterator. An iterable produces an iterator only once it is iterated on.

To demonstrate this functionality, we will instantiate a list, which is an iterable, and produce an iterator by calling the iter() built-in function on the list. 

list_instance = [1, 2, 3, 4]
print(iter(list_instance))

"""
<list_iterator object at 0x7fd946309e90>
"""

Although the list by itself is not an iterator, calling the iter() function converts it to an iterator and returns the iterator object.

To demonstrate that not all iterables are iterators, we will instantiate the same list object and attempt to call the next() function, which is used to return the next item in an iterator.  

list_instance = [1, 2, 3, 4]
print(next(list_instance))
"""
--------------------------------------------------------------------
TypeError                         Traceback (most recent call last)
<ipython-input-2-0cb076ed2d65> in <module>()
    3 print(iter(list_instance))
    4
----> 5 print(next(list_instance))
TypeError: 'list' object is not an iterator
"""

In the code above, you can see that attempting to call the next() function on the list raised a TypeError – learn more about Exception and Error Handling in Python. This behavior occurred for the simple fact that a list object is an iterable and not an iterator. 

Exploring Python iterators with examples

Thus, if the goal is to iterate on a list, then an iterator object must first be produced. Only then can we manage the iteration through the values of the list.

# instantiate a list object
list_instance = [1, 2, 3, 4]

# convert the list to an iterator
iterator = iter(list_instance)

# return items one at a time
print(next(iterator))
print(next(iterator))
print(next(iterator))
print(next(iterator))
"""
1
2
3
4
"""

Python automatically produces an iterator object whenever you attempt to loop through an iterable object. 

# instantiate a list object
list_instance = [1, 2, 3, 4]

# loop through the list
for iterator in list_instance:
  print(iterator)
"""
1
2
3
4
"""

When the StopIteration exception is caught, then the loop ends.

The values obtained from an iterator can only be retrieved from left to right. Python does not have a previous() function to enable developers to move backward through an iterator. 

The lazy nature of iterators

It is possible to define multiple iterators based on the same iterable object. Each iterator will maintain its own state of progress. Thus, by defining multiple iterator instances of an iterable object, it is possible to iterate to the end of one instance while the other instance remains at the beginning.

list_instance = [1, 2, 3, 4]
iterator_a = iter(list_instance)
iterator_b = iter(list_instance)
print(f"A: {next(iterator_a)}")
print(f"A: {next(iterator_a)}")
print(f"A: {next(iterator_a)}")
print(f"A: {next(iterator_a)}")
print(f"B: {next(iterator_b)}")
"""
A: 1
A: 2
A: 3
A: 4
B: 1
"""

Notice iterator_b prints the first element of the series.

Thus, we can say iterators have a lazy nature: when an iterator is created, the elements are not yielded until they are requested. In other words, the elements of our list instance would only be returned once we explicitly ask them to be with next(iter(list_instance))

However, all of the values from an iterator may be extracted at once by calling a built-in iterable data structure container (i.e., list(), set(), tuple()) on the iterator object to force the iterator to generate all its elements at once.

# instantiate iterable
list_instance = [1, 2, 3, 4]

# produce an iterator from an iterable
iterator = iter(list_instance)
print(list(iterator))
"""
[1, 2, 3, 4]
"""

It’s not recommended to perform this action, especially when the elements the iterator returns are large since this will take a long time to process.

Whenever a large data file swamps your machine's memory, or you have a function that requires its internal state to be maintained upon each call but creating an iterator does not make sense given the circumstances, a better alternative is to use a generator object.

Python Generators

The most expedient alternative to implementing an iterator is to use a generator. Although generators may look like ordinary Python functions, they are different. For starters, a generator object does not return items. Instead, it uses the yield keyword to generate items on the fly. Thus, we can say a generator is a special kind of function that leverages lazy evaluation.

Generators do not store their contents in memory as you would expect a typical iterable to do. For example, if the goal were to find all of the factors for a positive integer, we would typically implement a traditional function (learn more about Python Functions in this tutorial) as follows:  

def factors(n):
  factor_list = []
  for val in range(1, n+1):
      if n % val == 0:
          factor_list.append(val)
  return factor_list

print(factors(20))
"""
[1, 2, 4, 5, 10, 20]
"""

The code above returns the entire list of factors. However, notice the difference when a generator is used instead of a traditional Python function:

def factors(n):
  for val in range(1, n+1):
      if n % val == 0:
          yield val
print(factors(20))

"""
<generator object factors at 0x7fd938271350>
"""

Since we used the yield keyword instead of return, the function is not exited after the run. In essence, we told Python to create a generator object instead of a traditional function, which enables the state of the generator object to be tracked. 

Consequently, it is possible to call the next() function on the lazy iterator to show the elements of the series one at a time. 

def factors(n):
  for val in range(1, n+1):
      if n % val == 0:
          yield val
         
factors_of_20 = factors(20)
print(next(factors_of_20))

"""
1
"""

Another way to create a generator is with a generator comprehension. Generator expressions adopt a similar syntax to that of a list comprehension, except it uses rounded brackets instead of squared.

print((val for val in range(1, 20+1) if n % val == 0))
"""
<generator object <genexpr> at 0x7fd940c31e50>
"""

Exploring Python’s yield Keyword

The yield keyword controls the flow of a generator function. Instead of exiting the function as seen when return is used, the yield keyword returns the function but remembers the state of its local variables.

The generator returned from the yield call can be assigned to a variable and iterated upon with the next() keyword – this will execute the function up to the first yield keyword it encounters. Once the yield keyword is hit, the execution of the function is suspended. When this occurs, the function's state is saved. Thus, it is possible for us to resume the function execution at our own will. 

The function will continue from the call to yield. For example: 

def yield_multiple_statments():
  yield "This is the first statment"
  yield "This is the second statement"  
  yield "This is the third statement"
  yield "This is the last statement. Don't call next again!"
example = yield_multiple_statments()
print(next(example))
print(next(example))
print(next(example))
print(next(example))
print(next(example))
"""
This is the first statment
This is the second statement
This is the third statement
This is the last statement. Don't call next again or else!
--------------------------------------------------------------------
StopIteration                  Traceback (most recent call last)
<ipython-input-25-4aaf9c871f91> in <module>()
    11 print(next(example))
    12 print(next(example))
---> 13 print(next(example))
StopIteration:
"""

In the code above, our generator has four yield calls, but we attempt to call next on it five times, which raised a StopIteration exception. This behavior occurred because our generator is not an infinite series, so calling it more times than expected exhausted the generator.

Wrap-Up 

To recap, iterators are objects that can be iterated on, and generators are special functions that leverage lazy evaluation. Implementing your own iterator means you must create an __iter__() and __next__() method, whereas a generator can be implemented using the yield keyword in a Python function or comprehension. 

You may prefer to use a custom iterator over a generator when you require an object with complex state-maintaining behavior or if you wish to expose other methods beyond __next__(), __iter__(), and __init__(). On the other hand, a generator may be preferable when dealing with large sets of data since they do not store their contents in memory or when it is not necessary to implement an iterator. 

Topics

Intermediate Python

Beginner
4 hours
881,793
Level up your data science skills by creating visualizations using Matplotlib and manipulating DataFrames with pandas.
See DetailsRight Arrow
Start Course

Python Data Science Toolbox (Part 2)

Beginner
4 hours
225,166
Continue to build your modern Data Science skills by learning about iterators and list comprehensions.

Python Data Science Toolbox (Part 1)

Beginner
3 hours
343,955
Learn the art of writing your own functions in Python, as well as key concepts like scoping and error handling.
See all coursesRight Arrow
Related

The 23 Top Python Interview Questions & Answers

Essential Python interview questions with examples for job seekers, final-year students, and data professionals.
Abid Ali Awan's photo

Abid Ali Awan

22 min

Working with Dates and Times in Python Cheat Sheet

Working with dates and times is essential when manipulating data in Python. Learn the basics of working with datetime data in this cheat sheet.
DataCamp Team's photo

DataCamp Team

Plotly Express Cheat Sheet

Plotly is one of the most widely used data visualization packages in Python. Learn more about it in this cheat sheet.
DataCamp Team's photo

DataCamp Team

0 min

Getting started with Python cheat sheet

Python is the most popular programming language in data science. Use this cheat sheet to jumpstart your Python learning journey.
DataCamp Team's photo

DataCamp Team

8 min

Python pandas tutorial: The ultimate guide for beginners

Are you ready to begin your pandas journey? Here’s a step-by-step guide on how to get started. [Updated November 2022]
Vidhi Chugh's photo

Vidhi Chugh

15 min

30 Cool Python Tricks For Better Code With Examples

We've curated 30 cool Python tricks you could use to improve your code and develop your Python skills.
Kurtis Pykes 's photo

Kurtis Pykes

24 min

See MoreSee More