Skip to content

Convert a List of Dictionaries to a Pandas DataFrame

Covert a List of Dictionaries to a Pandas DataFrame Cover Image

In this tutorial, you’ll learn how to convert a list of Python dictionaries into a Pandas DataFrame. Pandas provides a number of different ways in which to convert dictionaries into a DataFrame. You’ll learn how to use the Pandas from_dict method, the DataFrame constructor, and the json_normalize function.

By the end of this tutorial, you’ll have learned:

  • How to convert a list of dictionaries to a Pandas DataFrame
  • How to work with different sets of columns across dictionaries
  • How to set an index when converting a list of dictionaries to a DataFrame
  • How to convert nested dictionaries to a Pandas DataFrame

Summary of Methods

The table below breaks down the different ways in which you can read a list of dictionaries to a Pandas DataFrame. Each of these are covered in-depth throughout the tutorial:

Method NameWorks with missing keysRead only some columnsSet an indexRead nested dictionaries
DataFrame()YesYesYesNo
from_dict()YesYesOnly using .set_index()No
from_records()YesYesYesNo
json_normalize()YesYesYesYes
Exploring different methods to read list of dictionaries to a Pandas DataFrame

Convert a List of Dictionaries to a Pandas DataFrame

In this section, you’ll learn how to convert a list of dictionaries to a Pandas DataFrame using the Pandas DataFrame class. By passing in a list of dictionaries, you’re easily able to create a DataFrame.

Each dictionary will represent a record in the DataFrame, while the keys become the columns. Let’s take a look at an example where each dictionary contains every key:

# Converting a List of Dictionaries to a DataFrame
import pandas as pd

list_of_dicts = [
{'Name': 'Nik', 'Age': 33, 'Location': 'Toronto'},
{'Name': 'Kate', 'Age': 32, 'Location': 'London'},
{'Name': 'Evan', 'Age': 36, 'Location': 'London'}]

df = pd.DataFrame(list_of_dicts)

print(df)

# Returns:
#    Name  Age Location
# 0   Nik   33  Toronto
# 1  Kate   32   London
# 2  Evan   36   London

Because each dictionary in the list contains the same keys, we’re able to use a number of different methods to accomplish this. The other following methods would also work:

# These methods all produce the same result
df = pd.DataFrame(list_of_dicts)
df = pd.DataFrame.from_dict(list_of_dicts)
df = pd.DataFrame.from_records(list_of_dicts)

Working with Missing Keys When Converting a List of Dictionaries to a Pandas DataFrame

Let’s now take a look at a more complex example. In the example below, we’ll provide dictionaries where one dictionary will be missing a key. Let’s use the .from_dict() method to read the list to see how the data will be read:

# Reading Dictionaries with Missing Keys
import pandas as pd

list_of_dicts = [{'Name': 'Nik', 'Age': 33, 'Location': 'Toronto'},
{'Name': 'Kate', 'Age': 32, 'Location': 'London'},
{'Name': 'Evan', 'Age': 36}]

df = pd.DataFrame.from_dict(list_of_dicts)

print(df)

# Returns:
#    Name  Age Location
# 0   Nik   33  Toronto
# 1  Kate   32   London
# 2  Evan   36      NaN

This method returns the same version, even if you were to use the pd.DataFrame() constructor, the .from_dict() method, or the .from_records() method. Any dictionary that is missing a key will return a missing value, NaN.

Reading Only Some Columns When Converting a List of Dictionaries to a Pandas DataFrame

There may be many times when you want to read dictionaries into a Pandas DataFrame, but only want to read a subset of the columns. In this case, you can use the columns= parameter. Note that this parameter is only available in the pd.DataFrame() constructor and the pd.DataFrame.from_records() method. Using this parameter in the pd.DataFrame.from_dict() method will raise a ValueError.

Let’s load the same list of dictionaries but only read two of the columns:

# Reading only a subset of columns
import pandas as pd

list_of_dicts = [{'Name': 'Nik', 'Age': 33, 'Location': 'Toronto'},
{'Name': 'Kate', 'Age': 32, 'Location': 'London'},
{'Name': 'Evan', 'Age': 36}]

df = pd.DataFrame.from_records(list_of_dicts, columns=['Name', 'Age'])
# Same as: df = pd.DataFrame(list_of_dicts, columns=['Name', 'Age'])

print(df)

# Returns:
#    Name  Age
# 0   Nik   33
# 1  Kate   32
# 2  Evan   36

Setting an Index When Converting a List of Dictionaries to a Pandas DataFrame

There are two different types of indices you may want to set when creating a DataFrame:

  1. A DataFrame index that is not part of the data you’re reading (such as 1, 2, 3), or
  2. A DataFrame index from the data that you’re reading (such as one of the columns)

Let’s take a look at the first use case. For this, we can only rely on the pd.DataFrame() constructor and the pd.DataFrame.from_records() method. To pass in an arbitrary index, we can use the index= parameter to pass in a list of values.

Let’s see how this is done in Pandas:

# Setting an index when reading a list of dictionaries
import pandas as pd

list_of_dicts = [{'Name': 'Nik', 'Age': 33, 'Location': 'Toronto'},
{'Name': 'Kate', 'Age': 32, 'Location': 'London'},
{'Name': 'Evan', 'Age': 36, 'Location': 'New York'}]

df = pd.DataFrame.from_records(list_of_dicts, index=['Employee_001', 'Employee_002', 'Employee_003'])
# Same as: df = pd.DataFrame(list_of_dicts, index=['Employee_001', 'Employee_002', 'Employee_003'])

print(df)

# Returns:
#               Name  Age  Location
# Employee_001   Nik   33   Toronto
# Employee_002  Kate   32    London
# Employee_003  Evan   36  New York

In order to read a list of dictionaries and set an index based on one of the keys, we can use any of the three methods covered above. While Pandas doesn’t directly provide a parameter to do this, we can use the .set_index() method to accomplish this.

Let’s read our data and use the 'Name' column as the index:

# Setting a column as an index
import pandas as pd

list_of_dicts = [{'Name': 'Nik', 'Age': 33, 'Location': 'Toronto'},
{'Name': 'Kate', 'Age': 32, 'Location': 'London'},
{'Name': 'Evan', 'Age': 36, 'Location': 'New York'}]

df = pd.DataFrame(list_of_dicts).set_index('Name')
# Same as: df = pd.DataFrame.from_dict(list_of_dicts).set_index('Name')
# Same as: df = pd.DataFrame.from_records(list_of_dicts).set_index('Name')

print(df)

# Returns:
#       Age  Location
# Name               
# Nik    33   Toronto
# Kate   32    London
# Evan   36  New York

In the final section, you’ll learn how to use the json_normalize() function to read a list of nested dictionaries to a Pandas DataFrame.

json_normalize: Reading Nested Dictionaries to a Pandas DataFrame

When loading data from different sources, such as web APIs, you may get a list of nested dictionaries returned to you. When reading these lists of dictionaries using the methods shown above, the nested dictionaries will simply be returned as dictionaries in a column.

However, in many cases, you’ll want each of these fields to return its own column. For this, we can use the pd.json_normalize() function.

Let’s take a look at an example where our list’s dictionaries are nested and use the json_normalize function to convert it to a DataFrame:

# Convert a List of Nested Dictionaries to a DataFrame
import pandas as pd

list_of_dicts = [
    {'Name': 'Nik', 'Age': 33, 'Location': {'City': 'Toronto', 'Country': 'Canada'}},
    {'Name': 'Kate', 'Age': 32, 'Location': {'City': 'London', 'Country': 'UK'}},
    {'Name': 'Evan', 'Age': 36, 'Location': {'City': 'New York', 'Country': 'USA'}}
]

df = pd.json_normalize(list_of_dicts)

print(df)

# Returns:
#    Name  Age Location.City Location.Country
# 0   Nik   33       Toronto           Canada
# 1  Kate   32        London               UK
# 2  Evan   36      New York              USA

Conclusion

In this tutorial, you learned how to read a list of dictionaries to a Pandas DataFrame. You learned how to use four different ways to accomplish this. You also learned how to read only a subset of columns, deal with missing data, and how to set an index. Finally, you learned how to read a list of nested dictionaries to your Pandas DataFrame.

Additional Resources

To learn more about related topics, check out the tutorials below:

Nik Piepenbreier

Nik is the author of datagy.io and has over a decade of experience working with data analytics, data science, and Python. He specializes in teaching developers how to use Python for data science using hands-on tutorials.View Author posts

Leave a Reply

Your email address will not be published. Required fields are marked *