In this post, you’ll learn how to use Python to convert a Pandas DataFrame into a dictionary. Because Pandas DataFrames are complex data structures, there are many different ways in which this can be done. This post explores all of the different options that Pandas makes available! For example, Pandas allows you to convert a DataFrame into a list of dictionaries or a dictionary of column and value mappings.
By the end of this tutorial, you’ll have learned:
- How the Pandas
.to_dict()
method works - How to customize the output of the method, and
- How to convert only a subset of columns to a dictionary using the
zip()
function
Table of Contents
Understanding the Pandas .to_dict() Method
Before diving into applying the Pandas .to_dict()
method, let’s take a look at what the method looks like:
# Understanding the Pandas .to_dict() Method
import pandas as pd
df = pd.DataFrame()
df.to_dict(orient='dict', into=<class 'dict'>)
The orient=
parameter accepts seven different arguments, each giving you different ways to customize the resulting dictionary. This guide explores all of them! Let’s dive into how to use the method.
Loading a Sample Pandas DataFrame
If you’d like to follow along with this tutorial line-by-line, I have provided a sample Pandas DataFrame in the code block below. The DataFrame is deliberately kept simple in order to better see what is happening.
# Load a Sample Pandas DataFrame
import pandas as pd
df = pd.DataFrame({
'Name': ['Nik', 'Evan', 'Kate'],
'Age': [33, 32, 33],
'Score': [90, 95, 100]
})
print(df)
# Returns:
# Name Age Score
# 0 Nik 33 90
# 1 Evan 32 95
# 2 Kate 33 100
From the code block above, you can see that the sample Pandas DataFrame has three columns and three records. Now that the DataFrame has been loaded, let’s see how you can apply the .to_dict()
method to convert the DataFrame to a dictionary.
Convert a Pandas DataFrame to a Dictionary
By default, the Pandas DataFrame .to_dict()
method will return a dictionary where the keys are the columns and the values are the index:record
matches. This process is more informative when your indices are meaningful, rather than arbitrary numbers.
Let’s take a look at what the Pandas to_dict() method returns with default arguments:
# Convert a Pandas DataFrame to a Dictionary
# import pandas as pd
df = pd.DataFrame({
'Name': ['Nik', 'Evan', 'Kate'],
'Age': [33, 32, 33],
'Score': [90, 95, 100]
})
print(df.to_dict())
# Returns:
# {'Name': {0: 'Nik', 1: 'Evan', 2: 'Kate'},
# 'Age': {0: 33, 1: 32, 2: 33},
# 'Score': {0: 90, 1: 95, 2: 100}}
In the following sections, you’ll learn how to customize the method to return differently structured dictionaries.
Convert a Pandas DataFrame to a Dictionary of Column Values
To create a dictionary of column values using the Pandas DataFrame to_dict method, you can pass in ‘list’ as the argument. This will create a key-value pair of column names and an ordered list of column values.
Let’s see what this looks like:
# Convert a Pandas DataFrame to a Dictionary of Column Values
import pandas as pd
df = pd.DataFrame({
'Name': ['Nik', 'Evan', 'Kate'],
'Age': [33, 32, 33],
'Score': [90, 95, 100]
})
print(df.to_dict('list'))
# Returns:
# {'Name': ['Nik', 'Evan', 'Kate'], 'Age': [33, 32, 33], 'Score': [90, 95, 100]}
In the following section, you’ll learn how to create a very similar dictionary, using Pandas Series objects instead of lists.
Convert a Pandas DataFrame to a Dictionary of Series Values
Similar to the process above, we can convert a Pandas DataFrame into a dictionary of column names and Pandas Series values. This can be accomplished by passing the string 'series'
into the method as its orient argument.
This can be helpful if you want to maintain the Pandas data structures for use elsewhere. Let’s see what this looks like:
# Convert a Pandas DataFrame to a Dictionary of Series Values
import pandas as pd
df = pd.DataFrame({
'Name': ['Nik', 'Evan', 'Kate'],
'Age': [33, 32, 33],
'Score': [90, 95, 100]
})
print(df.to_dict('series'))
# Returns:
# {'Name': 0 Nik
# 1 Evan
# 2 Kate
# Name: Name, dtype: object,
# 'Age': 0 33
# 1 32
# 2 33
# Name: Age, dtype: int64,
# 'Score': 0 90
# 1 95
# 2 100
# Name: Score, dtype: int64}
In the following section, you’ll learn how to separate indices, column names, and data into a dictionary.
Convert a Pandas DataFrame to a Dictionary Index, Columns, and Data
By using 'split'
as the argument in the Pandas to_dict method, you can create a dictionary that splits the index, columns, and data into separate keys in the resulting dictionary. The method will return the following dictionary: {'index': list, 'columns': list, 'data': list of lists}
.
This can be helpful when you want to pass items between data structures, where you may need to pass column headers and indices separate from their data.
# Convert a Pandas DataFrame to a Dictionary with Index, Columns, and Data
import pandas as pd
df = pd.DataFrame({
'Name': ['Nik', 'Evan', 'Kate'],
'Age': [33, 32, 33],
'Score': [90, 95, 100]
})
print(df.to_dict('split'))
# Returns:
# {'index': [0, 1, 2],
# 'columns': ['Name', 'Age', 'Score'],
# 'data': [['Nik', 33, 90], ['Evan', 32, 95], ['Kate', 33, 100]]}
To expand upon this, you can also pass in 'tight'
into the method to return a more in-depth dictionary. The resulting dictionary also includes the index names and column names, if they have been specified separately. This argument is only available as of Pandas version 1.4.0.
# Convert a Pandas DataFrame to a Dictionary of Index, Columns, and Data (Part 2)
import pandas as pd
df = pd.DataFrame({
'Name': ['Nik', 'Evan', 'Kate'],
'Age': [33, 32, 33],
'Score': [90, 95, 100]
})
print(df.to_dict('tight'))
# Returns:
# {'index': [0, 1, 2],
# 'columns': ['Name', 'Age', 'Score'],
# 'data': [['Nik', 33, 90], ['Evan', 32, 95], ['Kate', 33, 100]],
# 'index_names': [None],
# 'column_names': [None]}
In the following section, you’ll learn how to convert a Pandas DataFrame to a list of dictionaries.
Convert a Pandas DataFrame to a List of Dictionaries
One of the most common implementations of the Pandas to_dict method is to convert a DataFrame into a list of dictionaries. This most closely represents the JSON format, where you can easily pass data between languages.
By passing 'records'
into the method, you create a list that contains a single dictionary for each record in the DataFrame. Let’s see what this looks like:
# Convert a Pandas DataFrame to a List of Dictionaries
import pandas as pd
df = pd.DataFrame({
'Name': ['Nik', 'Evan', 'Kate'],
'Age': [33, 32, 33],
'Score': [90, 95, 100]
})
print(df.to_dict('records'))
# Returns:
# [{'Name': 'Nik', 'Age': 33, 'Score': 90},
# {'Name': 'Evan', 'Age': 32, 'Score': 95},
# {'Name': 'Kate', 'Age': 33, 'Score': 100}]
The resulting list will be ordered based on the current order of the DataFrame. If you need to include the index in the dictionaries, you can first reset the index of the DataFrame. This will include the index in the list of dictionaries:
# Convert a Pandas DataFrame to a List of Dictionaries with Indices
import pandas as pd
df = pd.DataFrame({
'Name': ['Nik', 'Evan', 'Kate'],
'Age': [33, 32, 33],
'Score': [90, 95, 100]
})
df.reset_index(inplace=True)
print(df.to_dict('records'))
# Returns:
# [{'index': 0, 'Name': 'Nik', 'Age': 33, 'Score': 90},
# {'index': 1, 'Name': 'Evan', 'Age': 32, 'Score': 95},
# {'index': 2, 'Name': 'Kate', 'Age': 33, 'Score': 100}]
In the next section, you’ll learn how to convert a Pandas DataFrame to a dictionary of indices and values.
Convert a Pandas DataFrame to a Dictionary of Index and Values
In this section, you’ll learn how to convert a Pandas DataFrame to a dictionary where the indices are the keys and the values are a dictionary of column names and record values. This can be accomplished by passing in 'index'
into the to_dict method.
# Convert a Pandas DataFrame to a Dictionary of Index and Values
import pandas as pd
df = pd.DataFrame({
'Name': ['Nik', 'Evan', 'Kate'],
'Age': [33, 32, 33],
'Score': [90, 95, 100]
})
print(df.to_dict('index'))
# Returns:
# {0: {'Name': 'Nik', 'Age': 33, 'Score': 90},
# 1: {'Name': 'Evan', 'Age': 32, 'Score': 95},
# 2: {'Name': 'Kate', 'Age': 33, 'Score': 100}}
This approach is even more meaningful when the indices contain more meaningful values.
Convert Two Pandas Series (Columns) Into a Dictionary
In this section, you’ll learn how to convert two Pandas columns into a dictionary. This approach works only if the column meant to hold the key contains only unique values. This is because Python dictionaries are required to have unique keys.
In order to accomplish this, we can use the powerful Python zip function, which allows you to iterate over multiple objects sequentially. Let’s see what this looks like:
# Convert Two Pandas Columns into a Dictionary
import pandas as pd
df = pd.DataFrame({
'Name': ['Nik', 'Evan', 'Kate'],
'Age': [33, 32, 33],
'Score': [90, 95, 100]
})
ages = dict(zip(df['Name'], df['Age']))
print(ages)
# Returns:
# {'Nik': 33, 'Evan': 32, 'Kate': 33}
We pass a zip object between the two columns into the dict()
constructor function. This allows us to easily convert the iterable object into a dictionary.
Conclusion
In this post, you learned how to convert a Pandas DataFrame into a dictionary. Pandas provides many different ways in which to accomplish this. This exhaustive guide covered all of the different ways in which to handle this conversion. You first learned how to use the .to_dict()
method. Then, you also learned how to create a dictionary of two Pandas columns, using the zip()
function.
Additional Resources
To learn more about related topics, check out the tutorials below: