This guide will teach you how to read CSV files in Python, including to Python lists and dictionaries. The Python csv
library gives you significant flexibility in reading CSV files. For example, you can read CSV files to Python lists, including readings headers and using custom delimiters. Likewise, you can read CSV files to Python dictionaries.
By the end of this guide, you’ll have learned the following:
- How to read CSV files in Python using the
csv.reader()
class andcsv.DictReader()
class - How to read CSV files to Python lists and dictionaries
- How to handle common complexities, such as double quotes, different encodings, and escape characters
Table of Contents
Quick Answer: How to Read CSV Files to a List in Python
If you’re in a hurry, the code below shows you how to read a CSV file into lists using Python. In the example, the code simply prints out the lists. You could just as well append them to a list to create a list of lists.
# Quick Answer: Reading a CSV File as Lists
import csv
with open('file.csv', 'r') as file:
reader = csv.reader(file)
for row in reader:
print(row)
# Returns:
# ['Nik', '34', 'datagy.io', 'Toronto']
# ['Kate', '33', 'google', 'Paris']
# ['Evan', '32', 'bing', 'New York City']
# ['Kyra', '35', 'yahoo', 'Atlanta']
The Python csv
module library provides huge amounts of flexibility in terms of how you read CSV files. Throughout this tutorial, we’ll explore how to read CSV files into lists and dictionaries. We’ll also explore how to customize the way in which the CSV files are read, such as dealing with different delimiters and various encodings.
How to Read a CSV File in Python to a List
In this section, we’ll explore how to use Python to read a CSV file to a list or list of lists. We’ll do this by exploring the csv.reader
class, which returns a generator based on the file passed into it. We’ll also explore how to read all lines at a single time and how to read files that have a header row (as well as how to skip it).
How to Read a CSV File Line-by-Line to a List in Python
In order to read a CSV file in Python into a list, you can use the csv.reader
class and iterate over each row, returning a list. Let’s see what this looks like in Python. We’ll work with a CSV file that looks like the file below:
Nik,34,datagy.io,Toronto
Kate,33,google,Paris
Evan,32,bing,New York City
Kyra,35,yahoo,Atlanta
Let’s see how we can use Python to read this CSV file using the csv
module’s reader
class:
# How to Read a CSV File Into Lists Using Python
import csv
with open('file.csv', 'r') as file:
reader = csv.reader(file)
for row in reader:
print(row)
# Returns:
# ['Nik', '34', 'datagy.io', 'Toronto']
# ['Kate', '33', 'google', 'Paris']
# ['Evan', '32', 'bing', 'New York City']
# ['Kyra', '35', 'yahoo', 'Atlanta']
Let’s break down what we’re doing in the code block above:
- We imported the
csv
module - We opened the file using a context manager in read more,
'r'
- We created a reader object by passing the file into the
csv.reader()
instantiator - We then looped over each record in the generator and printed it out. Each item that gets printed out returns a list.
If we wanted to create a list of lists, we can loop over each record and append it to an outer list.
# Creating a List of Lists from a CSV File
import csv
with open('file.csv', 'r') as file:
reader = csv.reader(file)
data = []
for row in reader:
data.append(row)
print(data)
# Returns:
# [['Nik', '34', 'datagy.io', 'Toronto'], ['Kate', '33', 'google', 'Paris'], ['Evan', '32', 'bing', 'New York City'], ['Kyra', '35', 'yahoo', 'Atlanta']]
In the code block above, we added a variable data
which is an empty list. Instead of printing out each row in the dataset, the appended the record to our list.
In some ways, this can feel redundant. If you know that you’re reading every row in the dataset and want to store it in a single data structure, there is an easier way to accomplish this. Let’s see how we can modify our code to accomplish this in the following section.
How to Read a CSV File All Lines to a List of Lists in Python
In order to read all lines in a CSV file to a Python list of lists, we can simply pass the reader
object into the list()
function. Depending on the size of your CSV file, this can have some memory implications, since you’ll be unpacking the entire generator object at once.
Let’s see how we can use Python to read all lines in a CSV file to a list of lists:
# Read All Lines from a CSV File at Once
import csv
with open('file.csv', 'r') as file:
reader = csv.reader(file)
data = list(reader)
print(data)
# Returns:
# [['Nik', '34', 'datagy.io', 'Toronto'], ['Kate', '33', 'google', 'Paris'], ['Evan', '32', 'bing', 'New York City'], ['Kyra', '35', 'yahoo', 'Atlanta']]
In the code block above, we didn’t iterate over the generator object returned by creating a reader
object. Instead, we passed this into the list()
function, which unpacks all of the items at once.
In the following section, you’ll learn how to read a header row from a CSV file.
How to Read a CSV File with a Header to a List in Python
When reading a CSV file with a header, we have three different options:
- Read the header into the same list of lists,
- Store the header as a separate list,
- Skip the header entirely
Imagine that we’re working with the dataset below:
Name, Age, Site, Location
Nik,34,datagy.io,Toronto
Kate,33,google,Paris
Evan,32,bing,New York City
Kyra,35,yahoo,Atlanta
In order to read the header into the list of lists, we don’t actually need to change anything from our previous example.
In order to read the header into a separate list, we can use the following method. This place the header into one list and then adds the following rows to another list.
# Reading a Header into a Separate List
import csv
with open('file.csv', 'r') as file:
reader = csv.reader(file)
header = next(reader, None)
data = []
for item in reader:
data.append(item)
print(f'{header=}')
print(f'{data=}')
# Returns:
# header=['Name', ' Age', ' Site', ' Location']
# data=[['Nik', '34', 'datagy.io', 'Toronto'], ['Kate', '33', 'google', 'Paris'], ['Evan', '32', 'bing', 'New York City'], ['Kyra', '35', 'yahoo', 'Atlanta']]
Doing this places the header into a separate list, header
. We use the next()
function, which reads the first item in this case. We pass in the second argument of None
, which safely handles a case where no record exists.
Then, we read the remaining data row by row into a list of lists. This can be helpful when you want to maintain the header information and need to reference it at a later time.
How are we printing these items?
In this case, we’re printing the items by using Python f-strings. While f-strings have been available since Python 3.6, the method we’re using is only available since Python 3.8. The method to print variables as print(f'{var=}')
is used for easier debugging, which prints the variable name and the value(s) it contains.
In order to skip the header entirely we can use the skip()
function to skip the first row. Because the reader
class returns a generator object, we can use the skip()
function which, as the name implies, skips an item.
Let’s see how this works:
# How to Skip a Header Row When Reading CSV Files
import csv
with open('file.csv', 'r') as file:
reader = csv.reader(file)
next(reader, None)
data = []
for item in reader:
data.append(item)
print(data)
# Returns:
# [['Nik', '34', 'datagy.io', 'Toronto'], ['Kate', '33', 'google', 'Paris'], ['Evan', '32', 'bing', 'New York City'], ['Kyra', '35', 'yahoo', 'Atlanta']]
In the following sections, you’ll learn how to read a CSV file into a Python dictionary.
How to Read a CSV File in Python to a Dictionary
In order to read a CSV file in Python into a list, you can use the csv.DictReader
class and iterate over each row, returning a dictionary. The csv module will use the first row of the file as header fields unless custom fields are passed into it.
Because of this, we’ll cover this section of the guide by first looking at an example without a header, where we specifically need to pass in field names. From there, we’ll cover off how to read a file that has a header included.
How to Read a CSV File in Python to a Dictionary Line by Line
Let’s see what this looks like in Python. We’ll work with a CSV file that looks like the file below:
Nik,34,datagy.io,Toronto
Kate,33,google,Paris
Evan,32,bing,New York City
Kyra,35,yahoo,Atlanta
Let’s see how we can use Python to read this CSV file using the csv
module’s DictReader
class:
# How to Read a CSV File Into Dictionaries Using Python
import csv
with open('file.csv', 'r') as file:
reader = csv.DictReader(
file, fieldnames=['Name', 'Age', 'Site', 'Location'])
for row in reader:
print(row)
# Returns:
# {'Name': 'Nik', 'Age': '34', 'Site': 'datagy.io', 'Location': 'Toronto'}
# {'Name': 'Kate', 'Age': '33', 'Site': 'google', 'Location': 'Paris'}
# {'Name': 'Evan', 'Age': '32', 'Site': 'bing', 'Location': 'New York City'}
# {'Name': 'Kyra', 'Age': '35', 'Site': 'yahoo', 'Location': 'Atlanta'}
In order to read a CSV file into a dictionary, follow the steps below:
- We imported the
csv
module - We opened the file using a context manager in read more,
'r'
- We created a reader object by passing the file into the
csv.DictReader()
instantiator and passed in our field names as a list of values - We then looped over each record in the generator and printed it out. Each item that gets printed out returns a dictionary.
In the following section, you’ll learn how to read all the lines of a CSV file into a list of dictionaries.
How to Read a CSV File All Lines to a List of Dictionaries in Python
In order to read a CSV file into a list of dictionaries, we can pass the DictReader
object into the list function. This will unpack the generator object into a list. Because the DictReader
object returns a dictionary for each value, we create a list of dictionaries.
Let’s see what this looks like:
# How to Read a CSV File Into a List of Dictionaries
import csv
with open('file.csv', 'r') as file:
reader = csv.DictReader(
file, fieldnames=['Name', 'Age', 'Site', 'Location'])
data = list(reader)
print(data)
# Returns:
# [{'Name': 'Nik', 'Age': '34', 'Site': 'datagy.io', 'Location': 'Toronto'}, {'Name': 'Kate', 'Age': '33', 'Site': 'google', 'Location': 'Paris'}, {'Name': 'Evan', 'Age': '32', 'Site': 'bing', 'Location': 'New York City'}, {'Name': 'Kyra', 'Age': '35', 'Site': 'yahoo', 'Location': 'Atlanta'}]
Depending on the size of your CSV file, this unpacking step can use significant amounts of memory. Be mindful of this. If needed, simply loop over each item in the DictReader
object and append them line by line.
How to Read a CSV File with a Header to a Dictionary in Python
If your file has a header row, you don’t need to pass in field names when reading a CSV file – Python will infer the field names from the first row of data. Let’s modify our CSV file to include field names, as shown below:
Name,Age,Website,Location
Nik,34,datagy.io,Toronto
Kate,33,google,Paris
Evan,32,bing,New York City
Kyra,35,yahoo,Atlanta
When we read our CSV file as we have before, we can skip the fieldnames=
parameter. Let’s see what this looks like:
# Reading a CSV File with a Header Row in Python
import csv
with open('file.csv', 'r') as file:
reader = csv.DictReader(file)
data = list(reader)
print(data)
# Returns:
# [{'Name': 'Nik', 'Age': '34', 'Site': 'datagy.io', 'Location': 'Toronto'}, {'Name': 'Kate', 'Age': '33', 'Site': 'google', 'Location': 'Paris'}, {'Name': 'Evan', 'Age': '32', 'Site': 'bing', 'Location': 'New York City'}, {'Name': 'Kyra', 'Age': '35', 'Site': 'yahoo', 'Location': 'Atlanta'}]
We can see how intuitive it can be to read a file that contains a header row. Because Python uses the field names as the dictionary keys, we can simplify the process we use significantly.
In the following sections, you’ll learn how to customize the behavior of reading CSV files, such as when files contain custom delimiters or leading spaces.
How to Handle Custom Delimiters in Reading CSV Files in Python
When reading CSV files that have custom delimiters, you can use the delimiter=
parameter when creating either the reader
class or the DictReader
class. This allows you to specify a string that’s used as the delimiter in the file.
In many cases, the custom delimiter will be a tab or a pipe character. Imagine that we’re working with the file below:
Name|Age|Website|Location
Nik|34|datagy.io|Toronto
Kate|33|google|Paris
Evan|32|bing|New York City
Kyra|35|yahoo|Atlanta
In the file above, the data are tab separated. We can specify that we want our file to be read with this custom delimit in mind by using the delimiter=
parameter when creating the reader object. Let’s see what this looks like in Python:
# Reading a CSV File with a Custom Delimiter
import csv
with open('file.csv', 'r') as file:
reader = csv.reader(file, delimiter='|')
data = list(reader)
print(data)
# Returns:
# [['Name', 'Age', 'Website', 'Location'], ['Nik', '34', 'datagy.io', 'Toronto'], ['Kate', '33', 'google', 'Paris'], ['Evan', '32', 'bing', 'New York City'], ['Kyra', '35', 'yahoo', 'Atlanta']]
In the code block above, we passed in the '|'
character as our delimiter. Python will parse the file for this delimiter and split the data accordingly.
How to Skip a Preceding Space When Reading CSV Files in Python
Some CSV formats will include a space following a comma for readability. When Python reads these files, it will actually include this space in the resulting data. Let’s see what this looks like when we read the following file:
Name, Age, Website, Location
Nik, 34, datagy.io, Toronto
Kate, 33, google, Paris
Evan, 32, bing, New York City
Kyra, 35, yahoo, Atlanta
When we read this file using the methods you have learned above, we get the following result (note that we’re only reading the first line):
# Reading a File with Preceding Spaces
import csv
with open('file.csv', 'r') as file:
reader = csv.reader(file)
print(next(reader))
# Returns:
# ['Name', ' Age', ' Website', ' Location']
We can see that each of the values (except for the first) has a preceding space. This is absolutely not how we want to read the data.
In order to resolve the initial space issue, we can use the skipinitialspace=
parameter. By default, this is set to False
. By modifying it to True
, Python will skip the initial space. Let’s see what this looks like:
# Skipping an Initial Space When Reading a CSV
import csv
with open('file.csv', 'r') as file:
reader = csv.reader(file, skipinitialspace=True)
print(next(reader))
# Returns:
# ['Name', 'Age', 'Website', 'Location']
We can see that by using the skipinitialspace=
parameter, we were able to resolve the preceding space issues.
How to Specify Quoting Characters When Reading CSV Files in Python
When working with files that have quoting characters in them, Python provides a number of different options to read CSV files. Let’s take a look at this example file below:
Name,Num,Last Message
Nik,1,"Hey, how's it going?"
Kate,2,"Not so bad!"
When we read this file using the method we have learned so far, we get the following results:
# Reading a CSV File With Quoting Characters
import csv
with open('file2.csv', 'r') as file:
reader = csv.reader(file)
for row in reader:
print(row)
# Returns:
# ['Name', 'Num', 'Last Message']
# ['Nik', '1', "Hey, how's it going?"]
# ['Kate', '2', 'Not so bad!']
We can see that in this case, the type of quotation mark that’s used is dependent on whether a quote character is present in the CSV file. The csv
module provides a number of different constants to handle quotes:
csv.QUOTE_ALL
specifies that all fields have quotes around themcsv.QUOTE_MINIMAL
specifies that only the fields that contain special characters, such as the delimiter, quote character or any line terminator will have quotes around themcsv.QUOTE_NONNUMERIC
specifies that the CSV file has quotes around non-numeric entriescsv.QUOTE_NONE
specifies that none of the entries have quotes around them
Let’s see how we can use the csv.QUOTE_NONNUMERIC
constant to prevent quoting the numbers in our CSV file:
# Specifying Quoting Formats
import csv
with open('file2.csv', 'r') as file:
reader = csv.reader(file, quoting=csv.QUOTE_MINIMAL)
for row in reader:
print(row)
# Returns:
# ['Name', 'Num', 'Last Message']
# ['Nik', '1', "Hey, how's it going?"]
# ['Kate', '2', 'Not so bad!']
In the following section, you’ll learn how to use dialects to simplify reading multiple, similar files.
How to Use Dialects when Reading CSV Files in Python
Dialects in the CSV module allow you to set customization criteria to easily write and read files in the same style. While we have only covered custom delimiters so far, the CSV library provides significant flexibility to customize CSV files. Because of this, dialects allow you to set preset styles that can be used to read and write CSV files in Python.
Let’s see how we can create a dialect and use it when reading a CSV file:
# Registering and Using a Dialect
import csv
csv.register_dialect('sample_dialect', skipinitialspace=True, delimiter='|')
with open('file.csv', 'r') as file:
reader = csv.reader(file, dialect='sample_dialect')
for row in reader:
print(row)
# Returns:
# ['Name', 'Age', 'Website', 'Location']
# ['Nik', '34', 'datagy.io', 'Toronto']
# ['Kate', '33', 'google', 'Paris']
# ['Evan', '32', 'bing', 'New York City']
# ['Kyra', '35', 'yahoo', 'Atlanta']
In the example above we created a dialect using the csv.register_dialect()
function. The function allows you pass in a name using a string. Then, it can subclass either existing dialects or add custom formatting as we have done.
This allows you to easily re-use the dialect when you are reading or writing to CSV files. Because the function isn’t tied to a particular writer class, it works with both the csv.reader()
class and the csv.DictReader()
class.
In the section below, you’ll learn how to use the popular Pandas library to read CSV files in Python.
How to Read a CSV File With Pandas in Python
The popular data analysis library, pandas, provides helpful function for reading different types of files, including CSV files. The function offers huge amounts of flexibility in terms of reading CSV files into pandas DataFrame structures.
Let’s see how we can use the pandas read_csv()
function to read a CSV file:
# Reading a CSV File with Pandas
import pandas as pd
df = pd.read_csv('file.csv')
print(df)
# Returns:
# Name Age Website Location
# 0 Nik 34 datagy.io Toronto
# 1 Kate 33 google Paris
# 2 Evan 32 bing New York City
# 3 Kyra 35 yahoo Atlanta
Let’s break down what we did in the code block above:
- We imported the pandas library using the convention,
pd
- We then assigned a DataFrame
df
using theread_csv()
function by passing in the path to function - Finally, we printed the DataFrame, which returned a two-dimensional tabular data structure
The pandas read_csv()
function provides an extensive number of parameters, including:
delimiter=
which specifies the delimiter to useuse_cols=
which specifies what columns to read inskiprows=
specifies the number of rows to skipnrows=
specifies how many rows to read in
The pandas read_csv()
function provides huge amounts of flexibility for reading data into tabular datasets.
Conclusion
In this guide, you learned how to use Python to read CSV files to both lists and dictionaries. The Python csv
module provides significant flexibility in terms of reading CSV files. By providing reader objects for both lists and dictionaries, the module lets you customize the output of your CSV file.
You first learned how to read CSV files to lists, including reading a single row, multiple rows, and adding headers. Then, you covered the same topics with Python dictionaries, as well as working with the additional options that the DictReader
class provides. Finally, you learned how to customize behavior by exploring different options such as modifying delimiters and double quoting, as well as creating dialects for easier reusability.
Additional Resources
To learn more about related topics, check out the resources below: