Skip to content

How to Read CSV Files in Python (to list, dict)

How to Read CSV Files in Python (to list, dict) Cover Image

This guide will teach you how to read CSV files in Python, including to Python lists and dictionaries. The Python csv library gives you significant flexibility in reading CSV files. For example, you can read CSV files to Python lists, including readings headers and using custom delimiters. Likewise, you can read CSV files to Python dictionaries.

By the end of this guide, you’ll have learned the following:

  • How to read CSV files in Python using the csv.reader() class and csv.DictReader() class
  • How to read CSV files to Python lists and dictionaries
  • How to handle common complexities, such as double quotes, different encodings, and escape characters

Quick Answer: How to Read CSV Files to a List in Python

If you’re in a hurry, the code below shows you how to read a CSV file into lists using Python. In the example, the code simply prints out the lists. You could just as well append them to a list to create a list of lists.

# Quick Answer: Reading a CSV File as Lists
import csv

with open('file.csv', 'r') as file:
    reader = csv.reader(file)
    for row in reader:
        print(row)

# Returns:
# ['Nik', '34', 'datagy.io', 'Toronto']
# ['Kate', '33', 'google', 'Paris']
# ['Evan', '32', 'bing', 'New York City']
# ['Kyra', '35', 'yahoo', 'Atlanta']

The Python csv module library provides huge amounts of flexibility in terms of how you read CSV files. Throughout this tutorial, we’ll explore how to read CSV files into lists and dictionaries. We’ll also explore how to customize the way in which the CSV files are read, such as dealing with different delimiters and various encodings.

How to Read a CSV File in Python to a List

In this section, we’ll explore how to use Python to read a CSV file to a list or list of lists. We’ll do this by exploring the csv.reader class, which returns a generator based on the file passed into it. We’ll also explore how to read all lines at a single time and how to read files that have a header row (as well as how to skip it).

How to Read a CSV File Line-by-Line to a List in Python

In order to read a CSV file in Python into a list, you can use the csv.reader class and iterate over each row, returning a list. Let’s see what this looks like in Python. We’ll work with a CSV file that looks like the file below:

Nik,34,datagy.io,Toronto
Kate,33,google,Paris
Evan,32,bing,New York City
Kyra,35,yahoo,Atlanta

Let’s see how we can use Python to read this CSV file using the csv module’s reader class:

# How to Read a CSV File Into Lists Using Python
import csv

with open('file.csv', 'r') as file:
    reader = csv.reader(file)
    for row in reader:
        print(row)

# Returns:
# ['Nik', '34', 'datagy.io', 'Toronto']
# ['Kate', '33', 'google', 'Paris']
# ['Evan', '32', 'bing', 'New York City']
# ['Kyra', '35', 'yahoo', 'Atlanta']

Let’s break down what we’re doing in the code block above:

  1. We imported the csv module
  2. We opened the file using a context manager in read more, 'r'
  3. We created a reader object by passing the file into the csv.reader() instantiator
  4. We then looped over each record in the generator and printed it out. Each item that gets printed out returns a list.

If we wanted to create a list of lists, we can loop over each record and append it to an outer list.

# Creating a List of Lists from a CSV File
import csv

with open('file.csv', 'r') as file:
    reader = csv.reader(file)
    data = []
    for row in reader:
        data.append(row)
    print(data)

# Returns:
# [['Nik', '34', 'datagy.io', 'Toronto'], ['Kate', '33', 'google', 'Paris'], ['Evan', '32', 'bing', 'New York City'], ['Kyra', '35', 'yahoo', 'Atlanta']]

In the code block above, we added a variable data which is an empty list. Instead of printing out each row in the dataset, the appended the record to our list.

In some ways, this can feel redundant. If you know that you’re reading every row in the dataset and want to store it in a single data structure, there is an easier way to accomplish this. Let’s see how we can modify our code to accomplish this in the following section.

How to Read a CSV File All Lines to a List of Lists in Python

In order to read all lines in a CSV file to a Python list of lists, we can simply pass the reader object into the list() function. Depending on the size of your CSV file, this can have some memory implications, since you’ll be unpacking the entire generator object at once.

Let’s see how we can use Python to read all lines in a CSV file to a list of lists:

# Read All Lines from a CSV File at Once
import csv

with open('file.csv', 'r') as file:
    reader = csv.reader(file)
    data = list(reader)
    print(data)

# Returns:
# [['Nik', '34', 'datagy.io', 'Toronto'], ['Kate', '33', 'google', 'Paris'], ['Evan', '32', 'bing', 'New York City'], ['Kyra', '35', 'yahoo', 'Atlanta']]

In the code block above, we didn’t iterate over the generator object returned by creating a reader object. Instead, we passed this into the list() function, which unpacks all of the items at once.

In the following section, you’ll learn how to read a header row from a CSV file.

How to Read a CSV File with a Header to a List in Python

When reading a CSV file with a header, we have three different options:

  1. Read the header into the same list of lists,
  2. Store the header as a separate list,
  3. Skip the header entirely

Imagine that we’re working with the dataset below:

Name, Age, Site, Location
Nik,34,datagy.io,Toronto
Kate,33,google,Paris
Evan,32,bing,New York City
Kyra,35,yahoo,Atlanta

In order to read the header into the list of lists, we don’t actually need to change anything from our previous example.

In order to read the header into a separate list, we can use the following method. This place the header into one list and then adds the following rows to another list.

# Reading a Header into a Separate List
import csv

with open('file.csv', 'r') as file:
    reader = csv.reader(file)
    header = next(reader, None)
    data = []
    for item in reader:
        data.append(item)

    print(f'{header=}')
    print(f'{data=}')

# Returns:
# header=['Name', ' Age', ' Site', ' Location']
# data=[['Nik', '34', 'datagy.io', 'Toronto'], ['Kate', '33', 'google', 'Paris'], ['Evan', '32', 'bing', 'New York City'], ['Kyra', '35', 'yahoo', 'Atlanta']]

Doing this places the header into a separate list, header. We use the next() function, which reads the first item in this case. We pass in the second argument of None, which safely handles a case where no record exists.

Then, we read the remaining data row by row into a list of lists. This can be helpful when you want to maintain the header information and need to reference it at a later time.

How are we printing these items?

In this case, we’re printing the items by using Python f-strings. While f-strings have been available since Python 3.6, the method we’re using is only available since Python 3.8. The method to print variables as print(f'{var=}') is used for easier debugging, which prints the variable name and the value(s) it contains.

In order to skip the header entirely we can use the skip() function to skip the first row. Because the reader class returns a generator object, we can use the skip() function which, as the name implies, skips an item.

Let’s see how this works:

# How to Skip a Header Row When Reading CSV Files
import csv

with open('file.csv', 'r') as file:
    reader = csv.reader(file)
    next(reader, None)
    data = []
    for item in reader:
        data.append(item)

    print(data)


# Returns:
# [['Nik', '34', 'datagy.io', 'Toronto'], ['Kate', '33', 'google', 'Paris'], ['Evan', '32', 'bing', 'New York City'], ['Kyra', '35', 'yahoo', 'Atlanta']]

In the following sections, you’ll learn how to read a CSV file into a Python dictionary.

How to Read a CSV File in Python to a Dictionary

In order to read a CSV file in Python into a list, you can use the csv.DictReader class and iterate over each row, returning a dictionary. The csv module will use the first row of the file as header fields unless custom fields are passed into it.

Because of this, we’ll cover this section of the guide by first looking at an example without a header, where we specifically need to pass in field names. From there, we’ll cover off how to read a file that has a header included.

How to Read a CSV File in Python to a Dictionary Line by Line

Let’s see what this looks like in Python. We’ll work with a CSV file that looks like the file below:

Nik,34,datagy.io,Toronto
Kate,33,google,Paris
Evan,32,bing,New York City
Kyra,35,yahoo,Atlanta

Let’s see how we can use Python to read this CSV file using the csv module’s DictReader class:

# How to Read a CSV File Into Dictionaries Using Python
import csv

with open('file.csv', 'r') as file:
    reader = csv.DictReader(
        file, fieldnames=['Name', 'Age', 'Site', 'Location'])
    for row in reader:
        print(row)

# Returns:
# {'Name': 'Nik', 'Age': '34', 'Site': 'datagy.io', 'Location': 'Toronto'}
# {'Name': 'Kate', 'Age': '33', 'Site': 'google', 'Location': 'Paris'}
# {'Name': 'Evan', 'Age': '32', 'Site': 'bing', 'Location': 'New York City'}
# {'Name': 'Kyra', 'Age': '35', 'Site': 'yahoo', 'Location': 'Atlanta'}

In order to read a CSV file into a dictionary, follow the steps below:

  1. We imported the csv module
  2. We opened the file using a context manager in read more, 'r'
  3. We created a reader object by passing the file into the csv.DictReader() instantiator and passed in our field names as a list of values
  4. We then looped over each record in the generator and printed it out. Each item that gets printed out returns a dictionary.

In the following section, you’ll learn how to read all the lines of a CSV file into a list of dictionaries.

How to Read a CSV File All Lines to a List of Dictionaries in Python

In order to read a CSV file into a list of dictionaries, we can pass the DictReader object into the list function. This will unpack the generator object into a list. Because the DictReader object returns a dictionary for each value, we create a list of dictionaries.

Let’s see what this looks like:

# How to Read a CSV File Into a List of Dictionaries
import csv

with open('file.csv', 'r') as file:
    reader = csv.DictReader(
        file, fieldnames=['Name', 'Age', 'Site', 'Location'])
    data = list(reader)
    print(data)

# Returns:
# [{'Name': 'Nik', 'Age': '34', 'Site': 'datagy.io', 'Location': 'Toronto'}, {'Name': 'Kate', 'Age': '33', 'Site': 'google', 'Location': 'Paris'}, {'Name': 'Evan', 'Age': '32', 'Site': 'bing', 'Location': 'New York City'}, {'Name': 'Kyra', 'Age': '35', 'Site': 'yahoo', 'Location': 'Atlanta'}]

Depending on the size of your CSV file, this unpacking step can use significant amounts of memory. Be mindful of this. If needed, simply loop over each item in the DictReader object and append them line by line.

How to Read a CSV File with a Header to a Dictionary in Python

If your file has a header row, you don’t need to pass in field names when reading a CSV file – Python will infer the field names from the first row of data. Let’s modify our CSV file to include field names, as shown below:

Name,Age,Website,Location
Nik,34,datagy.io,Toronto
Kate,33,google,Paris
Evan,32,bing,New York City
Kyra,35,yahoo,Atlanta

When we read our CSV file as we have before, we can skip the fieldnames= parameter. Let’s see what this looks like:

# Reading a CSV File with a Header Row in Python
import csv

with open('file.csv', 'r') as file:
    reader = csv.DictReader(file)
    data = list(reader)
    print(data)

# Returns:
# [{'Name': 'Nik', 'Age': '34', 'Site': 'datagy.io', 'Location': 'Toronto'}, {'Name': 'Kate', 'Age': '33', 'Site': 'google', 'Location': 'Paris'}, {'Name': 'Evan', 'Age': '32', 'Site': 'bing', 'Location': 'New York City'}, {'Name': 'Kyra', 'Age': '35', 'Site': 'yahoo', 'Location': 'Atlanta'}]

We can see how intuitive it can be to read a file that contains a header row. Because Python uses the field names as the dictionary keys, we can simplify the process we use significantly.

In the following sections, you’ll learn how to customize the behavior of reading CSV files, such as when files contain custom delimiters or leading spaces.

How to Handle Custom Delimiters in Reading CSV Files in Python

When reading CSV files that have custom delimiters, you can use the delimiter= parameter when creating either the reader class or the DictReader class. This allows you to specify a string that’s used as the delimiter in the file.

In many cases, the custom delimiter will be a tab or a pipe character. Imagine that we’re working with the file below:

Name|Age|Website|Location
Nik|34|datagy.io|Toronto
Kate|33|google|Paris
Evan|32|bing|New York City
Kyra|35|yahoo|Atlanta

In the file above, the data are tab separated. We can specify that we want our file to be read with this custom delimit in mind by using the delimiter= parameter when creating the reader object. Let’s see what this looks like in Python:

# Reading a CSV File with a Custom Delimiter
import csv

with open('file.csv', 'r') as file:
    reader = csv.reader(file, delimiter='|')
    data = list(reader)
    print(data)

# Returns:
# [['Name', 'Age', 'Website', 'Location'], ['Nik', '34', 'datagy.io', 'Toronto'], ['Kate', '33', 'google', 'Paris'], ['Evan', '32', 'bing', 'New York City'], ['Kyra', '35', 'yahoo', 'Atlanta']]

In the code block above, we passed in the '|' character as our delimiter. Python will parse the file for this delimiter and split the data accordingly.

How to Skip a Preceding Space When Reading CSV Files in Python

Some CSV formats will include a space following a comma for readability. When Python reads these files, it will actually include this space in the resulting data. Let’s see what this looks like when we read the following file:

Name, Age, Website, Location
Nik, 34, datagy.io, Toronto
Kate, 33, google, Paris
Evan, 32, bing, New York City
Kyra, 35, yahoo, Atlanta

When we read this file using the methods you have learned above, we get the following result (note that we’re only reading the first line):

# Reading a File with Preceding Spaces
import csv

with open('file.csv', 'r') as file:
    reader = csv.reader(file)
    print(next(reader))

# Returns:
# ['Name', ' Age', ' Website', ' Location']

We can see that each of the values (except for the first) has a preceding space. This is absolutely not how we want to read the data.

In order to resolve the initial space issue, we can use the skipinitialspace= parameter. By default, this is set to False. By modifying it to True, Python will skip the initial space. Let’s see what this looks like:

# Skipping an Initial Space When Reading a CSV
import csv

with open('file.csv', 'r') as file:
    reader = csv.reader(file, skipinitialspace=True)
    print(next(reader))

# Returns:
# ['Name', 'Age', 'Website', 'Location']

We can see that by using the skipinitialspace= parameter, we were able to resolve the preceding space issues.

How to Specify Quoting Characters When Reading CSV Files in Python

When working with files that have quoting characters in them, Python provides a number of different options to read CSV files. Let’s take a look at this example file below:

Name,Num,Last Message
Nik,1,"Hey, how's it going?"
Kate,2,"Not so bad!"

When we read this file using the method we have learned so far, we get the following results:

# Reading a CSV File With Quoting Characters
import csv

with open('file2.csv', 'r') as file:
    reader = csv.reader(file)
    for row in reader:
        print(row)

# Returns:
# ['Name', 'Num', 'Last Message']
# ['Nik', '1', "Hey, how's it going?"]
# ['Kate', '2', 'Not so bad!']

We can see that in this case, the type of quotation mark that’s used is dependent on whether a quote character is present in the CSV file. The csv module provides a number of different constants to handle quotes:

  1. csv.QUOTE_ALL specifies that all fields have quotes around them
  2. csv.QUOTE_MINIMAL specifies that only the fields that contain special characters, such as the delimiter, quote character or any line terminator will have quotes around them
  3. csv.QUOTE_NONNUMERIC specifies that the CSV file has quotes around non-numeric entries
  4. csv.QUOTE_NONE specifies that none of the entries have quotes around them

Let’s see how we can use the csv.QUOTE_NONNUMERIC constant to prevent quoting the numbers in our CSV file:

# Specifying Quoting Formats
import csv

with open('file2.csv', 'r') as file:
    reader = csv.reader(file, quoting=csv.QUOTE_MINIMAL)
    for row in reader:
        print(row)

# Returns:
# ['Name', 'Num', 'Last Message']
# ['Nik', '1', "Hey, how's it going?"]
# ['Kate', '2', 'Not so bad!']

In the following section, you’ll learn how to use dialects to simplify reading multiple, similar files.

How to Use Dialects when Reading CSV Files in Python

Dialects in the CSV module allow you to set customization criteria to easily write and read files in the same style. While we have only covered custom delimiters so far, the CSV library provides significant flexibility to customize CSV files. Because of this, dialects allow you to set preset styles that can be used to read and write CSV files in Python.

Let’s see how we can create a dialect and use it when reading a CSV file:

# Registering and Using a Dialect
import csv
csv.register_dialect('sample_dialect', skipinitialspace=True, delimiter='|')
with open('file.csv', 'r') as file:
    reader = csv.reader(file, dialect='sample_dialect')
    for row in reader:
        print(row)

# Returns:
# ['Name', 'Age', 'Website', 'Location']
# ['Nik', '34', 'datagy.io', 'Toronto']
# ['Kate', '33', 'google', 'Paris']
# ['Evan', '32', 'bing', 'New York City']
# ['Kyra', '35', 'yahoo', 'Atlanta']

In the example above we created a dialect using the csv.register_dialect() function. The function allows you pass in a name using a string. Then, it can subclass either existing dialects or add custom formatting as we have done.

This allows you to easily re-use the dialect when you are reading or writing to CSV files. Because the function isn’t tied to a particular writer class, it works with both the csv.reader() class and the csv.DictReader() class.

In the section below, you’ll learn how to use the popular Pandas library to read CSV files in Python.

How to Read a CSV File With Pandas in Python

The popular data analysis library, pandas, provides helpful function for reading different types of files, including CSV files. The function offers huge amounts of flexibility in terms of reading CSV files into pandas DataFrame structures.

Let’s see how we can use the pandas read_csv() function to read a CSV file:

# Reading a CSV File with Pandas
import pandas as pd

df = pd.read_csv('file.csv')
print(df)

# Returns:
#    Name   Age     Website        Location
# 0   Nik    34   datagy.io         Toronto
# 1  Kate    33      google           Paris
# 2  Evan    32        bing   New York City
# 3  Kyra    35       yahoo         Atlanta

Let’s break down what we did in the code block above:

  1. We imported the pandas library using the convention, pd
  2. We then assigned a DataFrame df using the read_csv() function by passing in the path to function
  3. Finally, we printed the DataFrame, which returned a two-dimensional tabular data structure

The pandas read_csv() function provides an extensive number of parameters, including:

  • delimiter= which specifies the delimiter to use
  • use_cols= which specifies what columns to read in
  • skiprows= specifies the number of rows to skip
  • nrows= specifies how many rows to read in

The pandas read_csv() function provides huge amounts of flexibility for reading data into tabular datasets.

Conclusion

In this guide, you learned how to use Python to read CSV files to both lists and dictionaries. The Python csv module provides significant flexibility in terms of reading CSV files. By providing reader objects for both lists and dictionaries, the module lets you customize the output of your CSV file.

You first learned how to read CSV files to lists, including reading a single row, multiple rows, and adding headers. Then, you covered the same topics with Python dictionaries, as well as working with the additional options that the DictReader class provides. Finally, you learned how to customize behavior by exploring different options such as modifying delimiters and double quoting, as well as creating dialects for easier reusability.

Additional Resources

To learn more about related topics, check out the resources below:

Nik Piepenbreier

Nik is the author of datagy.io and has over a decade of experience working with data analytics, data science, and Python. He specializes in teaching developers how to use Python for data science using hands-on tutorials.View Author posts

Leave a Reply

Your email address will not be published. Required fields are marked *