Skip to content

How to Zip and Unzip Files in Python

How to Zip and Unzip Files in Python Cover image

In many cases, you’ll encounter zipped files (or need to zip files yourself). Because zip files are so common, being able to work with them programmatically is an important skill. In this tutorial, you’ll learn how to zip and unzip files using Python.

By the end of this tutorial, you’ll have learned:

  • How to zip files using Python
  • How to add files to an existing zip file
  • How to zip files with password protection (encryption)
  • How to unzip one file, some files, or all files in a zip file
  • How to unzip files conditionally

Understanding the Python zipfile Module

Python comes packaged with a module, zipfile, for creating, reading, writing, and appending to a zip file. Because the module is built into Python, there is no need to install anything. In particular, the zipfile module comes with a class, ZipFile, which has many helpful methods to allow you to read and write zip files in Python.

Let’s dive into using the module to first learn how to zip files using the zipfile module.

How to Zip Files Using Python

In order to zip files using Python, we can create a zip file using the Zipfile class. When we open the Zipfile for the first time, Python will create the file if it doesn’t already exist. Because of this, we can simply call the open action without needing to create a file first.

From there, we can use the .write() method to add individual files to a zip file. In order to make this process easier, we can use a for loop to pass in multiple files in a single action. If we want to add all files in a directory, we can create a list of all the files in a folder.

Let’s see how we can use Python to zip files, using all files in a directory:

# How to Zip All Files in a Directory
import os
import zipfile

directory = '/Users/datagy/files'
files = os.listdir(directory)

with zipfile.ZipFile('zipfile.zip', 'w') as zip:
    for file in files:
        file_path = os.path.join(directory, file)
        zip.write(file_path)

In the example above, we used the os.listdir() function to list out all the files in a directory. Then, we used the with context manager to open the file. The benefit of this is that Python will automatically close the file when all operations are done.

From there, we looped over the list of files and wrote each of them to the zip file using the .write() method.

How to Add Files to an Existing Zip File in Python

Adding files to an exciting zip file is made easy using the Zipfile class. When you instantiate a Zipfile object, you can open it in append mode using the 'a' mode. From there, the process works in the same way as though you were adding files to a newly created file.

Let’s see how we can use the Zipfile class to add a file to an existing zip file:

# Add a File to An Existing Zip File in Python
import os
import zipfile

directory = '/Users/datagy/old/'
file_name = 'file.txt'

with zipfile.ZipFile('zipfile.zip', 'a') as zip:
    file_path = os.path.join(directory, file_name)
    zip.write(filename=file_path, arcname=file_name)

Notice that in the code block above that we use a second argument, arcname=. Let’s take a look at the difference between these two parameters:

  1. filename= represents the file that you want to add into your zip file
  2. arcname= represents the path and filename you want to use in the archive itself.

If we were to leave the second argument blank, then it would use the filename value. This would replicate the whole folder structure, meaning that, in our case, Python would add the Users/datagy/old/ directory.

How to Unzip a Zip File in Python

To unzip a zip file using Python, you can use the .open() method and use the 'r' option to read the file. This simply opens the file in read mode. However, how can we see what’s inside it?

That’s where the .printdir() method comes into play. The method will print out the contents and some information about the contents. Let’s take a look at what this looks like:

# Printing Contents of a Zip File
import zipfile

with zipfile.ZipFile('/Users/nikpi/ThatExcelSiteGSC/Archive.zip', 'r') as zip:
    zip.printdir()

# Returns:
# File Name                                             Modified             Size
# file1.txt                                      2022-11-13 12:51:02            0
# file2.txt                                      2022-11-13 07:22:26            0
# file3.txt                                      2022-11-13 07:22:46            0
# file4.txt                                      2022-11-13 07:22:46            0

Now we can see that our zip file has four different files.

How to Extract a Single File From a Zipfile in Python

In order to extract a file, we can use the .extract() method. The method takes both the filename that you want to extract and the destination folder.

Let’s take a closer look at method in more detail:

# Understanding the .extract() Method
ZipFile.extract(member, path=None, pwd=None)

We can see that the method has three different parameters. Let’s break these down in more detail:

  1. member= is the file you want to extract
  2. path= is the destination of where you want to place the file
  3. pwd= is the password if the file is encrypted (which we’ll cover soon)

Let’s see what this looks like to extract a single file from a zip file in Python:

# How to Extract a File from a Zip File in Python
import zipfile

with zipfile.ZipFile('/Users/datagy/Archive.zip', 'r') as zip:
    zip.extract('file1.txt', '/Users/datagy/')

Note that we’re opening the zip file using the 'r' method, indicating we want to read the file. We then instruct Python to extract file1.txt and place it into the /Users/datagy/ directory.

In many cases, you’ll want to extract more than a single file. In the following section, you’ll learn how to extract all files from a zipfile in Python.

How to Extract All Files From a Zipfile in Python

In order to extract all files from a zip file in Python, you can use the .extractall() method. The method only requires that you pass in the destination path to where you want to extract all of the members of the zip file.

Let’s take a look at how to extract all files from a zip file in Python:

# How to Extract All Files from a Zip File in Python
import zipfile

with zipfile.ZipFile('/Users/datagy/Archive.zip', 'r') as zip:
    zip.extractall('/Users/datagy/all/')

In the example above, we can see how simple it is to extract all files from a zip file. We first open the zip file using the context manager. From there, we simply pass the path of where we want to extract the files to into the .extractall() method.

In the next section, you’ll learn how to decrypt encrypted files.

How to Decrypt a Password-Protected Zip File Using Python

The Python zipfile library lets you easily decrypt zip files that have a password. When you’re working with a file that is password protected and simply try to get a file, Python will raise a RunTimeError. Let’s see what this looks like:

import zipfile

with zipfile.ZipFile('/Users/datagy/password.zip', 'r') as zip:
    zip.extractall('/Users/datagy/all/')

# Returns: 
# RuntimeError: File <ZipInfo filename='all/file2.txt' filemode='-rw-r--r--' file_size=0 compress_size=12> is encrypted, password required for extraction

Let’s see how we can pass in a password to extract the files from a password-protected zip file. The .extractall() and .extract() methods both accept an additional parameter, pwd=. This parameter accepts the password of the zip file, or of the individual file itself.

Let’s see how we can use Python to extract files from a password-protected file:

# How to Extract Files from a Password Protected Zip File
import zipfile

with zipfile.ZipFile('/Users/datagy/password.zip', 'r') as zip:
    zip.extractall('/Users/datagy/all/', pwd=b'datagy')

In the example above, we pass in the password in the pwd= parameter. Note that we’re using a binary string, as noted by prefixing a b in front of the string itself.

How to Unzip Files Conditionally in Python

In this section, you’ll learn how to unzip files conditionally. This can be helpful if you have a large zip file and only want to extract files that are of a certain file type. Similarly, you could use the method you’ll learn to only extract files that are smaller than a certain filesize.

In order to extract files conditionally, we can loop over each file in the zip file and see if it meets the condition. Let’s see how we can use Python to extract only jpg files from our zip file.

# How to Extract Files Conditionally
import zipfile
import os

with zipfile.ZipFile('/Users/datagy/Archive.zip', 'r') as zip:
    files = zip.namelist()
    for file in files:
        filename, extension = os.path.splitext(file)

        if extension == '.jpg':
            zip.extract(file, '/Users/datagy/Conditionally/')

Let’s break down how to extract files conditionally from a zip file in Python:

  1. Open the zip file using the read method
  2. We then get a list of all the files in the zip file using the .namelist() method
  3. We then loop over each file and get the filename and extension for each file to check its extension
  4. If the extension is matches the desired extension,then we extract it using the .extract() method

Conclusion

In this tutorial, you learned how to use Python to zip and unzip files. Being able to work with compressed files is a common practice regardless of the industry you work in. For example, working in data science, you’ll often find transactional records compressed to save space.

You first learned how to create a zip file and how to add a single file or all files in the zip file. From there, you learned how to open zip files and get either a single file or all files. Then, you learned how to handle files that are password protected. Finally, you learned how to extract files conditionally, such as files of a certain type.

Additional Resources

To learn more about related topics, check out the tutorials below:

Tags:

Leave a Reply

Your email address will not be published.