Python: Remove Punctuation from a String (3 Different Ways!)

  • by
Python - Remove Punctuation from String Cover Image

In this tutorial, you’ll learn how to use Python to remove punctuation from a string. You’ll learn how to strip punctuation from a Python string using the translate() method, the str.replace() method, the popular regular expression library re, and, finally, using for-loops.

Being able to work with and manipulate strings is an essential skill for any budding Pythonista. Strings you find via the internet or your files will often require quite a bit of work in order to be able to analyze them. One of the tasks you’ll often encounter is the ability to use Python to remove punctuation from a string.

The Quick Answer: Use .translate() for the fastest performance

Quick Answer - Python Remove Punctuation from String

Use Python to Remove Punctuation from a String with Translate

One of the easiest ways to remove punctuation from a string in Python is to use the str.translate() method. The translate method typically takes a translation table, which we’ll do using the .maketrans() method.

Let’s take a look at how we can use the .translate() method to remove punctuation from a string in Python. In order to do this, we’ll import the built-in string library, which comes bundled with a punctuation attribute.

import string

a_string = '!hi. wh?at is the weat[h]er lik?e.'
new_string = a_string.translate(str.maketrans('', '', string.punctuation))

print(new_string)

# Returns: hi what is the weather like

The .maketrans() method here takes three arguments, the first two of which are empty strings, and the third is the list of punctuation we want to remove. This tells the function to replace all punctuation with None.

In case you’re curious what punctuation are included in the string.punctuation, let’s have a quick look:

print(string.punctuation)

# Returns: !"#$%&'()*+,-./:;<=>?@[\]^_`{|}~

Want to learn more? If you want to learn how to use the translate method (and others!) to remove a character from a string in Python, check out my in-depth tutorial here.

Use Python to Strip Punctuation from a String with Regular Expressions

The Python regular expression library, re, feels like it can do just about anything – including stripping punctuation from a string!

Regular expressions is great because it comes built-in with a number helpful character classes that allow us to select different types of characters. For example, \w\s looks for words or whitespaces. We can select the opposite of this (i.e., anything that isn’t a word or a whitespace) using the ^ character. This, then, allows us to select anything that isn’t a word or a whitespace, which in our case, it selects punctuation.

Let’s see how we can use regex to remove punctuation in Python:

import re

a_string = '!hi. wh?at is the weat[h]er lik?e.'
new_string = re.sub(r'[^\w\s]', '', a_string)

print(new_string)

# Returns: hi what is the weather like

This is a great approach that looks for anything that isn’t an alphanumeric character or whitespace, and replaces it with a blank string, thereby removing it.

Use Python to Remove Punctuation from a String with str.replace

The str.replace() method makes easy work of replacing a single character. For example, if you wanted to only replace a single punctuation character, this would be a simple, straightforward solution.

Let’s say you only wanted to replace the ! character from our string, we could use the str.replace() method to accomplish this. Let’s take a look at how to:

a_string = '!hi. wh?at is the weat[h]er lik?e.'
new_string = a_string.replace('!', '')

print(new_string)

# Returns: hi. wh?at is the weat[h]er lik?e.

What we’ve done here, is append the .replace() method to our string. The first parameter is the string to replace, which in this case is our ! character. The second parameter is what to replace it with, which in this case is an empty string.

In the next example, you’ll learn how to use a for loop to replace all punctuation from a string using a for-loop.

Use Python to Strip Punctuation from a String using a for-loop

In the previous section of the tutorial, you learned how to use the str.replace() method to remove a single punctuation character. In this section, we’ll repeat this example, but use a for-loop to be able to remove every punctuation character.

Let’s see how we can do this in Python:

import string

a_string = '!hi. wh?at is the weat[h]er lik?e.'

for character in string.punctuation:
    a_string = a_string.replace(character, '')

print(a_string)

# Returns: hi what is the weather like

One of the things to note here, is that we’re writing over our original string here. We can’t assign a new string, as it will continuously replace itself.

Now that you’ve learned a number of methods, let’s see which of these methods is the fastest.

What is the fastest way to strip a Python String from Punctuation?

In this tutorial, you’ve learned three different methods to remove punctuation from a string in Python. Let’s see which of these methods is the fastest.

For this test, we created a string that’s over 1,000,000,000 characters long and removed all punctuation from a string using Python.

Let’s take a look at the results:

MethodTime Taken
str.translate()2.35 seconds
regular expressions88.8 seconds
for loop with str.replace()20.6 seconds
Figuring out which method is fastest to replace all punctuation in a string in Python

Of course, speed isn’t everything, but finding code that significantly slows down your code will often lead to a poorer user experience.

Conclusion

In this post, you learned how to strip punctuation from a Python string. You learned how to do this using the str.translate() method, as well as regular expressions. You also learned how to do this with the .replace() method as well as with a for-loop. Finally, you learned which of these methods is fastest.

To learn more about the str.translate() method, check out the official documentation here.