In this tutorial, you’ll learn how to use Python to remove newline characters from a string.
Working with strings in Python can be a difficult game, that often comes with a lot of pre-processing of data. Since the strings we find online often come with many issues, learning how to clean your strings can save you a lot of time. One common issue you’ll encounter is additional newline characters in strings that can cause issues in your work.
The Quick Answer: Use Python string.replace()
What are Python Newline Characters
Python comes with special characters to let the computer know to insert a new line. These characters are called newline characters. These characters look like this:
When you have a string that includes this character, the text following the newline character will be printed on a new line.
Let’s see how this looks in practice:
a_string = 'Hello!\nWelcome to Datagy!\nHow are you?\n' print(a_string) # Returns # Hello! # Welcome to Datagy! # How are you?
Now that you know how newline characters work in Python, let’s learn how you can remove them!
Use Python to Remove All Newline Characters from a String
Python’s strings come built in with a number of useful methods. One of these is the
.replace() method, which does exactly what it describes: it allows you to replace parts of a string.
# Use string.replace() to replace newline characters in a string a_string = 'Hello!\n Welcome to Datagy!\n How are you?\n' a_string = a_string.replace('\n','') print(a_string) # Returns: Hello! Welcome to Datagy! How are you?
Let’s see what we’ve done here:
- We passed the
string.replace()method onto our string
- As parameters, the first positional argument indicates what string we want to replace. Here, we specified the newline
- The second argument indicates what to replace that character with. In this case, we replaced it with nothing, thereby removing the character.
In this section, you learned how to use
string.replace() to remove newline characters from a Python string. In the next section, you’ll learn how to replace trailing newlines.
Tip! If you want to learn more about how to use the
.replace() method, check out my in-depth guide here.
Use Python to Remove Trailing Newline Characters from a String
There may be times in your text pre-processing that you don’t want to remove all newline characters, but only want to remove trailing newline characters in Python. In these cases, the
.replace() method isn’t ideal. Thankfully, Python comes with a different string method that allows us to to strip characters from the trailing end of a string: the
Let’s dive into how this method works in practise:
# Remove trailing newline characters from a string in Python a_string = 'Hello! \nWelcome to Datagy! \nHow are you?\n' a_string = a_string.rstrip() print(a_string) # Returns # Hello! # Welcome to Datagy! # How are you?
.rstrip() method works by removing any whitespace characters from the string. Because of this, we didn’t need to specify a new line character.
If you only wanted to remove newline characters, you could simply specify this, letting Python know to keep any other whitespace characters in the string. This would look like the line below:
a_string = a_string.rstrip('\n')
In the next section, you’ll learn how to use regex to remove newline characters from a string in Python.
Tip! If you want to learn more about the
.rstrip() (as well as the
.lstrip()) method in Python, check out my in-depth tutorial here.
Use Python Regex to Remove Newline Characters from a String
Python’s built-in regular expression library,
re, is a very powerful tool to allow you to work with strings and manipulate them in creative ways. One of the things we can use regular expressions (regex) for, is to remove newline characters in a Python string.
Let’s see how we can do this:
# Use regular expressions to remove newline characters from a string in Python import re a_string = 'Hello! \nWelcome to Datagy! \nHow are you?\n' a_string = re.sub('\n', '', a_string) print(a_string) # Returns: Hello! Welcome to Datagy! How are you?
Let’s see what we’ve done here:
- We imported
reto allow us to use the regex library
- We use the
re.sub()function, to which we passed three parameters: (1) the string we want to replace, (2), the string we want to replace it with, and (3) the string on which the replacement is to be done
It may seem overkill to use
re for this, and it often is, but if you’re importing
re anyway, you may as well use this approach, as it lets you do much more complex removals!
In this post, you learned how to use Python to remove newline characters from a string. You learned how to do this using the
string.replace() method, how to replace trailing newlines using
.rstrip(), and how to accomplish this using regular expression’s
To learn more about the
.rstrip() method, check out the official documentation here.
Want to learn Python for Data Science? Check out my ebook for as little as $10!