In this tutorial, you’ll learn how to use Python to split a string on multiple delimiters. You’ll learn how to do this with the built-in regular expressions library
re as well as with the built-in string
But why even learn how to split data? Splitting data can be an immensely useful skill to learn. Data comes in all shapes and it’s often not as clean as we would like to be. There will be many times when you want to split a string by multiple delimiters to make it more easy to work with.
Now lets get started!
How do you split a string in Python?
Python has a built-in method you can apply to string, called
.split(), which allows you to split a string by a certain delimiter.
The method looks like this:
In this method, the:
- seperator: argument accepts what character to split on. If no argument is provided, it uses any whitespace to split.
- maxsplit: the number of splits to do, where the default value is
-1, meaning that all occurrences are split.
Let’s say you had a string that you wanted to split by commas – let’s learn how to do this:
sample_string = 'my name is nik, welcome to datagy' split_string = sample_string.split(',') print(split_string) # Returns: ['my name is nik', ' welcome to datagy']
We can see here that what’s returned is a list that contains all of the newly split values.
Split a Python String on Multiple Delimiters using Regular Expressions
The most intuitive way to split a string is to use the built-in regular expression library
re. The library has a built in
.split() method, similar to the example covered above. What’s unique about this method is that it allows you to use regular expressions to split our strings.
Let’s see what this method looks like:
re.split(pattern, string, maxsplit=0, flags=0)
Similar to the example above, the
maxsplit= argument allows us to set how often a string should be split. If it’s set to any positive non-zero number, it’ll split only that number of times.
So, let’s repeat our earlier example with the
import re sample_string = 'my name is nik, welcome to datagy' split_string = re.split(',', sample_string) print(split_string) # Returns: ['my name is nik', ' welcome to datagy']
Now, say you have a string with multiple delimiters. The
re method makes it easy to split this string too!
Let’s take a look at another example:
import re sample_string = 'hi! my name is nik, welcome; to datagy' split_string = re.split(r',|!|;', sample_string) print(split_string) # Returns: ['hi', ' my name is nik', ' welcome', ' to datagy']
What we’ve done here is passed in a raw string that
re helps interpret. We pass in the pipe character
| as an
We can simplify this even further by passing in a regular expressions collection. Let’s see how we can do this:
import re sample_string = 'hi! my name is nik, welcome; to datagy' split_string = re.split(r'[,;!]', sample_string) print(split_string) # Returns: ['hi', ' my name is nik', ' welcome', ' to datagy']
This returns the same thing as before, but it’s a bit cleaner to write and to read.
Split a Python String on Multiple Delimiters using String Split
You’re also able to avoid use of the
re module altogether. The module can be a little intimidating, so if you’re more comfortable, you can accomplish this without the module as well.
In the example below, you’ll learn how to split a Python string with multiple delimiters by first replacing values. We’ll take our new string and replace all delimiters to be one consistent delimiter. Let’s take a look:
sample_string = 'hi! my name is nik, welcome; to datagy' new_string = sample_string.replace('!', ',').replace(';', ',') split_string = new_string.split(',') print(split_string) # Returns: ['hi', ' my name is nik', ' welcome', ' to datagy']
This method works fine when you have a small number of delimiters, but it quickly becomes messy when you have more than 2 or 3 delimiters that you would want to split your string by. It’s better to stick to the
re module for more complex splits.
Create a Function to Split a Python String with Multiple Delimiters
Finally, let’s take a look at how to split a string using a function. For this function, we’ll use the
re module. You’ll be able to pass in a list of delimiters and a string and have a split string returned.
Let’s get started!
from re import split def split_string(string, delimiters): """Splits a string by a list of delimiters. Args: string (str): string to be split delimiters (list): list of delimiters Returns: list: list of split strings """ pattern = r'|'.join(delimiters) return split(pattern, string) sample_string = 'hi! my name is nik, welcome; to datagy' new_string = split_string(sample_string, [',',';','!']) print(new_string) # Returns: ['hi', ' my name is nik', ' welcome', ' to datagy']
In this post, you learned how to split a Python string by multiple delimiters. You learned how to do this using the built-in
.split() method, as well as the built-in regular expression