In this tutorial, you’ll learn how to use the NumPy split function to split an array into chunks. Being able to work with and manipulate arrays in NumPy using Python is an important skill for anyone working with data.
By the end of this tutorial, you’ll have learned how to:
- Understand and use the NumPy
split()
function - How to split a NumPy array into evenly-sized chunks
- How to split NumPy arrays into differently-sized chunks
- How to split NumPy arrays across different axes
Want to learn how to split a list into chunks instead? Check out this complete guide to splitting lists in Python.
Table of Contents
Understanding the NumPy Split Function
The NumPy split() function takes three parameters, two of which are required. By default, you need to pass in the array that you want to split, as well as the indices or number of sections you want to split. Optionally, you can also pass in the axis, which defaults to 0.
The code block below shows all of the different parameters that are available in the function:
# Understanding the NumPy split() Function
import numpy as np
np.split(ary, indices_or_sections, axis=0)
In my experience, the axis parameter is rarely used. However, it’s important to know that it’s there. We’ll dive into using it as well, to make sure that all your bases are covered.
Splitting a NumPy Array Into Evenly-Sized Chunks
One of the simplest use cases of the NumPy split function is to split the array into a number of evenly-sized chunks. This is done using the indices_or_sections=
parameter. In order to split the array into evenly-sized chunks, you would pass in a single integer, representing the number of splits.
Let’s first create an array using the NumPy arange function, which creates a sequence of numbers. We’ll then split the array into three chunks:
# Splitting a NumPy Array into Evenly-Sized Chunks
import numpy as np
arr = np.arange(9)
arrs = np.split(arr, 3)
print(arrs)
# Returns:
# [array([0, 1, 2]), array([3, 4, 5]), array([6, 7, 8])]
Let’s break down what we did in the code block above:
- We first create a NumPy array called “arr” using
np.arange(9)
, which creates an array with values 0 through 8. - We then use the
np.split()
function to split the “arr” array into 3 evenly-sized chunks. - The resulting chunks are stored in a new array called “arrs”.
- Finally, the print() function is used to display the contents of “arrs”, which should be a list of 3 NumPy arrays, each containing 3 elements.
So, the code block splits a NumPy array into evenly-sized chunks and stores the resulting chunks in a new array. Let’s now see how we can use the function to split the array at different indices.
Splitting a NumPy Array Into Chunks
Alternatively, you can pass in a list of indices at which to split the array. This can be useful when you want to create multiple different arrays from an original array at different indices. Let’s take a look at an example:
# Splitting a NumPy Array at Indices
import numpy as np
arr = np.arange(9)
arrs = np.split(arr, [1,5])
print(arrs)
# Returns:
# [array([0]), array([1, 2, 3, 4]), array([5, 6, 7, 8])]
In the example above, we passed in the list of indices [1, 5]
. The way that this splits the original array is using the following splits:
[0:1]
– meaning that the data are split from the beginning up to (but not including the first)[1:5]
– meaning that this includes the first through to (but not including) index five[5:]
– meaning that items from index five through to the end are included
This is a much simpler and cleaner way of splitting an array into different splits using slicing.
Splitting a NumPy Array Across Different Axes
The third parameter is used to instruct NumPy arrays across different axes. By default, the axis parameter is set to 0; by changing it to 1, the arrays are split column-wise.
Let’s create a two-dimensional array by stacking the array over itself using the stack function. Then, we’ll split the array along the first axis.
# Splitting an Array Across Different Axes
import numpy as np
arr = np.arange(3)
arrs = np.stack([arr, arr, arr])
split = np.split(arrs, 3, axis=1)
print(split)
# Returns:
# [
# array([[0],
# [0],
# [0]]),
# array([[1],
# [1],
# [1]]),
# array([[2],
# [2],
# [2]])
# ]
In the example above, we created an array that had the same values stacked over itself. We then split it into three chunks, where each of the columns was returned.
Let’s now dive into dealing with errors that can be returned from the function.
Dealing with ValueError in NumPy Split
The NumPy split function will raise a ValueError if the function is passed an integer, but the array cannot be split into evenly-sized chunks.
Let’s take a look at an example of how this might occur:
import numpy as np
arr = np.arange(9)
arrs = np.split(arr, 2)
print(arrs)
# Raises:
# ValueError: array split does not result in an equal division
In the code block above, we attempted to split our array with nine values into two chunks. Because the function expects the values to be split evenly, if this isn’t possible, NumPy will raise an error.
Conclusion
In conclusion, the NumPy split function is a powerful tool for working with arrays in Python. By understanding the different parameters available, you can split arrays into evenly-sized or differently-sized chunks, as well as across different axes.
Additionally, it’s important to be aware of potential errors that may arise when using the function, such as the ValueError that occurs when attempting to split an array into uneven chunks. By mastering the NumPy split function, you’ll be better equipped to handle and manipulate data in a variety of contexts.
As a beginner Python programmer, this article is very helpful for me. Thanks for sharing this now I will try my own and practice 🙂
I’m glad it helped!