In this tutorial, you’ll learn how to use the NumPy where() function to process or return elements based on a single condition or multiple conditions. The np.where() function is one of the most powerful functions available within NumPy. The function allows you to both return indices where a condition is met, or process array items where a condition is met. Unfortunately, this function is often poorly documented and underused – this tutorial aims to solve that.
By the end of this tutorial, you’ll have learned:
- What the NumPy where() function is and how to understand its parameters
- How to replace items in an array with the NumPy where() function
- How to process items in an array with the NumPy where() function
- How to use the np.where() function with multiple conditions
- How to use the np.where() function to return indices where a condition is met
Table of Contents
Understanding the np.where() Function
Before we dive into using the np.where()
function, let’s take a look at what the function is and the different parameters it offers. The function is described as “Return elements chosen from x or y depending on condition” in the official documentation.
This means that the function can return elements from another set of arrays, x
or y
, depending on a condition being met in the passed in array, a
. If this is still confusing, don’t worry – the examples shown below will help clear up any confusion.
Let’s take a look at the syntax of the np.where()
function:
# Understanding np.where()
numpy.where(
condition, # Where True, yield x, otherwise y
[x, y, ] # Values to choose from
)
The syntax of the function can be a bit confusing. The examples provided below should make the usage of the function much cleaner. Let’s dive right into some examples!
Using np.where() to Replace Items in a NumPy Array
One of the most straightforward use cases of the np.where()
function is to replace values in an array. Let’s take a look at an example and then break down what we did:
# Using np.where() to Replace Items in an Array
import numpy as np
arr = np.array([1,2,3,4,5,6,7,8,9,10])
replaced = np.where(arr > 5, 'Met', 'Not Met')
print(replaced)
# Returns:
# ['Not Met' 'Not Met' 'Not Met' 'Not Met' 'Not Met' 'Met' 'Met' 'Met' 'Met' 'Met']
Let’s break down what we did above:
- We loaded numpy using the alias
np
- We created an array,
arr
, that includes the values of 1 through 10 - We then created another array,
replaced
, which used thenp.where()
function to replace values in our array,arr
The function broadcasts the condition array and returns values from either the first or second value.
Similarly, we can use arrays as our selections. For example, if you wanted to return the original array if a condition was met or another value, you could write the following:
# Using np.where() to Replace Items in an Array with an Array
import numpy as np
arr = np.array([1,2,3,4,5,6,7,8,9,10])
replaced = np.where(arr > 5, arr, -1)
print(replaced)
# Returns:
# [-1 -1 -1 -1 -1 6 7 8 9 10]
Similarly, we could use two arrays in our np.where()
function and select from either array based on a condition being met.
Using np.where() to Process Items in a NumPy Array
In this section, you’ll learn how to use the np.where()
function to process items in a NumPy array. This can be very helpful when you want to apply a calculation based on a condition being met.
Let’s see how this works in Python:
# Using np.where() to Process Items in an Array
import numpy as np
arr = np.array([1,2,3,4,5,6,7,8,9,10])
replaced = np.where(arr % 2 == 0, arr * 10, arr / 10)
print(replaced)
# Returns:
# [ 0.1 20. 0.3 40. 0.5 60. 0.7 80. 0.9 100. ]
In the code above, we evaluate whether each item is an even value (using the modulo operator). If the item is even, we multiply the value by 10. Otherwise, we divide the number by 10. This can be a great way to modify arrays based on a condition.
It’s important to note that in our example, the modified values came from the original array. This doesn’t have to be the case! We could use two different arrays and process them in different ways.
Say we had a list of values that identified an object as either a square or circle. We also had an array that contains either the radius of a circle or the length of a square’s side. We can use the np.where()
function to return an array of the areas, as shown below:
# Using np.where() with a Practical Example
import numpy as np
types = ['square', 'circle', 'square', 'circle', 'circle', 'square']
arr = np.array([2.3, 3.4, 1.2, 5.4, 3.2, 1.1])
areas = np.where(types == 'square', arr ** 2, np.pi * arr ** 2)
print(areas)
# Returns:
# [16.61902514 36.31681108 4.52389342 91.60884178 32.16990877 3.80132711]
In the example above, we worked with two arrays: one containing information on the shape of an object and another containing some dimensions about that object. We were able to use the np.where()
function to calculate the area of the object using the appropriate formula.
Using np.where() with Multiple Conditions
In this section, you’ll learn how to use the np.where()
function with multiple conditions. This can greatly extend the usability to the function.
Let’s take a look at how we can extend an earlier example: we can return the value if it’s greater than five and even – else return 0:
# Using np.where() with Multiple Conditions
import numpy as np
arr = np.arange(10)
filtered = np.where((arr % 2 == 0) & (arr > 5), arr, 0)
print(filtered)
# Returns: [0 0 0 0 0 0 6 0 8 0]
In the example above, we used the &
operator to select items based on two conditions being True
. It’s important to wrap the conditions in brackets, in order to prevent any ambiguity in the conditions.
We can also select items based on either condition being met, using the |
operator. Let’s see what this looks like:
# Using np.where() with Multiple Conditions and |
import numpy as np
arr = np.arange(10)
filtered = np.where((arr % 2 == 0) | (arr > 5), arr, 0)
print(filtered)
# Returns: [0 0 2 0 4 0 6 7 8 9]
In this example, we use the |
logical or
operator to select items where either condition is met.
Using np.where() with Multiple Dimensions
In this section, we’ll take a look at using the np.where()
function with arrays of multiple dimensions. In fact, this works the same as it does for arrays of only one dimension. Let’s take a look at how we can use a matrix with the np.where()
function:
# Using the np.where() Function with a NumPy Matrix
import numpy as np
arr = np.arange(9).reshape(3, 3)
filtered = np.where(arr % 2 == 0, arr, -1)
print(filtered)
# Returns:
# [[ 0 -1 2]
# [-1 4 -1]
# [ 6 -1 8]]
In the example above, we return a value from the original array if it’s even – otherwise, we return a -1
.
Using np.where() to Return Indices Where a Condition is Met
The np.where()
function can also be used to only return the indices of an array where a condition is met. This can be done when no resulting arrays are passed in. The resulting array is simply an array of the indices that match a condition.
Let’s see how we can accomplish this is Python:
# Using np.where() to Return Indices Where a Condition is Met
import numpy as np
arr = np.arange(9)
idx = np.where(arr % 2 == 0)
print(idx)
# Returns:
# (array([0, 2, 4, 6, 8]),)
Conclusion
In this tutorial, you learned how to use the np.where()
function to select and transform items in an array that meet a condition. You first learned how to understand the syntax of the function and then worked through a simple example. Then, you learned how to use the function to replace and transform items in an array. You also learned how to use the function with multiple conditions and with arrays of multiple dimensions. Finally, you learned how to use the function to return the indices of an array that meet a condition.
Additional Resources
To learn more about related topics, check out the tutorials below:
Enjoyed your training article on numpy.where() function. Am new to numpy. Struggling with selective processing of a single dimension in a 2 dimensional array. Most of the articles I am reading cover conditions on the full array.
Would be great to see an advanced example and explanation on how to return the full array and only change values in a single column.
Looking forward to your response.
DeVon
Hi DeVon, sorry for the late reply! Hopefully the same below helps:
import numpy as np
# Create a sample 2D array
array = np.array([
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]
])
# Define a condition
condition = array[:, 0] % 2 == 0 # Select rows where the first column is even
# Define values to replace for those rows
new_values = array[:, 1] * 10 # Multiply values in the second column by 10
# Apply the condition using where() and modify only the second column
array[:, 1] = np.where(condition, new_values, array[:, 1])
print(array)