In this tutorial, you’ll learn how to use Python to calculate the Euclidian distance between two points, meaning using Python to find the distance between two points. You’ll learn how to calculate the distance between two points in two dimensions, as well as any other number of dimensions.
You’ll first learn a naive way of doing this, using sum()
and square()
, then using the dot()
product of a transposed array, and finally, using numpy
and scipy
. You’ll close off the tutorial by gaining an understanding of which method is fastest.
Let’s get started!
The Quick Answer: Use scipy’s distance() or math.dist()
Table of Contents
What is the Euclidian distance between two points?
The Euclidian Distance represents the shortest distance between two points. Because of this, it represents the Pythagorean Distance between two points, which is calculated using:
d = √[(x2 – x1)2 + (y2 – y1)2]
We can easily calculate the distance of points of more than two dimensions by simply finding the difference between the two points’ dimensions, squared.
Euclidian distances have many uses, in particular in machine learning. For example, they are used extensively in the k-nearest neighbour classification systems. Because of this, understanding different easy ways to calculate the distance between two points in Python is a helpful (and often necessary) skill to understand and learn.
To learn more about the Euclidian distance, check out this helpful Wikipedia article on it.
Find the Euclidian Distance between Two Points in Python using Sum and Square
A very intuitive way to use Python to find the distance between two points, or the euclidian distance, is to use the built-in sum()
and product()
functions in Python.
Say we have two points, located at (1,2)
and (4,7)
, let’s take a look at how we can calculate the euclidian distance:
# Python Euclidian Distance using Naive Method
point_1 = (1,2)
point_2 = (4,7)
def naive_euclidian_distance(point1, point2):
differences = [point1[x] - point2[x] for x in range(len(point1))]
differences_squared = [difference ** 2 for difference in differences]
sum_of_squares = sum(differences_squared)
return sum_of_squares ** 0.5
print(naive_euclidian_distance(point_1, point_2))
# Returns 5.830951894845301
We can see here that we:
- Iterate over each points coordinates and find the differences
- We then square these differences and add them up
- Finally, we return the square root of this sum
We can dramatically cut down the code used for this, as it was extremely verbose for the point of explaining how this can be calculated:
# Python Euclidian Distance using Naive Method
point_1 = (1,2)
point_2 = (4,7)
def naive_euclidian_distance(point1, point2):
return sum([(point1[x] - point2[x]) ** 2 for x in range(len(point1))]) ** 0.5
print(naive_euclidian_distance(point_1, point_2))
# Returns 5.830951894845301
We were able to cut down out function to just a single return statement. Keep in mind, it’s not always ideal to refactor your code to the shortest possible implementation. It’s much better to strive for readability in your work!
Want to learn more about Python list comprehensions? Check out my in-depth tutorial here, which covers off everything you need to know about creating and using list comprehensions in Python.
In the next section, you’ll learn how to use the numpy
library to find the distance between two points.
Use Numpy to Find the Euclidian Distance
We can easily use numpy’s built-in functions to recreate the formula for the Euclidian distance. Let’s see how:
# Python Euclidian Distance using Sum and Product
import numpy as np
point_1 = (1,2)
point_2 = (4,7)
def numpy_euclidian_distance(point_1, point_2):
array_1, array_2 = np.array(point_1), np.array(point_2)
squared_distance = np.sum(np.square(array_1 - array_2))
distance = np.sqrt(squared_distance)
return distance
print(numpy_euclidian_distance(point_1, point_2))
# Returns: 5.830951894845301
Let’s take a look at what we’ve done here:
- We imported
numpy
and declared our two points - We then created a function
numpy_euclidian_distance()
which takes two points as parameters - We then turned both the points into numpy arrays
- We calculated the sum of the squares between the differences for each axis
- We then took the square root of this sum and returned it
If you wanted to use this method, but shorten the function significantly, you could also write:
import numpy as np
def numpy_euclidian_distance_short(point_1, point_2):
return np.sqrt(np.sum(np.square(np.array(point_1) - np.array(point_2))))
print(numpy_euclidian_distance_short(point_1, point_2))
# Returns: 5.830951894845301
Before we continue with other libraries, let’s see how we can use another numpy
method to calculate the Euclidian distance between two points.
Use Dot to Find the Distance Between Two Points in Python
Numpy
also comes built-in with a function that allows you to calculate the dot product between two vectors, aptly named the dot()
function. Let’s see how we can use the dot product to calculate the Euclidian distance in Python:
# Python Euclidian Distance using Numpy dot
import numpy as np
point_1 = (1,2)
point_2 = (4,7)
def numpy_dot_euclidian_distance(point1, point2):
array1, array2 = np.array(point1), np.array(point2)
differences = array1 - array2
squared_sums = np.dot(differences.T, differences)
distance = np.sqrt(squared_sums)
return distance
print(numpy_dot_euclidian_distance(point_1, point_2))
# Returns 5.830951894845301
Want to learn more about calculating the square-root in Python? I have an in-depth guide to different methods, including the one shown above, in my tutorial found here!
Again, this function is a bit word-y. We can definitely trim it down a lot, as shown below:
# Python Euclidian Distance using Numpy dot
import numpy as np
point_1 = (1,2)
point_2 = (4,7)
def numpy_dot_euclidian_distance(point1, point2):
differences = np.array(point1) - np.array(point2)
distance = np.sqrt(np.dot(differences.T, differences))
return distance
print(numpy_dot_euclidian_distance(point_1, point_2))
# Returns 5.830951894845301
In the next section, you’ll learn how to use the math
library, built right into Python, to calculate the distance between two points.
Use Math to Find the Euclidian Distance between Two Points in Python
Python comes built-in with a handy library for handling regular mathematical tasks, the math
library. Because calculating the distance between two points is a common math task you’ll encounter, the Python math
library comes with a built-in function called the dist()
function.
The dist()
function takes two parameters, your two points, and calculates the distance between these points.
Let’s see how we can calculate the Euclidian distance with the math.dist()
function:
# Python Euclidian Distance using math.dist
from math import dist
point_1 = (1,2)
point_2 = (4,7)
print(dist(point_1, point_2))
# Returns 5.830951894845301
We can see here that this is an incredibly clean way to calculating the distance between two points in Python. Not only is the function name relevant to what we’re calculating, but it abstracts away a lot of the math equation!
In the next section, you’ll learn how to use the scipy
library to calculate the distance between two points.
Use Python and Scipy to Find the Distance between Two Points
Similar to the math
library example you learned in the section above, the scipy
library also comes with a number of helpful mathematical and, well, scientific, functions built into it.
Let’s use the distance()
function from the scipy.spatial
module and learn how to calculate the euclidian distance between two points:
# Python Euclidian Distance using scipy
from scipy.spatial import distance
point_1 = (1,2)
point_2 = (4,7)
print(distance.euclidean(point_1, point_2))
# Returns 5.830951894845301
We can see here that calling the distance.euclidian()
function is even more specific than the dist()
function from the math
library. Being specific can help a reader of your code clearly understand what is being calculated, without you needing to document anything, say, with a comment.
Now that you’ve learned multiple ways to calculate the euclidian distance between two points in Python, let’s compare these methods to see which is the fastest.
Check out some other Python tutorials on datagy, including our complete guide to styling Pandas and our comprehensive overview of Pivot Tables in Pandas!
Fastest Method to Find the Distance Between Two Points in Python
In the previous sections, you’ve learned a number of different ways to calculate the Euclidian distance between two points in Python. In each section, we’ve covered off how to make the code more readable and commented on how clear the actual function call is.
Let’s take a look at how long these methods take, in case you’re computing distances between points for millions of points and require optimal performance.
Each method was run 7 times, looping over at least 10,000 times each function call.
Method | Time to Execute |
---|---|
Naive Method | 162 µs |
Numpy | 68 µs |
Numpy (dot) | 64 µs |
Math.dist | 2.47 µs |
scipy.distance | 71.5 µs |
We can see that the math.dist()
function is the fastest. The only problem here is that the function is only available in Python 3.8 and later.
Conclusion
In this post, you learned how to use Python to calculate the Euclidian distance between two points. The Euclidian distance measures the shortest distance between two points and has many machine learning applications. You leaned how to calculate this with a naive method, two methods using numpy, as well as ones using the math and scipy libraries.
To learn more about the math.dist()
function, check out the official documentation here.