Skip to content

Converting Pandas DataFrame Column from Object to Float

Converting Pandas DataFrame Column from Object to Float Cover Image

In this tutorial, you’ll learn how to convert a Pandas DataFrame column from object (or string) to a float data type. Data cleaning is an essential skill for any Python developer. Being able to convert data types in Python, especially to numeric data types is important to conduct analysis.

By the end of this tutorial, you’ll have learned the following:

  • How to use the Pandas to_numeric() function to convert strings to floats
  • How to use the Pandas astype() function to convert strings to floats
  • How to read files and specify numeric data types

Quick Answer: Use Pandas astype

The easiest way to convert a Pandas DataFrame column’s data type from object (or string) to float is to use the astype method. The method can be applied to a Pandas DataFrame column or to an entire DataFrame, making it very flexible.

Take a look at the code block below to see how this can best be accomplished:

# Convert a Pandas DataFrame Column From String to Float
df[col] = df[col].astype(float)

Let’s now dive into the different methods of how to do this in more detail.

Loading a Sample Pandas DataFrame

In order to follow along with this tutorial, I have created a Pandas DataFrame. If you want to follow along line-by-line, copy and paste the code below into your favorite code editor.

# Load a Sample Pandas DataFrame
import pandas as pd
data = {'Quantity': [10, 20, 30, 40, 50],
        'Price': ['1.00', '2.50', '3.20', '4.70', '5.00']}
df = pd.DataFrame(data)
print(df)

# Returns:
#    Quantity Price
# 0        10  1.00
# 1        20  2.50
# 2        30  3.20
# 3        40  4.70
# 4        50  5.00

We can see from the code above that the DataFrame has two columns. While one column looks like a float, it’s actually formatted as a string. We can confirm this by checking the datatypes of the DataFrame:

# Check the Data Types
print(df.dtypes)

# Returns:
# Quantity     int64
# Price       object
# dtype: object

We can see that the Price column’s data type is actually an object. Let’s now dive into how we can convert the column to a floating point value.

Convert a Pandas Column From Object to Float with to_numeric

To convert a Pandas column’s data type from object to float you can use the to_numeric function. The function allows you to pass in a Pandas column and convert it to a numeric data type, where the data type is inferred.

Let’s take a look at how we can convert a Pandas column to floats using the pd.to_numeric() function:

# Convert a Pandas DataFrame Column to Float
df['Price'] = pd.to_numeric(df['Price'])
print(df.dtypes)

# Returns:
# Quantity      int64
# Price       float64
# dtype: object

In the example above, we reassigned a column to itself, after passing the column into the to_numeric function. The function also provides a helpful parameter, errors=, to handle errors when converting values.

By default, this parameter is set to 'raise', which will raise an error if the value can’t be converted. You can modify this behavior to 'coerce', which will include missing values for any value that can’t be converted.

Let’s now take a look at another method, the .astype() method.

Convert a Pandas Column From Object to Float with astype

One of the most common ways to convert a Pandas DataFrame column’s data type from object to float is to use the Pandas astype method. The astype method allows you to pass in a data type that you want to use.

Let’s take a look at how we can convert a Pandas DataFrame column from string to floats using the .astype() method:

# Convert a Pandas DataFrame Column to Float
df['Price'] = df['Price'].astype(float)
print(df.dtypes)

# Returns:
# Quantity      int64
# Price       float64
# dtype: object

The benefit of this approach is that it allows you to specify the data type that you want to use. While the pd.to_numeric() function will infer the data type from the column’s values, the astype() method allows you to specify the data type that you want to use.

Convert a Pandas DataFrame From Object to Float with astype

Similar to the example above, you can use the Pandas astype method to convert all columns from strings to floats. This can be done by applying the method to the DataFrame as a whole. This will attempt to convert all columns to the specified data type.

Let’s take a look at an example of how to use the astype method on an entire DataFrame:

# Convert a Pandas DataFrame to Float
df = df.astype(float)
print(df.dtypes)

# Returns:
# Quantity    float64
# Price       float64
# dtype: object

In the example above, we applied the astype method to the entire DataFrame. What’s interesting about this is that it also converted the Quantity column to floats.

Convert a Pandas Column From Object to Float When Reading Files

One of the best ways to convert a string column to a float column is to read it correctly when reading the file. Many of the functions that allow you to read data in Pandas, such as the read_csv function, allow you to specify data types.

This can be done using the dtype= parameter in the Pandas read_csv function. The parameter accepts a dictionary, where the keys are column labels and the values are data types.

# Convert a Pandas DataFrame When Reading a File
df = pd.read_csv(file, dtypes={'float_col': float})

In the code block above, I demonstrated how you can specify that a column should be read as a float. If our sample DataFrame above was a CSV file, this approach would have saved us a step later on!

Similarly, the Pandas read_excel function also has a dtype parameter that works in the same way.

Conclusion

In this tutorial, we learned how to convert a Pandas DataFrame column from object (or string) to a float data type. Data cleaning is an essential skill for any Python developer, and being able to convert data types in Python, especially to numeric data types, is important to conduct analysis. We covered three different methods to convert a Pandas column from object to float: using the Pandas to_numeric() function, the Pandas astype() function, and specifying the data type when reading files.

The astype() method was the easiest and most flexible way to convert a single column, while the to_numeric() function allowed for more control over error handling. Finally, specifying data types when reading files allowed for the data to be read correctly from the start. By using these methods, you can ensure that your data is in the correct format for analysis.

To learn more about the Pandas to_numeric function, check out the official documentation.

Nik Piepenbreier

Nik is the author of datagy.io and has over a decade of experience working with data analytics, data science, and Python. He specializes in teaching developers how to use Python for data science using hands-on tutorials.View Author posts

Leave a Reply

Your email address will not be published. Required fields are marked *