In this posts, you’ll learn how to transpose a Pandas dataframe using both the .T
attribute and the .transpose()
method. Additionally, you’ll learn how to create a copy of the dataframe (rather than a view representation).
Table of Contents
Loading a Sample Dataframe
For this tutorial, we’ll be using two different dataframes. This is because the .transpose()
method works differently, depending on whether your dataframe has mixed datatypes or not.
Let’s get started by loading some sample dataframes:
import pandas as pd
df1 = pd.DataFrame.from_dict(
{
'Name': ['Jane', 'Melissa', 'John', 'Matt'],
'Age': [23, 45, 35, 64],
'Birth City': ['London', 'Paris', 'Toronto', 'Atlanta'],
'Gender': ['F', 'F', 'M', 'M']
}
)
df2 = pd.DataFrame.from_dict(
{
'Age': [1,2,3,4,5],
'Size': [10,24,43,54,56]
}
)
print('Dataframe df1:')
print(df1)
print('Dataframe df2:')
print(df2)
This returns the following:
Dataframe df1:
Name Age Birth City Gender
0 Jane 23 London F
1 Melissa 45 Paris F
2 John 35 Toronto M
3 Matt 64 Atlanta M
Dataframe df2:
Age Size
0 1 10
1 2 24
2 3 43
3 4 54
4 5 56
What is Transposing?
In linear algebra, and, thereby, in machine learning, transposing a matrix involves switching the rows and columns of a matrix. This operation is often used to estimate variances and covariances in regression.
A transposed matrix, say df
, is often denoted as df^T
.
By switching the rows and columns of a matrix, the shape of a matrix is altered. This is unless the rows and columns have the same dimensions, in which case the matrix size remains the same.
For example, say you have a matrix with 3 rows and 2 columns. The transpose of this matrix would have 2 rows and 3 columns.
Learn more about transposing by checking out this wikipedia entry on it.
Transposing a Pandas Dataframe
Pandas has two easy ways to transpose a dataframe. You can do this either be appending .T
to the end of a dataframe, or by calling the .transpose()
method (which gives you a bit more flexibility by adding parameters to it).
Let’s transpose the dataframe df2
and we’ll explore its transpose a little bit before we dive into a more complicated transpose after.
Let’s use the .T
method of transposing our dataframe:
>>> df2_T = df2.T
>>> print(df2_T)
0 1 2 3 4
Age 1 2 3 4 5
Size 10 24 43 54 56
Now, it’s important here to note that all of our values have the same datatype (integers, to be specific). Let’s explore the data types of the original dataframe as well as its transpose:
df2_T = df2.T
print('df2\'s datatypes:')
print(df2.dtypes)
print('\n')
print('df2_T\'s datatypes:')
print(df2_T.dtypes)
This returns the following information:
df2's datatypes:
Age int64
Size int64
dtype: object
df2_T's datatypes:
0 int64
1 int64
2 int64
3 int64
4 int64
dtype: object
Check out some other Python tutorials on datagy, including our complete guide to styling Pandas and our comprehensive overview of Pivot Tables in Pandas!
Transposing a Pandas Dataframe with Mixed Data Types
We can see here in both the original dataframe and in the transpose that all the datatypes are int64
.
Now let’s explore what happens with mixed data types, as found in df_1
.
df1_T = df1.transpose()
print(df1_T)
This returns the following:
0 1 2 3
Name Jane Melissa John Matt
Age 23 45 35 64
Birth City London Paris Toronto Atlanta
Gender F F M M
If we now look at the data types of these two dataframes we can see the following items are returned:
df1_T = df1.T
print('df1\'s datatypes:')
print(df1.dtypes)
print('\n')
print('df1_T\'s datatypes:')
print(df1_T.dtypes)
Which returns:
df1's datatypes:
Name object
Age int64
Birth City object
Gender object
dtype: object
df1_T's datatypes:
0 object
1 object
2 object
3 object
dtype: object
We can see any new column that has mixed
data types are immediately assigned object
as their new data type. Keep this in mind if you’re hoping to perform some types of calculations on these columns.
Create a Transpose Copy for Mixed Datatypes
By default, Pandas will create a copy of the dataframe if there are mixed data types. This is handled implicitly when Pandas identifies mixed data types allowing you to make this change happen automatically.
That being said, should you have the same data type throughout, as in our df2
example, you can explicitly tell Pandas to generate a copy by setting copy=True
.
Conclusion
In this post, you learned how to transpose a Pandas dataframe. You learned how to do this using both .T
and the .transpose()
method. You also learned a little bit about what transposing actually is, and how different data types may impact your results.
To learn more about the Pandas transpose function, check out the official documentation here.