Transpose a Pandas Dataframe

Transpose a Pandas Dataframe Cover Image

In this posts, you’ll learn how to transpose a Pandas dataframe using both the .T attribute and the .transpose() method. Additionally, you’ll learn how to create a copy of the dataframe (rather than a view representation).

Loading a Sample Dataframe

For this tutorial, we’ll be using two different dataframes. This is because the .transpose() method works differently, depending on whether your dataframe has mixed datatypes or not.

Let’s get started by loading some sample dataframes:

import pandas as pd

df1 = pd.DataFrame.from_dict(
    {
        'Name': ['Jane', 'Melissa', 'John', 'Matt'],
        'Age': [23, 45, 35, 64],
        'Birth City': ['London', 'Paris', 'Toronto', 'Atlanta'],
        'Gender': ['F', 'F', 'M', 'M']
    }
)

df2 = pd.DataFrame.from_dict(
    {
        'Age': [1,2,3,4,5],
        'Size': [10,24,43,54,56]
    }
)

print('Dataframe df1:')
print(df1)
print('Dataframe df2:')
print(df2)

This returns the following:

Dataframe df1:
      Name  Age Birth City Gender
0     Jane   23     London      F
1  Melissa   45      Paris      F
2     John   35    Toronto      M
3     Matt   64    Atlanta      M


Dataframe df2:
   Age  Size
0    1    10
1    2    24
2    3    43
3    4    54
4    5    56

What is Transposing?

In linear algebra, and, thereby, in machine learning, transposing a matrix involves switching the rows and columns of a matrix. This operation is often used to estimate variances and covariances in regression.

A transposed matrix, say df, is often denoted as df^T.

By switching the rows and columns of a matrix, the shape of a matrix is altered. This is unless the rows and columns have the same dimensions, in which case the matrix size remains the same.

For example, say you have a matrix with 3 rows and 2 columns. The transpose of this matrix would have 2 rows and 3 columns.

Learn more about transposing by checking out this wikipedia entry on it.

Transposing a Pandas Dataframe

Pandas has two easy ways to transpose a dataframe. You can do this either be appending .T to the end of a dataframe, or by calling the .transpose() method (which gives you a bit more flexibility by adding parameters to it).

Let’s transpose the dataframe df2 and we’ll explore its transpose a little bit before we dive into a more complicated transpose after.

Let’s use the .T method of transposing our dataframe:

>>> df2_T = df2.T
>>> print(df2_T)

       0   1   2   3   4
Age    1   2   3   4   5
Size  10  24  43  54  56

Now, it’s important here to note that all of our values have the same datatype (integers, to be specific). Let’s explore the data types of the original dataframe as well as its transpose:

df2_T = df2.T
print('df2\'s datatypes:')
print(df2.dtypes)
print('\n')
print('df2_T\'s datatypes:')
print(df2_T.dtypes)

This returns the following information:

df2's datatypes:
Age     int64
Size    int64
dtype: object


df2_T's datatypes:
0    int64
1    int64
2    int64
3    int64
4    int64
dtype: object

Check out some other Python tutorials on datagy, including our complete guide to styling Pandas and our comprehensive overview of Pivot Tables in Pandas!

Transposing a Pandas Dataframe with Mixed Data Types

We can see here in both the original dataframe and in the transpose that all the datatypes are int64.

Now let’s explore what happens with mixed data types, as found in df_1.

df1_T = df1.transpose()
print(df1_T)

This returns the following:

                 0        1        2        3
Name          Jane  Melissa     John     Matt
Age             23       45       35       64
Birth City  London    Paris  Toronto  Atlanta
Gender           F        F        M        M

If we now look at the data types of these two dataframes we can see the following items are returned:

df1_T = df1.T
print('df1\'s datatypes:')
print(df1.dtypes)
print('\n')
print('df1_T\'s datatypes:')
print(df1_T.dtypes)

Which returns:

df1's datatypes:
Name          object
Age            int64
Birth City    object
Gender        object
dtype: object


df1_T's datatypes:
0    object
1    object
2    object
3    object
dtype: object

We can see any new column that has mixed data types are immediately assigned object as their new data type. Keep this in mind if you’re hoping to perform some types of calculations on these columns.

Create a Transpose Copy for Mixed Datatypes

By default, Pandas will create a copy of the dataframe if there are mixed data types. This is handled implicitly when Pandas identifies mixed data types allowing you to make this change happen automatically.

That being said, should you have the same data type throughout, as in our df2 example, you can explicitly tell Pandas to generate a copy by setting copy=True.

Conclusion

In this post, you learned how to transpose a Pandas dataframe. You learned how to do this using both .T and the .transpose() method. You also learned a little bit about what transposing actually is, and how different data types may impact your results.

To learn more about the Pandas transpose function, check out the official documentation here.