This article explores all the different ways you can use to select columns in Pandas, including using loc, iloc, and how to create copies of dataframes. You’ll learn a ton of different tricks for selecting columns using handy follow along examples.
Let’s get started!
- Why Select Columns in Python?
- Creating our Dataframe
- Using loc to Select Columns
- Using iloc to Select Columns
- Select a Single Column in Pandas
- Select Multiple Columns in Pandas
- Copying Columns vs. Selecting Columns
Why Select Columns in Python?
The data you work with in lots of tutorials has very clean data with a limited number of columns. But this isn’t true all the time.
In many cases, you’ll run into datasets that have many columns – most of which are not needed for your analysis.
In this case, you’ll want to select out a number of columns.
This often has the added benefit of using less memory on your computer (when removing columns you don’t need), as well as reducing the amount of columns you need to keep track of mentally.
Creating our Dataframe
To get started, let’s create our dataframe to use throughout this tutorial. We’ll create one that has multiple columns, but a small amount of data (to be able to print the whole thing more easily).
We’ll need to import pandas and create some data. Simply copy the code and paste it into your editor or notebook.
import pandas as pd df = pd.read_csv('https://raw.githubusercontent.com/datagy/pivot_table_pandas/master/select_columns.csv') print(df.head())
This returns the following:
Name Age Height Score Random_A Random_B Random_C Random_D Random_E
0 Joe 28 5'9 30 73 59 5 4 31
1 Melissa 26 5'5 32 30 85 38 32 80
2 Nik 31 5'11 34 80 71 59 71 53
3 Andrea 33 5'6 38 16 63 86 81 42
4 Jane 32 5'8 29 19 40 48 5 68
Let’s take a quick look at what makes up a dataframe in Pandas:

Using loc to Select Columns
The loc function is a great way to select a single column or multiple columns in a dataframe if you know the column name(s).
This method is great for:
- Selecting columns by column name,
- Selecting rows along columns,
- Selecting columns using a single label, a list of labels, or a slice
The loc method looks like this:

Now, if you wanted to select only the name column and the first three rows, you would write:
selection = df.loc[:2,'Name'] print(selection)
This returns:
0 Joe
1 Melissa
2 Nik
You’ll probably notice that this didn’t return the column header.
Note: Indexes in Pandas start at 0. That means if you wanted to select the first item, we would use position 0, not 1.
If you wanted to select multiple columns, you can include their names in a list:
selection = df.loc[:2,['Name', 'Age', 'Height', 'Score']] print(selection)
This returns:
Name Age Height Score
0 Joe 28 5'9 30
1 Melissa 26 5'5 32
2 Nik 31 5'11 34
Additionally, you can slice columns if you want to return those columns as well as those in between. The same code we wrote above, can be re-written like this:
selection = df.loc[:2,'Name':'Score'] print(selection)
This returns:
Name Age Height Score
0 Joe 28 5'9 30
1 Melissa 26 5'5 32
2 Nik 31 5'11 34
Now, let’s take a look at the iloc method for selecting columns in Pandas.
Using iloc to Select Columns
The iloc function is one of the primary way of selecting data in Pandas. The method “iloc” stands for integer location indexing, where rows and columns are selected using their integer positions.
This method is great for:
- Selecting columns by column position (index),
- Selecting rows along with columns,
- Selecting columns using a single position, a list of positions, or a slice of positions
The standard format of the iloc method looks like this:

Now, for example, if we wanted to select the first two rows and first three columns of our dataframe, we could write:
selection = df.iloc[:2,:2] print(selection)
This returns:
Name Age
0 Joe 28
1 Melissa 26
Note that we didn’t write df.iloc[0:2,0:2], but that would have yielded the same result.
If we wanted to select all columns with iloc, we could do that by writing:
selection = df.iloc[:2,] print(selection)
This returns:
Name Age Height Score Random_A Random_B Random_C Random_D Random_E
0 Joe 28 5'9 30 73 59 5 4 31
1 Melissa 26 5'5 32 30 85 38 32 80
Similarly, we could select all rows by leaving out the first values (but including a colon before the comma).
selection = df.iloc[:,:2] print(selection)
This returns:
Name Age
0 Joe 28
1 Melissa 26
2 Nik 31
3 Andrea 33
4 Jane 32
Select a Single Column in Pandas
Now, if you want to select just a single column, there’s a much easier way than using either loc or iloc.
This can be done by selecting the column as a series in Pandas. You can pass the column name as a string to the indexing operator.
For example, to select only the Name column, you can write:
selection = df['Name'] print(selection)
Doing this, this returns the following:
0 Joe
1 Melissa
2 Nik
3 Andrea
4 Jane
Similarly, you can select columns by using the dot operator. To do the same as above using the dot operator, you could write:
selection = df.Name print(selection)
This returns the same as above:
0 Joe
1 Melissa
2 Nik
3 Andrea
4 Jane
However, using the dot operator is often not recommended (while it’s easier to type). This is because you can’t:
- Select columns with spaces in the name,
- Use columns that have the same names as dataframe methods (such as ‘type’),
- Pick columns that aren’t strings, and
- Select multiple columns.
Check out some other Python tutorials on datagy, including our complete guide to styling Pandas and our comprehensive overview of Pivot Tables in Pandas!
Select Multiple Columns in Pandas
Similar to the code you wrote above, you can select multiple columns.
To do this, simply wrap the column names in double square brackets.
If you wanted to select the Name, Age, and Height columns, you would write:
selection = df[['Name', 'Age', 'Height']] print(selection)
This returns:
Name Age Height
0 Joe 28 5'9
1 Melissa 26 5'5
2 Nik 31 5'11
3 Andrea 33 5'6
4 Jane 32 5'8
What’s great about this method, is that you can return columns in whatever order you want. If you wanted to switch the order around, you could just change it in your list:
selection = df[['Name', 'Height', 'Age']] print(selection)
Which returns:
Name Height Age
0 Joe 5'9 28
1 Melissa 5'5 26
2 Nik 5'11 31
3 Andrea 5'6 33
4 Jane 5'8 32
Copying Columns vs. Selecting Columns
Something important to note for all the methods covered above, it might looks like fresh dataframes were created for each. However, that’s not the case!
In Python, the equal sign (“=”), creates a reference to that object.
Because of this, you’ll run into issues when trying to modify a copied dataframe.
In order to avoid this, you’ll want to use the .copy() method to create a brand new object, that isn’t just a reference to the original.
To accomplish this, simply append .copy() to the end of your assignment to create the new dataframe.
For example, if we wanted to create a filtered dataframe of our original that only includes the first four columns, we could write:
new_df = df.iloc[:,:4].copy() print(new_df)
This results in this code below:
Name Age Height Score
0 Joe 28 5'9 30
1 Melissa 26 5'5 32
2 Nik 31 5'11 34
3 Andrea 33 5'6 38
4 Jane 32 5'8 29
This is incredibly helpful if you want to work the only a smaller subset of a dataframe.
Conclusion: Using Pandas to Select Columns
Thanks for reading all the way to end of this tutorial!
Using follow-along examples, you learned how to select columns using the loc method (to select based on names), the iloc method (to select based on column/row numbers), and, finally, how to create copies of your dataframes.
You also learned how to make column selection easier, when you want to select all rows.

Want to learn Python for Data Science? Check out my ebook!