`Pandas`

is a popular Python library used to manipulate tabular data. It provides a versatile `dataframe`

object that can read data from many popular formats, such as Excel, SQL, CSV and more. It provides an incredibly helpful methods to both reshape your data and analyze your data in different ways.

## Featured Pandas Articles

Python Pivot Tables – The Ultimate Guide – Learn everything you need to know about Pandas pivot tables in this in-depth guide, covering off the versatile function. You’ll learn how to work with multi-index pivot tables and creating your own custom functions to analyze data with.

Exploring the Pandas Style API – Learn how to style your Pandas Dataframe in different ways, both using colour and value formatting to better illustrate the importance of what you’re presenting. The Pandas style API provides you with many different tools that makes working with styling tabular data much easier.

## Pandas Tutorials

- Splitting Your Dataset with Scitkit-Learn train_test_splitIn this tutorial, you’ll learn how to split your Python dataset using Scikit-Learn’s train_test_split function. You’ll gain a strong understanding of the importance of splitting your data for machine learning to avoid underfitting or overfitting your models. You’ll also learn how the function is applied in many machine learning applications. Being able to split your… Read More »Splitting Your Dataset with Scitkit-Learn train_test_split
- Introduction to Random Forests in Scikit-Learn (sklearn)In this tutorial, you’ll learn what random forests in Scikit-Learn are and how they can be used to classify data. Decision trees can be incredibly helpful and intuitive ways to classify data. However, they can also be prone to overfitting, resulting in performance on new data. One easy way in which to reduce overfitting is… Read More »Introduction to Random Forests in Scikit-Learn (sklearn)
- Introduction to Pandas for Data ScienceIn this tutorial, you’ll learn how to dive into the wonderful world of Pandas. Pandas is a Python package that provides fast and flexible data structures used for data manipulation and analysis. By the end of this tutorial, you’ll have learned how to: Install pandas for Python using pip or conda Understand the pandas series… Read More »Introduction to Pandas for Data Science
- Indexing, Selecting, and Assigning Data in PandasIn this tutorial, you’ll learn how to index, select and assign data in a Pandas DataFrame. Understanding how to index and select data is an important first step in almost any exploratory work you’ll take on in data science. Similarly, knowing how to assign values in Pandas can open up a whole new world potential… Read More »Indexing, Selecting, and Assigning Data in Pandas
- Summarizing and Analyzing a Pandas DataFrameIn this tutorial, you’ll learn how to quickly summarize and analyze a Pandas DataFrame. By the end of this tutorial, you’ll have learned to take on some exploratory analysis of your dataset using pandas. You’ll learn how to calculate general attributes of your dataset, such as measures of central tendency or measures of dispersion. You’ll… Read More »Summarizing and Analyzing a Pandas DataFrame
- Transforming Pandas Columns with map and applyIn this tutorial, you’ll learn how to transform your Pandas DataFrame columns using vectorized functions and custom functions using the map and apply methods. By the end of this tutorial, you’ll have a strong understanding of how Pandas applies vectorized functions and how these are optimized for performance. You’ll also learn how to use custom… Read More »Transforming Pandas Columns with map and apply
- Binning Data in Pandas with cut and qcutIn this tutorial, you’ll learn how to bin data in Python with the Pandas cut and qcut functions. You’ll learn why binning is a useful skill in Pandas and how you can use it to better group and distill information. By the end of this tutorial, you’ll have learned: How to use the cut and… Read More »Binning Data in Pandas with cut and qcut
- DateTime in Pandas and PythonIn this tutorial, you’ll learn how to work with dates, times, and DateTime in Pandas and Python. Working with DateTime in Python and Pandas can be a complicated thing. This guide aims to make the complicated, simple, by focusing on what you need to know to get started and to know enough to discover more… Read More »DateTime in Pandas and Python
- Plotting in Python with MatplotlibIn this tutorial, you’ll learn how to get started with plotting in Python with the matplotlib library. You’ll learn how the matplotlib library works and gain an understanding of its “anatomy”. You’ll learn how to plot and customize some simple graphs and how to use the matplotlib library with Pandas. Finally, you’ll learn how to… Read More »Plotting in Python with Matplotlib
- Data Cleaning and Preparation in Pandas and PythonIn this tutorial, you’ll learn how to clean and prepare data in a Pandas DataFrame. You’ll learn how to work with missing data, how to work with duplicate data, and dealing with messy string data. Being able to effectively clean and prepare a dataset is an important skill. Many data scientists estimate that they spend… Read More »Data Cleaning and Preparation in Pandas and Python
- Pandas GroupBy: Group, Summarize, and Aggregate Data in PythonThe Pandas groupby method is an incredibly powerful tool to help you gain effective and impactful insight into your dataset. In just a few, easy to understand lines of code, you can aggregate your data in incredibly straightforward and powerful ways. By the end of this tutorial, you’ll have learned how the Pandas .groupby() method… Read More »Pandas GroupBy: Group, Summarize, and Aggregate Data in Python
- Pandas Datetime to Date Parts (Month, Year, etc.)In this tutorial, you’ll learn how to use Pandas to extract date parts from a datetime column, such as to date, year, and month. Pandas provides a number of easy ways to extract parts from a datetime object, including using the .dt accessor. By the end of this tutorial, you’ll have learned how the dt… Read More »Pandas Datetime to Date Parts (Month, Year, etc.)
- Calculate the Pearson Correlation Coefficient in PythonIn this tutorial, you’ll learn how to calculate the Pearson Correlation Coefficient in Python. The tutorial will cover a brief recap of what the Pearson correlation coefficient is, how to calculate it with SciPy and how to calculate it for a Pandas Dataframe. Being able to understand the correlation between different variables is a key… Read More »Calculate the Pearson Correlation Coefficient in Python
- Pandas: Get the Row Number from a DataframeLearn how to use Pandas to get the row number of rows matching a condition or multiple conditions, and how to count rows matching conditions.
- How to Calculate a Z-Score in Python (4 Ways)In this tutorial, you’ll learn how to use Python to calculate a z-score for an array of numbers. You’ll learn a brief overview of what the z-score represents in statistics and how it’s relevant to machine learning. You’ll then learn how to calculate a z-score from scratch in Python as well as how to use… Read More »How to Calculate a Z-Score in Python (4 Ways)
- Pandas: How to Drop a Dataframe Index ColumnLearn how to use Pandas to drop a dataframe index column using the reset_index and set_index methods and how to read csv without an index.
- Calculate a Weighted Average in Pandas and PythonLearn how to use Pandas to calculate the weighted average in Python, using groupby, numpy, and the zip function between two lists.
- How to Shuffle Pandas Dataframe Rows in PythonLearn how to shuffle a Pandas Dataframe using three different methods, including how to be able to reproduce your shuffle results.
- Python zfill & rjust: Pad a String in PythonIn this tutorial, you’ll learn how to use Python’s zfill method to pad a string with leadering zeroes. You’ll learn how the method works and how to zero pad a string and a number. You’ll also learn how to use the method in Pandas as well as how to use sign prefixes, such as +… Read More »Python zfill & rjust: Pad a String in Python
- Pandas: Number of Columns (Count Dataframe Columns)Learn how to use Python and Pandas to count the number of columns in a dataframe, using counting the number of columns meeting a condition.
- Pandas Sum: Add Dataframe Columns and RowsLearn how to use Pandas to calculate a sum, including adding Pandas Dataframe columns and rows, and how to add columns conditionally.
- Pandas Diff: Calculate the Difference Between Pandas RowsLearn how to use the Pandas diff method to calculate the difference between dataframe rows and columns, including at defined intervals.
- Normalize a Pandas Column or Dataframe (w/ Pandas or sklearn)Learn how to normalize and standardize a Pandas Dataframe with sklearn, including max absolute scaling, min-max scaling and z-scoare scaling.
- Pandas Quantile: Calculate Percentiles of a DataframeLearn how to use the Pandas quantile method to calculate percentiles in Pandas including how to modify the interpolation of values.
- Pandas Rank Function: Rank Dataframe Data (SQL row_number Equivalent)Learn how to use the Pandas rank method to rank you data, including how to rank a grouped dataframe using the groupby method.
- Pandas Describe: Descriptive Statistics on Your DataframeLearn how to use the Pandas describe method to generate summary statistics on your Pandas Dataframe, including changing percentiles.
- Python SHA256 Hashing Algorithm: ExplainedLearn how to implement Python SHA256 using the hashlib module, including working with unicode strings, files, and Pandas Dataframes.
- Pandas Shift: Shift a Dataframe Column Up or DownLearn how to use the Python Pandas shift function to move a dataframe’s rows up or down, including working with time series and missing data.
- 7 Ways to Sample Data in PandasLearn how to sample data in Pandas using Python, including how to use the sample function, reproduce results, and weighted samples of data.
- Python Lowercase String with .lower(), .casefold(), and .islower()Learn to use Python to lowercase text, using the lower and caseload functions, checking if strings are lower and converting lists to lower.
- Pandas Dataframe to CSV File – Export Using .to_csv()Use Python and Pandas to export a dataframe to a CSV file, using .to_csv, including changing separators, encoding, and missing values.
- Pandas: Iterate over a Pandas Dataframe RowsLearn how to use Python and Pandas to iterate over rows of a dataframe, why vectorization is better, and how to use iterrows and itertuples.
- Pandas: Convert Column Values to StringsLearn how to use Python and Pandas to convert a dataframe column values to strings, including how to optimize for memory and efficiency.
- Python Absolute Value: Abs() in PythonLearn how to calculate a Python absolute value using the abs() function, as well as how to calculate in numpy array and a pandas dataframe.
- Pandas Variance: Calculating Variance of a Pandas Dataframe ColumnLearn how to calculate the variance of a variable in Pandas, including how to calculate for a single column, multiple or a whole dataframe.
- Pandas: Create a Dataframe from Lists (5 Ways!)Learn how to create a Pandas dataframe from lists, including using lists of lists, the zip() function, and ways to add columns and an index.
- Pandas Rename Index: How to Rename a Pandas Dataframe IndexLearn how to rename a Pandas index, including a single index or multi-index, as well as how to drop an index name altogether.
- Pandas: Count Unique Values in a GroupBy ObjectLearn how to use Pandas to count unique values in a GroupBy object, allowing you to count distinct values using the popular groupby method.
- Pandas Reset Index: How to Reset a Pandas IndexLearn how to use the Pandas reset index method to reset an index, including working with a multi-index and dropping the original index.
- Pandas: Add Days to a Date ColumnLearn how to use Pandas to add days to a date column, both constant values and based off another column, using the Pandas timedelta function.
- Pandas Mean: Calculate Pandas Average for One or Multiple ColumnsLearn how to calculate the Pandas mean (or Pandas Average), including how to calculate it on a column, dataframe, and row, and with nulls.
- Python List Difference: Find the Difference between 2 Python ListsLearn how to find the Python list difference to find the differences between two lists, including how to find the symmetric list difference.
- Pandas Column to List – Convert a Pandas Series to a ListLearn how to convert a Pandas column to list with this tutorial. Learn three different ways to accomplish this, all very easy to follow!
- Get Pandas Column Names as a ListGet Pandas column names as a list using this tutorial! Also learn to sort this list alphabetically and see if a column exists.
- Transpose a Pandas DataframeLearn how to transpose a pandas dataframe, including how to work with mixed datatypes and what their outputs may be.
- Python: Find Average of List or List of ListsIn this post, you’ll learn how to use Python to find the average of a list or a list of lists, using built-in tools and packages like numpy.
- Python: Split a Pandas DataframeLearn how to split a Pandas dataframe in Python. Split a dataframe by column value, by position, and by random values.
- Get Pandas Column Names as a ListLearn how to get Pandas columns as a list, a sorted list and how to check if a column exists in a particular dataframe.
- Python: Count Number of Occurrences in a String (4 Ways!)Learn how to count the number of occurrences in a string using Python, including the built-in count method and the counter module.
- Convert Python String to Date: Python’s strptime FunctionLearn how to convert a Python string to date using the datetime module’s strptime function. Also learn how to do this to a Pandas dataframe!
- Pandas: Number of Rows in a Dataframe (6 Ways)Learn how to count the number of rows in a Pandas Dataframe, including identifying how many rows contain a value or meet a condition.
- Pandas Replace: Replace Values in Pandas DataframeLearn how to use the Pandas replace method to replace values across columns and dataframes, including with regular expressions.
- Create an Empty Pandas Dataframe and Append DataIn this post, you’ll learn how to create empty pandas dataframes and how to add data to them row-by-row and add rows via a loop.
- Seaborn Boxplot – How to create box and whisker plotsLearn how to create a Seaborn boxplot, including how to add styles, titles, axis labels and add grouped boxplots.
- Seaborn Line Plot – Create Lineplots with Seaborn relplotLearn how to use the Seaborn line plot andrelplot functions to create beautiful line charts, add titles, customize styles, multiple line charts.
- Seaborn Barplot – Make Bar Charts with sns.barplotLearn how to use the Seaborn barplot and countplot functions to create beautiful bar charts, add titles, customize styles, group bar charts.
- Rename Pandas Columns with Pandas .rename()Learn the ways in which you can rename Pandas columns, using Pandas .rename() method. Learn how to rename one, all, including automatically.
- Matplotlib Scatter Charts – Learn all you need to knowLearn how to create Matplotlib scatter charts, including how to customize colours, add titles, change transparency and size of markers.
- Reorder Pandas Columns: Pandas Reindex and Pandas insertLearn different ways to reorder Pandas columns, including the Pandas reindex function as well as a custom function.
- Pandas get dummies (One-Hot Encoding) ExplainedThe pandas get dummies function allows you to easily one-hot encode your data sets for use in machine learning algorithms.
- Relative Frequencies and Absolute Frequencies in Python and PandasIn this post, you’ll learn how to calculate relative frequencies and absolute frequencies using pure Python, as well as the popular data science library, Pandas. A relative frequency, measures how often a certain value occurs in a dataset, relative to the total number of values in that dataset. An absolute frequency, meanwhile, simply measures how… Read More »Relative Frequencies and Absolute Frequencies in Python and Pandas
- Pandas Fiscal Year – Get Financial Year with PandasLearn how to calculate custom a Pandas fiscal year and how to format these fiscal years in custom ways.
- How to Sort Data in a Pandas DataFrameLean how to sort data in a Pandas dataframe, including how to sort in ascending or descending order, as well as sorting by multiple columns.
- Pandas Value_counts to Count Unique ValuesThe Pandas value_counts functioncounts values in a Pandas dataframe. Learn to normalize, include missing values, and combine with groupby.
- Create New Columns in PandasPandas is one of the quintessential libraries for data science in Python. A useful skill is the ability to create new columns, either by adding your own data or calculating data based on existing data. Table of Contents Video TutorialLoading DatasetAssign a Custom Value to a Column in PandasAssign Multiple Values to a Column in… Read More »Create New Columns in Pandas
- Pandas Crosstab – Everything You Need to KnowThe Pandas crosstab function is one of the many ways in which Pandas allows you to customize data. On the surface, it appears to be quite similar to the Pandas pivot table function, which I’ve covered extensively here. This post will give you a complete overview of how to best leverage the function. The crosstab… Read More »Pandas Crosstab – Everything You Need to Know
- Pandas Fillna – Dealing with Missing ValuesIn this post, you’ll learn about the Pandas Fillna function and how to deal with missing values. No dataset is perfect. Learning how to deal with missing values is an important step in retaining useful data. Table of Contents Video TutorialLoading the DatasetIdentifying Missing Values in PandasPandas Fillna OverviewPandas Fillna to Fill ValuesFill Missing Values… Read More »Pandas Fillna – Dealing with Missing Values
- How to Drop Duplicates in PandasLearn how to drop duplicates in Pandas, including keeping the first or last instance, and dropping duplicates based only on a subset of columns.
- Use Pandas & Python to Extract Tables from Webpages (read_html)Learn how to easily scrape data from the web without having to build a complex web scraping script!
- Plotting a Histogram in Python with Matplotlib and PandasLearn what histograms are and how to create them in Python with Matplotlib and Pandas.
- All the Ways to Get Pandas Unique ValuesPandas provides a lot of different ways to interact with unique values. Learn how to get unique values as a list, get unique values across columns and more!
- All the Ways to Filter Pandas DataframesLearn all the ways in which to filter pandas dataframes in this tutorial, including filtering dates, multiple columns, using the iloc, loc and query functions!
- Unpivot Your Data with the Pandas Melt FunctionYou can easily unpivot and reshape data you with python by using Pandas and the Melt function! Find out how using this thorough overview!
- Use Pandas to Drop Columns and RowsLearn how to use Pandas to drop columns and rows in a dataframe, including how to drop columns or rows based on conditions.
- 4 Ways to Use Pandas to Select Columns in a DataframeThis article explores all the different ways you can use to select columns in Pandas, including using loc, iloc, and how to create copies of dataframes. You’ll learn a ton of different tricks for selecting columns using handy follow along examples. Table of Contents Why Select Columns in Python?Creating our DataframeUsing loc to Select ColumnsUsing… Read More »4 Ways to Use Pandas to Select Columns in a Dataframe
- Exploring the Pandas Style APIExplore the how to style Pandas dataframes and make them presentation ready, including how to add conditional formatting and data type labels!
- Pivot Tables in Pandas with PythonYou may be familiar with pivot tables in Excel to generate easy insights into your data. In this post, you’ll learn how to create pivot tables in Python and Pandas using the .pivot_table() method. This post will give you a complete overview of how to use the .pivot_table() function! Being able to quickly summarize data… Read More »Pivot Tables in Pandas with Python
- Using Pandas for Descriptive Statistics in PythonLearn how to easily generate high-level descriptive statistics on any dataframe using a simple Pandas function!
- Creating Date Ranges with PandasCreating date ranges with Pandas can significantly speed up your workflow when you need to iterate over a number of dates. For example, when we run queries on APIs or databases, we may need to generate a list of dates that can be iterated over. This can be a time-consuming task, but thankfully we can… Read More »Creating Date Ranges with Pandas
- Binning Data in Python with Pandas’ cut()In this post, we’ll explore how binning data in Python works with the cut() method in Pandas. In the past, we’ve explored how to use the describe() method to generate some descriptive statistics. In particular, the describe method allows us to see the quarter percentiles of a numerical column. However, as we’re generating insight into… Read More »Binning Data in Python with Pandas’ cut()
- VLOOKUP in Python and Pandas using .map() or .merge()You may be familiar with VLOOKUP in Excel and be wondering how to accomplish this in Python. Using this tutorial, we’ll demonstrate Pandas’ .map() and .merge() methods to accomplish the same thing!

## Numpy Tutorials

Numpy is an incredible library used to work with arrays and matrices to calculate linear algebra problems and many other applications. The library provides list-like numpy arrays, which can be up to 50 times faster than Python lists. The library provides the basis for many other libraries.

- NumPy for Data Science in PythonIn this tutorial, you’ll learn how to use Python’s NumPy library for data science. You’ll learn why the library matters in the realm of data science and how it’s foundational for many other libraries. You’ll learn about the NumPy ndarray data structure and how it works. By the end of the tutorial, you’ll have learned:… Read More »NumPy for Data Science in Python
- Calculate the Pearson Correlation Coefficient in PythonIn this tutorial, you’ll learn how to calculate the Pearson Correlation Coefficient in Python. The tutorial will cover a brief recap of what the Pearson correlation coefficient is, how to calculate it with SciPy and how to calculate it for a Pandas Dataframe. Being able to understand the correlation between different variables is a key… Read More »Calculate the Pearson Correlation Coefficient in Python
- Python: Multiply Lists (6 Different Ways)Learn how to use Python to multiply lists, including multiplying lists by a number and multiplying lists element-wise using numpy.
- How to Calculate a Z-Score in Python (4 Ways)In this tutorial, you’ll learn how to use Python to calculate a z-score for an array of numbers. You’ll learn a brief overview of what the z-score represents in statistics and how it’s relevant to machine learning. You’ll then learn how to calculate a z-score from scratch in Python as well as how to use… Read More »How to Calculate a Z-Score in Python (4 Ways)
- Calculate a Weighted Average in Pandas and PythonLearn how to use Pandas to calculate the weighted average in Python, using groupby, numpy, and the zip function between two lists.
- Python: Get Index of Max Item in ListLearn how to use Python to get the index of the max item in a list, including when duplicates exist, using for loops, enumerate, and numpy.
- Numpy Dot Product: Calculate the Python Dot ProductLearn how to use Python and numpy to calculate the dot product, including between arrays of different dimensions and of scalars.
- Python Natural Log: Calculate ln in PythonLearn how to use Python to calculate the natural logarithm, often referred to as ln, using the math and numpy libraries, and how to plot it.
- Python: Convert Degrees to Radians (and Radians to Degrees)Learn how to use Python to convert degrees to radians and radians to degrees, using the math library and the numpy library.
- Python Absolute Value: Abs() in PythonLearn how to calculate a Python absolute value using the abs() function, as well as how to calculate in numpy array and a pandas dataframe.
- Python: Subtract Two Lists (4 Easy Ways!)Learn how to use Python to subtract two lists, using the numpy library, the zip function, for-loops, as well as list comprehensions.
- Python: Transpose a List of Lists (5 Easy Ways!)Learn how to use Python to transpose a list of lists using numpy, itertools, for loops, and list comprehensions in this tutorial!