Skip to content

Python: Count Unique Values in a List (4 Ways)

Python Count Unique Items in a List Cover Image

In this tutorial, you’ll learn how to use Python to count unique values in a list. You’ll also learn what the fastest way to do this is! You’ll learn how to accomplish this using a naive, brute-force method, the collections module, using the set() function, as well as using numpy. We’ll close off the tutorial by exploring which of these methods is the fastest to make sure you’re getting the best performance out of your script.

The Quick Answer: Use Python Sets

# Using sets to count unique values in a list
list = ['apple', 'orage', 'apple', 'banana', 'apple', 'apple', 'orange', 'grape', 'grape', 'apple']

num_values = len(set(list))
print(num_values)

# Returns 5

Why count unique values?

Python lists are a useful built-in data structure! One of the perks that they offer is the ability to have duplicate items within them.

There may be many times when you may to count unique values contained within a list. For example, if you receive data in a list that tracks the number of log in into a site, you could determine how many unique people actually logged in.

Using Collections to Count Unique Values in a List

The built-in collections module can be used to count unique values in a list. The module has a built-in object called Counter that returns a dictionary-like object with the unique values as keys and the number of occurrences for values.

Because of this, we can counts the number of keys to count the number of unique values.

Tip! Want to learn more about the Python collections module and its Counter class? Check out my in-depth tutorial here, where you’ll learn how to count occurrences of a substring in a string.

Let’s see how we can use the Counter object to count unique values in a Python list:

# Use Counter from collections to count unique values in a Python list
list = ['apple', 'orage', 'apple', 'banana', 'apple', 'apple', 'orange', 'grape', 'grape', 'apple']

from collections import Counter

counter_object = Counter(list)
keys = counter_object.keys()
num_values = len(keys)

print(num_values)

# Returns 5

Let’s see what we’ve done here:

  1. We passed our list into the Counter object to create a unique object
  2. We get the keys using the .keys() attribute
  3. Finally, we get the length of that new object

We can make this much easier to write by simply chaining the process together, as shown below.

# Using Counter from collections to count unique values in a list
list = ['apple', 'orage', 'apple', 'banana', 'apple', 'apple', 'orange', 'grape', 'grape', 'apple']

from collections import Counter

num_values = len(Counter(list).keys())
print(num_values)

# Returns 5

This process returns the same thing, but is much quicker to write!

Using Sets to Count Unique Values in a Python List

Another built-in data structure from Python are sets. One of the things that separate sets from lists is that they can only contain unique values.

Python comes built with a set() function that lets you create a set based on something being passed into the function as a parameter. When we pass a list into the function, it turns the list into a set, thereby stripping out duplicate values.

Now, let’s see how we can use sets to count unique values in a list:

# Using sets to count unique values in a list
list = ['apple', 'orage', 'apple', 'banana', 'apple', 'apple', 'orange', 'grape', 'grape', 'apple']

set = set(list)
num_values = len(set)

print(num_values)

# Returns: 5

What we’ve done here is:

  1. Turned our list into a set using the built-in set() function
  2. Returned the number of values by counting the length of the set, using the len() function

We can also make this process a little faster by simply chaining our methods together, as demonstrated below:

# Using sets to count unique values in a list
list = ['apple', 'orage', 'apple', 'banana', 'apple', 'apple', 'orange', 'grape', 'grape', 'apple']

num_values = len(set(list))

print(num_values)

# Returns 5

This returns the same value but is a little faster to write out.

Want to learn more? Learn four different ways to append to a list in Python using this extensive tutorial here.

Use Numpy to Count Unique Values in a Python List

You can also use numpy to count unique values in a list. Numpy uses a data structure called a numpy array, which behaves similar to a list but also has many other helpful methods associated with it, such as the ability to remove duplicates.

Let’s see how we can use numpy to count unique values in a list:

# Use numpy in Python to count unique values in a list
list = ['apple', 'orage', 'apple', 'banana', 'apple', 'apple', 'orange', 'grape', 'grape', 'apple']

import numpy as np
array = np.array(list)
unique = np.unique(array)
num_values = len(unique)

print(num_values)

Let’s see what we’ve done here:

  1. We imported numpy as np and created an array using the array() function
  2. We used the unique() function from numpy to remove any duplicates
  3. Finally, we calculated the length of that array

We can also write this out in a much faster way, using method chaining. Let’s see how this can be done:

# Use numpy in Python to count unique values in a list
list = ['apple', 'orage', 'apple', 'banana', 'apple', 'apple', 'orange', 'grape', 'grape', 'apple']

import numpy as np

num_values = len(np.unique(np.array(list)))

print(num_values)

# Returns 5

This returns the same result as before.

Use a For Loop in Python to Count Unique Values in a List

Finally, let’s take a look at a more naive method to count unique items in a list. For this, we’ll use a Python for loop to iterate over a list and count its unique items.

a_list = ['apple', 'orage', 'apple', 'banana', 'apple', 'apple', 'orange', 'grape', 'grape', 'apple']

unique_list = list()
unique_items = 0

for item in a_list:
    if item not in unique_list:
        unique_list.append(item)
        unique_items += 1

print(unique_items)

Let’s see what we’ve done here:

  1. We create a new list called unique_list and an integer of 0 called unique_items
  2. We then loop over our original list and see if the current item is in the unique_list
  3. If it isn’t, then we append it to the list and add 1 to our counter unique_items

Check out some other Python tutorials on datagy, including our complete guide to styling Pandas and our comprehensive overview of Pivot Tables in Pandas!

What Method is Fastest to Count Unique Values in a Python List?

Now that you’ve learned four unique ways of counting unique values in a Python list, let’s take a look at which method is fastest.

What we’ll do is create a Python decorator to time each method. We’ll create a function that executes each method and decorate it to identify how long its execution takes.

For out sample list, we’ll use the first few paragraphs of A Christmas Carol, where each word is a list, and multiply that list by 10,000 to make it a bit of a challenge:

import time

def time_it(func):
    """Print the runtime of a decorated function."""
    def wrapper_time_it(*args, **kwargs):
        start_time = time.perf_counter()
        value = func(*args, **kwargs)
        end_time = time.perf_counter()
        run_time = end_time - start_time
        print(f"Finished {func.__name__!r} in {run_time:.10f} seconds")
        return value
    return wrapper_time_it

@time_it
def counter_method(a_list):
    from collections import Counter
    return len(Counter(a_list).keys())

@time_it
def set_method(a_list):
    return len(set(a_list))

@time_it
def numpy_method(a_list):
    import numpy as np
    return len(np.unique(np.array(list)))

@time_it
def for_loop_method(a_list):
    unique_list = list()
    unique_items = 0

    for item in a_list:
        if item not in unique_list:
            unique_list.append(item)
            unique_items += 1

    return unique_items


sample_list = ['Marley', 'was', 'dead:', 'to', 'begin', 'with.', 'There', 'is', 'no', 'doubt', 'whatever', 'about', 'that.', 'The', 'register', 'of', 'his', 'burial', 'was', 'signed', 'by', 'the', 'clergyman,', 'the', 'clerk,', 'the', 'undertaker,', 'and', 'the', 'chief', 'mourner.', 'Scrooge', 'signed', 'it:', 'and', 'Scrooge’s', 'name', 'was', 'good', 'upon', '’Change,', 'for', 'anything', 'he', 'chose', 'to', 'put', 'his', 'hand', 'to.', 'Old', 'Marley', 'was', 'as', 'dead', 'as', 'a', 'door-nail.', 'Mind!', 'I', 'don’t', 'mean', 'to', 'say', 'that', 'I', 'know,', 'of', 'my', 'own', 'knowledge,', 'what', 'there', 'is', 'particularly', 'dead', 'about', 'a', 'door-nail.', 'I', 'might', 'have', 'been', 'inclined,', 'myself,', 'to', 'regard', 'a', 'coffin-nail', 'as', 'the', 'deadest', 'piece', 'of', 'ironmongery', 'in', 'the', 'trade.', 'But', 'the', 'wisdom', 'of', 'our', 'ancestors', 'is', 'in', 'the', 'simile;', 'and', 'my', 'unhallowed', 'hands', 'shall', 'not', 'disturb', 'it,', 'or', 'the', 'Country’s', 'done', 'for.', 'You', 'will', 'therefore', 'permit', 'me', 'to', 'repeat,', 'emphatically,', 'that', 'Marley', 'was', 'as', 'dead', 'as', 'a', 'door-nail.', 'Scrooge', 'knew', 'he', 'was', 'dead?', 'Of', 'course', 'he', 'did.', 'How', 'could', 'it', 'be', 'otherwise?', 'Scrooge', 'and', 'he', 'were', 'partners', 'for', 'I', 'don’t', 'know', 'how', 'many', 'years.', 'Scrooge', 'was', 'his', 'sole', 'executor,', 'his', 'sole', 'administrator,', 'his', 'sole', 'assign,', 'his', 'sole', 'residuary', 'legatee,', 'his', 'sole', 'friend,', 'and', 'sole', 'mourner.']

sample_list *= 10000

counter_method(sample_list)
set_method(sample_list)
numpy_method(sample_list)
for_loop_method(sample_list)

# Returns
# Finished 'counter_method' in 0.2321387500 seconds
# Finished 'set_method' in 0.0463015000 seconds
# Finished 'numpy_method' in 0.2570261250 seconds
# Finished 'for_loop_method' in 7.1416198340 seconds

From this, we can see that while the Counter method and the Numpy methods are reasonably fast, the set method is fastest of the bunch! This could be attributed to the fact that it doesn’t require an import of another method.

Conclusion

In this post, you learned how to count unique values in a Python list. You learned how to do this using built-in sets, using the collections module, using numpy, and finally using a for-loop. You then learned which of these methods is the fastest method to execute, to ensure you’re not bogging down your script unnecessarily.

To learn more about the Counter object in the collections module, you can check out the official documentation here.

Tags:

Leave a Reply

Your email address will not be published.