In this tutorial, you’ll learn how to use the Python
chr functions to allow you to work better with Unicode. You’ll learn a quick refresher on Unicode and how string characters can be represented in different ways in Python. Then, you’ll learn how the
ord functions work in Python, both for single characters and for multiple characters. You’ll also learn how to work with hexadecimal data. Let’s get started!
ord() function converts a character into an integer that represents the Unicode code of the character. Similarly, the chr() function converts a Unicode code character into the corresponding string.
What is Unicode and How is it Used in Python?
Before diving into the
chr functions, let’s start off by covering why learning about Unicode is so important. At a basic level, computers work with numbers – because of this, the characters and letters that appear on a screen are numbers under the hood.
In the past, many different types of encoding existed. However, many of these were incomplete when considering the vast variety of characters that exist on the internet. In order to resolve this, the Unicode Consortium standardized specifications of how to represent characters in 1991.
The Unicode standard assigned numerical values to every type of character, from letters to symbols to emojis. The standard allowed computers to have a much easier time understanding symbols, especially as more and more symbols are being added to the internet.
Python Ord Function: Unicode to Integer
ord() function is used to convert a single Unicode character into its integer representation. We can pass in any single string character and the function will return an integer.
Let’s see what this looks like:
# Converting Unicode to Int Using ord() character = 'd' print(ord(character)) # Returns: 100
We can see the integer representation of the Unicode letter
'd' is 100. The function
ord() works by taking a single character as its input, the character that you want to convert to an integer.
Let’s see what happens when we pass in more than one character:
# Converting Unicode to Int Using ord() character = 'datagy' print(ord(character)) # Raises: TypeError: ord() expected a character, but string of length 6 found
We can see that passing in more than one character into the
ord() function raises a
TypeError. This happens because the function expects only a single character to be passed in.
Python Ord for Multiple Characters
In order to resolve the
TypeError that is raised when more than one character is passed into the
ord() function, we need to iterate over each character in the string. Because Python strings are iterable objects, we can directly iterate over these string values.
Let’s see how we can repeat our earlier example without raising a
# Converting Multiple Characters into Integers word = 'datagy' for letter in word: print(ord(letter)) # Returns: # 100 # 97 # 116 # 97 # 103 # 121
We can now see the integer representation of our Unicode string. This was accomplished by looping over each letter in the string and applied the
ord() function to it.
In the next section, you’ll learn about the reverse of this function: the
Python Chr Function: Integer to Unicode Character
chr function does the opposite of the
ord function: it converts an integer representation into its corresponding Unicode string character. Let’s try to convert some values from integers into their Unicode counterparts:
# Converting Integers to their Unicode Equivalent numbers = [100, 97, 116, 97, 103, 121] for number in numbers: print(chr(number)) # Returns: # d # a # t # a # g # y
We can go further and convert this list of numbers into an actual Python string. We can do this by using the
.join() method. Let’s take a look at how this works:
# Converting Integers to their Unicode Equivalent numbers = [100, 97, 116, 97, 103, 121] word = ''.join([chr(number) for number in numbers]) print(word) # Returns: datagy
As of the writing of this article, the function accepts any value between 0 and 1,114,111, representing all the available Unicode characters. If a value outside of this range is passed into the function, the function will raise a
ValueError. Let’s see what this looks like:
# Raising a ValueError with chr chr(1114112) # Raises: ValueError: chr() arg not in range(0x110000)
We can see that, as expected, a
ValueError was raised.
Working with Hexadecimal Data in Python Ord and Chr
In Python, Hexadecimal numbers are numbers represented in other common bases. The hexadecimal format changes the base to 16 and can be used with both the
ord functions. In Python, these numbers can be used by prefixing the integer with
We can convert an integer into its hexadecimal equivalent by using the
hex function. Let’s give this a shot:
# Converting an Integer to a Hexadecimal Number number = 100 hex_number = hex(100) print(hex_number) # Returns: 0x64
Now that we have the hexadecimal value for the number 100, we can pass this into the
chr function to convert it to its Unicode representation:
# Converting Hexadecimal to Unicode print(chr(0x64)) # Returns: d
Here we can see that the
0x64 is a valid number representation in Python. Python will interpret the
0x prefix to represent hexadecimal formats and will convert the value into its Unicode representation.
In this tutorial, you learned how to work with the
ord functions in Python. These functions allow you to translate unicode to string characters and string characters to unicode. You also learned how to use the
ord() function for multiple characters. Finally, you learned how to work with hexadecimal data in Python, when using the
To learn more about the Python
chr functions, check out the official documentation here.
To learn more about related topics, check out these tutorials: