Share article:

What is Unicode?

Unicode is a universal standard for character encoding. That means that virtually every character used in the world’s writing systems has a unique binary representation.

Unicode is the cornerstone of any digital text. Whether you’re writing an email, programming software, or simply sending a text message, you’re using Unicode. It is designed to include every character from all scripts and symbols from all the major writing systems in the world.

How Unicode works.

At its core, Unicode represents characters as abstract entities rather than specific visual representations.

Code points and character encoding.

In Unicode, every character is assigned a unique code point, typically expressed as a hexadecimal number. These code points serve as the foundation for character representation in digital text. For example, the code point for the Latin letter ‘A’ is U+0041.

Unicode supports various character encoding schemes, such as UTF-8, UTF-16, and UTF-32, which determine how code points are stored in binary form.

Multilingual support.

One of Unicode’s greatest strengths is its capacity to accommodate a multitude of languages and scripts like Arabic, Chinese, or Indic languages, making it the foundation for global digital communication.

Emoji and symbols.

Emojis and symbols have become ubiquitous in our digital conversations. Unicode ensures their consistent representation across devices and platforms. Unicode’s role in standardising emojis ensures that a smiling face or a thumbs-up emoji sent from one device appears the same on another.

Using Unicode in SMS.

When you use Unicode in SMS, you’re including a wide range of characters and symbols from different writing systems and languages, making text messages more versatile and inclusive. Unlike standard SMS, which is limited to a 7-bit character set and primarily supports the Latin alphabet, Unicode SMS enables the transmission of multilingual content.

However, it’s essential to know that Unicode characters use message space, typically reducing the maximum characters per SMS from 160 to 70 or even fewer, depending on the specific encoding used. This can result in additional spend you might not be aware of until you receive your invoice.

How many characters can you fit in a text message  infographic V2 2 1

Using Unicode in SMS can bridge language barriers, boost expressiveness, and enable richer communication, but it’s important to consider potential limitations in terms of message length and compatibility with the recipient’s device.

Benefits of Unicode.

Unicode offers numerous benefits that make it indispensable in today’s digital age. 

Universal standard.

Unicode facilitates a universal standard for character encoding, which means that it supports characters and symbols from all languages and writing systems around the globe. 

Forward compatibility.

Unicode is forward-compatible, meaning that the development of new characters and languages won’t disrupt existing ones. As languages evolve and new characters are created, Unicode adapts.

Compatibility with ASCII.

Unicode’s compatibility with ASCII (American Standard Code for Information Interchange) – a legacy encoding system – makes it easier for systems and apps to switch to the new universal standard. And it also has provisions for diacritical marks and special characters.

Flexibility.

Because Unicode can be stored in different encoding forms (like UTF-8, UTF-16, and UTF-32) it means that it’s flexible when it comes to memory usage and data transmission.

Take our text messaging platform for a spin.

Woman wearing a yellow shirt pointing at a text bubble.