Emoji Input Validation

Author: Vicky Adi Firmansyah (23109504)
Date: March 25, 2025
Category: Frontend Development

In the development of input features with character length validation, emoji can cause problems because each emoji consists of multiple characters in Unicode encoding. This can confuse users if the maximum validation is based on the number of Unicode characters, not the number of visible characters.

The Problem

Character Count Mismatch

When users enter emojis, the visible character count doesn’t match the actual Unicode character count, causing validation failures that confuse users.

Example Scenario

Setting a maximum input limit of 7 characters. If the user enters the emoji ❤️ (love), which is actually two characters in Unicode, then when the user types four love emojis (❤️❤️❤️❤️), the calculated Unicode length could be more than 7, causing validation to fail even though visually the user only sees four emojis.

Example 001 The image above shows an example of an input form using Ant Design with a maximum validation of 7 characters.

Root Causes

1. UTF-16 Code Units

The .length function in JavaScript or other languages counts based on UTF-16 code units (MDN JavaScript Reference), not the number of characters visible to the user.

Example:

"❤️".length  // Returns 2 (UTF-16 code units)
// But users see it as 1 character

2. Emoji Complexity

Emoji in Unicode are often made up of multiple code points and some emojis can have modifiers, such as skin color variations, that increase the number of characters in the encoding.

Examples:

Base emoji: ❤️ = 2 code units
With modifier: 👋🏻 = 4 code units (hand + skin tone)
Complex emoji: 👨‍👩‍👧‍👦 = 11 code units (family emoji)

Solution & Implementation

Form Implementation: React Ant Design Integration

In this example, we are using React Ant Design Form. Validation can’t be used directly, so we need to add it to the rules in Form.Item using a validator.

<Form.Item
  name="nik"
  label="NIK"
  rules={[
    {
      validator: (_, value) => {
        const NIK_MAX_LENGTH = 7
        if (getVisibleCharacterCount(value) > NIK_MAX_LENGTH) {
          return Promise.reject(new Error(`NIK Must be less than ${NIK_MAX_LENGTH} characters`));
        }
        return Promise.resolve();
      },
    },
  ]}
>
  <Input />
</Form.Item>

Implementation Details:

Custom Validator
- The validator function uses our getVisibleCharacterCount function
- Properly counts visible characters including emojis
- Validates against the maximum length
Error Handling
- Returns Promise.reject with a clear error message when limit exceeded
- Returns Promise.resolve() when validation passes
- User-friendly error messages

Before vs After Comparison

Before (Problematic)

Example-001

Using .length validation causes issues with emoji input, where visible characters don’t match Unicode character count.

Issues:

Emoji counted incorrectly
Validation fails unexpectedly
Confusing user experience
Users can’t enter expected amount of emojis

After (Fixed)

Example-002

Using Intl.Segmenter provides accurate character counting that matches user expectations.

Benefits:

✅ Accurate emoji counting
✅ Intuitive validation
✅ Better user experience
✅ Predictable behavior

Key Takeaways

Problem Summary:

Using .length to validate input with emoji can be problematic due to Unicode encoding complexity
Emoji consist of multiple UTF-16 code units but appear as single characters
This mismatch confuses users and causes unexpected validation failures

Solution:

✅ With Intl.Segmenter we can count the number of characters visible to the user more accurately
✅ This approach makes validation more intuitive and aligns with user expectations
✅ Easy to implement with modern JavaScript APIs
✅ Works well with popular form libraries like Ant Design