Replacing UTF8mb4 characters to HTML entities in a string with PHP

Published on 2024-06-22 • Modified on 2024-06-22

This snippet shows how to replace UTF8mb4 characters like emojis with HTML entities in a string with PHP. It can be helpful, for example, if your "legacy" database doesn't support this encoding.


<?php

declare(strict_types=1);

namespace App\Controller\Snippet;

/**
 * I am using a PHP trait to isolate each snippet in a file.
 * This code should be called from a Symfony controller extending AbstractController (as of Symfony 4.2)
 * or Symfony\Bundle\FrameworkBundle\Controller\Controller (Symfony <= 4.1).
 * Services are injected in the main controller constructor.
 */
trait Snippet306Trait
{
    public function snippet306(): void
    {
        $rawString = 'Hello 🙏 world 🙂!';
        echo 'inital string: '.$rawString.PHP_EOL;

        $string = preg_replace_callback(
            '/[\xF0-\xF7][\x80-\xBF]{3}/',
            static function ($matches) {
                return '&#'.hexdec(bin2hex(mb_convert_encoding($matches[0], 'UTF-32', 'UTF-8'))).';';
            },
            $rawString
        );

        echo 'final string: '.$string.PHP_EOL;

        // That's it! 😁
    }
}

 Run this snippet  More on Stackoverflow  Random snippet

  Work with me!