Article originated from https://medium.com/@hafiqiqmal93/normalizing-fancy-text-to-normal-text-in-laravel-7d9ed56d5a78
Text input from users are not at all interesting. With the advent of Unicode in the smartphones, users now have the luxury (and sometimes the whimsy) to input text in a variety of styles and formats. From emojis to diacritics, ligatures to full-width characters, the range of “fancy text” can be extremely confusing or difficult to understand by the system. While visually appealing, these text variations pose a significant challenge for the system particularly in terms of data consistency, searchability, and user experience.
Here are the example of fancy text:-
???????? ???? ? ??? ?????????? ????? ?? ??? ????? ??? ?? ?????????? ?? ??????? ???? ?????? ??? ??? ???? ????? ??? ? ?? ???? ?? ????? ??? ??????? ?? ???? ???? ?? ??? ?? ????? ??? ???????? ?????? ????? ?????, ?? ???? ??????? ???? ????..????? ?? ??? ????. ??? ?????? ???? ?? ???? ????? ?????????
Looks like italic character but its not italic. Its actually belongs to Mathematical Alphanumeric Symbols.
Well, a very obvious problem is that PHP can't JSON encode deformed UTF-8 characters upon receipt. In the modern way of doing web development, where APIs and frontend frameworks use JSON to transport data, this is a problem. If treated wrong, such deformed characters will result in data corruption, crash, or angry users.
Our goal is simple :- came out with the solution that will convert every fancy text into normal readable text.
Normalization forms are pivotal to understanding the normalization process. They cater to different linguistic and technical needs. For instance, the NFC form combines characters into their composed forms, whereas NFD does the opposite, decomposing composed characters into their constituent parts. NFKC and NFKD forms go further, considering compatibility characters - folding variations of characters into a canonical form. These forms ensure that text comparison, searching, and storage are consistent and reliable.
The code snippet provided is a sterling example of PHP approach to solving complex problems with simplicity and efficiency. Let's dissect this solution, understand its components, and see how it seamlessly integrates :-
public static function normalizeText($text): ?string { if (!$text) { return null; } $intl = [ \Normalizer::FORM_C, \Normalizer::FORM_D, \Normalizer::NFD, \Normalizer::FORM_KC, \Normalizer::NFKC, \Normalizer::FORM_KC_CF, \Normalizer::FORM_KD, \Normalizer::NFKD, \Normalizer::NFC, \Normalizer::NFKC_CF, ]; foreach ($intl as $form) { if (!\Normalizer::isNormalized($text, $form)) { return \Normalizer::normalize($text, $form); } } return $text; }
The usage is simple:-
$normalText = Utils::normalizeText($YOUR_FANCY_STRING)
You may register inside helper function to make it easier to use. For example:-
if ( ! function_exists('normalize_text')) { function normalize_text(string $text): string { return Utils::normalizeText($text) } } // USAGE $normalText = normalize_text($YOUR_FANCY_STRING)
At its core, this function leverages PHP's **Normalizer** class-a part of the Internationalization (intl) extension-to address the normalization. The **Normalizer** class offers several normalization forms, each tailored to different normalization needs. This function iterates through these forms, checking if the text is already normalized in a given form using **isNormalized** function. If not, it normalizes the text to that form and returns the normalized string.
While fancy text may add visual appeal to user input, it poses significant challenges for data processing and system interoperability. However, with the adoption of PHP's Normalizer class and the implementation of normalization forms, developers can overcome these challenges and ensure that their applications maintain data consistency and reliability in the face of diverse text inputs.
Do you have any experiences or challenges related to handling fancy text in your projects? How do you currently address such issues, and do you find PHP's Normalizer class useful in your workflow? Let's continue the conversation and share our insights to help each other navigate the complexities of modern web development. ??
The above is the detailed content of Normalizing Fancy Text to Normal Text in Laravel. For more information, please follow other related articles on the PHP Chinese website!