Home Backend Development PHP Tutorial Introduction to a class library that extends PHP-based Emoji processing methods

Introduction to a class library that extends PHP-based Emoji processing methods

Aug 08, 2016 am 09:22 AM
emoji quot str

Introduction to CarmelaCarmela provides a set of solutions for processing 4-section UTF-8 based on PHP, PHP extensions, JAVA, C++ and other languages, such as common Emoji tag supportBackground: UTF-8 format Strings containing Emoji expressions are directly inserted into the database. If the database has not been adjusted, an error will be reported. This problem can be avoided by changing the character set of the database and table to utf8mb4_general_ci. However, in many large-scale systems and architectures, modifying the character set of the database may cause many problems, such as PC-side display and compatibility issues between new and old data. For this kind of problem, there is another solution, which is to replace before entering the database and do reverse replacement according to the client type after leaving the database. CarmelaCarmela provides a solution for processing 4-section UTF-8 based on PHP extension, which can replace UTF-8 characters larger than 3 bytes in UTF-8 into UBB mode, such as a certain UTF-8 character % f0%9f%91%a4 (for the convenience of display, the encode mode of the emoji tag is shown), what it looks like after replacement [u]1f464[/u], and when reading from the database, according to different request clients (iOS, Andriod , PC) do reverse substitution. The name Carmela comes from "Different Carmela". The "Different Carmela" series of stories tells the adventure stories of the hen Carmela and her children Carmelido and Carmen. In the Carmela family Everyone is so different, they dare to dream, and they dare to try things that others dare not think of. Installation1. Compile and package git clone https://github.com/ugg/Carmela /phpize ./configure --with-php-c/php-config-path make make install
  • Modify the configuration file

    vim /php.ini

  • Add the following content[carmela] extension=carmela.so Method: carmela_str2ubb: Convert the string containing the emoji tag into ubb mode , what it looks like after replacement [u]1f464[/u]. An example: $str = urldecode("This is test %F0%9F%98%9C+%F0%9F%98%99 by ugg"); echo "str:".$str."\n"; echo "ubb:".carmela_str2ubb($str)."\n"; Output result: str:This is test xxxx(CSDN Emoji不能展示用XXXX代替) by ugg ubb:This is test [u]1f61c[/u] [u]1f619[/u] by ugg carmela_ubb2str: Contains the ubb tag converted to utf-8 string format. For PC platform transfer, you can refer to the carmela_ubb2str method in encode.class.php. An example: $str = urldecode("This is test %F0%9F%98%9C+%F0%9F%98%99 by ugg"); $str = carmela_str2ubb($str); echo "ubb:".$str."\n"; echo "str:".carmela_ubb2str($str)."\n"; Output result: ubb:This is test [u]1f61c[/u] [u]1f619[/u] by ugg str:This is test(CSDN Emoji不能展示用XXXX代替) by uggcarmela_substr: Intercept the specified length of characters from the string containing emoji characters. carmela_sububb: Intercept the specified length of characters from the string containing the ubb tag. carmela_delstr: Delete emoji characters in strings, non-strict mode, 3-byte emoji characters cannot be deleted, mainly used in some. carmela_delubb: Delete ubb tags in strings containing ubb tags. Performance使用PHP分别实现了两种方法,分别使用PHP的str_replace方法和PHP查找四字节emoji,进行替换的方法,以及PHP扩展方式,使用相同数据分别进行测试,测试效果如下。=========================== 方案1:PHP str_replace方式 ========================= =========== EMOJI TO STRING ========== TIME:781.94ms,处理行数: 100,处理字数:10100,处理字节数:31028 平均每行处理时间:7.819ms =========== STRING TO EMOJI ========== TIME:118.566ms,处理行数: 100,处理字数:18710,处理字节数:37793 平均每行处理时间:1.186ms =========================== 方案2:PHP字符查找方式 ========================= =========== EMOJI TO STRING ========== TIME:51.526ms,处理行数: 100,处理字数:10100,处理字节数:31028 平均每行处理时间:0.515ms =========== STRING TO EMOJI ========== TIME:27.959ms,处理行数: 100,处理字数:23092,处理字节数:41236 平均每行处理时间:0.28ms =========================== 方案3:PHP扩展方式 ========================= =========== EMOJI TO STRING ========== TIME:0.721ms,处理行数: 100,处理字数:10100,处理字节数:31028 平均每行处理时间:0.007ms =========== STRING TO EMOJI ========== TIME:0.956ms,处理行数: 100,处理字数:20308,处理字节数:38452 平均每行处理时间:0.01ms 从以上测试效果上来看,str_replace方式,性能非常的差。使用PHP直接编写替换函数方式,性能提升10倍多,而采用扩展方式后,性能提升明显,在把emoji从字符形式转换为ubb方式时,性能提升1000倍。以上测试数据通过create_file.php可以动态生成。本测试用例,生成100行数据,每行100个字符,100字符中可以包含3-10个emoji字符,进行测试的,直接运行benchmark.php 查看运行性能。原理处理四字节的emoji原理非常简单,通过字符对比找到emoji字符进行替换。难点就是在基本原理上如何提升性能,如何快速查找,替换。PHP扩展方式,为大家提供了一种思路,可以参考这种思路实现java,C#,js等等版本的。PC如何支持EMoji表情展示?在项目目录中的emoji目录下找到images目录,从web根目录创建emoji文件夹,把images文件夹整个拷贝到emoji文件下,调用encode.class.php里面的carmela_ubb2str方法,Util_Encode::carmela_ubb2str($str, "PC"); 即可在PC上展示Emoji表情,目前收集到的845个emoji表情,一些新的表情符号并没有纳入其中,当然,目前这种方法并没有写入PHP扩展中,性能相对来说并不高。Contact ugg.xchj@gmail.com for all questions

    以上就介绍了基于PHP扩展一种处理Emoji方法的类库介绍,包括了方面的内容,希望对PHP教程有兴趣的朋友有所帮助。

    Statement of this Website
    The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

    Hot AI Tools

    Undresser.AI Undress

    Undresser.AI Undress

    AI-powered app for creating realistic nude photos

    AI Clothes Remover

    AI Clothes Remover

    Online AI tool for removing clothes from photos.

    Undress AI Tool

    Undress AI Tool

    Undress images for free

    Clothoff.io

    Clothoff.io

    AI clothes remover

    Video Face Swap

    Video Face Swap

    Swap faces in any video effortlessly with our completely free AI face swap tool!

    Hot Tools

    Notepad++7.3.1

    Notepad++7.3.1

    Easy-to-use and free code editor

    SublimeText3 Chinese version

    SublimeText3 Chinese version

    Chinese version, very easy to use

    Zend Studio 13.0.1

    Zend Studio 13.0.1

    Powerful PHP integrated development environment

    Dreamweaver CS6

    Dreamweaver CS6

    Visual web development tools

    SublimeText3 Mac version

    SublimeText3 Mac version

    God-level code editing software (SublimeText3)

    Apple releases iOS 17.4 Beta 1, introducing 118 new emojis, including phoenix, lime, etc. Apple releases iOS 17.4 Beta 1, introducing 118 new emojis, including phoenix, lime, etc. Jan 26, 2024 am 08:24 AM

    According to reports on January 26, according to foreign technology media emojipedia, in the iOS 17.4 Beta 1 update released today, in addition to major improvements such as sideloading and third-party app stores for testing in 27 EU countries, Apple has also added several new Emoji. The iOS 17.4 Beta 1 update adds phoenix, lime, smiling faces shaking their heads up and down, and a series of character emoticons that point the way, from the Emoji 15.1 update proposed by Unicode in September 2023. The complete Emoji are attached as follows: This update adds a total of 118 Emojis, including 6 new Emojis and 4 gender-neutral family Emojis. In addition, there are 6 existing character expressions

    How to turn off emoji in Win10 Education Edition How to turn off emoji in Win10 Education Edition Feb 24, 2024 pm 01:55 PM

    Emoji emoticons are the latest input method function added to Win10 Education Edition. Many cute emoticons make chatting less boring. However, some users will pop up this interface when using shortcut keys. Today I will show you how to turn off emoji in Win10 Education Edition. introduce. How to turn off emoji1 in Win10 Education Edition. First, you need to right-click the input method on the right side of the taskbar below. 2. Select Settings in the pop-up option box to enter the language setting interface. 3. Select the "Keys" option in the interface and scroll to the bottom to find "Open Emoticons and Symbols Panel", where you can turn off emoji expressions. 4. If other input methods are installed, you can enter the "Settings and Language" interface from Windows settings, and then select

    php提交表单通过后,弹出的对话框怎样在当前页弹出,该如何解决 php提交表单通过后,弹出的对话框怎样在当前页弹出,该如何解决 Jun 13, 2016 am 10:23 AM

    php提交表单通过后,弹出的对话框怎样在当前页弹出php提交表单通过后,弹出的对话框怎样在当前页弹出而不是在空白页弹出?想实现这样的效果:而不是空白页弹出:------解决方案--------------------如果你的验证用PHP在后端,那么就用Ajax;仅供参考:HTML code

    How to input Huawei emoji How to input Huawei emoji Sep 26, 2023 pm 01:31 PM

    Huawei emoji input method: 1. Huawei mobile phones come with an emoji keyboard. You can switch to this keyboard to enter emoji when entering text; 2. Where you need to enter emoji, long press the input box, and then select "Enter" "Method Settings", in the input method settings, you can find and select the emoji input method that comes with the system; 3. You can download and use a third-party emoji keyboard, and then browse and select the emoji you want to use on the keyboard. .

    iOS 15.4 and iPadOS 15.4 Beta 1 public beta released, Face ID can be used even if you wear a mask iOS 15.4 and iPadOS 15.4 Beta 1 public beta released, Face ID can be used even if you wear a mask Apr 13, 2023 pm 11:01 PM

    Apple has officially launched iOS 15.4 and iPadOS 15.4 Beta 1 to the public, version 19E5209h. The official update notes list many changes, including the addition of new features. Support for masks Face ID Apple has added “Mask Face ID” in iOS 15.4. The iPhone will authenticate users by identifying the unique characteristics of the area around the eyes when wearing a mask. Note that users must use iPhone 12 or above to use this feature. Adding 112 new Emoji expressions iOS 15.4 Beta 1 officially supports Emoji 14. Add to it

    Python built-in type str source code analysis Python built-in type str source code analysis May 09, 2023 pm 02:16 PM

    1The basic unit of Unicode computer storage is the byte, which is composed of 8 bits. Since English only consists of 26 letters plus a number of symbols, English characters can be stored directly in bytes. But other languages ​​(such as Chinese, Japanese, Korean, etc.) have to use multiple bytes for encoding due to the large number of characters. With the spread of computer technology, non-Latin character encoding technology continues to develop, but there are still two major limitations: no multi-language support: the encoding scheme of one language cannot be used in another language and there is no unified standard: for example There are many encoding standards in Chinese such as GBK, GB2312, GB18030, etc. Since the encoding methods are not unified, developers need to convert back and forth between different encodings, and many errors will inevitably occur.

    What are the similarities and differences between __str__ and __repr__ in Python? What are the similarities and differences between __str__ and __repr__ in Python? Apr 29, 2023 pm 07:58 PM

    What are the similarities and differences between __str__ and __repr__? We all know the representation of strings. Python's built-in function repr() can express objects in the form of strings to facilitate our identification. This is the "string representation". repr() obtains the string representation of an object through the special method __repr__. If __repr__ is not implemented, when we print an instance of a vector to the console, the resulting string may be. >>>classExample:pass>>>print(str(Example()))>>>

    What to do if mysql emoji is garbled What to do if mysql emoji is garbled Feb 16, 2023 am 10:01 AM

    Solution to the garbled mysql emoji code: 1. Check the mysql table encoding through the "show create table test ENGINE=InnoDB DEFAULT CHARSET=utf8mb4" command; 2. Set the reading encoding to "utf8mb4" through "set names utf8mb4;"

    See all articles