In PHP, detecting the language of a UTF-8 string is a common task. One versatile solution is the Text_LanguageDetect PEAR package.
This package offers simplicity in use, with a database of 52 languages. However, Eastern Asian language detection is not supported.
To use the Text_LanguageDetect package, follow these steps:
If the detection is successful, you will receive an array with detected languages and their confidence scores. Otherwise, an error message will be displayed.
Consider the following example:
require_once 'Text/LanguageDetect.php'; $l = new Text_LanguageDetect(); $result = $l->detect("Hallo Welt", 4); if (PEAR::isError($result)) { echo $result->getMessage(); } else { print_r($result); }
This code will detect the language of the string "Hallo Welt" and return an array of detected languages and their confidence scores. The array may look like this:
Array ( [german] => 0.407037037037 [dutch] => 0.288065843621 [english] => 0.283333333333 [danish] => 0.234526748971 )
The above is the detailed content of How Can PHP Detect the Language of a UTF-8 String?. For more information, please follow other related articles on the PHP Chinese website!