PHP code for Chinese word segmentation_PHP tutorial

WBOY
Release: 2016-07-20 11:10:01
Original
751 people have browsed it

I have used the dedecms word segmentation function before, but after testing it was still not ideal. After some processing, the results were still acceptable. Today I saw this word segmentation method again and showed it to everyone.

class NLP{
private static $cmd_path;
// does not end with '/'
static function set_cmd_path($path){
self::$cmd_path = $path;
}
private function cmd($str){
$descriptorspec = array(
0 => array("pipe", "r "),
1 => array("pipe", "w"),
);
$cmd = self::$cmd_path . "/ictclas";
$process = proc_open ($cmd, $descriptorspec, $pipes);
if (is_resource($process)) {
$str = iconv('utf-8', 'gbk', $str);
fwrite( $pipes[0], $str);
$output = stream_get_contents($pipes[1]);
fclose($pipes[0]);
fclose($pipes[1]);
$return_value = proc_close($process);
}
/*
$cmd = "printf '$input' | " . self::$cmd_path . "/ictclas";
exec ($cmd, $output, $ret);
$output = join("n", $output);
*/
$output = trim($output);
$output = iconv('gbk', 'utf-8', $output);
return $output;
}
/**
* Perform word segmentation and return a word list.
*/
function tokenize($str){
$tokens = array();
$output = self::cmd($input);
if($output){
$ps tutorial = preg_split('/s+/', $output) ;
foreach($ps as $p){
list($seg, $tag) = explode('/', $p);
$item = array(
'seg' = > $seg,
'tag' => $tag,
);
$tokens[] = $item;
}
}
return $tokens;
}
}
NLP::set_cmd_path(dirname(__FILE__));
?>

It is very simple to use (make sure the ICTCLAS compiled executable and dictionary are in the current directory ):
Copy the code as follows:
require_once('NLP.php');
var_dump(NLP::tokenize('Hello, World!'));
?>

The PHP class for Chinese word segmentation is below. Use the proc_open() function to execute the word segmentation program and interact with it through the pipeline. Enter the text to be segmented and read. Word segmentation results.


www.bkjia.comtruehttp: //www.bkjia.com/PHPjc/444764.htmlTechArticleI have used dedecms word segmentation function before. After testing, it was still not ideal. After some processing, the results were still acceptable. , today I saw this word segmentation method again, and I will show it to everyone...
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template
About us Disclaimer Sitemap
php.cn:Public welfare online PHP training,Help PHP learners grow quickly!