java - 怎么用脚本判断几个连续的中文字符是一个姓名??
阿神
阿神 2017-04-17 13:47:43
0
3
491

冷狐毕军,
高李阳子,
闻人共建,
欧阳新成,
徐姜敏然,
某家公司,
欧阳伟强,
石戴菲子,
朱为准,
徐海峰,
王潇荔,
种亚男,
付义平,
鲁雅萍,
... ...

如上面的例子,怎么用脚本判断他们是一个名字,不是的自动删除所在行!
不限定语言!
我的想法是,把百家姓写入一个文件中,取要识别的中文字符串的第一个字,去匹配百家姓每个姓的第一个字,如果有匹配的,第二步,反过来取这个姓的完整中文字符串,假设这个姓中文字符长度为n,拿去匹配那个要识别的中文串的前n个字符,如果也匹配,默认它就是一个中文姓名!
难实现吗?

阿神
阿神

闭关修行中......

reply all(3)
左手右手慢动作

The surname is easy to handle. The key is how to judge the first name. The logic is more troublesome, but the implementation is very simple. . . .
In addition, surnames not only have one character, but also have compound surnames. .

洪涛

Can’t tell.
I can name my son
"How to use", "script judgment", "break a few", "continuous", "continuous", "Chinese characters"... are completely in compliance with Chinese law.

黄舟

You need Named Entity Recognizer (NER)
For example: http://nlp.stanford.edu/software/CRF-NER.shtml Although it says "Chinese models built from the Ontonotes Chinese named entity data", the effect is not confirmed

Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template
About us Disclaimer Sitemap
php.cn:Public welfare online PHP training,Help PHP learners grow quickly!