Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Chinese name self-expanding recognition method based on search logs

A recognition method and self-expanding technology, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve problems such as low efficiency of person name recognition, and achieve the effect of improving the recognition rate and reducing noise information.

Active Publication Date: 2016-12-21
BEIJING INFORMATION SCI & TECH UNIV
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] In order to solve problems such as low efficiency of name recognition in current search logs, the present invention provides a self-expanding recognition method for Chinese names based on search logs. The method includes the following steps:

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Chinese name self-expanding recognition method based on search logs
  • Chinese name self-expanding recognition method based on search logs
  • Chinese name self-expanding recognition method based on search logs

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0024] In order to meet the current demand for precise search and solve the problem of name recognition when searching and querying, the embodiment of the present invention provides a method for recognizing Chinese names based on search logs. With the help of self-expanding recognition ideas, the name template is constructed through the seed name, and the name template is created in the seed name according to the template. The query string and the ranking change trend of the query string of the entire target corpus, the context of the name is screened, and the candidate name is defined using the pattern matching idea, which reduces the noise information in the name recognition and improves the recognition rate.

[0025] In order to make the purpose, technical method and advantages of the embodiments of the present invention more clear, the technical solutions provided by the embodiments of the present invention will be similarly described below in conjunction with the accompanyi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention belongs to the field of natural language processing of computational linguistics and discloses a Chinese name self-extension recognition method based on search logs. In the method, the search logs are utilized to query characteristics of first word family names of strings and excavate seed names by virtue of a 'family name drive' name recognition idea; the seed names are utilized to generate a candidate name template set in the search logs; name templates are screened according to the frequency change tendency of candidate name templates in corresponding query strings and the whole search logs; candidate names are generated according to the name templates; the candidate names are defined and screened by utilizing forward and reverse key words to obtain a name set, previous n templates with highest distinction degree in the name templates serve as seed templates for next iteration, accordingly the names in the search logs are excavated, the seed names are established and the name templates are generated by utilizing characteristics of the search logs, the name templates are filtered according to the change tendency of name contexts in the corresponding search logs and the whole search log query strings, noise information during name recognition is decreased, and the name recognition rate in the search logs is improved.

Description

technical field [0001] The invention belongs to the field of natural language processing in computational linguistics, and in particular relates to a self-expanding recognition method for Chinese personal names based on a search log. Background technique [0002] With the rapid growth of network information, search engines have increasingly demonstrated their epoch-making significance. Since the development of Chinese search engines, they already have a huge number of users, process hundreds of millions of requests every day, and accumulate large-scale query logs. Named entities make up a large proportion of search logs. According to the research statistics of relevant scholars: in the web search queries updated every day, 2~4% of the queries are composed of individual names; about 30% of the queries contain names; the researchers marked 76,717 query strings and found that the names appeared 961 queries were received with a frequency of 6245, accounting for 8.14% of the to...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30G06F17/27
CPCG06F16/3331
Inventor 吕学强文彬
Owner BEIJING INFORMATION SCI & TECH UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products