A name ambiguity eliminating method applied to Web figure search

A technology for people's names and characters, applied in the field of photoelectric transmission, can solve the problems that the social circles of people with the same name do not overlap very much, and people with the same name cannot have the same occupation, etc., and achieve the effect of improving accuracy.
CN109815401AInactive Publication Date: 2019-05-28四川易诚智讯科技有限公司 +1

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
四川易诚智讯科技有限公司
Publication Date
2019-05-28
Estimated Expiration
Not applicable · inactive patent

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The invention discloses a name ambiguity eliminating method applied to Web task searching, which comprises the following steps of S1, extracting an HTML webpage source code, and extracting noise irrelevant to character information from the HTML webpage source code; S2, extracting a character webpage feature set; S3, generating a combined feature vector representing a certain person related webpagefrom the person webpage feature set extracted in the step S2; S4, performing hierarchical clustering by adopting a condensation hierarchical clustering algorithm to obtain a character webpage clustering result. According to the method, through introduction of the n-element capital model, the limitation of traditional named entity recognition is solved, named entity extraction is limited, and a plurality of special vocabularies and special vocabularies in the text cannot be extracted; different extracted features are endowed with different weights according to the importance of the features tothe character representation, so that the name disambiguation accuracy is improved.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The invention belongs to the field of photoelectric transmission, in particular to an all-fiber distributed acoustic wave sensing technology. Background technique

[0002] With the advent of the mobile Internet era, search engines have become an important tool for people to acquire knowledge, and it is very common to search for personal information on the Internet. According to statistics, about 5%-10% of search engine queries involve names, and only less than 20% of people are willing to add additional information when searching for names. At the same time, personal names are highly ambiguous. According to the report of the US Census Bureau, there are 1 billion people who use only 90,000 different names. The name retrieval of the search engine gets mixed results of multiple related webpages with the same name, and there is a tendency for "celebrity" webpages to overwhelm "non-celebrity". For example, if Google searches for "Michael Jordan", the resu...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More