Data disambiguation method and apparatus, and computer device

A data and disambiguation technology, applied in the field of data processing, can solve problems such as insufficient data category identification, labor cost, and poor data disambiguation effect

Active Publication Date: 2018-01-19
BEIJING BAIDU NETCOM SCI & TECH CO LTD
View PDF6 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0002] In related technologies, machine learning methods and dictionaries are generally used to disambiguate data categories, or named entity recognition technology is used to identify categories such as person names, place names, and organization names. In this way, data category identification is not comprehensive enough, and Does not combine actual scenarios, consumes a lot of labor costs, and the effect of data disambiguation is not good

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data disambiguation method and apparatus, and computer device
  • Data disambiguation method and apparatus, and computer device
  • Data disambiguation method and apparatus, and computer device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0027]Embodiments of the present invention are described in detail below, examples of which are shown in the drawings, wherein the same or similar reference numerals designate the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary only for explaining the present invention and should not be construed as limiting the present invention. On the contrary, the embodiments of the present invention include all changes, modifications and equivalents falling within the spirit and scope of the appended claims.

[0028] figure 1 is a schematic flowchart of a data disambiguation method proposed by an embodiment of the present invention.

[0029] see figure 1 , the method includes:

[0030] S11: Construct training data.

[0031] In the embodiment of the present invention, the training data is used as a proper name for an example, which is not limited.

[0032] The proper name ca...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a data disambiguation method and apparatus, and a computer device. The method comprises the steps of labeling each piece of data in training data based on to-be-classified typesto obtain multiple pieces of first data labeled to belong to the to-be-classified types and multiple pieces of second data labeled not to belong to the to-be-classified types; and based on a user clicking behavior log, determining a feature related to each piece of the first data and a feature related to each piece of the second data as a first feature and a second feature respectively, and according to the first feature and the second feature, training labeling corresponding to each piece of the first data / each piece of the second data. Through the method and the apparatus, the data of the user clicking behavior log can be deeply mined; the referable data is extracted for performing analysis; and scenes can be combined multi-directionally, so that the data disambiguation accuracy is greatly improved, the data disambiguation time and expense are shortened and reduced, and automatic disambiguation effects of reducing cost and improving data disambiguation are achieved.

Description

technical field [0001] The invention relates to the technical field of data processing, in particular to a data disambiguation method, device and computer equipment. Background technique [0002] In related technologies, machine learning methods and dictionaries are generally used to disambiguate data categories, or named entity recognition technology is used to identify categories such as person names, place names, and organization names. In this way, data category identification is not comprehensive enough, and Not combined with the actual scene, it consumes a lot of labor costs, and the effect of data disambiguation is not good. Contents of the invention [0003] The present invention aims to solve one of the technical problems in the related art at least to a certain extent. [0004] Therefore, an object of the present invention is to propose a data disambiguation method, which can realize in-depth mining of the data of the user's click behavior log, extract the refer...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06F17/27
Inventor 刘琼琼
Owner BEIJING BAIDU NETCOM SCI & TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products