Data generalization method and device, equipment and medium

A generalization and data technology, applied in the field of data processing, can solve inappropriate problems and achieve the effect of avoiding unreasonable

Active Publication Date: 2019-05-10
BEIJING BAIDU NETCOM SCI & TECH CO LTD
View PDF11 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

But for example, for "who is the champion of the 2018 Football World Cup", it is obviously inappropriate to generalize to "who is the champion of the 2018 Football World Cup"

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data generalization method and device, equipment and medium
  • Data generalization method and device, equipment and medium
  • Data generalization method and device, equipment and medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0028] figure 1 It is a flowchart of a data generalization method provided by Embodiment 1 of the present invention. This embodiment is applicable to the generalization of search items. The method can be implemented by a data generalization method, and the device can be implemented by software and / or hardware. see figure 1 , the data generalization method provided by the embodiment of the present invention includes:

[0029] S110. Group the retrieval item set including the target retrieval item to be generalized and the historical retrieval item according to the words in each retrieval item.

[0030] Wherein, the historical search items include search items historically input by the user.

[0031] In the process of implementing the present invention, the inventor found that, from the perspective of a month, the actual proportion of queries searched on an average day does not exceed 1% of the total queries. Queries with an average daily search of no more than 1 are general...

Embodiment 2

[0051] figure 2 It is a flowchart of a data generalization method provided by Embodiment 2 of the present invention. This embodiment is an optional solution proposed on the basis of the foregoing embodiments. see figure 2 , the data generalization method provided in this embodiment includes:

[0052] S210. Unify the synonyms included in the search item and / or the words that identify the same entity.

[0053] Wherein, words identifying the same entity may be aliases of the entity.

[0054] Exemplarily, the search items are which era Li Bai belongs to, and which dynasty Shixian lived in. The unified search items are: which dynasty Li Bai belongs to, and which dynasty Li Bai lived in.

[0055] In order to avoid introducing too many unification errors, the unification of synonymous words and / or words identifying the same entity included in the search items includes:

[0056] The synonymous words and / or the words identifying the same entity in the retrieval items whose impo...

Embodiment 3

[0064] image 3 It is a flowchart of a data generalization method provided by Embodiment 3 of the present invention. This embodiment is an optional solution proposed on the basis of the foregoing embodiments. see image 3 , the data generalization method that the embodiment of the present invention provides comprises:

[0065] S310. Group the retrieval item set including the target retrieval item to be generalized and the historical retrieval item according to the words in each retrieval item.

[0066] S320. According to the conversion loss between the target retrieval item and the historical retrieval item in the target retrieval item group, determine the generalized retrieval item of the target retrieval item from the historical retrieval items.

[0067] Among them, the conversion loss refers to the loss required to transform from one retrieval item to another retrieval item. The smaller the loss, the more similar the two retrieval items are, and the more likely they exp...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention discloses a data generalization method and device, equipment and a medium, and relates to the technical field of data processing. The method comprises the steps that according to words in all retrieval items, a retrieval item set comprising a to-be-generalized target retrieval item and a historical retrieval item is grouped; and according to a grouping result, determining a generalized retrieval item of the target retrieval item from historical retrieval items. The embodiment of the invention provides a data generalization method and device, equipment and a medium. Reasonable and wide generalization of to-be-generalized retrieval items is achieved.

Description

technical field [0001] The embodiments of the present invention relate to the technical field of data processing, and in particular, to a data generalization method, device, device, and medium. Background technique [0002] Retrieval items (queries) expressing the same semantics often have more than one expression form, and how to mine as many of these expression forms as possible is the generalization of the query. [0003] The current generalization of query is mainly to replace keywords based on synonyms. [0004] However, although some generalizations can be resolved using keyword substitution, these generalizations are not comprehensive enough. After all, the problems that keyword replacement can solve are limited, but people may always come up with some unexpected ways of asking questions. [0005] In addition, keyword substitution may also be wrong in certain subject situations. For example, in most cases, "who" and "which person" are equivalent. But for example, ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27
CPCY02D10/00
Inventor 周环宇冯欣伟余淼
Owner BEIJING BAIDU NETCOM SCI & TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products