Universal tag mining method and device, server and medium

A tag and sequence technology, applied in the Internet field, can solve the problems that tags are not general enough, cannot meet the needs of question and answer, and cannot be mined by users' subjective tags, so as to reduce the development work time and meet the specific needs.

Active Publication Date: 2018-05-04
BEIJING BAIDU NETCOM SCI & TECH CO LTD
View PDF7 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, the structured extraction tags based on vertical websites are not universal enough, and cannot be applied to unpopular fields without vertical websites, or when there are no tag attributes on vertical websites, and most of the tags excavated from the structure of vertical websites are some Conventional noun labels cannot meet more specific question and answer needs
Extract tags based on other text attributes of entities. Since the text attributes of entities are not rich enough, some user subjective tags cannot be mined.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Universal tag mining method and device, server and medium
  • Universal tag mining method and device, server and medium
  • Universal tag mining method and device, server and medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0028] figure 1 It is a flow chart of the general tag mining method in Embodiment 1 of the present invention. This embodiment is applicable to tag mining for different fields, different entities and various types of websites. The method can be executed by a general tag mining device, which can be implemented in the form of software and / or hardware, for example, the general tag mining device can be configured in a server. Such as figure 1 As shown, the method specifically includes:

[0029] S110. Match tag seed rules including tag placeholders and attributes of the tag placeholders with historical search information to determine matching tags.

[0030] In this embodiment, tags are attributes used to describe entity characteristics in the knowledge graph, and are generally used to meet general demand Q&A on products. For example, "love" in "movies about love" is a tag. In addition, the label can also be defined more broadly. In addition to the above-mentioned conventional nou...

Embodiment 2

[0051] figure 2 It is a flow chart of the general label mining method in Embodiment 2 of the present invention, and this embodiment is further optimized on the basis of the above embodiments. Such as figure 2 As shown, the method includes:

[0052] S210. Match the tag seed rules including tag placeholders and attributes of the tag placeholders with historical search information to determine matching tags.

[0053] S220. Combine existing label seed rules and matched labels to construct a new search sequence set.

[0054] S230. Perform generalization processing on each search sequence contained in the new search sequence set to obtain a new label seed rule, and return to perform the matching operation of the new label seed rule and the historical search information to determine a new label until the label and the label seed rule The convergence condition is met.

[0055] S240. Use the web pages corresponding to the search sequence determined according to the obtained tags ...

Embodiment 3

[0068] image 3 It is a flow chart of the general label mining method in Embodiment 3 of the present invention, and this embodiment is further optimized on the basis of the foregoing embodiments. Such as image 3 As shown, the method includes:

[0069] S310. Match the tag seed rule including tag placeholders and attributes of the tag placeholders with historical search information to determine matching tags.

[0070] S320. Combine existing label seed rules and matched labels to construct a new search sequence set.

[0071]S330. Perform generalization processing on each search sequence contained in the new search sequence set to obtain a new label seed rule, and return to perform the matching operation of the new label seed rule and the historical search information to determine a new label until the label and the label seed rule The convergence condition is met.

[0072] S340. Use the web pages corresponding to the search sequence determined according to the obtained tags ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

An embodiment of the invention discloses a universal tag mining method and device, a server and a medium. The universal tag mining method comprises: matching tag seed rules, containing tag placeholders and attributes of the tag placeholders, with historical search information to determine a matched tag; combining the existing tag seed rules and the matched tag to construct a new search sequence set; generalizing search sequences included in the new search sequence set to obtain new tag seed rules, and returning to match the new tag seed rules with historical search information to determine newtags until the tags and the tag seed rules meet convergence conditions. The method provided by the embodiment allows more comprehensive and deeper tags to be mined; the whole tag mining process is not dependent on vertical websites; the same process can be used to perform tag mining on various webpages; development time can be greatly shortened, and specific needs of users are met.

Description

technical field [0001] Embodiments of the present invention relate to Internet technologies, and in particular to a method, device, server and medium for mining general labels. Background technique [0002] With the development of the Internet, the service platform can allow users to query the resources they want. Currently, when a user uses a search term to inquire about resources, a list of resources matching the user's search term is usually determined through tags in the search term. [0003] At present, there are two ways of label mining. One is based on the structured extraction of vertical websites. In most fields, there are some high-quality vertical websites on the Internet. It is likely that conventional label attributes have been built on them, such as songs. Genre, movie classification and so on. The other is based on the extraction of other text attributes of the entity, such as establishing an extraction model from the summary of a movie based on features suc...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/2465G06F16/951G06F16/9532G06F16/955
Inventor 冯欣伟曹徐平张一麟李莹
Owner BEIJING BAIDU NETCOM SCI & TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products