UGC text content mining method, system, equipment and storage medium

A text and content technology, applied in the field of OTA, can solve problems such as the inability to dig out the topics that users are interested in, and achieve the effect of improving mining efficiency, improving accuracy and saving time.

Pending Publication Date: 2021-09-24
携程旅游信息技术(上海)有限公司
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The technical problem to be solved by the present invention is to overcome the defect in the prior art that it is impossible to quickly and accur

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • UGC text content mining method, system, equipment and storage medium
  • UGC text content mining method, system, equipment and storage medium
  • UGC text content mining method, system, equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Example Embodiment

[0074] Example 1

[0075] This embodiment provides a method for mining UGC text content. refer to figure 1 , mining methods include:

[0076] S11. Obtain the UGC text content.

[0077] S12. Obtain the subject heading input by the user.

[0078] S13. Obtain an extension word set of the subject word based on the subject word, wherein the extension word set includes extension words similar to the subject word, and the extension word is output by a model trained based on the UGC text content.

[0079] S14, outputting the extended word set.

[0080] S15. Use the selected expanded word in the expanded word set as the subject word selection result.

[0081] S16: Calculate the correlation between the subject word selection result and the UGC text content, sort in descending order of the correlation, and output several UGC text contents that are ranked first in the correlation of the expanded words.

[0082] Among them, the UGC text content may include review information of sceni...

Example Embodiment

[0119] Example 2

[0120] This embodiment also provides a mining system for UGC text content. refer to Image 6 , the mining system includes: a text content acquisition module 1 , a subject word acquisition module 2 , an expanded word set calculation module 3 , an output module 4 , a subject word selection module 5 and a first correlation degree calculation module 6 .

[0121] The text content acquisition module 1 is used to acquire UGC text content.

[0122] The subject heading obtaining module 2 is used to obtain the subject heading input by the user.

[0123] The expanded word set calculation module 3 is used to obtain an expanded word set of the subject word based on the subject word, wherein the expanded word set includes the expanded word similar to the subject word, and the expanded word is output by a model trained based on the UGC text content.

[0124] The output module 4 is used for outputting the extended word set.

[0125] The subject word selection module 5 i...

Example Embodiment

[0163] Example 3

[0164] Figure 7 This is a schematic structural diagram of an electronic device according to Embodiment 3 of the present invention. The electronic device includes a memory, a processor, and a computer program stored on the memory and executable on the processor. When the processor executes the program, the method for mining UGC text content in Embodiment 1 is implemented. Figure 7 The electronic device 30 shown is only an example, and should not impose any limitation on the function and scope of use of the embodiments of the present invention.

[0165] The electronic device 30 may take the form of a general-purpose computing device, which may be, for example, a server device. Components of the electronic device 30 may include, but are not limited to, the above-mentioned at least one processor 31 , the above-mentioned at least one memory 32 , and a bus 33 connecting different system components (including the memory 32 and the processor 31 ).

[0166] The ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a UGC text content mining method, a system, equipment and a storage medium. The mining method comprises the following steps: acquiring UGC text content; obtaining a subject term input by a user; obtaining an expansion word set of the subject terms based on the subject terms, the expansion word set comprises expansion words similar to the subject terms, and the expansion words are output by a model obtained through UGC text content training; outputting an expansion word set; taking the selected extension word in the extension word set as a subject word selection result; and calculating the relevancy between the subject word selection result and the UGC text content, sorting according to a descending order of the relevancy, and outputting a plurality of UGC text contents with the relevancy of the extension words in the front. According to the method and the device, the user is helped to accurately mine the extension words related to the subject words, so that the UGC text content which the user is interested in can be obtained through the selected extension words, the accuracy is improved, the mining efficiency is improved, and the time of the user is saved.

Description

technical field [0001] The present invention relates to the technical field of OTA (Online Travel Agency, online travel), in particular to a method, system, device and storage medium for mining UGC (User Generated Content, user-generated content) text content. Background technique [0002] In the field of tourism, a large amount of UGC content is generated every day. Before purchasing or learning about a certain product, users often read user comments or strategy information. At present, it is impossible to quickly and accurately mine massive (hundreds of millions) data Content on topics of interest to users. How to quickly and accurately dig out the subject content that users are interested in from massive data is an urgent problem that needs to be solved in the field of tourism. Contents of the invention [0003] The technical problem to be solved by the present invention is to overcome the defect in the prior art that it is impossible to quickly and accurately mine the...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/9535G06F40/295
CPCG06F16/9535G06F40/295
Inventor 刘新何蜀波孙玉霞朱登龙
Owner 携程旅游信息技术(上海)有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products