Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

UGC text content mining method, system, equipment and storage medium

A text and content technology, applied in the field of OTA, can solve problems such as the inability to dig out the topics that users are interested in, and achieve the effect of improving mining efficiency, improving accuracy and saving time.

Pending Publication Date: 2021-09-24
携程旅游信息技术(上海)有限公司
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The technical problem to be solved by the present invention is to overcome the defect in the prior art that it is impossible to quickly and accurately mine the subject content of user interest from massive data, and provide a mining method, system, device and storage for UGC text content. medium

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • UGC text content mining method, system, equipment and storage medium
  • UGC text content mining method, system, equipment and storage medium
  • UGC text content mining method, system, equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0075] This embodiment provides a method for mining UGC text content. refer to figure 1 , mining methods include:

[0076] S11. Obtain UGC text content.

[0077] S12. Acquiring the subject words input by the user.

[0078] S13. Obtain an extended word set of the subject word based on the subject word, wherein the extended word set includes extended words similar to the subject word, and the extended word is output by a model trained based on UGC text content.

[0079] S14. Output the expanded word set.

[0080] S15. Use the selected expanded word in the expanded word set as the subject word selection result.

[0081] S16. Calculate the correlation degree between the selection result of the keyword and the UGC text content, sort in descending order according to the correlation degree, and output several UGC text contents whose correlation degree of the extended word is ranked first.

[0082] Wherein, the UGC text content may include review information of scenic spots, revi...

Embodiment 2

[0120] This embodiment also provides a mining system for UGC text content. refer to Figure 6 , the mining system includes: a text content acquisition module 1 , a subject term acquisition module 2 , an extended word set calculation module 3 , an output module 4 , a subject term selection module 5 and a first correlation degree calculation module 6 .

[0121] The text content acquisition module 1 is used to acquire UGC text content.

[0122] The keyword acquisition module 2 is used to acquire the keyword input by the user.

[0123] The extended word set calculation module 3 is used to obtain the extended word set of the subject word based on the subject word, wherein the extended word set includes extended words similar to the subject word, and the extended word is output by a model trained based on UGC text content.

[0124] The output module 4 is used for outputting the expanded word set.

[0125] The subject word selection module 5 is used to use the selected extended wo...

Embodiment 3

[0164] Figure 7 It is a schematic structural diagram of an electronic device provided by Embodiment 3 of the present invention. The electronic device includes a memory, a processor, and a computer program stored on the memory and operable on the processor. The processor implements the method for mining UGC text content in Embodiment 1 when executing the program. Figure 7 The electronic device 30 shown is only an example, and should not limit the functions and scope of use of the embodiments of the present invention.

[0165] Electronic device 30 may take the form of a general-purpose computing device, which may be a server device, for example. Components of the electronic device 30 may include, but are not limited to: at least one processor 31 , at least one memory 32 , and a bus 33 connecting different system components (including the memory 32 and the processor 31 ).

[0166] The bus 33 includes a data bus, an address bus, and a control bus.

[0167] The memory 32 may i...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a UGC text content mining method, a system, equipment and a storage medium. The mining method comprises the following steps: acquiring UGC text content; obtaining a subject term input by a user; obtaining an expansion word set of the subject terms based on the subject terms, the expansion word set comprises expansion words similar to the subject terms, and the expansion words are output by a model obtained through UGC text content training; outputting an expansion word set; taking the selected extension word in the extension word set as a subject word selection result; and calculating the relevancy between the subject word selection result and the UGC text content, sorting according to a descending order of the relevancy, and outputting a plurality of UGC text contents with the relevancy of the extension words in the front. According to the method and the device, the user is helped to accurately mine the extension words related to the subject words, so that the UGC text content which the user is interested in can be obtained through the selected extension words, the accuracy is improved, the mining efficiency is improved, and the time of the user is saved.

Description

technical field [0001] The present invention relates to the technical field of OTA (Online Travel Agency, online travel), in particular to a method, system, device and storage medium for mining UGC (User Generated Content, user-generated content) text content. Background technique [0002] In the field of tourism, a large amount of UGC content is generated every day. Before purchasing or learning about a certain product, users often read user comments or strategy information. At present, it is impossible to quickly and accurately mine massive (hundreds of millions) data Content on topics of interest to users. How to quickly and accurately dig out the subject content that users are interested in from massive data is an urgent problem that needs to be solved in the field of tourism. Contents of the invention [0003] The technical problem to be solved by the present invention is to overcome the defect in the prior art that it is impossible to quickly and accurately mine the...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/9535G06F40/295
CPCG06F16/9535G06F40/295
Inventor 刘新何蜀波孙玉霞朱登龙
Owner 携程旅游信息技术(上海)有限公司
Features
  • Generate Ideas
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More