Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Text topic mining method based on word semantic weight of Internet service

An Internet, word technology, applied in semantic analysis, natural language data processing, instruments, etc., can solve the problems of weak distinction between key words, NMF model cannot be Mashup modeling, service description text is short, etc., to alleviate the problem of sparsity. Effect

Pending Publication Date: 2021-05-25
ZHEJIANG UNIV OF TECH
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] In order to overcome the problem that the existing Mashup service description text is short, resulting in the use of document-word frequency information or TF-IDF method to distinguish key words weakly, so that the NMF model cannot effectively model the Mashup service, the present invention proposes an Internet-based service word A text topic mining method with semantic weight. Based on TF-IDF, this method combines service label information and context word information to recalculate the weight of words and increase the weight value of key words, so as to effectively model the Mashup service. Confirm Service Documentation Topics

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text topic mining method based on word semantic weight of Internet service
  • Text topic mining method based on word semantic weight of Internet service
  • Text topic mining method based on word semantic weight of Internet service

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0059] The present invention will be further described below.

[0060] A text topic mining method based on the semantic weight of Internet service words, comprising the following steps:

[0061] Step 1: Use the natural language toolkit (NLTK) in Python to perform part-of-speech tagging on the words in the Mashup service description document. NLTK is a well-known natural language processing library for processing natural language-related stuff, the steps are as follows:

[0062] 1.1 Traverse each word in the current Mashup service description document, and use NLTK to restore the part of speech of the word;

[0063] 1.2 Use NLTK to extract the root of the word, and judge whether the word is a noun word, if it is a noun word, add the noun set Nset;

[0064] 1.3 Repeat step 1.1 until all Mashup services are processed;

[0065] Step 2: Count word frequency information and calculate TF-IDF information. The steps are as follows:

[0066] 2.1 Traverse each word in the Mashup serv...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

A text topic mining method based on word semantic weight of Internet service comprises the following steps: 1, performing part-of-speech tagging on words in a Mashup service description document by using a natural language toolkit in Python; 2, counting word frequency information, and calculating TF-IDF information; 3, extracting Mashup service label information wherein the semantic weight of each word in the Mashup service description document is recalculated on the basis of the noun set Nset and the TF-IDF value; and 4, solving Mashup theme features through an NMF model. On the basis of TF-IDF, in combination with service label information and context word information, weights of words are recalculated, and weight values of key words are increased, so that Mashup service modeling and service document theme confirmation are effectively performed.

Description

technical field [0001] The invention relates to a text topic mining method based on the semantic weight of Internet service words Background technique [0002] Driven by the development of cloud computing and the idea of ​​"service-oriented" service computing, more and more companies publish data, resources or related businesses to the Internet in the form of Web services, in order to improve the utilization rate of information and their own competitiveness. . However, traditional SOAP-based Web services have problems such as complex technical systems and poor scalability, making it difficult to adapt to complex and changeable application scenarios in real life. In order to overcome the problems caused by traditional services, in recent years, a lightweight information service combination mode has emerged on the Internet - Mashup technology, which can mix and match a variety of different Web APIs and develop a variety of new Web services to alleviate It is difficult for tr...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F40/216G06F40/30G06F40/284
CPCG06F40/216G06F40/30G06F40/284Y02D10/00
Inventor 陆佳炜赵伟郑嘉弘徐俊张元鸣肖刚
Owner ZHEJIANG UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products