Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Document modeling method

A document and category technology, applied in metadata text retrieval, character and pattern recognition, text database clustering/classification, etc., can solve problems such as category modeling without label information

Active Publication Date: 2017-11-17
SHENZHEN IPIN INFORMATION TECH CO LTD
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Neither topic models based on Bayesian probabilistic graphical models nor deep Boltzmann machines directly model the types of label information

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Document modeling method
  • Document modeling method
  • Document modeling method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0041] In order to understand the above-mentioned purpose, features and advantages of the present invention more clearly, the present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments. It should be noted that, in the case of no conflict, the embodiments of the present application and the features in the embodiments can be combined with each other.

[0042] In the following description, many specific details are set forth in order to fully understand the present invention. However, the present invention can also be implemented in other ways than described here. Therefore, the protection scope of the present invention is not limited by the specific implementation disclosed below. Example limitations.

[0043] figure 1 A schematic flowchart of a document modeling method of the present invention is shown.

[0044] Such as figure 1 As shown, the present invention discloses a method for document modeling, comp...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a document modeling method. A semi-structured document is effectively modeled by designing the method in which words and tag information can be utilized at the same time and the tag type information can be automatically utilized. An autonomous compensation mechanism is utilized, the effects of the tag information of different types on document modeling are learned in a deep Boltzmann machine, and heterogeneous information of different tag types can be taken into full consideration, so that more effective semi-structured document vectors are learned.

Description

technical field [0001] The present invention relates to document processing and modeling technology, and more specifically, to a document modeling processing method. Background technique [0002] The so-called semi-structured documents refer to those document data that contain rich label information, such as web page text with structural information such as category information, title, author, date, etc. With the development of the Internet, more and more semi-structured text data appear in various network applications. Such text data containing tag information is collectively referred to as semi-structured text data (Semi-Structured Documents). How to effectively model this semi-structured text data has become a research hotspot. Traditionally, an effective means of modeling semi-structured document data is to use a topic model (Topicmodel) based on a Bayesian probability graph model. This modeling method is mainly based on the assumption of the bag-of-words model, and s...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30G06K9/62
CPCG06F16/35G06F16/38G06F18/214
Inventor 李双印潘嵘
Owner SHENZHEN IPIN INFORMATION TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products