Semantic model based text message extraction method and device

A semantic model and text information technology, applied in the field of text processing, can solve the problems of increased workload of staff, low matching flexibility, low extraction efficiency, etc., and achieve the effect of reducing generation difficulty, improving extraction efficiency, and reducing workload
CN107608949AActive Publication Date: 2018-01-19ZHONGKE DINGFU BEIJING TECH DEV

Patent Information

Authority / Receiving Office
CN · China
Current Assignee / Owner
ZHONGKE DINGFU BEIJING TECH DEV
Publication Date
2018-01-19

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The invention discloses a semantic model based text message extraction method and device. The method comprises the steps that to-be-extracted text messages are obtained; the to-be-extracted text messages are subjected to message extraction according to extract expressions and semantic models corresponding to the extract expressions, and target messages are obtained, wherein the extract expressionscomprise a part of speech extract expression, a time extract expression and / or a rule extract expression, the semantic model corresponding to the part of speech extract expression is a statistical semantic model, the semantic model corresponding to the time extract expression is a time semantic conceptual model, and the semantic model corresponding to the rule extract expression is a rule semantic model. Accordingly, the corresponding extract expressions and the semantic models are set according to different extract requirements, message extraction is conducted on the to-be-extracted text messages, workers do not need to compile complex regular expressions one by one, the generation difficulty is lowered, the matching flexibility is improved, and therefore the method can not only improvethe extract efficiency but also lower the workload of the workers.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The present application relates to the technical field of text processing, in particular to a semantic model-based text information extraction method and device. Background technique

[0002] With the explosive growth of Internet information, the contents of various documents are becoming more and more colorful. Since the information people need is hidden in various styles of content, it is increasingly difficult to find it. Therefore, people need to use information extraction methods to find the required information in relevant texts.

[0003] At present, the information extraction method is mainly based on the HTML structure extraction method, which uses the HTML parser to scan the characters in the HTML text information one by one, analyzes the structural hierarchical relationship of the HTML text information, and numbers the same HTML tags sequentially from zero, and finally A DOM tree corresponding to the HTML text information is formed, and then...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More