Unlock instant, AI-driven research and patent intelligence for your innovation.

A method and related device for determining the novelty of text

A novelty and text technology, applied in the field of data processing, can solve the problems of great influence of retrieval results, low novelty accuracy of target text and candidate text, etc., and achieve the effect of accurate novelty calculation

Active Publication Date: 2021-09-03
BEIJING HEXIANG WISDOM TECH CO LTD
View PDF11 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] In the current method, the user needs to determine the keywords. The selection of keywords has a great impact on the retrieval results, and the selection of keywords is subjective and does not necessarily depend on the understanding of the actual content of the target text. Therefore, the relationship between the target text and the candidate text The accuracy of novelty is lower

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method and related device for determining the novelty of text
  • A method and related device for determining the novelty of text
  • A method and related device for determining the novelty of text

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0124] Please combine figure 1 For understanding, the text structuring method provided in the embodiment of the present application will be described in detail below. The text structuring method mainly includes two parts, the first part is training the structured model, and the second part is text structuring structured representation.

[0125] First, train the structured model;

[0126] The structured model includes an entity extraction model for extracting entities and a relationship extraction model for extracting entities, and the training method includes the following steps:

[0127] Step 101. Acquire a marked first corpus set, which is obtained by marking each text in the first text set with entity corpus according to a first preset rule.

[0128] The first text collection includes but is not limited to technical documents, patents, academic papers, etc. In this embodiment of the present application, the first text collection is described using patents as an example. ...

Embodiment 2

[0222] see Figure 5 As shown, the embodiment of the present application also provides a method for determining text similarity. The method in this example is applied to an electronic device. The electronic device may be a server or a terminal. The method may include the following steps:

[0223] Step 301. Obtain a target text and a candidate data set. The candidate data set includes a plurality of arrays, and each of the plurality of arrays represents a semantic vector of an entity; the entity is included in the candidate text.

[0224] The server may receive the target text sent by the terminal, for example, the target text may be a patent.

[0225] The specific method for the server to obtain the candidate data set includes at least the following two methods:

[0226] In the first possible implementation:

[0227] First, a text collection is obtained, the text collection includes n candidate texts, and n is an integer greater than or equal to 2. It can be understood that ...

Embodiment 3

[0330] see Figure 7 As shown, the embodiment of the present application also provides a method for determining the novelty of text, the method is applied to an electronic device, and the electronic device can be a server or a terminal. In this embodiment, the electronic device can be a terminal As an example for illustration, the method specifically includes the following steps:

[0331] Step 401, determine the target text.

[0332] For example, the target text may be a patent or a paper. In this embodiment, the target text is described using a patent as an example.

[0333] Step 402, extracting multiple target entities in the target text to obtain a target entity set.

[0334] In this example, multiple target entities in the target text are extracted through the entity extraction model in Embodiment 1, specifically, the target text is input into the entity extraction model, and the target text is identified through the entity extraction model Multiple target entities in ,...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the present application provides a method for determining the novelty of a text and a related device, the method includes: determining the target text; extracting multiple target entities in the target text to obtain a target entity set; obtaining each A candidate entity set of a candidate text; determining a first entity intersection between the target entity set and the candidate entity set, where the first entity intersection is a matching entity in the target entity set and the candidate entity set; The novelty of the target text and the candidate text is determined according to a difference parameter between the first entity intersection and the target entity set. In the embodiment of the present application, the accuracy rate of novelty calculation is improved.

Description

technical field [0001] The invention relates to the field of data processing, in particular to a method and a related device for determining the novelty of a text. Background technique [0002] With the advent of the era of technological explosion, the importance of information continues to increase, and the amount of data continues to increase, so information retrieval is particularly important. [0003] Users often need to search the database according to the target text, and query the candidate text similar to the target text in the database, but most of the current retrieval methods are based on text retrieval, and text retrieval focuses on the matching of text characters. For example, the user determines the keywords in the target text, enters the keywords, and then the retrieval system performs keyword matching with the candidate texts in the database according to the keywords. The higher the number of keywords matched, the lower the novelty of the candidate text and t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F40/194G06F16/33
CPCG06F40/194
Inventor 陈伟然姜庭欣杨冠梅段博超郭永红何佳王志强王希桢李静毅刘乾楠
Owner BEIJING HEXIANG WISDOM TECH CO LTD