A method and related device for determining the novelty of text
A novelty and text technology, applied in the field of data processing, can solve the problems of great influence of retrieval results, low novelty accuracy of target text and candidate text, etc., and achieve the effect of accurate novelty calculation
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0124] Please combine figure 1 For understanding, the text structuring method provided in the embodiment of the present application will be described in detail below. The text structuring method mainly includes two parts, the first part is training the structured model, and the second part is text structuring structured representation.
[0125] First, train the structured model;
[0126] The structured model includes an entity extraction model for extracting entities and a relationship extraction model for extracting entities, and the training method includes the following steps:
[0127] Step 101. Acquire a marked first corpus set, which is obtained by marking each text in the first text set with entity corpus according to a first preset rule.
[0128] The first text collection includes but is not limited to technical documents, patents, academic papers, etc. In this embodiment of the present application, the first text collection is described using patents as an example. ...
Embodiment 2
[0222] see Figure 5 As shown, the embodiment of the present application also provides a method for determining text similarity. The method in this example is applied to an electronic device. The electronic device may be a server or a terminal. The method may include the following steps:
[0223] Step 301. Obtain a target text and a candidate data set. The candidate data set includes a plurality of arrays, and each of the plurality of arrays represents a semantic vector of an entity; the entity is included in the candidate text.
[0224] The server may receive the target text sent by the terminal, for example, the target text may be a patent.
[0225] The specific method for the server to obtain the candidate data set includes at least the following two methods:
[0226] In the first possible implementation:
[0227] First, a text collection is obtained, the text collection includes n candidate texts, and n is an integer greater than or equal to 2. It can be understood that ...
Embodiment 3
[0330] see Figure 7 As shown, the embodiment of the present application also provides a method for determining the novelty of text, the method is applied to an electronic device, and the electronic device can be a server or a terminal. In this embodiment, the electronic device can be a terminal As an example for illustration, the method specifically includes the following steps:
[0331] Step 401, determine the target text.
[0332] For example, the target text may be a patent or a paper. In this embodiment, the target text is described using a patent as an example.
[0333] Step 402, extracting multiple target entities in the target text to obtain a target entity set.
[0334] In this example, multiple target entities in the target text are extracted through the entity extraction model in Embodiment 1, specifically, the target text is input into the entity extraction model, and the target text is identified through the entity extraction model Multiple target entities in ,...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


