Data processing method and system, storage medium and electronic equipment

A data processing and data technology, which is applied in the field of data processing, can solve problems such as inability to delete, low data accuracy, and reduce the accuracy of generated text summary data, so as to achieve the effect of improving accuracy

Pending Publication Date: 2021-12-03
JD DIGITS HAIYI INFORMATION TECHNOLOGY CO LTD
View PDF0 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] Before the calculation of the similarity algorithm, the data is usually preprocessed. At present, the commonly used preprocessing method is to remove special characters in the data (such as punctuation, brackets, labels, etc.). However, in the face of complex data objects (such as structure data), it will be interfered by non-special characters (letters, numbers, Chinese characters), making it impossible to select and delete repeated text in complex structured data, resulting in the generated text summary data containing repeated text, thereby reducing the Accuracy of generated text summarization data
[0004] Therefore, existing data for generating text summarization has low accuracy

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data processing method and system, storage medium and electronic equipment
  • Data processing method and system, storage medium and electronic equipment
  • Data processing method and system, storage medium and electronic equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0053] The following will clearly and completely describe the technical solutions in the embodiments of the application with reference to the drawings in the embodiments of the application. Apparently, the described embodiments are only some of the embodiments of the application, not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the scope of protection of this application.

[0054] In this application, the term "comprises", "comprises" or any other variation thereof is intended to cover a non-exclusive inclusion such that a process, method, article, or apparatus comprising a set of elements includes not only those elements, but also includes none. other elements specifically listed, or also include elements inherent in such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising a ..." does not...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a data processing method and system, a storage medium and electronic equipment. Comprising the steps of obtaining to-be-processed text data, and performing data type identification on the to-be-processed text data to obtain a data type result; determining a corresponding feature configuration list based on the data type result; obtaining a corresponding extraction rule according to the data type result; extracting feature data from the feature configuration list based on an extraction rule; and when the feature data meets a preset condition, generating text abstract data based on a preset abstract rule and the feature data. Through the above scheme, under a complex data structure including non-special characters and the like, feature extraction processing is performed on different data types to obtain respective corresponding feature data, so that the requirements of automatic type identification, automatic feature extraction, automatic text abstract generation and the like under the complex data structure are met, and the accuracy of obtaining the text abstract data is improved. In addition, the text abstract data is subjected to similarity algorithm recognition, so that the accuracy of a similarity calculation result is improved.

Description

technical field [0001] The present application relates to the technical field of data processing, and more specifically, to a data processing method, system, storage medium and electronic equipment. Background technique [0002] In the natural language processing task, it is judged whether two documents are similar, and the similarity degree of the two documents is calculated by the similarity algorithm. For example, when discovering microblog hot topics based on a clustering algorithm, it is necessary to measure the content similarity of each text, and then let the microblogs with sufficiently similar content be clustered into a cluster; when preprocessing the corpus, based on the text similarity, Select the duplicate text and delete it. [0003] Before the calculation of the similarity algorithm, the data is usually preprocessed. At present, the commonly used preprocessing method is to remove special characters in the data (such as punctuation, brackets, labels, etc.). Ho...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/194G06F40/205G06F16/34
CPCG06F40/194G06F40/205G06F16/345
Inventor 吴东
Owner JD DIGITS HAIYI INFORMATION TECHNOLOGY CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products