Unlock instant, AI-driven research and patent intelligence for your innovation.

Method and device for identifying abnormal Chinese character strings

A technology of Chinese characters and recognition methods, which is applied in the field of recognition of abnormal Chinese character strings, and can solve the problems of low accuracy in identifying abnormal Chinese character strings in text

Active Publication Date: 2019-10-25
BEIJING GRIDSUM TECH CO LTD
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The main purpose of this application is to provide a method and device for identifying abnormal Chinese character strings, so as to solve the problem of identifying whether there are abnormal Chinese character strings in the related technology in order to improve the recognition efficiency of identifying whether there are abnormal Chinese character strings in the text. low accuracy problem

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for identifying abnormal Chinese character strings
  • Method and device for identifying abnormal Chinese character strings

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0019] It should be noted that, in the case of no conflict, the embodiments in the present application and the features in the embodiments can be combined with each other. The present application will be described in detail below with reference to the accompanying drawings and embodiments.

[0020] In order to enable those skilled in the art to better understand the solution of the present application, the technical solution in the embodiment of the application will be clearly and completely described below in conjunction with the accompanying drawings in the embodiment of the application. Obviously, the described embodiment is only It is an embodiment of a part of the application, but not all of the embodiments. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without creative efforts shall fall within the scope of protection of this application.

[0021] It should be noted that the terms "first" and "second...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present application discloses a method and device for identifying abnormal Chinese character strings. The method comprises: determining the sampling times for sampling the character strings in the text to be processed; sampling the character strings in the text to be processed according to the sampling times to obtain a sampling string set; calculating the Chinese character string ratio value according to the sampling string set, wherein , the Chinese character string proportion value is the proportion of the Chinese character strings in the sampling character string set to all the character strings in the sampling character string set; and identifying whether there is an abnormal Chinese character string in the text to be processed according to the Chinese character string proportion value. The present application solves the problem in the related art that in order to improve the recognition efficiency of identifying whether there are abnormal Chinese character strings in the text, the accuracy rate of identifying whether there are abnormal Chinese character strings in the text is low.

Description

technical field [0001] The present application relates to the field of natural language processing, in particular, to a method and device for identifying abnormal Chinese character strings. Background technique [0002] When performing natural language processing on network texts, many abnormal texts may be generated due to system or non-system reasons, such abnormalities include Chinese encoding errors, malicious advertising links, and so on. If the text to be processed is not checked for exceptions before the parsing task of natural language processing, it may cause problems such as unknown errors in the parsing or excessively long parsing time. Therefore, before text processing, it is necessary to adopt a certain mechanism to check the exception of the text to be processed. Usually, by traversing all the characters in the string, counting each character, and formulating some filter conditions to judge whether there are abnormal Chinese strings in the text to be processed...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06K9/62
CPCG06F18/285
Inventor 何鑫
Owner BEIJING GRIDSUM TECH CO LTD