Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Named entity recognition and extraction using genetic programming

A genetic computing and program technology, applied in the field of named entity recognition and extraction using genetic programming, can solve problems such as slow speed and low efficiency, and achieve the effect of reducing false positive errors, improving results, and reducing manual input and errors

Active Publication Date: 2021-09-07
ALIPAY (HANGZHOU) INFORMATION TECH CO LTD
View PDF10 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, generating such programs usually involves extensive expert programming effort, which is inefficient and slow
In the era of big data and cloud-based services, service providers or platforms are faced with the need to handle entity recognition tasks in a large variety of data stream categories that cannot be handled by manual programming

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Named entity recognition and extraction using genetic programming
  • Named entity recognition and extraction using genetic programming
  • Named entity recognition and extraction using genetic programming

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0019] This paper describes techniques for generating pattern programs using genetic algorithms. Genetic algorithms operate on sample data strings representing categories of data to be identified or extracted by named entity recognition. Such an example data string is called a "positive example" data string. Genetic algorithms can also operate on negative example data strings, which represent data strings that negate positive example data strings, eg, are not the target of a named entity recognition task. In an initialization phase, an initial schema program is generated based on sample data strings representing categories of data to be identified or extracted by named entity recognition. Starting from the initial pattern program, genetic operations are performed iteratively to generate several generations of offspring pattern programs. In each round of genetic operations, the offspring pattern programs are generated through cross-breeding operations and mutation operations....

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

Disclosed herein are methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating a pattern program using a genetic algorithm. The genetic algorithm operates on example data strings that represent the data categories to be recognized or extracted through named entity recognition. In the initialization stage, the initial pattern programs are generated based on example data strings that represent the data categories to be recognized or extracted through named entity recognition. Starting from the initial pattern programs, genetic operations are iteratively conducted to generate generations of offspring pattern programs. In each round of the genetic operation, offspring pattern programs are generated through the crossover operation and the mutation operation.

Description

Background technique [0001] Advances in network and storage subsystem design continue to enable the processing of ever-increasingly large data flows between and within computer systems. At the same time, the content of such data streams has come under increasing scrutiny. For example, the collection, analysis and storage of personal data is subject to scrutiny and regulation. Organizations must ensure that personal data is collected legally under strict conditions. Organizations that collect and manage personal data have an obligation to protect it from misuse and unlawful use, and an obligation to respect the rights of data owners. Personal data or other sensitive data includes, but is not limited to, name, date of birth, place of birth, ID number, home address, credit card number, phone number, email address, URL, IP address, bank account number, etc. [0002] Classifying and extracting personal or other sensitive data from data streams involves named entity recognition. ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/12
CPCG06F40/295G06F40/205G06N3/126G06N5/04
Inventor 王德胜贾茜刘洋章鹏张谦郑鹏
Owner ALIPAY (HANGZHOU) INFORMATION TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products