A resume analysis method based on binaryzation

A method of parsing and resume technology, applied in the field of resume parsing based on dualization, can solve problems such as difficult implementation, low success rate of label recognition, difficulty in dictionary database and algorithm model, etc., and achieve good practicability and accuracy of information recognition Good and accurate recognition effect

Active Publication Date: 2019-06-28
深圳市前海欢雀科技有限公司
View PDF12 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004]The above three technologies all have certain limitations, and there are certain obstacles in order to achieve the goal of resume analysis at the practical level, mainly: patent CN105787047A and There is no specific algorithm model for information extraction, but only a conceptual solution: matching extraction is too dependent on a powerful dictionary library and a complex algorithm model, and it is very difficult to establish such a high-level dictionary library and algorithm model; patent CN107145584A Mainly for resumes with clear sources, this kind of resume is often in accordance with the predetermined standard format, each module of the resume has information prompts prefix keywords, and a prefix dictionary can be generated based on the prefix keywords to help confirm the resume keywords Segmentation and extraction of content, but for resumes without a clear source, and these often occupy most of the resume, there is no information prompt prefix keyword before the content, using this method, resume information cannot be effectively extracted; patent CN107392143A uses SVM to try to identify XML The resume information in the tag, but because the XML templates are very different, the parsing training is carried out under the premise of limited samples, and the success rate of tag recognition is low, which requires a large number of samples for parsing training, which is more difficult in practical applications realized
Therefore, in view of the deficiencies in the actual implementation process of the above-mentioned multiple schemes, they should be corrected and improved. Later, Fang created this design, so it provides a dual-based resume parsing method to solve the problem of achieving the goal of resume parsing at the practical level

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A resume analysis method based on binaryzation

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0020] In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments It is a part of embodiments of the present invention, but not all embodiments. The components of the embodiments of the invention generally described and illustrated in the figures herein may be arranged and designed in a variety of different configurations.

[0021] Accordingly, the following detailed description of the embodiments of the invention provided in the accompanying drawings is not intended to limit the scope of the claimed invention, but merely represents selected embodiments of the invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art wi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a resume analysis method based on binaryzation. The resume analysis method comprises the following steps: reading resumes in batches; converting the batch read-in resumes intoHTML and TXT text formats; judging whether the resume converted into the HTML format can be applied to a resume accurate identification template or not; according to the method of the regular expression, the resume information converted into the HTML text format is analyzed through the XPATH, the analyzed resume information is scored, and whether the resume information is higher than a predefinedthreshold value or not is judged; using a TensorFlow-constructed BI-LSTM-CRF machine learning model to carry out resume information named entity extraction; extracting and identifying the resume information by utilizing the label dictionary and combining with the named entity, and cutting each block of the resume information; Traversing the content of each resume plate, and storing the extracted resume information content by using a data link table; and storing the resume information subjected to reanalysis by using JSON or XML structured data. According to the invention, the resume information can be accurately identified on the basis of a limited resume sample.

Description

technical field [0001] The present invention relates to a resume parsing method, in particular to a resume parsing method based on dualization. Background technique [0002] Resume parsing can be classified as a task set of Natural Language Processing (Natural Language Processing), an important part of which is Named Entity Recognition (NER). In the task of resume analysis, it is necessary to identify the resume text to be processed including: name, Email address, phone number, place of origin, school, major, date in education history, position, company name, date in work history, etc. [0003] Similar resume analysis technologies in the prior art mainly include the following methods: (1) a method for extracting, analyzing, and converting resume information disclosed in patent CN105787047A, which includes a computer reading in the file path for storing resumes, reading the file stream, Extract the text content, output a large text character string, read in the extracted lar...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/21G06F17/22
CPCY02D10/00
Inventor 钟实陈少燕潘志锋
Owner 深圳市前海欢雀科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products