A Resume Accurate Parsing Method Based on SVM Text Classification

A parsing method and text classification technology, applied in the field of accurate parsing of resumes based on SVM text classification, can solve the problems of missing parsing result information, erroneous parsing results, loss of useful content information, etc., to achieve high segmentation accuracy and avoid parsing. Errors, the effect of avoiding information loss

Active Publication Date: 2019-12-27
INST OF SOFTWARE - CHINESE ACAD OF SCI
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

There are generally two big disadvantages in doing this: first, if a resume keyword appears in the middle of a large piece of text, it will disconnect the original content, and the analysis result of this method will be It will be completely wrong, resulting in the lack of information in the analysis results, and the robustness of the algorithm is poor.
Second, if all files are converted into text when uploading, the original resume format information will be lost, and correspondingly a lot of useful content information will be lost

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Resume Accurate Parsing Method Based on SVM Text Classification
  • A Resume Accurate Parsing Method Based on SVM Text Classification

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030] The present invention will be described in detail below in conjunction with the accompanying drawings and embodiments.

[0031] Such as figure 1 As shown, the implementation steps of the present invention are specifically as follows:

[0032] Introduction to each part of the method:

[0033] ●Resume format conversion technology

[0034] In order to avoid the problems caused by the current parsing technology purely relying on pattern matching, the present invention first cuts the uploaded resume into large sections. A general resume will be divided into several basic modules such as basic information, education experience, work experience, and project experience, and the font, font size or color of these titles will generally be different in content, and these differences can be reflected in the XML format of the resume. The XML format file will add a label to each line of the document. The content of the label includes font, font size, color, etc., which can be used ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention relates to a method for accurately analyzing resumes based on SVM text classification. The steps are as follows: (1) operate Microsoft office under .net framework, convert resume files in various formats into PDF format, and then convert PDF into XML format files (2) extract the label of each resume text line under the xml format and generate corresponding feature vectors; (3) label each resume text line, and use SVM to carry out classification training according to the feature vector corresponding to the label value and each resume text line, Obtaining a classifier; (4) cutting each resume according to the obtained classifier, and analyzing and extracting information in blocks, so as to complete the precise analysis of each resume.

Description

technical field [0001] The invention relates to a resume accurate analysis method based on SVM text classification, which is natural language processing, pattern recognition, AC automaton search technology and .net operation Microsoft word technology, and is a resume accurate analysis method integrating multiple technologies. Background technique [0002] At present, the general method of uploading resume analysis solutions on human resource websites in the market is as follows: convert the files uploaded by users into plain text format, list the dictionary of field names that need to be parsed, and then look up the words in these dictionaries in the resume. The word will return the content within a certain range in the future as the analysis result of this field. There are generally two big disadvantages in doing this: first, if a resume keyword appears in the middle of a large piece of text, it will disconnect the original content, and the analysis result of this method wi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06K9/00G06F16/35G06Q10/10
CPCG06F16/35G06Q10/1053G06V30/414
Inventor 毕翔薛云志刘张宇
Owner INST OF SOFTWARE - CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products