Data searching and capturing method of information of a plurality of high-end talents

A technology of information data and talents, applied in the fields of electrical digital data processing, special data processing applications, instruments, etc., can solve the problem of low accuracy of search results

Inactive Publication Date: 2013-06-26
国家外国专家局国外人才信息研究中心
View PDF3 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Another way is that enterprises make simple search requests based on a few conditions such as keywords, website names, and time in ordinary Internet search engines. The search results are few and the accuracy of the search results is not high.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data searching and capturing method of information of a plurality of high-end talents
  • Data searching and capturing method of information of a plurality of high-end talents
  • Data searching and capturing method of information of a plurality of high-end talents

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0017] Step 1. Prepare a real resume page.

[0018] Provide 5000 resume pages, divided into ten groups, 500 resume pages in each group. These resumes are all English resumes, which can be crawled from the Internet by computer using existing web crawler technology, or retrieved and screened manually from the Internet.

[0019] A list of pre-prepared resume urls such as figure 1 shown.

[0020] Step 2, obtaining the text content of each resume in the first set of resumes.

[0021] Manually obtain the content of the text on each resume web page, that is, remove advertisements, web page headers, web page tails and other non-text information on each resume web page; finally remove the label code by the program.

[0022] Step 3: Count the total number T of words in each resume.

[0023] Use word segmentation technology (or manual processing) to further process the content of the text obtained in step 2, that is, remove function words and retain content words. Save all the words...

Embodiment 2

[0066] Step 1 to Step 16 of Embodiment 2 are exactly the same as Embodiment 1. After the sixteenth step, the following steps seventeen' to twenty-one' are also included.

[0067] Step seventeen', calculate the final negative evaluation score A of the new web page.

[0068] Based on the same principle, take 10 groups of 500 web pages that are not resumes in each group, and calculate the top 100 words that appear most frequently in each group in these 10 groups of web pages that are not resumes according to steps 2 to 5, and divide the 10 groups The scores of the first 100 words that appear most frequently in the web pages that are not resumes are defined as negative scores, and the first group of web pages that are not resumes are used to obtain the first 100 words in step 5 for the new words captured in step 16. The webpage is scored according to the method in step 6, and the first negative evaluation score A of the new webpage is obtained 1 , and so on, using the first 100 ...

Embodiment 3

[0078] Step 1: Prepare a real resume.

[0079] Provide 5000 resumes, divided into ten groups, 500 resumes in each group. These resumes are all Chinese resumes, or Japanese resumes, or Korean resumes, or resumes in any language, which can be crawled by computer using existing web crawler technology, or manually retrieved and screened from the Internet. The remaining steps are the same as Step 2 to Step 19 of Example 1.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a data searching and capturing method of information of a plurality of high-end talents. Combined with a web crawler technology and a data analyzing technology, the method help an enterprise to build a talent information database, and the method refers to the building method of a resume database and has the advantages of being quick, high in efficiency, accurate and reliable.

Description

technical field [0001] The invention relates to a method for searching and capturing massive amounts of high-end talent information data. Background technique [0002] Nowadays, enterprises are more and more inclined to find high-level talents they need from the Internet. There are generally two types of resume acquisition methods. One is that the portal recruitment website provides a resume registration system. Applicants register their resumes on the website, and then the company looks for the talents it needs from the website's resume database. In this way, the talent resources provided to the company are limited to one or a few website. Another way is that enterprises make simple search requests in ordinary Internet search engines based on a few conditions such as keywords, website names, and time. The search results are few and the accuracy of the search results is not high. Contents of the invention [0003] The technical problem to be solved by the present invent...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 付俊生钟延光苏小鲁陈化北夏兵王勇
Owner 国家外国专家局国外人才信息研究中心
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products