Copyright resource identification method and copyright resource identification device

A technology for identifying methods and resources, applied in the computer field, can solve the problems of wasting human resources, lag in response, low efficiency, etc., and achieve the effect of saving human resources, improving efficiency, and ensuring accuracy.

Active Publication Date: 2013-05-29
BEIJING BAIDU NETCOM SCI & TECH CO LTD
View PDF4 Cites 19 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This method not only consumes human resources, has low efficiency, and relatively lags in response, but also cannot find other copyright resources with the same content, and cannot judge copyright resources that have not been reported
[0005] 2) Based on the identification method of topic retrieval, the identification is performed by obtaining the title of the copyright resource. Since the content of the resource text is not identified,

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Copyright resource identification method and copyright resource identification device
  • Copyright resource identification method and copyright resource identification device
  • Copyright resource identification method and copyright resource identification device

Examples

Experimental program
Comparison scheme
Effect test

Example Embodiment

[0061] Embodiment 1. Taking document resources as an example

[0062] figure 1 The flow chart of the method provided by Embodiment 1 of the present invention, such as figure 1 As shown, the method includes the following steps:

[0063] Step S101: Use the titles of existing copyright resources to search to obtain positive sample corpus, and use the titles of non-copyright resources to search to obtain negative sample corpus.

[0064] Obtain the existing copyright resources and non-copyright resources, extract the titles of the existing copyright resources, use the extracted titles as search terms (query) to search in the search engine, and obtain search results. Pages related to the title of copyright resources, specifically including page titles, abstracts, site information, link information, etc., combine search results and information on existing copyright resources, including titles of existing copyright resources and Text content, etc., constitute the positive sample co...

Example Embodiment

[0104] Embodiment two,

[0105] figure 2 The device structure diagram provided for the second embodiment of the present invention, such as figure 2 As shown, the device may include: a training corpus acquisition module 101 , a classification model establishment module 102 , a to-be-predicted corpus acquisition module 103 , a confidence degree acquisition module 104 and a recognition module 105 .

[0106] The training corpus acquisition module 101 is configured to use titles of existing copyright resources to search to obtain positive sample corpus, and use titles of non-copyright resources to search to obtain negative sample corpus.

[0107] The training corpus acquisition module 101 acquires the titles of existing copyright resources and non-copyright resources, uses the acquired titles as a query to search in a search engine, and obtains search results, and these search results include pages related to the titles of existing copyright resources , specifically including p...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a copyright resource identification method and a copyright resource identification device. The copyright resource identification method includes the following steps: S1, positive sample corpus and negative sample corpus are acquired by utilizing titles of existing copyright resources and non-copyright resources; S2, classification features of the positive sample corpus and the negative sample corpus are extracted, the weight of each classification feature in the belonged category is obtained by means of machine learning and training, and a classification model is set up; S3, to-be-recognized resources are acquired, and the steps from S31 to S33 are carried out on the acquired to-be-recognized resources; S31, to-be-forecasted corpus is acquired by means of titles of to-be-recognized resources; S32, the classification features of the to-be-forecasted corpus are extracted, confidence coefficient of the to-be-recognized resources belonging to the copyright resources or the non-copyright resources is determined according to the set up classification model; and S33, according to the confidence coefficient of the resources to be recognized belonging to the copyright resources or the non-copyright resources, the resources to be recognized is recognized to be the copyright resources or not. The copyright resource identification method and the copyright resource identification device can guarantee the accuracy rate and the recall rate, save the manpower resources, and improve the efficiency.

Description

【Technical field】 [0001] The present invention relates to the field of computer technology, in particular to a copyright resource identification method and device. 【Background technique】 [0002] With the continuous development of network technology, people are becoming more and more accustomed to using the Internet to share and acquire resources. Some resource sharing platforms, such as Baidu Tieba, Baidu Wenku, MP3, Video, Douding.com, Daoke Baba, etc., are open platforms for netizens to share documents, audio, video and other resources online. Users can freely upload resources for sharing, You can also watch videos and listen to songs online on the platform, and you can also browse or download online documents and materials in various fields such as courseware, exercises, exam question banks, thesis reports, professional materials, official letter templates, legal documents, and literary novels. The resources accumulated on the open platform are all uploaded by users. Th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
Inventor 徐兴军吴羡刘婵
Owner BEIJING BAIDU NETCOM SCI & TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products