Data mining method based on multi-source heterogeneous patent data semantic integration

A technology of data mining and semantic integration, applied in the computer field, can solve problems such as data sparseness, achieve the effects of improving extraction accuracy, improving data mining efficiency, and improving mining accuracy

Inactive Publication Date: 2014-03-26
肖冬梅
View PDF1 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In the article "Overview of Text-Oriented Ontology Learning Research" in the second issue of "Computer Science" in 2007, the text-oriented ontology learning technology solution was disclosed, especially a "statistics-based method" was disclosed. , by calculating the frequency of terms to extract concepts and the relationship between concepts, but this method has the disadvantage of generating sparse data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data mining method based on multi-source heterogeneous patent data semantic integration
  • Data mining method based on multi-source heterogeneous patent data semantic integration
  • Data mining method based on multi-source heterogeneous patent data semantic integration

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0043] The present invention will be further described below in conjunction with accompanying drawings.

[0044] A data mining method based on semantic integration of multi-source heterogeneous patent data, comprising the following steps in sequence:

[0045] The first step is to use global patent data and inter-translation dictionaries as data sources to automatically perform ontology learning to construct a patent global ontology. Patent documents in the patent database are structured information. By combining professional dictionaries and inter-translation dictionaries, word frequency statistical methods can be used to complete patent subject identification.

[0046] The second step is to construct a function for judging the similarity of individual information in each data source according to the corresponding attributes. Individual data in different data sources have their corresponding unique attributes, and the functions used to judge the similarity of each correspondi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a data mining method based on multi-source heterogeneous patent data semantic integration. The data mining method sequentially comprises the following steps that (1) global patent data and inter-translation dictionaries are taken as data sources, and ontology study is conducted, and a patent global ontology is constructed; (2) according to corresponding attributes, a function used for judging the individual information similarity degree in each data source is constructed; (3) according to the constructed similarity degree functions, similarity information of individuals in different data sources is calculated under the guidance of the patent global ontology; (4) data mining is conducted according to the similarity information. According to the data mining method based on multi-source heterogeneous patent data semantic integration, the extracting precision of concepts and concept relations is greatly improved, patent global ontology study is achieved, and data mining efficiency and accuracy degree are greatly improved.

Description

technical field [0001] This invention relates to the computer field, in particular to a data mining method based on semantic integration of multi-source heterogeneous patent data. Background technique [0002] Ontology, as a clear and formal specification of a shared conceptual model, has a great advantage in establishing a common understanding of different information. Therefore, this method is widely used in semantic integration related fields. In fact, the application of ontology in many fields is realized on the basis of constructing ontology, and the construction of ontology is still a tedious and hard but crucial task. [0003] The master's thesis titled "Intelligent Search Engine Based on Semantic Technology" published by Beijing University of Posts and Telecommunications in 2009 discloses an intelligent system for realizing semantic retrieval by constructing an ontology. Wherein, the ontology in the disclosed technical solution is manually established by domain exp...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/374G06F16/335G06F16/835
Inventor 肖冬梅程戈吕宁杨萍林政均方舟之
Owner 肖冬梅
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products