Supercharge Your Innovation With Domain-Expert AI Agents!

Method and system for judging text region based on RDF knowledge base

A technology of knowledge base and region, applied in special data processing applications, instruments, electronic digital data processing, etc., can solve problems such as inability to extract and judge text region attributes, and cannot guarantee recall rate, etc., to achieve good scalability and improve recall The effect of high rate and high accuracy

Active Publication Date: 2017-04-19
XIAMEN MEIYA PICO INFORMATION
View PDF4 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In this case, the method cannot extract and judge the geographical attributes of the text
[0006] In the Chinese patent publication with the publication number CN105608072A, a text-related analysis method is proposed. Although the solution can guarantee the accuracy rate, it cannot guarantee the recall rate.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for judging text region based on RDF knowledge base
  • Method and system for judging text region based on RDF knowledge base
  • Method and system for judging text region based on RDF knowledge base

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0099] Please refer to figure 2 , The first embodiment of the present invention is: a method for judging a text region based on an RDF knowledge base, including the following steps:

[0100] S1: Build an RDF knowledge base about geographic information, and build an index. The geographic information includes a regional division and its related terms.

[0101] S2: Preset the level of regional division; for example, set the level of province to 1, the level of prefectures and cities as 2, and the level of districts and counties as 3.

[0102] S3: Preset the position weight of the noun related to the area according to the position of the noun related to the area in the text; further, the first position weight of the corresponding title and the second position weight of the corresponding text can be preset respectively.

[0103] S4: Obtain geographically related terms in the text; such as landmark buildings, schools, highways, airports, companies, activities, events, sports events, etc.

[...

Embodiment 2

[0122] This embodiment is a further expansion of step S1 in the first embodiment. The RDF knowledge base can be constructed on the basis of the geographical data of the original traditional table structure; specifically, concepts are defined through Ontology (Ontology is a conceptual model modeling tool that can describe information systems at the semantic and knowledge levels) , Describe the relationship, use the data conversion tool (Apache D2RQ or R2RML mapping language) and the preset mapping file to map and export the regional data of the traditional table structure, and then import the RDF triple database; obtain the regional division through the SPARQL statement All the dimensional information of the, that is, the nouns related to the area, obtain the documents corresponding to the nouns related to the area and record the nouns of the area; the full-text search engine (Lucene or Solr engine) is used to index the documents. image 3 Is a schematic diagram of the Ontology d...

Embodiment 3

[0131] Please refer to Image 6 This embodiment is a text region judgment system based on RDF knowledge base corresponding to the above embodiment, including:

[0132] The construction module 1 is used to construct and index the RDF knowledge base of geographical information, the geographical information including the geographical division and its related nouns;

[0133] The first preset module 2 is used to preset the level of the regional division;

[0134] The second preset module 3 is used to preset the position weight of the nouns related to the area according to the position of the nouns related to the area in the text;

[0135] The first acquisition module 4 is used to acquire geographically related nouns in the text;

[0136] The search module 5 is configured to use a geographically related noun as a keyword, search the RDF database to obtain the corresponding geographical division set, and obtain the respective geographical division set corresponding to all geographically relat...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method and a method for judging a text region based on an RDF knowledge base. The method comprises the following steps: constructing the RDF knowledge base about regional information, and establishing indexes; presetting layers of regional compartments; presetting the position weights of regionally relevant nouns; acquiring the regionally relevant nouns in a text; acquiring the respectively corresponding regional compartment sets of all the regionally relevant nouns in the text; respectively calculating the first value of a regional relevant noun corresponding to each regional compartment in the regional compartment sets; acquiring a regionally relevant noun set corresponding to each regional compartment in all the regional compartment sets; accumulating the first value of each regional compartment to obtain the second value of each regional compartment; and calculating the probability that each regional compartment is the text region according to the second value. The territory marking for the text is realized on the basis of the RDF knowledge database, so that the recalling rate of the method is improved, and relatively high accuracy rate is guaranteed.

Description

Technical field [0001] The present invention relates to the technical field of text region analysis, in particular to a method and system for judging text region based on RDF knowledge base. Background technique [0002] The explosive growth of network data puts forward more and higher requirements for data analysis. Text analysis and mining technology is currently a widely used technology. The semantic content of the text is extracted through corresponding technologies and methods, and then a series of operations such as classification and clustering of the text are performed. It is mainly used for product recommendation, public opinion analysis, and Text search and other fields. [0003] In the analysis of public opinion, it is necessary to sort and analyze the public opinion on the network under different themes, such as public opinion hotspots and development trends in different regions. For this reason, extracting and judging the geographical information involved in the cont...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
Inventor 李晟段思欣栾江霞黄钦泉章正道王备战
Owner XIAMEN MEIYA PICO INFORMATION
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More