Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Domain term semantic drift extraction method

A technology of semantic drift and domain, applied in the field of semantic drift extraction of domain terms, which can solve the problems that the recall rate cannot provide the semantic retrieval results of the term, and the semantics are unrealistic.

Pending Publication Date: 2020-04-21
HARBIN ENG UNIV
View PDF4 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Generally speaking, it is unrealistic to manually mine domain terms and their time- and region-related semantics in massive professional texts; traditional information retrieval systems based on keywords and Boolean retrieval have a recall rate of only about 20% and cannot provide Semantic search results of terms, and modern search technology supported by artificial intelligence technology provides search results of semantic terms by introducing natural language understanding, but no relevant research has considered this kind of semantic drift in time and space

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Domain term semantic drift extraction method
  • Domain term semantic drift extraction method
  • Domain term semantic drift extraction method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0121] In order to obtain a quadruple (domain term, time, region, semantics) that can describe the semantic drift of domain terms, we need to extract two aspects of content from the domain corpus, namely the domain term and its time-region-related semantics. Such as figure 1 Shown is the overall flow chart of the method. Among them, the process in the dotted box marked with number 1 represents domain term extraction: starting from the domain corpus, go through step A1 (rule-based candidate domain term extraction), step A2 (statistical candidate domain term screening) and step A3 ( Domain term filtering based on small sample learning) to obtain the final domain term set; the process in the dotted box marked as number 2 represents domain term time and region related semantic extraction: use the extraction to obtain domain terminology and domain corpus, and go through step B1 (based on Candidate domain term time-region related semantics extraction of rules), step B2 (filtering o...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention belongs to the technical field of text semantic extraction, and particularly relates to a domain term semantic drift extraction method. As long as the term semantics in the professionalfield have distinct time and region characteristics (such as laws, regulations, policies and the like in the social insurance field), the method can be used for extracting quadruples (field terms, time, regions and semantics), and semantic drift of the field terms is described accordingly.

Description

technical field [0001] The invention belongs to the technical field of text semantic extraction, and in particular relates to a method for extracting domain term semantic drift. Background technique [0002] For terms in certain professional fields, their interpretation or definition (that is, their semantics) will change over time and geographically. A clear example is the terminology in the legal field, such as the term "deductible line" in the laws, regulations and policies related to the field of social insurance (medical insurance), which was interpreted in the context of medical insurance in Beijing in 2018 It is: 1,300 yuan for retirees and 1,800 yuan for in-service employees; in the same year, in the context of Shanghai medical insurance, it was interpreted as: 700 yuan for retirees and 1,500 yuan for in-service employees. [0003] The objects to be processed and extracted all come from a large amount of text in a certain professional field. As a representative of ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F40/289G06F40/30
CPCY02D10/00
Inventor 黄少滨李轶李熔盛申林山何杰李泽松张柏嘉颜伟
Owner HARBIN ENG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products