Online semantic excavation system of Chinese polysemic words and based on uniform resource locator (URL)

A semantic mining and multi-semantic technology, applied to instruments, network data indexing, and other database retrieval, etc., can solve problems such as slow mining process, failure to make full use of online semantic information, and inefficiency

Inactive Publication Date: 2014-01-01
EAST CHINA NORMAL UNIV
View PDF1 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Online semantic mining needs to download relevant webpages, and webpage downloading is very time-consuming, making the mining process extremely slow, and previous semantic mining methods are not efficient
In short, the previous semantic mining methods were mostly based on text processing, and failed to make full use of other online semantic information.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Online semantic excavation system of Chinese polysemic words and based on uniform resource locator (URL)
  • Online semantic excavation system of Chinese polysemic words and based on uniform resource locator (URL)
  • Online semantic excavation system of Chinese polysemic words and based on uniform resource locator (URL)

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0030] The present invention starts with Chinese multi-semantic words, and obtains its multi-semantic word search results online. Taking the Chinese multi-semantic word "bib" as an example below, the present invention will be further described in conjunction with the accompanying drawings.

[0031] refer to figure 1 , is an overall flow chart of the application of the present invention. figure 1 The Internet resources to be referenced by the present invention are shown in the dotted line frame, and the corresponding modules of the present invention are in the part of the solid line frame. Among them, module 1 is a URL-based semantic classification module, and module 2 is a semantic generation module. Specifically, Chinese multi-semantic words are received in the "search engine search" module, and then the obtained search results (including webpage URLs and corresponding abstracts) are used in module 1 to classify the obtained "online URL classifier" constructed by the presen...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an online semantic excavation system of Chinese polysemic words and based on a uniform resource locator (URL). The system utilizes a webpage classification method based on the URL and can conduct semantic excavation on the Chinese polysemic words online. The process includes first constructing a URL classifier through an online URL classification catalogue; then classifying searching results (including webpage URLs and abstracts) of the polysemic words returned by a search engine by means of the URL classifier to obtain initial semantic classification results of the polysemic words; finally clustering the initial semantic classification results according to the webpage abstracts to obtain semantic excavation results of the polysemic words. The semantic excavation system has ideal accuracy and recall rate and is highly applicable to semantic excavation of network popular words.

Description

technical field [0001] The invention relates to technical fields such as web crawler, web page information cleaning, named entity recognition, URL feature extraction, semantic classification based on URL, text feature word extraction, clustering algorithm, etc. A multi-semantic word semantic mining system for Chinese semantic mining. Background technique [0002] The important application of semantic knowledge learning in the field of artificial intelligence, therefore, has always been a hot issue in the research of natural language processing (NLP). Among them, semantic mining studies the acquisition of semantic information of multi-semantic words, and is widely used in fields such as correlation calculation and query expansion. The multi-semantic feature of nouns is particularly obvious, so it is the research focus of semantic mining. For Chinese multi-semantic nouns, semantic mining should be able to dig out their latest semantics more comprehensively. For example, for...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/951G06F16/9566
Inventor 刘一正
Owner EAST CHINA NORMAL UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products