Extraction method and device for Internet-oriented meaningful strings

An extraction method and meaningful technology, which are applied in the field of Internet-oriented meaningful string extraction and devices, can solve the problems of meaningless, similar content, and redundancy in calculating the frequency of occurrence of single words, achieve good semantic independence, and reduce similarity. degree, the effect of improving the accuracy

Inactive Publication Date: 2010-10-06
HARBIN ENG UNIV
View PDF3 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

This patented technology allows for efficient retrieval of relevant web pages without repeating any specific words or phrases that may indicate their importance. It uses techniques like repetitive string detection and analyzers to identify important parts of speech signals while reducing redundancy. By replacing certain features within these areas, they become clearer over time compared against other similar sounds. Overall, this technique improves efficiency and quality in accessing useful content across different platforms.

Problems solved by technology

This patented technical problem addressed in these inventions relates to improving image understanding capabilities through knowledge graphs based on complex relationships among data elements such as tokens and attributes. Current techniques require manual effort and may result in overfitting or undercompaction of irrelevant parts due to lack of consideration of relevant aspects like correlations and dependency structure.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Extraction method and device for Internet-oriented meaningful strings
  • Extraction method and device for Internet-oriented meaningful strings
  • Extraction method and device for Internet-oriented meaningful strings

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0056] In order to make the purpose, technical solution and advantages of the present invention clearer, the method and system for extracting meaningful strings oriented to the Internet of the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments.

[0057] The invention extracts meaningful strings from massive webpages existing on the Internet. Meaningful strings are complete language units with independent semantics, tight coupling, and wide circulation. The meaningful strings extracted by the present invention can be used as the feature representation of the text representation model and applied to the clustering and classification of massive Internet data.

[0058] The present invention divides the meaningful string mining method process into four stages of repeated string discovery, internal analysis, external analysis, and inter-string analysis. The whole process is as follows: figure 1 shown, including ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides an extraction method and a device for Internet-oriented meaningful strings. The extraction method comprises the following steps: extracting repeated character strings and filtering the character strings sequentially by in-string analysis, out-string analysis and among-string analysis; and the extraction device comprises a repeated string discovery module, an in-string analysis module, an out-string analysis module and an among-string analysis module which are successively connected in series. The invention can effectively extract meaningful strings on news pages and forums, and can be widely used in the fields of network public opinion management, Internet intelligent information processing and the like.

Description

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Owner HARBIN ENG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products