Unlock instant, AI-driven research and patent intelligence for your innovation.

Forward word segmentation method and device based on Chinese retrieval

A word segmentation method and word segmentation technology, applied in the field of Chinese web page retrieval in search engines, can solve problems such as blindly selecting the maximum length

Inactive Publication Date: 2014-01-29
JIANGSU XINRUIFENG INFORMATION TECH
View PDF2 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The present invention aims at the problems existing in the single forward maximum matching algorithm. In the search system, especially the vertical search system, the professional environment is fully utilized to establish a professional thesaurus in the machine dictionary. First, according to the proper nouns in the thesaurus The maximum length to determine the value of MAX_Length solves the problem of blindly selecting the maximum length in the matching algorithm, and forms a forward matching algorithm by combining the forward maximum matching algorithm, which greatly improves the accuracy of retrieval

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Forward word segmentation method and device based on Chinese retrieval
  • Forward word segmentation method and device based on Chinese retrieval
  • Forward word segmentation method and device based on Chinese retrieval

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a forward word segmentation method and device based on Chinese retrieval and relates to the field of processing of webpage character information in computer networks. According to the forward word segmentation method and device based on the Chinese retrieval, word segmentation is carried out on a Chinese character string which is S=C1C2C3C4...Cn through the device which is composed of a central processing unit, input-and-output equipment, a register, a mechanized dictionary, a window counter and a memorizer. The forward word segmentation method and device based on the Chinese retrieval aim to solve the problems existing in an independent forward maximum matching algorithm, a professional environment is utilized fully in a searching system, particularly in a vertical searching system, professional word banks are established in a robot dictionary, the value of the MAX_Length is determined according to the maximum lengths of proper nouns in the word banks, the problem that the maximum lengths are selected blindly according to a matching algorithm is solved, a forward maximum matching algorithm is formed through the forward maximum matching algorithm, and then the retrieval accuracy is improved to a great extent.

Description

technical field The invention relates to the field of text information processing of webpages in computer networks, in particular to a method and device for retrieving Chinese webpages in search engines. Background technique With the continuous development of the Internet, the number of web pages has increased dramatically, and web pages have become the largest and most extensive source of information for people. A lot of useful information is submerged in the vast number of Web pages. Faced with massive information, people can no longer simply rely on manual processing of all the information. Text search is one of the important application technologies in the field of large-scale information processing, and it is also an important research direction in the field of information processing. With the in-depth study of text classification search technology, text search technology is more and more widely used in information technology. The word segmentation technology is the "...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/27G06F17/30
Inventor 刘迎春魏华峰方筠捷
Owner JIANGSU XINRUIFENG INFORMATION TECH