Phrase extraction method and device, electronic equipment and storage medium

An extraction method and phrase technology, applied in the field of intelligent search, can solve the problems of collocation analysis, large manpower consumption, inability to efficiently construct phrase collocation corpus, etc., and achieve high processing efficiency

Pending Publication Date: 2019-12-03
BEIJING BAIDU NETCOM SCI & TECH CO LTD
View PDF6 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, the manual tagging method requires a lot of manpower for corpus tagging, and the co-occurrence analysis method cannot analyze the collocation of words beyond the distance in the sentence, so the above method cannot efficiently build a comprehensive phrase collocation corpus

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Phrase extraction method and device, electronic equipment and storage medium
  • Phrase extraction method and device, electronic equipment and storage medium
  • Phrase extraction method and device, electronic equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0084] Exemplary embodiments of the present application are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present application to facilitate understanding, and they should be regarded as exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

[0085] Exemplary embodiments of the present application are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present application to facilitate understanding, and they should be regarded as exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a phrase extraction method and a device, electronic equipment and a storage medium, and relates to the technical field of big data. The method comprises the steps of: segmenting a corpus text to obtain short sentences; extracting candidate phrases according to dependency relationships and part-of-speech among the words in the short sentences; and if the phrases meet the preset conditions, storing the candidate phrases into a phrase matching corpus. Therefore, the phrase matching mode conforming to the part-of-speech combination can be determined according to the dependency relationship and the part-of-speech among the words in the sentence, and the phrase extraction efficiency and accuracy of the corpus text are improved.

Description

technical field [0001] The present application relates to intelligent search technology in the field of big data technology, and in particular to a phrase extraction method, device, electronic equipment and storage medium. Background technique [0002] With the development of data processing technology, the function of intelligent search is becoming more and more powerful. In addition to searching for relevant content based on keywords, users can also search for phrase matching results by entering a phrase with interrogative words. [0003] At present, the phrase collocation corpus is generally pre-built by manual annotation or co-occurrence analysis. When the user enters a phrase with question words, the search engine searches all the phrases that meet the requirements from the phrase collocation corpus. [0004] However, the manual labeling method requires a lot of manpower for corpus labeling, and the co-occurrence analysis method cannot analyze the collocation of words ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27G06F16/33
CPCG06F16/3344
Inventor 郭辰阳钱璟吕继根邵英杰
Owner BEIJING BAIDU NETCOM SCI & TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products