A distributed text approximation nearest neighbor semantic search algorithm

An approximate nearest neighbor and semantic search technology, applied in computing, semantic analysis, natural language data processing, etc., can solve problems such as difficult to recommend scientific and technological achievements, unsatisfactory retrieval results, and poor user experience, so as to optimize search speed and reduce Small problems, effect of improving accuracy

Inactive Publication Date: 2018-12-28
HANGZHOU DIANZI UNIV
View PDF2 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] At present, many online literature resource databases and technology docking trading platforms can only provide traditional search solutions based on keywords, which are inconvenient to use, lack semantic understanding, unsatisfactory search results, and poor user experience. Users provide recommended scientific and technological achievements services

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A distributed text approximation nearest neighbor semantic search algorithm
  • A distributed text approximation nearest neighbor semantic search algorithm
  • A distributed text approximation nearest neighbor semantic search algorithm

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0049] The present invention will be described in further detail below in conjunction with the accompanying drawings.

[0050] The present invention proposes a distributed text approximate nearest neighbor semantic search calculation method, which includes four steps, constructing a text semantic vector based on a deep learning model; constructing a multi-layer clustering index of a text semantic vector, by The division of hyperplane space can quickly obtain similar texts; the distributed and balanced storage of text semantic vectors can distribute and evenly store semantically similar text vectors to different nodes, reducing data skew and speeding up the calculation of similarity; multi-dimensional user preference screening Text semantic search, efficient screening of multi-dimensional preferences for large-scale texts and precise real-time semantic search, returns text results that are most similar to the semantics of user needs.

[0051] combine figure 1 , the present inv...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a distributed text approximation nearest neighbor semantic searching calculation method. The invention comprises the construction of a text semantic vector, the construction ofa multi-layer clustering index of the text semantic vector, the distributed balanced storage of the text semantic vector, and the text semantic search of the multi-dimensional user preference screening. The construction of the text semantic vector comprises sack extraction, text segmentation, word vector model training and text vector calculation. The multi-layer clustering index of the text semantic vector constructs a dichotomous Kmeans clustering including the text vector; the distributed balanced storage of the text semantic vector comprises a distance calculation of a multidimensional space of the text semantic vector and a distributed balanced storage; the multidimensional user preference screening text semantic search includes multidimensional preference efficient screening and real-time semantic precision search for large-scale text. The invention reduces the amount of calculation and optimizes the search speed.

Description

technical field [0001] The invention belongs to the technical field of big data text analysis, relates to natural language processing, in particular to a distributed text approximate nearest neighbor semantic search calculation method Background technique [0002] With the advent of the era of big data and the rapid development of information technology, scientific and technological achievements have also experienced rapid and explosive growth in a short period of time, and a large amount of information is generated every day. On average, 13,000 to 14,000 papers containing new knowledge are published every day; more than 300,000 invention-creation patents are registered every year, and an average of 800-900 patents come out every day. We are submerged in the ocean of scientific and technological achievements. How to quickly find the scientific and technological achievements we need and promote the transfer and transformation of scientific and technological achievements is an...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06F17/27
CPCG06F40/289G06F40/30
Inventor 徐小良穆诗棋王宇翔
Owner HANGZHOU DIANZI UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products