Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

394 results about "Sentence segmentation" patented technology

Chinese entity relation extraction method based on keyword and verb dependency

The invention discloses a Chinese entity relation extraction method based on keyword and verb dependency. Taking large-scale unstructured free text as target text, firstly, the text is segmented and keywords are extracted to form a text keyword thesaurus. Then the text is subjected to sentence segmentation, word segmentation, part-of-speech tagging, named entity recognition, dependency parsing, and entity corpus is constructed by combining named entity thesaurus and keyword thesaurus. According to the characteristics of Chinese sentence structure, syntactic structure and the dependency betweenwords, the entity-relation syntactic rules are constructed from verbs, and then each sentence in the text is matched with the relation syntactic rules. Finally, the relation triple is output and theset of text relation triple is obtained. The invention can make the entity relation extraction of the large-scale Chinese text more effective and more accurate.
Owner:SHANGHAI DATATOM INFORMATION TECH CO LTD

Voice answering method for combining intelligent answer with artificial answer

The invention relates to the technical field of human-computer interaction, and discloses a voice answering method for combining intelligent answer with artificial answer. The voice answering method provided by the invention has a core thought that a voice identification algorithm is firstly used for converting problem voice information into problem text information; then, the problem text information is subjected to sentence segmentation processing to obtain a user interrogative sentence; and finally, on the basis of an interrogative sentence similarity, whether a standard interrogative sentence which is most similar to the user interrogative sentence and corresponding answering information are found in a QA library or a manual service desk is accessed to obtain answering information is determined. Therefore, voice answering can be realized so as to bring convenience for users to input problem information, and questioning and answering efficiency and user experience are improved. Meanwhile, when a proper answer can not be found in the QA library, manual answer can be switched to, and the doubts of users can be solved in time.
Owner:JIANGMEN POWER SUPPLY BUREAU OF GUANGDONG POWER GRID

Information processing method and device for realizing intelligent question answering

The invention relates to the technical field of man-machine interaction, and discloses an information processing method and device for realizing intelligent question answering. The information processing method comprises the following steps of: carrying out sentence segmentation on question text information to obtain a user question; and searching a standard question most similar to the user question and corresponding answer information from a QA library on the basis of a question similarity. Compared with the existing keyword retrieval-based question answering method, the method disclosed by the invention does not need to require the users to have keyword decomposition ability, is automatic in the whole process and is capable of greatly enhancing the user experience and improving the search effect and the pertinence and effectiveness of answers. Meanwhile, through fusing natural language understanding technologies such as sentence model analysis, lexical analysis and lexical meaning extension, and carrying out comprehensive calculation on multi-dimensional similarity, the method is capable of improving the correctness of a final sentence similarity in a Chinese automatic question answering process, and enabling a Chinese intelligent question answering system to be possible.
Owner:JIANGMEN POWER SUPPLY BUREAU OF GUANGDONG POWER GRID

Large-length voice full-automatic segmentation method

The invention relates to a large-length voice full-automatic segmentation method which is a zero-labeling sentence automatic segmentation algorithm having higher accuracy. The algorithm enables a Force-alignment non-supervision algorithm and a semi-supervised learning method based on an HMM to be blended, automatic expansion is carried out on a few precise labeling sets provided for the zero-labeling sentence segmentation algorithm by a semi-supervised learning minimization labeling sentence segmentation algorithm through establishment of an iteration mechanism based on a timer shaft, the purpose of the maximization of the precise labeling sets is realized, and then according to obtained correct periods, voice of an original length is cut into smaller paragraphs or sets of sentences. According to the method, a Force-aligned method under the HMM and a Co_training method in semi-supervised learning are blended together, so the facts that in a large-length voice sentence segmentation process, manual intervention is not needed, and segmentation accuracy is high are guaranteed. The large-length voice full-automatic segmentation method can be applied to rapid and automatic construction of a voice corpus.
Owner:张巍

Browsing based Chinese input method

A platform to implement a Chinese input method consists of the following components: 1. A Keypad. 2. A cascade Multi-Window, 3. A Sentence Editing Buffer. 4. An Attribute Viewing Window. 5. A Text Accumulation Window. 6. A two-level phrase set refining control window. The keypads are designed based on the lexical structures of the Zhu-Yin and Pin-Yin phonetic systems. Efficient mouse operations have been designed to enter phonetic symbol strings. The Multi-Window can present a great many candidate words and phrases. It allows a user to browse on its multi pages without mouse clicking. A two-phase sentence generation procedure relieves users from the burden of sentence segmentation and creates the possibility of harvesting system supplied longer generalized phrases.
Owner:DU MIN WEN +1

Chinese short text sentiment classification method based on fields

The present invention discloses a Chinese short text sentiment classification method based on fields, which includes: data preprocessing of a short text including sentence segmentation, word segmentation, stop word filtration, and field division; construction of a field-oriented sentiment dictionary; extraction and matching of sentiment paths, extraction and polarity discrimination of candidates, and TF-IDF weight calculation of sentiment words by the field-oriented sentiment dictionary and using a corpus as a data set; sentimental characteristic extraction of the short text; and the corpus training or unknown sentiment types discrimination by a rand forest algorithm. Experiments show that the scheme provided by the present invention has high accuracy rate.
Owner:GUANGDONG UNIV OF PETROCHEMICAL TECH

Reinforcement learning based anaphora resolution method

The invention discloses a reinforcement learning based anaphora resolution method, which comprises the following steps: data preprocessing: carrying out word segmentation, sentence segmentation, part-of-speech tagging, part-of-speech reduction, named entity identification, syntactic analysis and word vector conversion on text data to obtain candidate preceding words and analogy word related characteristics; constructing a neural network model: combining the characteristics of the word vectors and the relevant characteristics which can learn the fingering pairs and the relevant semantic information, better sorting and scoring the candidate preceding words and the fingering words, and finally obtaining an fingering chain; and using the trained model to carry out anaphora resolution, inputting text data, and outputting a resolution chain. According to the method, deep learning training is carried out by adopting a reward measurement mechanism for overcoming the defects of a heuristic lossfunction, the model effect is improved, hyper-parameter setting is automatically carried out for different language data sets, the necessity of manual setting is avoided, the practicability of the model is improved, and the application range is expanded.
Owner:NAT COMP NETWORK & INFORMATION SECURITY MANAGEMENT CENT +1

Man-machine interaction question-answering method and system based on complex intention intelligent identification

The invention discloses a man-machine interaction question-answering method and system based on complex intention intelligent recognition, and the method comprises the steps: obtaining an original question sentence of a user, carrying out the sentence segmentation and part-of-speech tagging, and obtaining the part-of-speech information of each component word of the question sentence; performing dependency syntax analysis on the question sentence to obtain a dependency syntax tree; carrying out industry entity identification to obtain industry entities and the number, and extracting a core dependency tree to simplify questions; carrying out industry question relation classification on the questions, carrying out Chinese multi-intention question rewriting, and then carrying out knowledge retrieval on the questions; and selecting and generating answers for knowledge retrieval results, and returning the answers to the user. According to the method and system, multi-intention complex questions can be effectively simplified in any industrial scene, the intention of the user can be accurately understood, the industrial knowledge can be more naturally fed back to the user, the user can more accurately and quickly obtain the required industrial knowledge, the user experience is improved, and the method and system are particularly suitable for man-machine interaction intelligent questions and answers in the medical industry.
Owner:HUNAN UNIV

Sentence segmentation method and sentence segmentation apparatus, machine translation system, and program product using sentence segmentation method

To provide a highly accurate sentence segmentation process in natural language processing by estimating parts of speech of words in text to be processed. Dictionary data is used to perform a sentence segmentation process on a text to be processed. If it cannot be determined through a user of the dictionary data whether the text should be broken into sentences, the parts of speech of words constituting the text are estimated and a further sentence segmentation process is performed based on the result of the estimation.
Owner:NUANCE COMM INC

Method and apparatus for translating a speech

There is provided a method for translating a speech, includes recognizing the speech into a text which includes a long sentence containing a plurality of simple sentences, segmenting the long sentence into the simple sentences, and translating each simple sentence into a sentence of a target language. A long sentence segmentation module is inserted between the speech recognition module and the machine translation module in the method, wherein the long sentence in the text recognized can be split into several simple and complete sentences. In this way, difficulties in translation are relieved, and translation quality is improved. Further, there is also provided a user interface which allows the user to modify the segmentation results conveniently. The modifying operations of the user are recorded to update the segmentation model online to improve the effect of the automatic segmentation step by step.
Owner:KK TOSHIBA

Semantic information retrieval method

The invention discloses a semantic information retrieval method. The method includes: receiving query terms submitted by a user, and performing term segmentation to obtain keywords included in the query terms; according to semantic relation among the keywords, performing query analysis and converting the query terms into conceptual expressions; reading texts to be retrieved from a storage medium by taking piece as unit; subjecting the texts to be retrieved to sentence segmentation and term segmentation, and segmenting the read texts into sentences and terms; subjecting the sentences to semantic analysis to obtain conceptual categories of the sentences and conceptual expressions of the terms; computing semantic distance between the acquired conceptual expressions of the query terms and the conceptual expressions of the texts to be retrieved; sorting from the near to the distant according to the semantic distance, and returning query results. Compared with retrieval results obtained by term matching according to a traditional information retrieval method, retrieval results can be effectively improved in accuracy.
Owner:吴晨

Dependency syntax tree-based knowledge graph expansion method and system

The invention provides a dependency syntax tree-based knowledge graph expansion method and system. The method comprises the steps of crawling a basic corpus set A and a knowledge extraction corpus setB; performing cleaning, sentence segmentation and word segmentation processing on corpora of the two corpus sets; performing syntax analysis on the corpora subjected to the word segmentation, and according to an analysis result, constructing a dependency syntax tree; combining the dependency syntax tree constructed based on the basic corpus set A with corresponding knowledge in a knowledge graphto generate dependency syntax tree rules, calculating score values, and adding the score values and the dependency syntax tree rules to a dependency syntax tree rule library G0; expanding the rule library G0 by the corpus set B subjected to syntax analysis to form a rule library G1; extracting knowledge in the corpus set B subjected to the syntax analysis by utilizing the rule library G1, and taking the knowledge with a highest score on matching of an attribute word subjected to knowledge extraction and an attribute word library as alternative knowledge; and adding the alterative knowledge notoccurring in the knowledge graph to the knowledge graph. The labor cost is reduced; and the knowledge in different fields can be added to the knowledge graph.
Owner:南京云问网络技术有限公司

Unsupervised training for overlapping ambiguity resolution in word segmentation

A method for resolving overlapping ambiguity strings in unsegmented languages such as Chinese. The methodology includes segmenting sentences into two possible segmentations and recognizing overlapping ambiguity strings in the sentences. One of the two possible segmentations is selected as a function of probability information. The probability information is derived from unsupervised training data. A method of constructing a knowledge base containing probability information needed to select one of the segmentation is also provided.
Owner:MICROSOFT TECH LICENSING LLC

A drug entity relationship extraction method and system based on an attention mechanism neural network

The invention relates to a drug entity relationship extraction method and system based on an attention mechanism neural network. The method comprises the following steps: (1) analyzing the text content of a pharmaceutical document, using sentences as basic units for sentence segmentation, and performing vectorization representation on each word in the sentences; (2) inputting a vectorized representation result into a recurrent neural network, extracting association characteristics of words in the sentences according to a front-back bidirectional word sequence through the recurrent neural network, and identifying all medicine entities; (3) obtaining inter-word importance weights in the sentences through an attention mechanism neural network, and combining the inter-word importance weights with the output in the step (2); And (4) inputting a result obtained in the step (3) into a convolutional neural network, and predicting a category relation between every two medicated entity words through the convolutional neural network. According to the classification method for increasing the attention mechanism concerned entity class information weight, the influence caused by wrong dependencyanalysis results in long sentences can be reduced, and the accuracy of extracting the pharmacochemical entity relationship is improved.
Owner:PEKING UNIV

Traditional Chinese medicine diagnosis and treatment knowledge graph automatic construction method based on deep learning

The invention discloses a traditional Chinese medicine diagnosis and treatment knowledge graph automatic construction method based on deep learning. The traditional Chinese medicine diagnosis and treatment knowledge graph automatic construction method comprises the steps: constructing an initialized literature medical record corpus, carrying out sentence segmentation and word segmentation on a medical record, and marking a theory-law-prescription-medicine entity in the medical record; predicting the entity through a bidirectional LSTM, and automatically extracting the entity from traditional Chinese medicine literature medical records through a deep learning model; and clustering similar entities appearing in the same medical record to form an entity group, then forming a triple accordingto a predefined relationship between entities, and constructing a knowledge graph. According to the invention, the relationship between traditional Chinese medicine diagnosis and treatment concepts ispredefined; construction of the knowledge graph is converted into a traditional Chinese medicine diagnosis and treatment named entity recognition task; and entities are automatically extracted from traditional Chinese medicine literature medical records through a deep learning model, and the entities are clustered to form an entity set, so that the many-to-many problem between traditional Chinesemedicine diagnosis and treatment concepts is solved, and the famous and old traditional Chinese medicine diagnosis and treatment thought in the medical records is completely displayed.
Owner:UNIV OF ELECTRONIC SCI & TECH OF CHINA

Financial text sentiment analysis method

ActiveCN105138506AIn line with the habit of thinkingSentiment analysis results are accurate and reasonableSpecial data processing applicationsHidden layerSentence segmentation
The invention relates to a financial text sentiment analysis method, comprising the following operational steps of: firstly, constructing a financial sentiment dictionary; secondly, performing sentence segmentation on a text, performing word segmentation, and generating a word segmentation sequence vector comprising a word text, a word property and a word sentiment value; thirdly, correcting the influence of a negative word, a degree word, a single concept word, a transitional word, a standard word and the like on the sentiment value; fourthly, calculating a fused financial text sentiment value by using weighted combination of a multiplication sentiment model for calculation of a sentiment generation function and an addition sentiment model for words in articles; and fifthly, compatibly expressing sentiment values [0,2] and [-1,1]. According to the method, for different sentiment environments, an input layer is applied as a word, a hidden layer is applied as a sentence sentiment layer expressed by the sentiment generation function, and an output layer is applied as a neural network of a nerve cell to calculate financial sentiment.
Owner:TIANYUN RONGCHUANG DATA TECH BEIJING CO LTD

Visual document atlas construction method

The invention discloses a visual document atlas construction method, which comprises the following steps of S1, sequentially extracting keywords from an input text, carrying out sentence segmentationon the input text, carrying out word segmentation on each single sentence generated by sentence segmentation, and sequentially carrying out part-of-speech tagging, named entity identification and dependency syntax analysis on each word formed by word segmentation; S2, formulating a relationship extraction rule, and extracting triple data from each simple sentence obtained in S1 based on the relationship extraction rule, wherein the triple data is composed of two entity words and a relation word; S3, importing the triple data obtained in the S2 into a graph database to form a document graph; and S4, performing graph mining operation on the data in the graph database, and realizing document graph visualization based on the mining operation. According to the visual document atlas constructionmethod, the key information of the text can be extracted, and the document is mapped into the visual graph based on semantic association, so that a user is helped to efficiently master the semantic information of the article.
Owner:SHANGHAI DATATOM INFORMATION TECH CO LTD

Method for structured processing of Chinese pathological text

The present invention relates to a method for structured processing of a Chinese pathological text. The method comprises the following steps: extracting template information corresponding to each sample from a hierarchical stricture of a sample of text data of a pathological report text data and indicator; extracting the template information comprising short sentence segmentation and indicator name extraction; classifying the short sentences; with respect to each sample, in combination with a classification result cluster and a short sentence cluster, calculating a TF value, an IDF value and a C-value of each indicator name in an indicator name list in a short sentence language material, and screening out an indicator name whose TF value, IDF value and C-value satisfy a threshold, and using the obtained indicator name as a component of the final template. According to the present invention, a non-structured Chinese pathological text can be structured.
Owner:DONGHUA UNIV +1

Automatic detection and analysis method for English composition grammar errors

The invention provides an automatic detection and analysis method for English composition grammar errors. According to the method, first, sentence segmentation is performed on an input English composition to be detected; next, word segmentation is performed on words in each sentence after sentence segmentation; then, spelling checking is performed on the words; after it is detected that all spellings are correct, part-of-speech tagging is performed on all the words; tagging results of the multi-tag words after tagging are corrected; the flow chart of different error instance rules is constructed; combined with the existing grammar rules and error examples, the grammar check is performed comprehensively on the sentences; finally the position where the grammatical error occurs in the composition is located, and the specific modification opinion is given. The method can locate the grammatical error position and give specific error content and solution; at the same time, the grammar rulescan be extended by modifying the error example flow chart. The method has high grammar error detection and correction capability, which can quickly detect and give a feedback of an English composition, and can be applied to a real-time environment.
Owner:SOUTH CHINA UNIV OF TECH

Graph convolutional network relationship extraction method based on multi-dependency relationship representation mechanism

The invention provides a graph convolutional network relationship extraction method based on a multi-dependency relationship representation mechanism, and the method comprises the following steps: carrying out preprocessing on a collected unstructured text, including sentence segmentation, word segmentation, part-of-speech tagging, entity type tagging, relationship type annotation and generation of a semantic embedding vector of each segmented word, performing dependency relationship analysis on sentences, and generating a dependency relationship tree; capturing context semantic features of sentences based on a bidirectional long-short-term memory recurrent neural network; generating a full adjacency matrix, a concentrated adjacency matrix and a distance weight adjacency matrix according to the dependency relationship tree, performing convolution operation on the adjacency matrix, the concentrated adjacency matrix and the distance weight adjacency matrix in combination with context semantic features of the sentence, and performing maximum pooling processing on a result after the convolution operation to obtain a sentence representation vector; obtaining the entity relationship feature information based on the feedforward neural network, and carrying out the entity relationship classification. According to the method, relation extraction can be better assisted, and the recognition precision is improved.
Owner:中国科学院电子学研究所苏州研究院

Method and device for analyzing semantic orientation of Chinese network topic comment text

The invention discloses a method and device for analyzing the semantic orientation of a Chinese network topic comment text. The method comprises the following steps: performing word segmentation and sentence segmentation on the Chinese network topic comment text to obtain a result sequence; performing syntactic analysis and grammatical analysis on the result sequence to obtain an evaluation object; performing sentence pattern analysis on the result sequence to determine simple sentences and complex sentences in the comment text, judging the relations among all the simple sentences forming a complex sentence, and determining a first emotion orientation value of sentence pattern analysis; extracting emotion phrases in each sentence in the result sequence according to the evaluation object and a preset phrase matching mode, and calculating a second emotion orientation value of each emotion phrase; calculating a third emotion orientation value of each sentence in the comment text according to the first emotion orientation value and the second emotion orientation values; determining a text emotion orientation value of the comment text according to the third emotion orientation values. According to the method, the accuracy and the recall rate of the semantic orientation analysis of the network topic comment text are improved.
Owner:BEIJING JINGDONG SHANGKE INFORMATION TECH CO LTD +1

Online corpus alignment method and system

The invention discloses an online corpus alignment method and system. The method comprises the steps of analyzing a bilingual inter-translated file to obtain a result file; performing paragraph adjustment on the result file to enable paragraphs between an original text and a translated text to correspond; automatically performing sentence segmentation on the original text and the translated text through a preset sentence segmentation rule to obtain original text sentences and translated text sentences, and performing calculation according to a preset arrangement rule to obtain arrangement combinations of the original text sentences and the translated text sentences; and calculating sentence similarity corresponding to each arrangement combination of the original text sentences and the translated text sentences, and selecting the arrangement combination with the maximum similarity as a final sentence-sentence alignment result. According to the method and the system, the accuracy of alignment can be improved.
Owner:上海一者信息科技有限公司

Abstract generation method and device, server and storage medium

The embodiment of the invention discloses an abstract generation method and device, a server and a storage medium. The method comprises the following steps: carrying out sentence segmentation on a target text to obtain a sentence set; Obtaining a target topic corresponding to a target text, and predicting each sentence in the sentence set by using a pre-trained abstract model in combination with the target topic to obtain a probability value that each sentence is an abstract sentence; And selecting a plurality of abstract sentences from the sentence set according to the probability value, andforming an abstract of a target text according to the abstract sentences. According to the embodiment of the invention, when the abstract is generated, the abstract which has higher relevancy with thetheme and is more accurate is generated by combining the theme of the text, the important information coverage capability of the abstract is improved, and meanwhile, diversified abstracts can be generated according to different themes.
Owner:BEIJING BAIDU NETCOM SCI & TECH CO LTD

Information mining method and device

The invention discloses an information mining method and device. A specific implementation way of the method comprises the following steps: carrying out sentence segmentation on obtained text information to obtain a sub-sentence set; selecting at least one candidate sub-sentence from the sub-sentence set according to the preset public opinion word set; carrying out word segmentation on the at least one candidate sub-sentence on the basis of a domain dictionary, carrying out dependency parsing on various words obtained after word segmentation to obtain at least one candidate word collocation pair; selecting at least one word collocation pair from the at least one candidate word collocation pair as a first word collocation pair set mined from the text information according to the public opinion word set. The implementation way achieves rapid and accurate information mining.
Owner:SHANGHAI YOUYANG XINMEI INFORMATION TECH CO LTD

Text sentence segmentation method and system

The invention discloses a text sentence segmentation method and system. The method comprises following steps: pre-collecting a small amount of textual data and corresponding speech data, constructinga long-term memory segmentation model based on text segmentation features and acoustic segmentation features; when the text is segmented, obtaining the text of the sentence to be segmented and corresponding speech data; extracting text segmentation features and the acoustic segmentation features according to the to-be-segmented text and the speech data corresponding to the to-be-segmented text, respectively; according to the extracted text segmentation features, acoustic segmentation features, and the long-term memory segmentation model, segmenting the to-be-segmented text. The invention can effectively improve the accuracy of text segmentation.
Owner:IFLYTEK CO LTD

integrated automatic lexical analysis method and system for ancient Chinese texts

The invention discloses an integrated automatic lexical analysis method for ancient Chinese texts. The method includes the following steps: pre-training the word vector of the ancient Chinese with semantic features by using the Word2Vec model; adding the information data appearing in the historical documents to the ancient name database to form a number of proper noun entries; adjusting Bi-LSTM- Each parameter of the CRF neural network model preprocesses the final training corpus into a model readable form, loads into the neural network model, continuously iteratively learns, and automaticallyevaluates the labeling result of the test corpus. According to the method, a sentence segmentation, word segmentation and part-of-speech tagging integrated tagging method is adopted, the repeated tagging process of lexical analysis of multiple sub-tasks is omitted, and multi-stage diffusion of repeated tagging errors is also avoided; According to the method, a deep learning model is adopted, richlanguage features can be learned automatically, and the work of manually customizing a feature template in traditional machine learning is omitted; The labeling model is accelerated by adopting GPU hardware, the model training time can be greatly shortened, and the efficiency is much higher than that of a traditional machine learning model.
Owner:NANJING NORMAL UNIVERSITY

Meeting recording method, apparatus and device and computer readable medium

The invention provides a meeting recording method. The meeting recording method comprises the following steps: carrying out preprocessing on conference voice; carrying out sentence segmentation on theconference voice, and uploading the conference voice after sentence segmentation to a voice server for text conversion; and receiving the text information obtained after voice conversion of the voiceserver. According to the embodiment provided by the invention, meeting recording is carried out through the method of converting voice into text, the work efficiency can be improved, and resources are saved. Further, according to the embodiment of the invention, the processing including noise reduction can be carried out on the conference voice, and the accuracy rate of voice recognition can be improved.
Owner:BEIJING BAIDU NETCOM SCI & TECH CO LTD

Method and device for generating document summary

The invention discloses a method and device for generating a document summary. The method comprises the steps of conducting sentence segmentation on a document set to obtain a sentence set and expressing the sentence set by using a vector space model, determining similar sentences corresponding to each sentence and the number of the similar sentences according to a preset similarity threshold, obtaining corresponding importance scores by calculating, obtaining each sentence in the sentence set sequentially as current processing sentences, counting and comparing the number of the similar sentences of the current processing sentences with the number of the similar sentences corresponding to each similar sentence of the current processing sentences to find out maximum values, adding the corresponding sentences into a diversity reference set, calculating diversity scores and comprehensive scores of each sentence, and sorting all of the sentences in the sentence set and filtering to form the document summary. The invention further provides the device for generating the document summary. According to the method and device for generating the document summary, internal information of the sentence and global information in the document set are comprehensively considered to reduce the redundancy of the document summary as a whole.
Owner:SHENZHEN RAISOUND TECH

Method and device for searching webpages according to sentence serial numbers

InactiveCN101923556AIncrease sort weightRank the topSpecial data processing applicationsSentence segmentationRanking
The invention discloses method and device for searching webpages according to sentence serial numbers. The method comprises the following steps of: A, obtaining a plurality of webpages and downloading to a webpage database; B, carrying out sentence segmentation on the plurality of webpages and respectively distributing serial numbers for sentences of each webpage; C, making a forward index table including sentence serial numbers; D, making an inverted index table including the sentence index numbers; E, inputting a search item and segmenting the search item into at least one key letter, one key word or a punctuation mark; and F, calculating a sequencing weight value of a webpage including the key letter, the key word or the punctuation mark according to the inverted index table and outputting search results. By adopting the method and the device of the invention, the sequencing weight value of webpages with zero distances or smaller distances among sentences including the key letter, the keyword or the punctuation mark can be increased, thereby putting the ranking of the webpages forwards to increase the search satisfaction of users.
Owner:SHANGHAI LAISEEK CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products