Patents

Literature

Patsnap Eureka AI that helps you search prior art, draft patents, and assess FTO risks, powered by patent and scientific literature data.

84 results about "Search engine technology" patented technology

Filter

Efficacy Topic

Property

Owner

Technical Advancement

Application Domain

Technology Topic

Technology Field Word

Patent Country/Region

Patent Type

Patent Status

Application Year

Inventor

A search engine is an information retrieval software program that discovers, crawls, transforms and stores information for retrieval and presentation in response to user queries. A search engine normally consists of four components e.g. search interface, crawler(also known as a spider or bot),indexer, and database. The crawler traverses a document collection, deconstructs document text, and assigns surrogates for storage in the search engine index. Online search engines store images, link data and metadata for the document as well.

Image intelligent mode recognition and searching method

InactiveCN101211341AImprove hit rateShort response timeCharacter and pattern recognitionSpecial data processing applicationsThe InternetUniform resource locator

The invention puts forward an image intelligent mode identification search method. The method can establish an image sample training set database and combine with basic text search engine technology and basic image content inquiry technology, so that a network creeper can perform Internet image search and URL information resolution, so as to catch the image URL and relevant information into a local primary database; perform such pre-processes as preliminary filtration, decompression and image pre-classification and etc for the images; then, calculate color characteristics, grain characteristics and shape characteristics of the extraction images, so as to gain corresponding characteristic vector sets; combine with the image URL information before saving the images into the image basic database and establishing an index for the images; perform characteristic vector similarity calculation for images in the image basic databases and sample training sets, and then, save the classified images into an image classification database; accept key words or image description that are input by the user, create the index vector, perform similarity calculation with the image characteristic vectors in the image classification database, and then, return the index results to the user.

Image intelligent mode recognition and searching method

Image intelligent mode recognition and searching method

Image intelligent mode recognition and searching method

Owner:SHANGHAI XINSHENG ELECTRONICS TECH

Distribution-base mass log collection system

InactiveCN104036025AProcessing in near real timeDeal effectively with data collection issuesDatabase management systemsHardware monitoringCollection systemDistributed cache

The invention discloses a distribution-base mass log collection system comprising a data source layer, a distribution-type cache layer, a distribution-type storing and calculating layer, a bushiness processing layer, a visible display layer and a unified dispatching and managing module. The system has the advantages that the problem of log collection and high-speed storage can be solved effectively, the distribution-type storage and search engine technology is adopted, querying and retrieval speed can be increased, and the mass logs can be collected and analyzed accurately and accurately in high speed.

Distribution-base mass log collection system

Distribution-base mass log collection system

Owner:蓝盾信息安全技术有限公司

Authentication Process Using Search Technology

ActiveUS20110258118A1FinanceProtocol authorisationPaymentRisk profiling

Systems and methods are presented for improved authentication and risk analysis processes using search engine technology. In one potential implementation, an authorization request message is received at a payment processing network as part of a transaction between a user and a merchant. The payment processing network analyzes risk based on a search history associated with the user involved in the transaction with the merchant. A response to the authentication request is made based in part on the risk associated with the user search history. In further embodiments, a user registers with a search engine as part of a service for improved authentication, where the user accepts privacy settings allowing storage of search and transaction data by a search engine server. The search engine server passes search and transaction data to a risk analysis server for creation of risk parameters which may be used to authenticate transactions.

Authentication Process Using Search Technology

Authentication Process Using Search Technology

Authentication Process Using Search Technology

Owner:VISA INT SERVICE ASSOC

Method for judging whether web page content is identical or not

InactiveCN101350032AHelp filterEasy to viewSpecial data processing applicationsWeb contentSearch engine technology

The invention relates to a method for judging whether web page contents are same or not, which can be used in the technical field of search engines to filter the query results with same web page contents. The method comprises: calculating the similarity of web page titles and the similarity of web page text contents, judging whether the web pages are same contents or not according to the similarity of the web page titles and the text contents, determining the web pages to be the web pages with the same contents if the similarity of the web page titles and the similarity of the text contents reach certain valve value, and otherwise determining the web pages to be the web pages with different contents.

Method for judging whether web page content is identical or not

Method for judging whether web page content is identical or not

Method for judging whether web page content is identical or not

Owner:胡辉

Method for extracting, analyzing and searching network flow and content

ActiveCN103281213ASolve the repeatabilitySolve problems such as serial number reset to zeroData switching networksSpecial data processing applicationsData informationOriginal data

The invention discloses a method for extracting, analyzing and searching network flow and content. The method comprises the following steps: shunting original flow into n data processing queues; independently processing an original data message of each data processing queue by the data processing queue, performing protocol recognition and filtration on the message and performing conversation recombination on TCP (Transmission Control Protocol) flow in the message; performing protocol resolving and decoding on a recombined TCP conversation and extracting out structured data information therein; and as for key information specified by requirements, performing searching labeling in data content extracted by a content resolving and extracting module based on a multimode matching algorithm or a search engine technology, and submitting labeling results to a searching labeling information database, thereby providing searching labeling results for multiple modes of applications. The method can be used for solving the problems of repeated data packets, serial number zero adjustment and the like in the TCP conversation recombination, realizing the character labeling for the original flow, and ensuring that a user can acquire effective information conveniently.

Method for extracting, analyzing and searching network flow and content

Method for extracting, analyzing and searching network flow and content

Method for extracting, analyzing and searching network flow and content

Owner:XI AN JIAOTONG UNIV

Search engine technology based on relevance feedback and clustering

InactiveCN101853272AMeet the query requirementsWon't throw awaySpecial data processing applicationsWeb pageRetrieval result

The invention relates to a search engine technology based on relevance feedback and clustering. By simultaneously utilizing user relevance feedback information and relavancy sequencing to direct the clustering of retrieval results, the invention ensures that the final partitioning of the retrieval results meet user query requirements; and in a clustering process, a large amount of documents and repeated webpage which are irrelevant to a user are removed, the clustering speed is improved and the retrieval results are optimized at the same time. In the clustering process, a clustering center is not modified by a clustering cluster irrelevant to the user, thereby result documents relevant to the user are ensured not to be lost when noise is introduced in irrelevant document clustering.

Search engine technology based on relevance feedback and clustering

Search engine technology based on relevance feedback and clustering

Search engine technology based on relevance feedback and clustering

Owner:NORTH CHINA ELECTRIC POWER UNIV (BAODING)

Searching engine based on information extraction technique

InactiveCN1410918AImprove work efficiencyData acquisition and loggingLearning basedRelevant information

The machine learning method is carried out for the set of HTML page samples containing the homogeneous information in similar layout so as to obtain the rules for abstracting the information and the structural information from the quasi-free HTML text. The number of the rules and the degree of abstraction are adjusted through the training and learning in order to meet the precision requirement. Then, based on the rule set abstracted after learning, the abstracting information of the text files outside the sample set is carried out. The abstracting information is carried out for the pages withspecific content collected by using the searching engine base on the rules. The invention raises the efficiency of processing information, since combination between the technique of abstracting information.

Searching engine based on information extraction technique

Searching engine based on information extraction technique

Searching engine based on information extraction technique

Owner:ZHEJIANG UNIV

Bayesian model-based commodity code classification method and system

ActiveCN107704892ASolve the sparsity problemSolving Encoding Classification ProblemsCharacter and pattern recognitionNatural language data processingClassification methodsConditional probability

The invention provides a Bayesian model-based commodity code classification method and system. Aiming at the problems of sparsity and context loss caused by short text features of commodity name information, the method and system adopt a synonym extension technology for word-segmented sparse words, and increase the word synonym degrees through an external search engine technology; and then, a Bayesian model is adopted to calculating condition probabilities between word sequences and code categories of commodity names so as to obtain a Bayesian code classification mode, and a trained model is used for predicting commodity code categories of input commodity names.

Bayesian model-based commodity code classification method and system

Bayesian model-based commodity code classification method and system

Bayesian model-based commodity code classification method and system

Owner:NINGBO AISINO +1

Method for Identifying Phone Numbers and Alphanumeric Sequences

InactiveUS20100005426A1Reduce decreaseNatural language data processingOffice automationVoice over IPApplication software

A method and system provides the ability to identify phone numbers and alphanumeric sequences in a text source such as a web page and format them and / or convert them into usable information. According to one aspect, the invention automatically identifies and formats phone numbers in web pages so that they can be automatically dialed using VoIP or PSTN phone services. According to other aspects, the formatted phone number can be provided to any network voice applications, such as voice-over IP (VOIP) solutions, voice chat, voice IM, and wireless, WiFi, and WiMax connections. According to additional aspects, the invention can be integrated with internet browsing applications to allow automatic dialing of phone numbers over VOIP services or PSTN services. According to still other aspects, the invention allows technical support, electronic commerce, customer support numbers, phone numbers for product ordering, etc., on company web pages to be identified and dialed with a mouse click. According to yet other aspects, a search engine provider using the invention can charge an advertiser for phone calls placed to the advertiser, which are far more valuable than “clicks” on advertiser links, thereby reducing the “click fraud” problem afflicting current search engine technology.

Method for Identifying Phone Numbers and Alphanumeric Sequences

Method for Identifying Phone Numbers and Alphanumeric Sequences

Method for Identifying Phone Numbers and Alphanumeric Sequences

Owner:VAN BENEDICT +1

Searching method of search engine system

InactiveCN1991829ASearch results are accurateBest sortSpecial data processing applicationsInternet search enginesDatabase

The invention relates to a hunting engine technique, specially the hunting method of user providing hunting request which is based on the nature language. The hunting system receives the hunting request which is provided in different mode, in nature language and in any domain; the hunting engine inquires the history hunting database for user hunting request, classifies the problem, inquires internet hunting engine or registered expert group, and forms a prehunting result recording set; then, every record of the prehunting result is polled automatically by program or by registered voters, calculating the mark of every record to form a affirmed hunting result record set; the hunting results which are affirmed are returned to user sequentially; the user requesting problem and hunting results which are affirmed in this time are stored to form hunting history database. User can classify, affirm and poll the hunting results issued this time.

Searching method of search engine system

Owner:陈亚斌

Authentication process using search technology

InactiveUS20140058949A1FinanceProtocol authorisationPaymentRisk profiling

Systems and methods are presented for improved authentication and risk analysis processes using search engine technology. In one potential implementation, an authorization request message is received at a payment processing network as part of a transaction between a user and a merchant. The payment processing network analyzes risk based on a search history associated with the user involved in the transaction with the merchant. A response to the authentication request is made based in part on the risk associated with the user search history. In further embodiments, a user registers with a search engine as part of a service for improved authentication, where the user accepts privacy settings allowing storage of search and transaction data by a search engine server. The search engine server passes search and transaction data to a risk analysis server for creation of risk parameters which may be used to authenticate transactions.

Authentication process using search technology

Authentication process using search technology

Authentication process using search technology

Owner:VISA INT SERVICE ASSOC

Key word dynamic matching generating based on search engine technology

InactiveCN101000608AAvoid problemsMaintain fairnessSpecial data processing applicationsWeb siteInternet users

A method for dynamically matching-generating key word base search engine technique includes using search engine to match and generate multiple search key word for Internet knowledge information at concrete webpage and website, dynamically varying the sequencing parameter in following with using said key word to click concrete knowledge information by Internet user and then using public judgment as a search attribute to raise search accuracy.

Owner:吴风勇

System and method for rural information service and comprehensive management

InactiveCN102142973AComplete managementEfficient managementData switching networksSpecial data processing applicationsEmbedded technologyTime information

The invention discloses a system and a method for rural information service and comprehensive management. The system comprises seven parts of a comprehensive management terminal system for rural information, an information management platform system, a distributed database group system, an expert and government agency management system, a network information search engine system, a shared short message platform and a rural comprehensive information website, wherein information communication is realized by the Internet and the 3G network, and seven parts are organically combined by taking the information management platform as the center. In the system, hardware equipment, such as the comprehensive management terminal system for rural information, the shared short message gateway, server and the like, can be used, and the embedded technology, the Internet technology, the distributed database technology, the search engine technology, the information acquisition and automatic classification technology, the multimedia information broadcasting technology, the e-commerce technology and the like are comprehensively integrated. In order to satisfy requirements on information by rural residents, convenient and quick real-time information-driven service and management can be provided, and real and accurate rural information is provided for governments to formulate the policy.

System and method for rural information service and comprehensive management

System and method for rural information service and comprehensive management

System and method for rural information service and comprehensive management

Owner:HUNAN CITY UNIV

Method for automatically recognizing semanteme of natural language sentences understood by computer

InactiveCN102681982AImplement automatic semantic annotationGet rid of complexitySpecial data processing applicationsSemantic matchingQuestions and answers

The invention relates to a method for automatically recognizing the semanteme of natural language sentences understood by a computer, in particular to a method for accurately recognizing the Chinese language, which comprises the following specific steps: a, establishing a body base in a certain filed; b, establishing a semantic framework knowledge base on the basis of a field body; and c, directly matching the natural language sentences with semantic structures on the basis of the body mapping of semantic frameworks, and recognizing the matching according to the modes of the semantic frameworks. The method is very different from the mainstream method of segmenting words in the second-generation search engine technology. The segmented words have concept annotations of the field body, the accurate semantic matching of the natural language sentences can be obtained hereby, and the computer system can carry out calculation and inference on the body knowledge, therefore, the deep artificial intelligent question and answer has a wide prospect of application.

Owner:SHANGHAI YUNSOU NETWORK TECH

Index maintenance method for supporting multiple data sources

InactiveCN101989301AAvoid mergingAchieve mutual coexistenceSpecial data processing applicationsData sourceEngineering

The invention belongs to the technical field of search engines, in particular to an index maintenance method for supporting multiple data sources. An entire index library is divided into a series of sub index libraries, and each sub index library stores indexes in certain time granularity and comprises an independent catalogue and relevant documents. The method comprises the following three operating steps of: loading data of the sub index libraries, combining the sub index libraries and processing user retrieval requests. The real-time updating of the index is conveniently realized by setting the sub index libraries; the coexistence of sub index libraries with different time granularities is realized by setting an appropriate index combination detecting period; the retrieval requests within a time range limited by users are mapped onto the sub index libraries, the index can be updated in independent sub index libraries without influencing the user retrieval requests, and thus, the response time is ensured to meet user requirements.

Index maintenance method for supporting multiple data sources

Index maintenance method for supporting multiple data sources

Index maintenance method for supporting multiple data sources

Owner:FUDAN UNIV

Method and system for scheduling tasks of distributed network crawlers

InactiveCN103514301AWeb data indexingSpecial data processing applicationsWeb crawlerWeb page

The invention belongs to the technical field of internet search engines, and provides a method and system for scheduling tasks of distributed network crawlers. The method comprises the following steps: configurating distributed network crawler clusters; analyzing a webpage corresponding to a first layer link, and extracting a second layer link existed in the webpage by a first crawler; distributing a crawling task corresponding to the second layer link according to a Hash consistency algorithm; recording the crawling task corresponding to the second layer link to a crawling task document corresponding to a crawler with the corresponding sequence number if the second layer link is distributed to a crawler apart from the first crawler; packaging and uploading crawling task documents to a shared directory at every other pre-set time intervals; extracting and performing a corresponding crawling task in the shared directory by each crawler regularly. According to the invention, the cooperative task scheduling of the distributed network crawler tasks is realized through the shared directory, so that the tasks can be distributed to each crawler uniformly.

Method and system for scheduling tasks of distributed network crawlers

Method and system for scheduling tasks of distributed network crawlers

Owner:SHENZHEN COSHIP ELECTRONICS CO LTD

Database retrieval method based on search engine technology

InactiveCN103744913AEfficient search functionQuick searchRelational databasesSpecial data processing applicationsRelational databaseRetrieval result

The invention discloses a database retrieval method based on the search engine technology. The method comprises that S1, a retrieval server with a service interface for providing auxiliary rapid retrieval service is set, and the retrieval server obtains data from a relational database and establishes indexes; S2, a client sends a retrieval request to the retrieval server through the service interface; S3, the retrieval server obtains a retrieval result according to retrieval conditions and sends the retrieval result to the client through the server interface; S4, the client processes and displays the returned result. The database retrieval method based on the search engine technology adds the retrieval server under the premise of not affecting the system performance of an original database to achieve a high-efficiency retrieval function.

Database retrieval method based on search engine technology

Owner:GOSUNCN TECH GRP

An electric power marketing knowledge base system based on an intelligent retrieval technology

InactiveCN109189752AImprove service support capabilitiesImprove normativeWebsite content managementSpecial data processing applicationsPersonalizationMulti platform

The invention discloses an electric power marketing knowledge base system based on an intelligent retrieval technology. The electric power marketing knowledge base system is based on a B / S structure as a whole structure and comprises a storage layer, which realizes the classification and storage of knowledge according to standards, and simultaneously comprises a backup storage of the knowledge base; a service layer used to connect storage layer and application layer, including data and data exchange, data quality and other functions and services; an application layer, which is used for displaying the system operation interface and providing relevant application functions for users. The system uses the distributed search engine technology Elasticsearch to realize the real-time retrieval ofbig data. This system puts the application layer on the Web side and the storage layer on the server side; on the basis of convenient operation, effective knowledge wealth summary of massive documentscan be carried out, an orderly document classification system is formed and users can perform convenient document management operation by using the system. The system adopts Elasticsearch to realizereal-time retrieval, and realizes the functions of extremely fast file dynamics, multi-platform file interconnection, intelligent full-text retrieval, customization of personalized rules, class windowoperation and so on.

An electric power marketing knowledge base system based on an intelligent retrieval technology

An electric power marketing knowledge base system based on an intelligent retrieval technology

An electric power marketing knowledge base system based on an intelligent retrieval technology

Owner:ELECTRIC POWER RESEARCH INSTITUTE OF STATE GRID SHANDONG ELECTRIC POWER COMPANY +1

Web crawler system based on browser kernel

InactiveCN106649567AMake up for deficienciesWeb data indexingProgram loading/initiatingNetwork communicationWeb crawler

The invention relates to a webpage search engine technology, and aims at providing a web crawler system based on a browser kernel. The web crawler system based on the browser kernel comprises a browser engine module, a network communication module and a strategy module and is used for conducting page analyzing and finding URLs of other pages. According to the web crawler system based on the browser kernel, resources relied by a page are dynamically loaded through the built-in browser kernel by using a dynamic analysis technology, Javascript is executed, dynamic operations such as events of simulating mouse clicks, double clicks and carriage return are conducted on a DOM node to find a new page, and the defects of a traditional crawler are overcome.

Web crawler system based on browser kernel

Owner:HANGZHOU ANHENG INFORMATION TECH CO LTD

Method of managing websites registered in search engine and a system thereof

ActiveUS20080208858A1Automatic detectionEnhance self-purificationData processing applicationsWeb data indexingWeb siteDatabase

The present invention relates to a search engine that provides information on a predetermined website on the Internet. According to a preferred embodiment of the present invention, there is provided a method of managing websites registered in a search engine in a search engine administration system, comprising the steps of allowing a predetermined interface module to receive information on a website and allowing a website registration module to sort the received website information by the predetermined field and then to record the sorted information in a database means; extracting a HTML file constituting web pages of the website; detecting a predetermined function that generates a pop-up window by analyzing the extracted HTML file; increasing a predetermined counter value as much as a given value depending on the number of pop-up windows generated due to the detected function; determining whether the counter value exceeds a predetermined value; and if it is determined that the counter value exceeds the predetermined value, controlling a predetermined process to be performed for the registered website

Method of managing websites registered in search engine and a system thereof

Method of managing websites registered in search engine and a system thereof

Method of managing websites registered in search engine and a system thereof

Owner:NHN CORP

Problem-posing commodity information consultation method based on search engine technique

InactiveCN101882291AAccurate marketingCommerceSpecial data processing applicationsThe InternetSales promotion

The invention discloses a problem-posing commodity information consultation method based on a search engine technique, which relates to the technique of the computer Internet. The method comprises the following steps that a merchant stores commodities and the resoluble consumer's problem information of the commodities to a data base; a consumer inputs knotty problems in life, and then a computer searches in the data base to find out the commodity capable of solving the problems of the consumer, and the problems of the consumer are immediately transmitted to the merchant in front of the computer to remind the merchant of carrying out online solution; the consumer can grade answers provided by the merchant; the mark can be accumulated to the account number of the merchant and is used for deciding the rank of the commodity of the merchant in a search results. By adopting the method, as long as the problems that the consumer meet in lift are input, the consumer can find products and service for solving the problems without mastering any product term; similarly, the merchant can accurately carry out sales promotion on target users without carrying out large-scale propaganda.

Problem-posing commodity information consultation method based on search engine technique

Problem-posing commodity information consultation method based on search engine technique

Problem-posing commodity information consultation method based on search engine technique

Owner:万昌洵 +1

Method of search engine log data mining facing user information requirements

ActiveCN103164537AImprove service qualityEfficient and fast divisionSpecial data processing applicationsPersonalizationData mining

The invention relates to the field of internet search engine log division, in particular to a method of search engine log data mining facing user information requirements. The method of the search engine log data mining facing the user information requirements comprises the steps of inquiring log block classification, inquiring similarity calculation and user information requirement provision. Search term similarity and search result similarity are calculated comprehensively to be used as query similarity, two queries are judged whether to have the same information requirements or not according to the query similarity, and division of search logs can be carried out effectively and quickly. The method of the search engine log data mining facing the user information requirements has the advantages that aiming at the defect that a traditional search engine quality evaluation method cannot describe complex and vague information requirements of users completely, a search engine user information requirement satisfaction evaluation method based on behavior logs is provided. User information requirements are used as a unit, user satisfaction is evaluated by analyzing search behaviors of users in search engine logs, personal requirements of the users are analyzed, the development of a search engine technology is promoted, and service quality of a search engine is improved.

Method of search engine log data mining facing user information requirements

Method of search engine log data mining facing user information requirements

Method of search engine log data mining facing user information requirements

Owner:ZHEJIANG HONGCHENG COMP SYST

Search algorithm for Chinese word segmentation

ActiveCN108846016AImprove search efficiencyLess build timeNatural language data processingSpecial data processing applicationsTheoretical computer scienceChinese word

The invention belongs to the technical field of text search engines and specifically relates to a search algorithm for Chinese word segmentation. The algorithm is mainly divided into two phases including an offline indexing phase and an online searching phase. In the offline indexing phase, firstly suffix string sets of all original string sets are extracted, and then an improved suffix tree is generated by the suffix string sets. In the online searching phase, firstly query results of a keyword are obtained according to an index model based on the suffix tree, then a matching degree between the keyword and the query result is quantified, and finally, the query results are sorted from high to low according to a matching program followed by return. According to the search algorithm, an index construction time and an occupation space are balanced through an improved index structure based on the suffix tree, thus the search efficiency of the index structure with the search algorithm is much higher than the efficiency of violently calculating the matching degree and sorting efficiency of a result set.

Search algorithm for Chinese word segmentation

Search algorithm for Chinese word segmentation

Search algorithm for Chinese word segmentation

Owner:FUDAN UNIV

Network abstract customization search engine

InactiveCN101059815ATimely and convenient graspFind out exactlySpecial data processing applicationsThe InternetInformation needs

The invention relates to a search engine service platform system which can automatic generate based on network creation, and a relative method. The invention comprises that network platform user sets keyword and link subject catalog according to personal demand, the system automatically, timely or real-time browses the new information needed by user on internet from layer to layer, whenever user is online, the system automatically stores obtained special information into the network space of user, the system supports user to set search time and frequency according to personal demand. The invention combines the novel search engine technique with user demands on direction, time and position, which realizes network abstract creation of search engine technique.

Network abstract customization search engine

Network abstract customization search engine

Network abstract customization search engine

Owner:宋鸣

Speech recognition and cloud search engine technology based man-machine interactive system and method

InactiveCN103902613AAchieve preservationRealize processingSpeech recognitionTransmissionKey pressingMan machine

The invention relates to a speech recognition and cloud search engine technology based man-machine interactive system and method. The system comprises an MIC interface, a speech receiving module, a key processing module, an infrared processing module, a storage module, a wireless emitting module, a power management module and a main processing chip, wherein the MIC interface, the speech receiving module, the key processing module, the infrared processing module, the storage module, the wireless emitting module and the power management module are all connected with the main processing chip. By the system and method, the original defect of complexity of manual input based on a remote controller is changed, and speech input can be performed directly; users can know TV shopping actively in stead of passively in the past, the feature that the users are uninterested in shopping in the past is changed, and interaction of participation of the users is improved.

Speech recognition and cloud search engine technology based man-machine interactive system and method

Speech recognition and cloud search engine technology based man-machine interactive system and method

Owner:QINGDAO PENGHAI SOFT CO LTD

Label automatic generation method based on meta-search engine

InactiveCN106682149AReduce coverageGuaranteed recallSpecial data processing applicationsPart of speechAlgorithm

The invention discloses a label automatic generation method based on a meta-search engine. The method comprises the steps that firstly, text preprocessing optimization is conducted, Chinese word segmentation is conducted and meanwhile, basis information of words is saved and the basic information comprises part of speech, word position, word frequency of which quintuple is composed; secondly, the words are filtered, stop words are removed, part of speech filtration is conducted, and according to experience, noun, verb and gerund are kept and noise disturbance is reduced; word information quantity is recalculated again, by counting the word basic information, word position score, word frequency and word span are calculated and comprehensive score is calculated as weight of the words; finally, the similarity between words is calculated as edge weight in TextRank algorithm and the TextRank algorithm is used for calculating TextRank value of each word. According to the label automatic generation method based on the meta-search engine, the meta-search engine technology and automatic generation label are used, the automatic label technology is applied to the search engine and therefore recall ratio and precision ratio are guaranteed.

Label automatic generation method based on meta-search engine

Label automatic generation method based on meta-search engine

Label automatic generation method based on meta-search engine

Owner:HUNAN UNIV OF SCI & ENG

Applying search engine technology to HCM employee searches

ActiveUS7991787B2Reliable methodProvide quicklyWeb data indexingDigital data processing detailsData miningResult list

The present system provides an efficient and reliable method for name searching within an employee records database. The present invention uses a plurality of different searching algorithms such as an exact algorithm and a fuzzy algorithm. The exact algorithm is used to provide a first set of a limited number of results from the entire employee database. The fuzzy algorithm is then used to search through only the first set of results to quickly provide a ranked results list of employee names that is displayed to a user. The user is then able to select the appropriate name from the results list for further processing.

Applying search engine technology to HCM employee searches

Applying search engine technology to HCM employee searches

Applying search engine technology to HCM employee searches

Owner:SAP AG

Server and Chinese character segmentation method and device

ActiveCN104462105AImprove accuracyQuick fixMathematical modelsMachine learningPattern recognitionChinese characters

The invention discloses a server and a Chinese character segmentation method and device and belongs to the technical field of search engines. The Chinese character segmentation method includes: receiving a word segmentation instruction; acquiring a first Chinese character set; acquiring retrieving information corresponding to each character in the first character set according to preset corresponding relations; acquiring multiple combined characters and retrieving probabilities according to the first character set and the retrieving information corresponding to each character in the first character set; performing path combination according to the characters included in the multiple combined characters; acquiring the retrieving probability of each path; determining the path with the highest probability; performing segmentation on keywords according to the combined characters included in the path with the highest probability. Manual segmentation is omitted, independence on tools of dictionaries and the like is not needed, convenience in operation is achieved, data sources are dynamically updated, wrong segmentation modes can be rapidly corrected, high distinguishing degree is achieved for new characters, and accuracy in segmentation is improved.

Server and Chinese character segmentation method and device

Server and Chinese character segmentation method and device

Server and Chinese character segmentation method and device

Owner:TENCENT TECH (SHENZHEN) CO LTD

Method and system for language classification of sites

ActiveCN104572767AFulfil requirementsSimple system architectureWeb data indexingSpecial data processing applicationsAlgorithmClassification methods

The invention provides a method and a system for language classification of sites. The method comprises the following steps: searching each language by virtue of a default preset word of the language, and obtaining all webpage links corresponding to the language; classifying all webpage links according to the link addresses of all webpage links, wherein each class corresponds to one site; sampling partial webpage links from a sub-class corresponding to each site to form a sample set; generating a training model corresponding to the language according to the number and the language information of the webpage links in the sample set; classifying a webpage link set of to-be-detected webpage resources according to the site to obtain each site needing to be detected; obtaining a language predicted value of each to-be-detected site according to the language training model. On the basis of a single webpage language recognition technology for the webpage, a reasonable and efficient method for language classification of the sites is provided; a system framework is simple and easy to maintain, so that the requirements of a modern search engine technology are met.

Method and system for language classification of sites

Method and system for language classification of sites

Method and system for language classification of sites

Owner:NEW FOUNDER HLDG DEV LLC +2

Methods and system for improving the relevance, usefulness, and efficiency of search engine technology

InactiveUS20180121545A1Improve relevanceImprove the usefulnessWeb data indexingSemantic analysisSemantic searchFull scale

The disclosed methods, systems, and apparatus use Natural Language Processing (NLP) in conjunction with a world model and cognitive frames to semantically analyze, understand, rank, store, and retrieve digital text. The goal is to improve the relevance, usefulness and efficiency of information search. The world model represents things existing in the real world whereas cognitive frames specify possible user interaction with such a world. Using NLP in conjunction with a world model and cognitive frames to understand text is an advancement in automated text analysis. It addresses three serious shortcomings of the existing search technology: the inadequate measure of the meaningful content in web pages; a poor understanding of users' goals and tasks in their search and, the irrelevant search results. The disclosed methods have led to the successful implementation of a full-scale semantic search engine in medicine, and they are applicable and adaptable to other disciplines.

Methods and system for improving the relevance, usefulness, and efficiency of search engine technology

Methods and system for improving the relevance, usefulness, and efficiency of search engine technology

Methods and system for improving the relevance, usefulness, and efficiency of search engine technology

Owner:COGILEX R&D INC

Popular searches

Taxonomic database Image based Text searching Feature vector User input Image description Textural feature Image content Training set Internet privacy