Patents

Literature

Patsnap Eureka AI that helps you search prior art, draft patents, and assess FTO risks, powered by patent and scientific literature data.

30 results about "Web crawler" patented technology

Filter

Efficacy Topic

Property

Owner

Technical Advancement

Application Domain

Technology Topic

Technology Field Word

Patent Country/Region

Patent Type

Patent Status

Application Year

Inventor

A Web crawler, sometimes called a spider or spiderbot and often shortened to crawler, is an Internet bot that systematically browses the World Wide Web, typically for the purpose of Web indexing (web spidering).

Systems and methods for interactive scheduling

ActiveUS12664521B1Natural language data processingOffice automationData packWeb crawler

Disclosed herein are embodiments of systems, methods, and products comprises an analytic server, which automatically manages appointment scheduling. The analytic server receives a customer request to schedule an appointment. The analytic server determines the required data from both customer and service provider for making the appointment. The analytic server retrieves customer data comprising requested service attributes, user preferences, users attributes from internal database and external data source. The analytic server retrieves service providers' data comprising provider service attributes, providers' attributes from internal database and external data sources. The analytic server accesses external data source by web crawling various websites. The analytic server executes an artificial intelligence model to predict user preferences and needs. The analytic server determines potential service providers best matching the customer's input or predicted preferences. The analytic server generates an appointment for each matching service provider and transmits an electronic message comprising the appointments to customer device.

Systems and methods for interactive scheduling

Owner:UNITED SERVICES AUTOMOBILE ASSOCIATION (USAA)

A method and device for evaluating influence effect of an academic recommendation algorithm based on a social robot

PendingCN122262395ARealize accurate quantificationImprove causal explanation powerWeb data indexingText database indexingEngineeringSocial robot

A method and device for evaluating the influence effect of an academic recommendation algorithm based on a social robot, the method comprising: collecting academic papers, citation relationships and author information through a data interface or a web crawler, and screening target personnel data according to a research field; constructing a virtual social robot and establishing a research interest vector of the virtual social robot based on historical literature and research theme information of the target personnel; presetting multiple literature acquisition strategies for the social robot to form differentiated experimental conditions; controlling the social robot to perform operations such as paper retrieval, access and click recommendation on an academic platform, and recording literature access paths, recommendation results and browsing sequence data in real time; calculating the research theme distribution of the social robot according to a literature set contacted in the experiment, and comparing the research theme distribution with the initial research interest, so as to obtain the change degree of the research direction; and comparing the change results of the research direction of the social robot under different strategies, and quantitatively evaluating the influence degree of the recommendation system on the research direction evolution of the scientific researchers.

A method and device for evaluating influence effect of an academic recommendation algorithm based on a social robot

Owner:ZHEJIANG UNIV OF TECH

Systems and method for large language model-based differentiation

ActiveUS12675639B1DifferentiatorEngineering

A system for Large Language Model (LLM) based differentiation, the system including a processor configured to receive input data associated with an entity, classify the input data to a descriptive class, command an adaptive web crawler to retrieve descriptive content associated with the descriptive class, extract a plurality of positioning signals from the input data associated with the entity, generate a contrast score for each positioning signal of the plurality of positioning signals by comparing the plurality of positioning signals to the descriptive content, encode the contrast score into a differentiator profile including at least one categorical tag and at least one weighted relationship for each positioning signal, modify a generation behavior of a base LLM using the differentiator profile as a conditioning input and generate, by the base LLM conditioned on the differentiator profile, one or more differentiator outputs.

Systems and method for large language model-based differentiation

Systems and method for large language model-based differentiation

Systems and method for large language model-based differentiation

Owner:CRISP INC

Crawler configuration method and related device

PendingCN122132613AWeb data indexingExecution for user interfacesConfiguration itemWeb crawler

This application provides a web crawler configuration method and related equipment. The web crawler configuration method includes: displaying a corresponding web crawler configuration interface in response to the selection of a configuration mode, the web crawler configuration interface containing multiple web crawler configuration items; obtaining a preliminary configuration scheme in response to configuration operations on each of the web crawler configuration items; executing corresponding debugging steps according to the preliminary configuration scheme in response to the triggering of the debugging function of the configuration interface; and forming a final configuration scheme in response to the completion of debugging. The technical solution of this application significantly reduces the complexity of web crawler configuration by responding to each step of user operation, combining a visual interface and real-time debugging, enabling non-professional users to easily and quickly configure web crawlers, improving user experience and configuration accuracy.

Crawler configuration method and related device

Crawler configuration method and related device

Crawler configuration method and related device

Owner:BEIJING HONGTENG INTELLIGENT TECH CO LTD

A deep crawler method and system based on multi-level recursion

PendingCN122316719AIp addressModel selection

This invention discloses a deep web crawler method and system based on multi-level recursion, belonging to the field of computer and artificial intelligence technology. The method includes data collection and preprocessing, deep learning model selection and training, real-time monitoring and detection, system integration and performance optimization. Data collection and preprocessing involves collecting data and cleaning and denoising it, as well as feature extraction and transformation. Deep learning model selection and training involves choosing different models or hybrid models based on the data type. Real-time monitoring and detection involves deploying the trained model to a real network environment and performing real-time monitoring and detection, monitoring the IP addresses of access requests, especially identifying a large number of requests from the same IP address or a range of IP addresses, which suggests web crawler activity. This invention can distinguish between normal users and web crawlers, helping websites and service providers effectively prevent malicious web crawlers and protect their data and resources from adverse effects.

A deep crawler method and system based on multi-level recursion

Owner:陈雨洁

Crawler detection method, device and readable storage medium

ActiveCN116599686BSecuring communicationWeb siteAnalytic model

The application discloses a crawler detection method and device and a readable storage medium. An analysis server periodically acquires traffic data of a target website, groups the traffic data according to IP addresses of various access requests in the traffic data, and obtains groups corresponding to different IP addresses. Then, the analysis server extracts features of the groups to obtain a first feature set of each IP address, and the features in the first feature set are used to represent access behaviors of the IP address. After that, the analysis server inputs the features in the first feature set of each IP address into an analysis model, and determines which IP addresses are normal IP addresses and which IP addresses are abnormal IP addresses among multiple IP addresses accessing the target website in a current period. By using the scheme, the historical access traffic is analyzed in advance to obtain the analysis model, the network crawler is detected by using the analysis model, and the dependence on professional experience of security personnel is reduced.

Crawler detection method, device and readable storage medium

Owner:XIAMEN WANGSU CO LTD

A method, equipment, and media for monitoring regional catering markets based on multi-source heterogeneity.

PendingCN122132614AWeb data indexingCommerceAnalysis dataData source

This invention discloses a method, equipment, and medium for monitoring regional catering markets based on multi-source heterogeneity, belonging to the fields of data science and computer science. It addresses the technical problems in existing regional catering market monitoring, such as fragmented data sources of real-time consumption, limited data analysis dimensions, and lagging decision support. The method includes: collecting and processing relevant catering data for the current region using an adaptive web crawler framework to obtain multi-source heterogeneous consumption data; cleaning and anomaly monitoring the multi-source heterogeneous consumption data to obtain preprocessed multi-source heterogeneous consumption data; performing multi-layer data analysis on the multi-source heterogeneous consumption data under relevant multi-dimensional catering models to obtain catering market analysis data; simulating policy subsidies using the catering market analysis data to obtain catering subsidy simulation data; and visualizing the catering market analysis data and catering subsidy simulation data to generate a regional catering visualization map.

A method, equipment, and media for monitoring regional catering markets based on multi-source heterogeneity.

Owner:INSPUR ZHUOSHU BIG DATA IND DEV CO LTD

A highly robust web crawler system

PendingCN122173697AWeb data indexingBiological modelsData streamData set

This invention discloses a highly robust web crawler system, belonging to the fields of big data acquisition and artificial intelligence technology. The system is a closed-loop intelligent system with multiple modules working collaboratively. It includes a multi-data source database construction and pre-training module, a real-time monitoring module for crawler operation status, a data semantic analysis and association mining module, a data request-matching evaluation and parameter optimization module, and a dataset generation and delivery module. Through tight coupling between modules and effective transmission of data flow and control flow, the system achieves full-process automation and intelligence from data discovery, intelligent crawling, dynamic adaptation to high-quality delivery. The system features high robustness and anti-crawling capabilities, strong data semantic understanding and association accuracy, and low manual costs.

A highly robust web crawler system

Owner:BOXIAN GROUP HONG KONG LTD

Network crawler system and method including advertisement filtering

ActiveCN117633327BEngineeringWeb crawler

The application discloses a web crawler system and method comprising advertisement filtering. In the system: a scheduler distributes crawling tasks to multiple crawlers according to target to be crawled; each crawler executes corresponding crawling task and sends crawling result to a content parser; the content parser determines first crawling result which does not need to be crawled again and second crawling result which needs to be crawled again in each crawling result, parses the first crawling result to obtain first crawling content, and sends the second crawling result to a static rule filtering engine; the static rule filtering engine filters the second crawling result to obtain third crawling result, and sends the result to a machine learning filtering engine; the machine learning filtering engine filters the third crawling result to obtain second target to be crawled, and feeds back the target to the scheduler; and a result processor outputs the first crawling content. The application solves the technical problem that the existing web crawler engine crawls a large amount of advertisement content, which simultaneously causes great resource pressure on the crawling party and the content provider.

Network crawler system and method including advertisement filtering

Network crawler system and method including advertisement filtering

Network crawler system and method including advertisement filtering

Owner:CHINA TELECOM CORP LTD

A media data visual analysis system

PendingCN122112285AWeb data indexingMultimedia data browsing/visualisationSocial mediaOriginal data

The application provides a media data visual analysis system, which comprises a data acquisition module, a data preprocessing module, an analysis engine module, a visual rendering module and a system control module. The data acquisition module is configured with a mixed mechanism of a network crawler and API calling, can adapt to technical protocols of various information sources such as social media, news websites and video platforms, and can continuously and stably obtain original data with different structures. Through the introduction of a heat analysis formula considering time decay and comprehensive behaviors such as likes, comments and shares, the attention degree and propagation effect of information units can be quantified more scientifically and accurately. The application integrates data acquisition, preprocessing, analysis engine and visual rendering into one system, forms a complete closed-loop workflow for media data value mining and display, and overcomes the defects of fragmented links and low integration in the prior art.

A media data visual analysis system

A media data visual analysis system

A media data visual analysis system

Owner:ZHONGBEI UNIV

System and method for extracting and categorizing information from online sources

PCT designated stageWO2026109928A1Natural language analysisWeb data indexingDomain nameSchema for Object-Oriented XML

A system and method for efficiently extracting and categorizing business information from online sources is disclosed. The system comprises a web crawler that obtains company domains from a database and collects depth-1 URLs from company homepages. A classification model, utilizing a fine-tuned BERT architecture, predicts which URLs contain relevant information for generating tags. A content extractor then extracts content from these predicted URLs using one or more modules. Finally, a large language model (LLM) processes the extracted content and generates tags using custom prompts designed for each tag category. These prompts are tailored to the nature of the extracted content, enhancing the context provided to the LLM. This multi-stage approach addresses challenges in processing large-scale, unstructured business data from diverse web sources, potentially offering improved efficiency, scalability, and accuracy in automated business intelligence gathering.

System and method for extracting and categorizing information from online sources

System and method for extracting and categorizing information from online sources

System and method for extracting and categorizing information from online sources

Owner:6SENSE INSIGHTS INC

Data aging identification method and device

ActiveCN112199565BWeb data indexingSpecial data processing applicationsThe InternetEngineering

The present disclosure relates to a data timeliness identification method and device, a content pushing method and device, an electronic device and a computer readable storage medium. The data timeliness identification method comprises: obtaining to-be-processed data; obtaining associated data by using a web crawler according to the to-be-processed data; and determining timeliness information of the to-be-processed data based on a semantic relationship between the to-be-processed data and the associated data, wherein the timeliness information comprises old news or non-old news. The real publication time of news content can be determined by using Internet information through the web crawler, the timeliness of the news content is further determined through the determined real publication time, whether the publication time of the news content is modified by a content cooperation party can be identified, and the news content in the database is marked with timeliness, so that the timeliness problem can be fully considered during subsequent pushing.

Data aging identification method and device

Owner:BEIJING XIAOMI PINECONE ELECTRONICS CO LTD

Simulated user behavior-based crawler method, device and related equipment

PendingCN122286757AWeb siteFeature extraction

This invention discloses a web crawler method, apparatus, and related equipment based on simulated user behavior. The method collects web browsing behavior data from several users, extracts features from the behavior data to obtain user browsing behavior features, and constructs a behavior feature library based on these features. Based on feature parameters of trajectory jitter characteristics and preset curve construction logic, a mouse movement trajectory conforming to user operating habits is constructed. According to preset time-series calculation logic and time-series distribution characteristics, fluctuation intervals and page dwell times consistent with user mouse operations are generated. A complete user operation behavior chain is constructed based on the fluctuation intervals, page dwell times, and mouse movement trajectory, resulting in a behavior chain. Anti-crawling feedback from the target website is monitored in real time, and the behavior chain is adjusted according to the anti-crawling feedback. Data is then crawled according to the adjusted behavior chain. This method significantly improves the anti-crawling capability of crawled data.

Simulated user behavior-based crawler method, device and related equipment

Simulated user behavior-based crawler method, device and related equipment

Owner:BEIJING TAIXIN TIANCHENG TECHNOLOGY CO LTD

A digital preservation system and method for folk culture

PendingCN122134310AInput/output for user-computer interactionData processing applicationsData setEngineering

This invention belongs to the interdisciplinary field of folk culture and big data technology, and discloses a digital protection system and method for folk culture. The system comprises five core modules, forming a closed-loop process: a folk culture uploading and administrator review module allows users to upload information in text, audio, and video formats, with administrators reviewing it intuitively according to cultural characteristics; a big data analysis and tracing module constructs a database through multi-path web crawlers, extracts multi-dimensional features, and establishes a tracing network to accurately determine the origin and connections of culture; a regional folk culture analysis module constructs regional feature datasets and mines cultural background; a digital display and database module realizes data classification management, cloud storage, and diversified display, supporting retrieval and sharing; and a folk culture education module provides diversified courses, knowledge graphs, gamified interactions, and VR / AR immersive experiences, adaptable to different learners. This invention provides an efficient solution for the digital inheritance of folk culture.

A digital preservation system and method for folk culture

Owner:QINGDAO UNIV OF TECH

System and method for detecting copyrighted material

PendingUS20260141005A1Data processing applicationsWeb data indexingWeb siteEngineering

Systems, methods, and computer-readable storage media for detecting copyrighted material online, and more specifically to search for media content, comparing found content to known, sending notices regarding copyright infringement, and monitoring removal of the infringing material. Doing so involves the use of web crawlers to detect media content, generating fingerprints for that content, and comparing the fingerprints against known proprietary fingerprints. Once proprietary content is identified, the system can send notices out to the owner of the web site where the content was found and can continue monitoring the site until the content is removed.

System and method for detecting copyrighted material

Owner:INNOVASOFT TECH HOLDINGS LTD

Method and system for creating a browsing node using frequent pattern mining

ActiveCN116628101BPathPingEngineering

The present disclosure provides methods and systems for creating browse nodes using frequent pattern mining. Browse node pages are addressed by their path. As a result, web crawlers have a greater chance of finding browse nodes than corresponding parameter-based search pages. Browse nodes and search result pages can be further distinguished by using a title or header meta tag that indicates information about the browse node and distinguishes the browse node from a general search result page. The number of combinations of keywords, categories, and keyword-value pairs makes it prohibitive to create a browse node for every possible combination in all but the simplest applications. Methods and systems for identifying which search result pages should be converted to browse nodes are disclosed herein.

Method and system for creating a browsing node using frequent pattern mining

Owner:EBAY INC

A deep learning-based ancient seal seal script recognition method and system

ActiveCN115439863BPattern recognitionChinese characters

The application discloses a kind of based on deep learning's ancient seal seal script recognition method and system, the recognition method in which obtains seal ancient Chinese character recognition dataset by network crawler, and carries out the pre-processing operation of frame removal, character segmentation, uses the automatic data enhancement method of joint loss optimization to optimize the distribution of ancient seal seal script data, using KL divergence loss replaces cross-entropy loss, and using pre-training model parameter as initial parameter in the recognition process, fine-tuning training is carried out on the data enhanced dataset on depth neural network, obtains the final seal ancient Chinese character recognition model.The application is based on depth neural network, utilizes automatic data enhancement strategy, improves the accuracy of ancient seal seal script recognition and model performance, to realize the recognition of seal ancient Chinese character provides more robust recognition result.

A deep learning-based ancient seal seal script recognition method and system

Owner:WUHAN UNIV

Dynamic optimization of request parameters for proxy servers

PendingCN122372546AEngineeringWeb crawler

The present disclosure relates to dynamic optimization of request parameters for a proxy server. Systems and methods of task fulfillment are extended as provided herein and target the web scraping process through the step of a client submitting a request to a web crawler. The systems and methods allow for more complex requests to be defined for the web crawler in order to receive more specific data. In one aspect, a method for extracting and collecting data from a network by a service provider infrastructure includes the steps of inspecting parameters of a request received from a user's device, adjusting the request parameters according to pre-established scraping logic, selecting a proxy according to criteria of the pre-established scraping logic, sending the adjusted request to a target through the selected proxy, inspecting metadata received from the target, and forwarding the data to the user's device.

Dynamic optimization of request parameters for proxy servers

Dynamic optimization of request parameters for proxy servers

Dynamic optimization of request parameters for proxy servers

Owner:OKOSILA BOSE PTE LTD

Optimizing scraping requests through browsing profiles

ActiveUS12670221B2User deviceEngineering

Systems and methods of task implementation are extended as provided herein and target the web crawling process through a step of submitting a request by a customer to a web crawler. The systems and methods allow a request for a web crawler to be enriched with a customized browsing profile in order to be categorized as an organic human user to obtain targeted content. In one aspect, a method for data extraction and gathering from a Network by a Service provider infrastructure include at least some of the following exemplary steps: receiving and examining the parameters of a request received from a User's Device, enriching the request parameters with a pre-established browsing profile, sending the enriched request to a Target through the selected Proxy, receiving a response from the Target, dissecting the response's metadata that is appropriate for updating the browsing profile utilized for the request, and forwarding the data to the User's device pursuant to the examination of the response obtained from the Target system.

Optimizing scraping requests through browsing profiles

Optimizing scraping requests through browsing profiles

Optimizing scraping requests through browsing profiles

Owner:OXYLABS UAB

A web crawler detection system based on application scenarios

ActiveCN115525813BEffectively locate abnormal partsLocating abnormal partsWeb data indexingSpecial data processing applicationsBusiness enterpriseWeb crawler

This invention discloses a web crawler detection system based on application scenarios, including a crawler detection platform. The crawler detection platform comprises a user analysis unit, a human-machine recognition unit, a secondary verification unit, a space development unit, and a reality integration unit. This invention relates to the field of web crawler detection technology. This application scenario-based web crawler detection system divides the enterprise network into different application scenarios according to their uses. Based on these application scenarios, it performs triple malicious crawler identification, achieving a more comprehensive screening of malicious crawlers. Simultaneously, it provides registered users with personal spaces, offering convenience and effectively avoiding false positives on genuine users, thus improving user experience. Furthermore, by combining network information related to the application scenarios to manage personal spaces, it greatly enhances the flexibility of malicious crawler detection while effectively locating abnormal parts of the enterprise network, providing powerful assistance for the operation and management of the enterprise network.

A web crawler detection system based on application scenarios

Owner:WUHAN JIYI NETWORK TECH CO LTD

Systems and methods for adapting platform behavior using machine-learning-based remote entity lifecycle monitoring

ActiveUS12645988B2Web data indexingMachine learningEntity identifierEngineering

Platform behavior may be adapted via machine-learning-based entity lifecycle monitoring. A web crawler agent collects data comprising an entity identifier token. A machine learning model is trained to determine, based at least in part on the entity identifier token, whether a corresponding entity is associated with the computing platform (e.g., whether the corresponding entity is a platform subscriber entity for the computing platform). Based on the output of the machine learning model applied to the entity identifier token (in some embodiments, in combination with other relevant data parsed from the collected data), an indication of an entity lifecycle status and a confidence value therefor are determined. Based on the entity lifecycle status and the confidence value, a listener is bound to the platform subscriber entity. The listener monitors activity of the platform subscriber entity with respect to the platform and identifies platform action to take in response.

Systems and methods for adapting platform behavior using machine-learning-based remote entity lifecycle monitoring

Systems and methods for adapting platform behavior using machine-learning-based remote entity lifecycle monitoring

Systems and methods for adapting platform behavior using machine-learning-based remote entity lifecycle monitoring

Owner:CAPITAL ONE SERVICES LLC

Guided web crawler for automated identification and verification of webpage resources

ActiveUS12675538B2EngineeringWeb crawler

There are provided systems and methods for a guided web crawler for automated identification and verification of webpage resources. A service provider, such as an online transaction processor, may provide a guided web crawler and / or resources for such crawler for execution by computing devices of users. Users may load different pluggable modules to the guided web crawler, which are associated with specific web crawling tasks. Web crawling tasks may correspond to identification and verification of webpage resources on a webpage, such as a location, placement, use of, and / or number of appearances of the resource. The web crawler may use code from the pluggable module being executed to parse and / or crawl webpage data for a webpage and identify requested resources. Thereafter, the guided web crawler may automate resources to use, display, and / or interact with the identified and verified resource.

Guided web crawler for automated identification and verification of webpage resources

Owner:PAYPAL INC

System and method for detecting copyrighted material

PCT designated stageWO2026110044A1Data processing applicationsWeb data indexingWeb siteEngineering

Systems, methods, and computer-readable storage media for detecting copyrighted material online, and more specifically to search for media content, comparing found content to known, sending notices regarding copyright infringement, and monitoring removal of the infringing material. Doing so involves the use of web crawlers to detect media content, generating fingerprints for that content, and comparing the fingerprints against known proprietary fingerprints. Once proprietary content is identified, the system can send notices out to the owner of the web site where the content was found and can continue monitoring the site until the content is removed.

System and method for detecting copyrighted material

Owner:INNOVASOFT TECH HOLDINGS LTD

Network content crawling method, electronic device, and storage medium

ActiveCN115186160BProgram initiation/switchingResource allocationComputer networkEngineering

The application provides a network content crawling method, an electronic device and a storage medium. The network crawler comprises a main crawler, a task distribution platform and a plurality of sub-crawlers for processing different task types. The method comprises: the main crawler sending a target task type of a network content crawling task to the task distribution platform, wherein the target task type is determined according to task information carried in a task processing request of the network content crawling task; and the task distribution platform distributing the network content crawling task to a corresponding sub-crawler for processing according to the target task type. Since the network content crawling task is divided into task types and processed by corresponding sub-crawlers in the network content crawling process, the possibility of blocking can be reduced in the large-scale network content crawling process.

Network content crawling method, electronic device, and storage medium

Owner:SHANGHAI NEWTOUCH SOFTWARE CO LTD

A material downloading method and a material downloader

ActiveCN120768895BEfficient crawlingSolve technical issues that affect video production efficiencyMessage queueEngineering

The present application relates to the field of web crawler, especially to a material downloading method and a material downloader, the method first acquires the search field input by a user, then crawls the material resource links related to the search field and pushes them into a message queue, then parses the message queue to obtain parsing information, and pushes the parsing information into a task queue to execute the material batch downloading task in parallel. Compared with the prior art, the method of the present application realizes efficient crawling of materials, supports users to smoothly edit videos while downloading new materials in batches, and solves the technical problem of affecting video production efficiency caused by the existing material crawling method.

A material downloading method and a material downloader

Owner:GUANGZHOU TAIDONG TECH CO LTD

A dangerous driving crime sentencing pre-judgment and leniency path analysis system

PendingCN122451195ANerve networkNamed-entity recognition

The present application relates to the technical field of sentencing prediction and lenient path analysis, and discloses a dangerous driving crime sentencing prediction and lenient path analysis system, which comprises a data acquisition and preprocessing module and a legal named entity recognition model, the data acquisition and preprocessing module comprises the following units: a network crawler unit, which specifically sends HTTP requests to the China Judgments Document Net, the open API of local courts and the procuratorial document net, is used for receiving returned HTML pages or JSON data, and inserts a random time interval between requests; the present application can quantitatively predict the number of days of detention, the amount of fines and the probability of probation of dangerous driving crime based on multi-dimensional characteristics such as alcohol content, accident consequences, confession stage and compensation understanding by automatically collecting and structuring massive historical judgment data and constructing a multi-task deep neural network model, and fills the technical gap that the existing legal retrieval system can only provide static legal articles and case pushing and cannot realize accurate sentencing prediction of individual cases.

A dangerous driving crime sentencing pre-judgment and leniency path analysis system

Owner:郝唯伊

Large language model-based vulnerability remediation action descriptions

ActiveUS12670263B2Data descriptionThe Internet

A vulnerability documentation system detects vulnerabilities having outdated or undocumented formatted descriptions for corresponding remediation actions. A web crawler crawls the Internet for configuration data for software / firmware affected by the detected vulnerabilities and descriptive content for the remediation actions. The vulnerability documentation system prompts and LLM with a prompt for each detected vulnerability comprising instructions to generate a formatted description for remediation actions using the crawled configuration data / descriptive content. The vulnerability documentation system then populates natural language descriptions of remediation actions from the formatted descriptions and pushes the natural language descriptions to affected devices.

Large language model-based vulnerability remediation action descriptions

Large language model-based vulnerability remediation action descriptions

Large language model-based vulnerability remediation action descriptions

Owner:PALO ALTO NETWORKS INC

Server for providing service for educating english and method for operation thereof

ActiveKR102993476B1Human–machine interfaceData transformation

According to various embodiments, a server providing an artificial intelligence model trained according to a meta-learning method may include: a data collection unit (501) that collects data from multiple servers using web crawling and converts the collected data into numerical data; a data preprocessing unit (502) that generates training data by preprocessing the numerical data to normalize the data and performing a dimension transformation on the normalized data; a data visualization unit (503) that visualizes the training data; a model training unit (504) that trains an artificial intelligence model using the training data according to a meta-learning method; a model verification unit (505) that verifies the reliability of the artificial intelligence model by inputting verification / evaluation data corresponding to the training data into the artificial intelligence model; a model providing unit (506) that allows multiple users to access the artificial intelligence model when the reliability of the artificial intelligence model is above a threshold value; and a Human-Machine Interface (HMI) unit (507) that provides a graphical user interface that allows the multiple users to connect to the server and perform multiple operations related to the artificial intelligence model. Various other embodiments are also possible.

Server for providing service for educating english and method for operation thereof

Owner:KEPCO KDN CO LTD

Hotspot problem-based knowledge base incremental updating method and device, equipment and medium

PendingCN122114101AWeb data indexingWebsite content managementUser inputEngineering

The application discloses a kind of knowledge base incremental updating method, device, equipment and medium based on hot spot problem, to solve the problem of RAG knowledge base update lag in relevant technology.Therein method includes: rewriting processing to the question input by user;If AI model judges according to the rewritten question and the knowledge in knowledge base cannot answer question, then store the rewritten question;When meeting preset trigger condition, the rewritten question stored is divided into question cluster using hierarchical clustering and K-means clustering, and target question cluster meeting preset screening condition is selected, the question in target question cluster is summarized using AI model, and the hot spot question corresponding to each target question cluster is obtained;Through web crawler, obtain the web page content related to each hot spot question, based on the semantic correlation degree of hot spot question and each web page content related, extract the text content with highest semantic correlation degree as the related knowledge of hot spot question;Related knowledge is stored to knowledge base.

Hotspot problem-based knowledge base incremental updating method and device, equipment and medium

Owner:AISINO CORPORATION

A multi-agent intention hidden crawler method based on a large language model

PendingCN122332634ALinguistic modelPrivacy protection

This invention discloses a multi-agent intent-hiding web crawler method based on a large language model, belonging to the field of intelligent web page data collection and privacy protection technology. This method constructs a collaborative system driven by a large language model, consisting of a task planning agent, a sub-task execution agent, and a feedback optimization agent. It decomposes the user's high-level collection intent into a low-sensitivity sub-task chain, and uses a feedback optimization mechanism to achieve scheduling, execution, and dynamic adjustment. It automatically completes web page access, content extraction, and anomaly handling, and outputs the results in a structured manner. This invention, employing the aforementioned multi-agent intent-hiding web crawler method based on a large language model, significantly reduces the risk of crawler intent exposure, improves the success rate and adaptability of collection in high-anti-crawling and dynamic web page environments, reduces manual rule configuration, and is highly versatile and stable, suitable for various public web page data collection and open-source intelligence acquisition scenarios.

A multi-agent intention hidden crawler method based on a large language model

Owner:BEIJING INST OF TECH

Popular searches

Server Data bank Service provisioning External data Data profiling Service provider Scheduling system Data mining Data interface Sequencing data