Website text content-based online loan website entity recognition method and system
A technology for entity recognition and web page content, applied in the field of online loan website recognition, it can solve the problems of poor timeliness, low accuracy and high cost, and achieve the effect of high timeliness and high accuracy
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0043] Such as figure 1 As shown, a method for entity recognition of an online loan website based on the text content of the website, the specific steps are as follows:
[0044] S01, build the domain name table of the training set, collect the website domain name hosts of known website types, obtain the web content texts corresponding to these domain name hosts through crawler technology, and mark these website type labels at the same time, where 1 indicates that it is an online loan website, and 0 indicates other websites. If the website is an online loan website, mark the entity name of the online loan website; if it is not an online loan website, leave it blank. Thus, the domain name table T_host of the training set is generated, which includes the domain name host, webpage content content, online loan website label, entity name entity;
[0045] S02. Construct the domain name table of the prediction set, obtain the DPI data of the operator, extract the host field of the doma...
Embodiment 2
[0060] Corresponding to Embodiment 1, this embodiment also provides an online loan website entity recognition system based on website text content, including
[0061] Build the domain name table module of the training set, collect the website domain name hosts of known website types, obtain the corresponding web page content texts of these domain name hosts through crawler technology, and mark these website types with labels, where 1 indicates that it is an online loan website, and 0 indicates other websites. If the website is an online loan website, mark the entity name of the online loan website; if it is not an online loan website, leave it blank. Thus, the domain name table T_host of the training set is generated, which includes the domain name host, webpage content content, online loan website label, entity name entity;
[0062] Build the prediction set domain name table module, obtain the DPI data of the operator, extract the domain name host field in the data, and form ...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More - R&D
- Intellectual Property
- Life Sciences
- Materials
- Tech Scout
- Unparalleled Data Quality
- Higher Quality Content
- 60% Fewer Hallucinations
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2025 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com

