Method and device for labeling corpus
A corpus tagging and corpus technology, applied in semantic tool creation, natural language data processing, unstructured text data retrieval, etc., can solve problems such as process redundancy, achieve simple algorithms, high annotation efficiency, and reduce repetitive manual processing The effect of work
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0034]In order to make the technical problems, technical solutions and beneficial effects to be solved by the present invention, the present invention will be described in further detail below with reference to the accompanying drawings and examples. It will be appreciated that the specific embodiments described herein are intended to explain the present invention and is not intended to limit the invention.
[0035]Such asfigure 1 As shown, a corpus method of the present invention includes the following steps:
[0036]a. Treating the text to quantify, obtain the text of the text;
[0037]b. Based on the text vectors of the text, the text is used to clustering the text using the DBSCan clustering algorithm to obtain long tail corpus and to be marked.
[0038]c. For the long tail type, return to step B; for the laminated corpus, set the tag to obtain a labeling;
[0039]d. Merge all labels to get the final labeling.
[0040]The step A further includes:
[0041]A1. Treatment of the text to obtain a word re...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com