A Chinese address semantic tagging method based on Bayesian word segmentation algorithm
A technology of Chinese address and semantic annotation, applied in computing, geographic information database, natural language data processing, etc., to achieve fast and accurate semantic analysis
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0076] The specific implementation process of the present invention will be described below by taking the Chinese address "Yenlord Food Plaza, No. 137, Dongma Road, Nankai District" as an example.
[0077] P1: Set the annotation relationship table, which can be designed as shown in Table 1.
[0078] P2: Obtain the set T of pre-segmented and marked NT pieces of Chinese address data as the training corpus, set the set T={T i}, where each piece of Chinese address data is T i , and 1≤i≤NT.
[0079] P3: Perform statistical learning on the set T. The specific steps of statistical learning include:
[0080] P31: Count each word segmented in the set T, the word frequency of each word, and the frequency value of each word and its adjacent previous word at the same time, and store it in the word frequency dictionary Word_dic;
[0081] P32: Count each word and the tagging relationship corresponding to the word, and store it in the tagging relationship dictionary Taging_dic;
[0082] P3...
Embodiment 2
[0094] The above embodiment 1 is the case where the address information does not contain uncertain tagged related words. Next, the specific implementation process of the present invention will be described by taking the Chinese address "Lane 98, Bixiu Road, Minhang District, Shanghai" as an example.
[0095] A1: Set the annotation relationship table, which can be designed as shown in Table 1.
[0096] A2: Obtain the set T of pre-segmented and marked NT pieces of Chinese address data as the training corpus, set the set T={T i}, where each piece of Chinese address data is T i , and 1≤i≤NT.
[0097] A3: Perform statistical learning on the set T. The specific steps of statistical learning include:
[0098] A31: Count each word segmented in the set T, the word frequency of each word, and the frequency value of each word and its adjacent word at the same time, and store it in the word frequency dictionary Word_dic;
[0099] A32: Count each word and the tagging relationship corres...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com