Organization name abbreviation generation method and device and computer readable storage medium
A technology of organization name and abbreviation, applied in the field of natural language processing, can solve problems such as difficulty in ensuring correctness, and achieve the effect of improving recall and accuracy
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0064] Embodiment 1 Organization name abbreviation generation method 1
[0065] Such as figure 1 As shown, a method for generating an institution name abbreviation according to an embodiment of the present invention includes the following steps:
[0066] Step 101: Obtain a dictionary of place names, a dictionary of institutional terms, a dictionary of industry terms, and a text corpus;
[0067] It should be noted that this application divides the words used in the full name of the institution into the following four categories: nouns of geographical names, proper names of institutions, nouns of industry and nouns of institutional nature, among which nouns of geographical names are used to identify the information of place names in the full name of institutions; The proper name of the organization is used to identify the proper noun of the organization name in the full name of the organization; the industry noun is used to identify the noun that reflects the industry to which ...
Embodiment 2
[0167] Embodiment 2 Organization name abbreviation generation method 2
[0168] Such as figure 2 As shown, a method for generating an institution name abbreviation according to an embodiment of the present invention includes the following steps:
[0169] Step 201: Obtain the full name of the institution and a text corpus, and search the text containing the full name of the institution in the text corpus;
[0170] In an exemplary embodiment, the text corpus includes a news corpus and a Wikipedia corpus.
[0171] In an example of this embodiment, the text corpus is built by crawling the news corpus and downloading the text data of Wikipedia (these data will be updated regularly), and the data in the text corpus is indexed by using retrieval software to facilitate subsequent search.
[0172] Step 202: In the retrieved text, extract the character strings of I to J characters adjacent to Chinese characters as candidate character strings, wherein I and J are preset natural numbe...
Embodiment 3
[0187] Embodiment three: computer-readable storage medium
[0188] An embodiment of the present invention also provides a computer-readable storage medium, where one or more programs are stored in the computer-readable storage medium, and the one or more programs can be executed by one or more processors to implement the following: Steps in the method for creating an institution name abbreviation described in any of the above items.
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 

