Method for identifying user name abbreviations

A user name and abbreviation technology, which is applied in the computer field, can solve problems such as difficult to apply real-time user name abbreviation phenomenon recognition tasks, high time consumption, etc., and achieve the effect of reducing the amount of calculation, easily determining, and reducing a large number of repeated calculations

Inactive Publication Date: 2016-11-30
INST OF INFORMATION ENG CHINESE ACAD OF SCI
View PDF2 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] However, the enumeration method needs to enumerate all the substrings of the username when identifying the phenomenon of username abbreviation, which takes a lot of time and is difficult to apply to large-scale real-time username abbreviation recognition tasks

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for identifying user name abbreviations
  • Method for identifying user name abbreviations
  • Method for identifying user name abbreviations

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0042]Embodiment 1 is used to identify whether there is abbreviation of pinyin among user names. According to the characteristics of Chinese names, the name must be at least two characters, such as Zhang Wei, Shi Xiaoming, etc. Taking Zhang Wei as an example, the pinyin of his name is ZhangWei or Wei Zhang, and the pinyin abbreviation is random, and From a statistical point of view, the pinyin abbreviation is likely to be the first letter of the name, that is, zw or wz; and the pinyin of Shi Xiaoming's name Shi Xiaoming or Xiaoming Shi, the pinyin abbreviation is likely to be sxm or xms, from the above analysis. It is considered that W is a set of pinyin whose character string length is not less than 2, and ΔL=2. Similarly, if you want to identify the abbreviation of an English name, since English names are generally less used as middle names, that is, English names are at least composed of first name and last name. For example, the English name Sheldon Lee Cooper is often She...

Embodiment 2

[0045] This embodiment 2 is used to identify whether there is abbreviation of pinyin among user names. From the analysis of the above embodiment, it can be known that W is a set of pinyin whose character string length is not less than 2, and ΔL=2.

[0046] Given two user names a=wanxia68 and b=wanter_123, the segmentation results of user names a and b obtained through the method provided in the second step above are respectively X a ={wan,xia,6,8}, X b ={wan,te,r,1,2,3}, the abbreviated forms are respectively Y a =wx68,Y b =wtr123, further, the length m=0 of the longest abbreviation of user names a and b is obtained through the calculation in the third step above. Since m<ΔL, it means that there is no abbreviation phenomenon between user names a and b.

[0047] The method provided by the invention is to automatically identify whether there is an abbreviation phenomenon in the user name through an algorithm, without enumerating all the substrings of the user name as in the p...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a method for identifying user name abbreviations. The method includes the following steps: (1) filtering characters in two or more user names with English letters and numbers kept; (2) segmenting the filtered user names into a plurality of continuous fragments, and selecting the initial character of each fragment to form a new character string; (3) acquiring the length of the longest abbreviation according to the new character string, and determining that user name abbreviations exist among user names if the length value is larger than or equal to a give threshold deltaL; converting the English letters to lower-case or upper-case forms in a unified manner; the fragments are words or single characters; the fragments are obtained by segmentation according to a specified dictionary; the length of the longest abbreviation is obtained according to the new character string via a dynamic programming algorithm.

Description

technical field [0001] The invention relates to the field of computers, in particular to a method for identifying abbreviated user names. Background technique [0002] In recent years, the Internet has developed rapidly and has penetrated into all aspects of social life, such as watching news or videos on portal sites such as Sina, Sohu, and Tencent, and exchanging information on social networks such as Weibo, Tieba, and communities. When people use these networks Will register an account, fill in the user name. The user name is a character string that meets certain rules and can identify the user's identity filled in by the user when registering for the website. It usually consists of English letters, numbers, and underscores and other special characters. [0003] When a user registers on a certain website, because the user name of the website is unique, and the commonly used user name has been registered by others, or due to other considerations such as protecting persona...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27G06F17/30
CPCG06F16/9535G06F40/279
Inventor 亚静王玉斌柳厅文时金桥李全刚
Owner INST OF INFORMATION ENG CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products