Method and system for acquiring shortened form of organization name based on website homepage information

A technology of organization structure and organization name, which is applied in the direction of network data retrieval, network data index, special data processing application, etc.

Inactive Publication Date: 2016-09-21
CHINA INTERNET NETWORK INFORMATION CENTER
View PDF5 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the anchor text-based method for obtaining institutional aliases used in the above patent has certain limitations, that is, not all institutional aliases will appear in

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for acquiring shortened form of organization name based on website homepage information

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0032] Attached picture figure 1 is a flowchart of the method for extracting the abbreviation of the organization. Such as figure 1 As shown, the method mainly includes the following four main steps, which will be described in detail below.

[0033] Step 1: By training the homepage information of the known full name and abbreviation to extract the words that often appear with the name of the organization, that is, the context feature words of the name of the organization, which will be used for the extraction of a large number of organization names in the future. Since the full name and the abbreviated name are interchangeable in context, we do not distinguish between the full name and the abbreviated name when training contextual feature words for institution names. The feature word training process is described in detail below.

[0034] Select the domain name addresses of 200 organizations, determine the full name and abbreviation of the organization names of these websit...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention discloses a method and system for acquiring a shortened name of an organization name based on website homepage information. According to the method, homepage information of a website of an organization is used to acquire a shortened name, so that a commonly-used shortened name of a related organization can be acquired efficiently in a targeted manner; the shortened name of a name of the organization can be acquired without using anchor text information, so that the method is a replenishment for a method for determining a shortened name of an organization name using an anchor text; and a similarity degree between a shortened name and a full name can be calculated, so that a relatively high accuracy rate is achieved in the aspect of shortened name acquisition.

Description

technical field [0001] The invention relates to the technical field of Internet data analysis, in particular to a method and a system for obtaining the name and abbreviation of an organization based on information on the home page of a website. Background technique [0002] Organizations generally refer to government agencies, groups or other enterprises and institutions, including government departments, research institutes, various schools, companies, international organizations, etc. In daily life, we are used to replacing the full name with the conventional abbreviation for the names of some organizations with a large number of characters. Usually referred to as "Institute of Computing Technology, Chinese Academy of Sciences", and "Beijing University of Posts and Telecommunications" is usually referred to as "Beiyou". With the popularity of the Internet and the rapid expansion of various types of information, more and more Internet users are accustomed to using search e...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F16/951
Inventor 李晓东张俊玲耿光刚延志伟陈勇
Owner CHINA INTERNET NETWORK INFORMATION CENTER
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products