Method and apparatus for normalizing non-numeric characteristics of file

A non-numerical, normalized technology, applied in the computer field, can solve problems such as inability to recognize and receive non-numerical features

Active Publication Date: 2016-06-22
IBM CORP
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, since the classifier can only receive numerical features as input, but not non-numeric features, non-numeric features of configuration files such as file paths cannot be used by classifiers for configuration file identification

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and apparatus for normalizing non-numeric characteristics of file
  • Method and apparatus for normalizing non-numeric characteristics of file
  • Method and apparatus for normalizing non-numeric characteristics of file

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0020] Preferred embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. Although preferred embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

[0021] The main idea of ​​the present invention is to consider that the difference of file metadata (metadata) such as the file path of the same type of configuration file in different environments and systems is not random, but structural. For example, the file path has a hierarchical structure and local To make full use of this inherent feature of file metadata, extract words (tokens) from strings of metadata such as file paths, build a wo...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method and a corresponding apparatus for normalizing non-numeric characteristics of a file. The method comprises the steps of cutting at least one pair of positive instances of the non-numeric characteristics of the given file into a plurality of tokens; comparing the tokens in the at least one pair of the positive instances to obtain matched tokens; and for the matched tokens, calculating weights, matched with the given file, of the tokens and storing the tokens and the weights of the tokens in a token library.

Description

technical field [0001] The invention relates to the field of computers, in particular to a method and device for normalizing non-numerical features of files. Background technique [0002] Most modern software uses configuration files to provide users with the flexibility to customize configuration items based on their specific usage scenarios. For example, users can customize the value of the configuration item MaxClients (maximum number of clients) in the configuration file httpd.conf to adjust the maximum number of clients simultaneously connected to the Apache HTTP server. [0003] Some day-to-day IT operations, such as application or data backup and recovery, workload migration, file disaster recovery, etc., are becoming more complex and challenging because they are highly dependent on the identification of configuration files in a distributed environment. Therefore, there is a huge need to identify these profiles from existing environments to accomplish these common IT...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCH04L41/0806G06F16/2455G06F16/248G06F16/24578
Inventor 孟繁晶杨林李长升徐景民E·H·斯特恩卓雪君王晗
Owner IBM CORP
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products