Method for setting information classification threshold for optimizing lam percentage and information filtering system using same

A classification threshold and setting method technology, applied in the field of information filtering, can solve problems such as performance constraints, deviation of model optimization results, inconsistent evaluation indicators, etc., to achieve the effect of improving performance and optimizing technical indicators

Inactive Publication Date: 2010-12-08
HEILONGJIANG INST OF TECH +1
View PDF2 Cites 15 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0023] In order to solve the problems existing in the existing information filtering models, such as the inconsistency between the optimization target and the evaluation index of the filtering problem, the deviation of the model optimization results, and the restricted performance, the present invention proposes a method for setting the information classification threshold for optimizing lam% and using the method information filtering system

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for setting information classification threshold for optimizing lam percentage and information filtering system using same
  • Method for setting information classification threshold for optimizing lam percentage and information filtering system using same
  • Method for setting information classification threshold for optimizing lam percentage and information filtering system using same

Examples

Experimental program
Comparison scheme
Effect test

specific Embodiment approach 1

[0069] Specific Embodiment 1: What is described in this embodiment is a method for setting an information classification threshold for optimizing lam%, and the setting method is: setting a biased classification threshold so that hm% or sm% approaches 0, and then make the value of lam% approach to 0, that is: make The value of tends to zero to achieve the purpose of minimizing lam%.

[0070] For example, you can set the classification threshold to 0.999999.

[0071] In this embodiment, the threshold value cannot be set too far; otherwise, the situation of calculating log(0) will occur, that is, the situation that lam% cannot be calculated will occur. Therefore, the information classification threshold in this embodiment is biased toward hm% or sm%, but it is not set to zero.

[0072] The above method for obtaining information classification thresholds has nothing to do with the filtering model used by the filtering system, so this method for setting information classificatio...

specific Embodiment approach 2

[0081] Embodiment 2: This embodiment describes an information filtering system based on the method for setting information classification thresholds described in Embodiment 1, which includes a feature weight library, a trainer, and an information filter, wherein:

[0082] The feature weight library is used to store the features and weight information of spam and normal information;

[0083] The trainer is used to adjust / update the features and their weights in the feature weight library according to the user's feedback;

[0084] The information filter is used to extract features from the received information and obtain feature information; it is also used to identify the received information based on the features in the feature weight database, and classify the information into normal information and junk information;

[0085] In the information filter, the method for identifying new information is:

[0086] Establish an information filtering model framework based on ranking ...

specific Embodiment approach 3

[0101] Embodiment 3: This embodiment provides another information filtering system based on the method for setting the spam classification threshold described in Embodiment 1. The system includes a feature weight library, a trainer, and an information filter, wherein:

[0102] The feature weight library is used to store the features and weight information of spam and normal information;

[0103] The trainer is used to adjust / update the features and their weights in the feature weight library according to the user's feedback;

[0104] The information filter is used to extract features from the received information and obtain feature information; it is also used to identify the received information based on the features in the feature weight database, and classify the information into normal information and junk information;

[0105] In the information filter, the method for identifying new information is:

[0106] Establish an information filtering model framework based on ran...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an information filtering system and information filtering technique, which solves the problems of inconsistent evaluation indexes of an optimization target and a filtering problem, deviation of a model optimization result and restrained performance in the conventional information filtering model. In the method for setting the information classification threshold for optimizing the lam percentage, an offset classification threshold is set to make hm percentage or sm percentage approach to zero so as to make the lam percentage approach to zero. The information filtering system comprises a characteristic weight library, a trainer and an information filter, wherein the information filter is used for extracting the characteristics of received information, acquiring information of characteristic information, identifying the received information based on the characteristics in the characteristic weight library and dividing the information into normal information and junk information. The information filtering system can be used for filtering electronic information such as network information, mobile phone spam and the like.

Description

technical field [0001] The present invention relates to an information filtering method and a threshold setting method in the method, in particular to an information filtering method such as junk mail and short message filtering, that is, a classification threshold setting method in the method. Background technique [0002] With the rapid development of information technology, e-mail and mobile phone text messages have become the main means of communication and exchange in people's daily work and life, effectively promoting the production and progress of human society. However, a large number of spam emails and spam text messages have seriously affected their normal use. In the third quarter of 2008, Chinese netizens received an average of 17.86 spam emails per week, an increase of 1.17 or 7.0% compared with the same period last year; the proportion of spam emails received was 57.89%, an increase of 2.04 percentage points, and the global average is even higher than the abov...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 韩咏齐浩亮杨沐昀何晓宁李生王丁孙育华雷国华
Owner HEILONGJIANG INST OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products