[0067] figure 1 Shows the method flow of file scanning in an embodiment, including the following steps:
[0068] Step S110: Obtain the file to be scanned.
[0069] In this embodiment, after the scan engine of the virus-checking and killing software or the Trojan-killing software is turned on, the files to be scanned are obtained according to the user's operation of selecting the scanning range of the files on the checking and killing interface, and these files are regarded as files to be scanned. In the process of obtaining the files to be scanned, in order to make the file scanning easier to maintain during the running process, according to the operation of the user to select the scanning range of the file, the multiple files to be scanned are enumerated according to the set queue length to form a specific length of enumeration. Lift the queue to wait for scanning.
[0070] Step S130: Extract the attribute information of the files to be scanned one by one from the acquired files to be scanned.
[0071] In this embodiment, the attribute information of each file to be scanned is extracted from the acquired files to be scanned. Specifically, the attribute information includes information such as file path name, file generation time, file modification time, file size, and file identification number. Through the attribute information, it can be seen whether the file to be scanned has been modified.
[0072] In the actual application process, the information contained in the attribute information may be diverse. For example, in order to reduce the amount of data, the attribute information may only include the file generation time, the file modification time, and the file size. It can be known whether the corresponding file to be scanned has been modified, but the probability of this attribute information being used by the Trojan horse program may be higher, because the Trojan horse program modifies the file generation time and file modification time to the original The file generation time and file modification time of the file are not modified, and the attribute information including the file identification number is allocated by the operating system, and it is difficult to implement the modification operation. Therefore, in the preferred embodiment, the attribute information The information includes the file path name, file generation time, file modification time, file size, and file identification number. The attribute information containing the file identification number is less likely to be modified by the Trojan horse, and the file path name is beneficial to optimize subsequent Processing process.
[0073] Step S150: Compare the extracted attribute information with the stored attribute information, and determine whether the extracted attribute information is the same as the stored attribute information, if not, go to step S170, if yes, go to step S190.
[0074] In this embodiment, the stored attribute information is the attribute information corresponding to the file identified as normal in the result of the previous Trojan horse scan. The extracted attribute information of the file to be scanned is compared with the pre-stored attribute information one by one to determine whether the attribute information of the file to be scanned is the same as a certain stored attribute information; when the attribute information of the extracted file to be scanned is determined When it is the same as a certain attribute information stored, it means that the file to be scanned has not been modified since the last scan. Therefore, it can be determined that the file to be scanned is safe at this time and does not need to be scanned again; When the attribute information of the file to be scanned is different from all the stored attribute information, it means that the file to be scanned has changed and may be modified by the Trojan horse program. Therefore, the file to be scanned needs to be scanned to determine the file to be scanned. File security.
[0075] In addition, if the above-mentioned file scanning process is the first scan, or the acquired file to be scanned is scanned for the first time, there is no corresponding pre-stored attribute information. At this time, the file to be scanned should be directly scanned by Trojan horse. And store the attribute information corresponding to the file identified as normal in the obtained Trojan horse scanning result.
[0076] Step S170, scanning the file to be scanned corresponding to the extracted attribute information.
[0077] In this embodiment, the scanning engine is triggered to scan the file to be scanned whose attribute information has changed, and it is determined whether there is a Trojan horse program in the file according to the obtained scanning result. If the file is identified as a normal state according to the scanning result, it is safe , The corresponding attribute information is stored for the comparison of the files to be scanned in the next scan.
[0078] Step S190: Remove the file to be scanned corresponding to the extracted attribute information from the file to be scanned.
[0079] In this embodiment, the acquired file to be scanned is determined by the user's choice. However, in the actual file scanning process, for the file to be scanned that has not changed since the previous scan has determined that the file to be scanned is normal Therefore, it does not need to be scanned by Trojan horse again, that is, when it is determined that the file to be scanned has not undergone any changes and does not need to be scanned again, the file to be scanned is removed from the acquired file to be scanned to reduce waiting time. Scan the number of documents, thereby reducing unnecessary resource consumption.
[0080] In another embodiment, such as figure 2 As shown, the above step of comparing the extracted attribute information with the stored attribute information also includes the following steps:
[0081] Step S210: Obtain the Trojan scan result.
[0082] In this embodiment, the Trojan horse scan result is obtained after the Trojan horse scan is completed. The Trojan horse scan result records information such as the file name and status identifier corresponding to the file scanned by the Trojan horse. That is, through the Trojan horse scan result, it is possible to know which files are identified as normal Status, which files are marked as dangerous.
[0083] Step S230: Extract files identified as normal from the Trojan horse scanning results.
[0084] Step S250: Obtain and store the attribute information corresponding to the files identified as normal state one by one.
[0085] In this embodiment, after obtaining the attribute information corresponding to each file identified as a normal state, the attribute information is stored for comparison in the file scanning process. In a preferred embodiment, the caching mechanism is applied to provide higher performance for file scanning, and the attribute information corresponding to each file identified as normal is cached to achieve faster data query speed and file scanning performance with less resources. .
[0086] In another embodiment, in the above-mentioned Trojan horse scanning method, the attribute information is an attribute value, and the attribute value uniquely identifies the corresponding attribute information. Before the step of comparing the extracted attribute information with the stored attribute information, the method further includes: Calculate the extracted attribute information to obtain the attribute value corresponding to the file to be scanned, and calculate the attribute information of the obtained file to obtain the attribute value corresponding to the obtained file.
[0087] In this embodiment, storing the attribute information may occupy a lot of storage space. In order to avoid an excessive amount of stored information and reduce the occupied space, the attribute information is encrypted and calculated to obtain the attribute value. Determine whether the corresponding attribute information has changed by judging whether the attribute value has changed. The attribute value can be any one of MD5 value, CRC value (Cyclical Redundancy Check, cyclic redundancy check) and HASH value (hash value) Kind.
[0088] In another embodiment, such as image 3 As shown, the above Trojan horse scanning method includes the following steps:
[0089] Step S301: Obtain the file to be scanned.
[0090] Step S302: Obtain the attribute information of the files to be scanned one by one from the obtained files to be scanned.
[0091] Step S303: Calculate the extracted attribute information to obtain the attribute value corresponding to the file to be scanned.
[0092] Step S304: Extract the file path name of the file to be scanned from the extracted attribute information, and calculate the file path name to obtain the information digest value corresponding to the file to be scanned.
[0093] In this embodiment, in order to speed up the comparison, the stored attribute information can be searched according to the file path name of the file to be scanned. Since the attribute information is an attribute value, correspondingly, the file path name should be encrypted and calculated to obtain the corresponding information digest value. The attribute value corresponds to the information digest value. For example, if the attribute value is a HASH value, the information digest value Also in the form of HASH value.
[0094] For example, if the file path of a file to be scanned is named C:\Windows\System32\kernel32.dll, a string of characters is obtained after the encryption calculation of C:\Windows\System32\kernel32.dll, and this string of characters is Is the corresponding information summary value.
[0095] Step S305: Query the stored information summary value, and the information summary value obtained by the query is the same as the information summary value corresponding to the file to be scanned.
[0096] In this embodiment, a query is performed among multiple stored information digest values to obtain a certain information digest value that is the same as the information digest value of the file to be scanned from the multiple stored information digest values. In the query process, in order to obtain the relevant attribute value from the stored large amount of data, the information summary value is used to search the index. The information digest value is calculated by encrypting the file path name. It is the unique identifier of the file path name. Because there may be multiple files under a file path name, that is, a certain information digest value may be the same as the attribute value of multiple files. Corresponding.
[0097] Step S306: Obtain a stored attribute value that has a corresponding relationship with the information summary value obtained by the query from the stored correspondence relationship.
[0098] In this embodiment, since the information digest value and the corresponding attribute value of the file are pre-stored, and the correspondence between the information digest value and the attribute value is established, the attribute value of the file can be found from the information digest value.
[0099] After querying the stored information summary value that is the same as the information summary value corresponding to the file to be scanned, one or more stored attribute values can be obtained according to the stored correspondence relationship.
[0100] Step S307: It is determined whether the attribute value corresponding to the file to be scanned is the same as the stored attribute value, if not, step S308 is performed, and if yes, step S309 is performed.
[0101] In this embodiment, it is determined whether the attribute value corresponding to the file to be scanned has changed. When it is determined that the attribute value corresponding to the file to be scanned is the same as the stored attribute value, it means that the attribute value corresponding to the file to be scanned has not changed, and then it can be learned Neither the attribute information nor the file to be scanned has changed. Therefore, it can be confirmed that the file to be scanned is a safe file identified as a normal state. There is no need to scan the file to be scanned again. When the attribute corresponding to the file to be scanned is determined When the value is not the same as the stored attribute value, it means that the attribute value corresponding to the file to be scanned has changed, and there is a possibility that the file to be scanned is modified by the Trojan horse program, so the file to be scanned should be scanned.
[0102] Step S308, scanning the file to be scanned corresponding to the extracted attribute value.
[0103] Step S309: Remove the file to be scanned from the acquired file to be scanned.
[0104] Step S310: Obtain the Trojan scan result.
[0105] Step S311: Extract files identified as normal from the Trojan horse scanning results.
[0106] In this embodiment, after the scanning is completed, a file identified as a normal state is extracted from the Trojan horse scan result obtained by the scan, and the file identified as a normal state is a safe file.
[0107] Step S312: Obtain the attribute information corresponding to the files identified as normal status one by one.
[0108] Step S313: Calculate the attribute information of the acquired file to obtain the attribute value corresponding to the acquired file.
[0109] Step S314: Extract the file path name from the attribute information of the obtained file, and calculate the file path name to obtain the information digest value corresponding to the obtained file.
[0110] In this embodiment, the file path name of the file is extracted from the acquired attribute information of the file, and then the file path name of the file is encrypted and calculated to obtain the information digest value.
[0111] In step S315, the corresponding relationship between the information digest value and the attribute value corresponding to the acquired file is established by indexing the information digest value corresponding to the acquired file, and the corresponding relationship is stored.
[0112] In this embodiment, in order to speed up the query in the processing process, the attribute value is stored as an index based on the information summary value. Since there may be multiple files under one file path name, in the correspondence between the information summary value and the attribute value corresponding to the obtained file, one information summary value may correspond to multiple attribute values.
[0113] Figure 4 The Trojan scanning system in an embodiment is shown, including a file enumeration module 102, an information acquisition module 104, a comparison module 106, a scanning module 108, and a file removal module 110.
[0114] The file enumeration module 102 is used to obtain files to be scanned.
[0115] In this embodiment, after the scanning engine of the virus-checking and killing software or the Trojan-killing software is turned on, the file enumeration module 102 obtains the files to be scanned according to the user's operation of selecting the scanning range of the file on the checking and killing interface, and these files are used as waiting scan document. In the process of obtaining the files to be scanned, the file enumeration module 102 enumerates multiple files to be scanned according to the set queue length according to the operation of the user to select the scanning range of the file in order to make the file scanning easy to maintain during the running process. An enumeration queue of a certain length is formed to wait for scanning.
[0116] The information acquisition module 104 is configured to extract attribute information of the files to be scanned one by one from the acquired files to be scanned.
[0117] In this embodiment, the information acquiring module 104 extracts the attribute information of each document to be scanned from the acquired documents to be scanned, and the attribute information corresponding to each document to be scanned is different. Specifically, the attribute information includes information such as file path name, file generation time, file modification time, file size, and file identification number. Through the attribute information, it can be seen whether the file to be scanned has been modified.
[0118] In the actual application process, the information contained in the attribute information may be diverse. For example, in order to reduce the amount of data, the attribute information may only include the file generation time, the file modification time, and the file size. It can be known whether the corresponding file to be scanned has been modified, but the probability of this attribute information being used by the Trojan horse program may be higher, because the Trojan horse program modifies the file generation time and file modification time to the original The file generation time and file modification time of the file are not modified, and the attribute information including the file identification number is allocated by the operating system, and it is difficult to implement the modification operation. Therefore, in the preferred embodiment, the attribute information The information includes the file path name, file generation time, file modification time, file size, and file identification number. The attribute information containing the file identification number is less likely to be modified by the Trojan horse, and the file path name is helpful for optimizing subsequent processing process.
[0119] The comparison module 106 is used to compare the extracted attribute information with the stored attribute information, and determine whether the extracted attribute information is the same as the stored attribute information, if not, notify the scanning module 108, if yes, notify the file removal Module 110.
[0120] In this embodiment, since multiple attribute information is stored in advance, the file corresponding to the attribute information is all the files that have been identified as normal after scanning confirmation. The comparison module 106 compares the extracted attribute information of the file to be scanned with the pre-scanned file. The stored attribute information is compared one by one to determine whether the attribute information of the file to be scanned is the same as a certain stored attribute information; when the comparison module 106 determines that the extracted attribute information of the file to be scanned is consistent with a certain stored attribute information At the same time, it indicates that the file to be scanned has not been modified since the last scan, so it can be determined that the file to be scanned is safe at this time and does not need to be scanned again; when the comparison module 106 determines that the extracted file to be scanned is When the attribute information is different from all the stored attribute information, it means that the file to be scanned has changed and may be modified by the Trojan horse program. Therefore, the file to be scanned needs to be scanned to determine the security of the file to be scanned .
[0121] In addition, if the above-mentioned file scanning process is the first scan, or the acquired file to be scanned is scanned for the first time, there is no corresponding pre-stored attribute information. At this time, the scanning module 108 should be notified to directly scan the file to be scanned. Perform a Trojan scan, and store the attribute information corresponding to the file identified as normal in the Trojan scan result.
[0122] The scanning module 108 is configured to scan the document to be scanned corresponding to the extracted attribute information.
[0123] In this embodiment, the scanning module 108 triggers the scanning engine to scan the file to be scanned whose attribute information has changed, and determines whether there is a Trojan horse program in the file according to the obtained scanning result, and if the file is identified as normal according to the scanning result, If it is a safe file, the corresponding attribute information is stored for comparison of the file to be scanned in the next scan.
[0124] The file removal module 110 is used to remove the files to be scanned from the obtained files to be scanned.
[0125] In this embodiment, the acquired file to be scanned is determined by the user's choice. However, in the actual file scanning process, for the file to be scanned that has not changed since the previous scan has determined that the file to be scanned is normal Therefore, it does not need to be scanned by Trojan horse again, that is, when it is determined that there is no change in the file to be scanned, and it is not necessary to scan again, the file removal module 110 removes the file to be scanned from the enumeration queue. In order to reduce the number of documents to be scanned, thereby reducing unnecessary resource consumption.
[0126] In another embodiment, such as Figure 5 As shown, the above-mentioned Trojan horse scanning system also includes a result acquisition module 112 and an extraction module 114.
[0127] The result obtaining module 112 is used to obtain the Trojan horse scanning result.
[0128] In this embodiment, the Trojan horse scan result is obtained after the Trojan horse scan is completed. The Trojan horse scan result records information such as the file name and status identifier corresponding to the file scanned by the Trojan horse. That is, through the Trojan horse scan result, it is possible to know which files are identified as normal Status, which files are marked as dangerous.
[0129] The extraction module 114 is configured to extract files identified as normal from the Trojan horse scanning results.
[0130] The information acquisition module 104 is also configured to acquire and store the attribute information corresponding to the files identified as normal states one by one.
[0131] In this embodiment, after obtaining the attribute information corresponding to each file identified as a normal state, the information obtaining module 104 stores the attribute information for comparison in the file scanning process. In a preferred embodiment, the application caching mechanism provides higher performance for file scanning, and the information acquisition module 104 caches the attribute information corresponding to each file identified as normal, so as to achieve faster data query speed and occupy less resources File scanning performance.
[0132] In another embodiment, the attribute information is an attribute value, which uniquely identifies corresponding attribute information, such as Image 6 As shown, the above-mentioned Trojan horse scanning system also includes an attribute value calculation module 116.
[0133] The attribute value calculation module 116 is configured to calculate the extracted attribute information to obtain the attribute value corresponding to the file to be scanned, and calculate the attribute information of the obtained file to obtain the attribute value corresponding to the obtained file.
[0134] In this embodiment, storing the attribute information may occupy a lot of storage space. In order to avoid an excessive amount of stored information and reduce the occupied space, the attribute value calculation module 116 encrypts the attribute information to obtain the attribute value. Determine whether the corresponding attribute information has changed by judging whether the attribute value has changed. The attribute value can be any of MD5 value, CRC value, and HASH value.
[0135] In another embodiment, such as Figure 7 As shown, the above-mentioned Trojan horse scanning system also includes a summary value calculation module 118 and a relationship establishment module 120.
[0136] The digest value calculation module 118 is configured to extract the file path name of the file to be scanned from the extracted attribute information, and calculate the file path name of the file to be scanned to obtain the information digest value corresponding to the file to be scanned.
[0137] In this embodiment, in order to speed up the comparison, the summary value calculation module 118 can search the stored attribute information according to the file path name of the file to be scanned. Since the attribute information is an attribute value, correspondingly, the file path name should be encrypted and calculated to obtain the corresponding information digest value. The attribute value corresponds to the information digest value. For example, if the attribute value is a HASH value, the information digest value Also in the form of HASH value.
[0138] The summary value calculation module 118 is further configured to extract the file path name from the attribute information of the acquired file, and calculate the file path name to obtain the information summary value corresponding to the acquired file.
[0139] In this embodiment, the digest value calculation module 118 extracts the file path name of the file from the acquired attribute information of the file, and then encrypts the file path name of the file to obtain the information digest value.
[0140] The relationship establishment module 120 is used for indexing the information summary value corresponding to the acquired file to establish a correspondence relationship between the information summary value and the attribute value corresponding to the acquired document, and storing the correspondence relationship.
[0141] In this embodiment, in order to speed up the query speed in the processing process, the attribute value is stored as an index based on the information summary value. Since there may be multiple files under one file path name, in the correspondence between the information digest value and the attribute value corresponding to the obtained file, one information digest value may correspond to multiple attribute values.
[0142] In a specific embodiment, such as Picture 8 As shown, the comparison module 106 includes a query unit 1062, an attribute value query unit 1064, and a judgment unit 1068.
[0143] The query unit 1062 is used to query the stored information summary value, and the information summary value obtained by the query is the same as the information summary value corresponding to the file to be scanned.
[0144] In this embodiment, the query unit 1062 performs a query among the multiple stored information digest values to obtain a certain information digest value that is the same as the information digest value of the file to be scanned from the multiple stored information digest values. In the query process, in order to obtain the relevant attribute value from the stored large amount of data, the information summary value is used to search the index. The information digest value is obtained by encrypting the file path name and is the unique identifier of the file path name. Since there may be multiple files under a file path name, that is, a certain information digest value may be the same as the attribute value of multiple files. Corresponding.
[0145] The attribute value query unit 1064 is configured to obtain the stored attribute value that has a corresponding relationship with the queried information summary value from the stored correspondence relationship.
[0146] In this embodiment, since the information summary value and the corresponding attribute value of the file are stored in advance, and the corresponding relationship between the information summary value and the attribute value is established, the attribute value query unit 1064 can find the information from the information summary value. The attribute value of the file.
[0147] After querying a certain information digest value that is the same as the information digest value corresponding to the file to be scanned from the stored information digest values, the attribute value query unit 1064 can obtain one or more stored attribute values according to the stored correspondence relationship.
[0148] The determining unit 1066 is configured to determine whether the attribute value corresponding to the file to be scanned is the same as the stored attribute value, if not, notify the scanning module 108, and if so, notify the file removal module 110.
[0149] In this embodiment, the judgment unit 350 judges whether the attribute value corresponding to the file to be scanned has changed. When it is judged that the attribute value corresponding to the file to be scanned is the same as the stored attribute value, it means that the attribute value corresponding to the file to be scanned has not changed. Furthermore, it can be learned that neither the attribute information nor the file to be scanned has undergone any changes, so it can be confirmed that the file to be scanned is a safe document marked as a normal state, and there is no need to scan the file to be scanned again. When the file to be scanned is determined When the corresponding attribute value is not the same as the stored attribute value, it indicates that the attribute value corresponding to the file to be scanned has changed, and there is a possibility that the file to be scanned is modified by the Trojan horse program, so the file to be scanned should be scanned.
[0150] In the above Trojan scanning method and system, when the attribute information of the file to be scanned is different from the pre-stored attribute information, the file to be scanned is scanned for security to determine whether the file to be scanned is marked as a normal state. Perform a security scan on the files to be scanned with the same attribute information as the pre-stored attribute information. Because the files to be scanned with the same attribute information as the pre-stored attribute information have a very high probability of not being modified, the security scans to be scanned Documents are greatly reduced, which flexibly reduces resource consumption, and also ensures the safety of documents and improves document scanning speed.
[0151] In the above Trojan scanning method and system, the attribute information is the attribute value calculated according to the attribute information. Compared with the attribute information containing multiple kinds of information, the attribute value is just a string of characters, which can avoid the storage of data that is too large during storage. The problem occurs, reducing the occupied resources.
[0152] In the above-mentioned Trojan horse scanning method and system, for the file identified as normal in the Trojan horse scanning result, the corresponding information summary value is calculated according to the file path name in the attribute information, and the information summary value and attribute are established by indexing the information summary value The corresponding relationship between the values can effectively improve the query and comparison speed of the stored attribute values, thereby increasing the file scanning speed.
[0153] The above-mentioned embodiments only express several implementation modes of the present invention, and their description is more specific and detailed, but they should not be interpreted as a limitation on the patent scope of the present invention. It should be pointed out that for those of ordinary skill in the art, without departing from the concept of the present invention, several modifications and improvements can be made, and these all fall within the protection scope of the present invention. Therefore, the protection scope of the patent of the present invention should be subject to the appended claims.