Software detection method and device, electronic equipment and computer readable storage medium

CN115495751BActive Publication Date: 2026-06-23CHINA TELECOM CORP LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
CHINA TELECOM CORP LTD
Filing Date
2022-09-29
Publication Date
2026-06-23

Smart Images

  • Figure CN115495751B_ABST
    Figure CN115495751B_ABST
Patent Text Reader

Abstract

The present disclosure provides a software detection method and device, electronic equipment and computer readable storage medium, relating to the technical field of Internet. The method comprises: obtaining first component information of a software to be detected; obtaining a target candidate component according to the first component information and second component information of a plurality of candidate components; obtaining component dangerous data flow information according to component vulnerability information of the target candidate component and component source code of the software to be detected; extracting the dangerous data flow information to obtain data flow node information; and determining whether the software to be detected has a vulnerability according to the calling function and the data flow node information. The present disclosure analyzes the application source code of the software to be detected to obtain the first component information of the dependent component, determines the target candidate component in the second component information of the plurality of candidate components through the first component information, increases the accuracy of determining the target candidate component, and is conducive to reducing the false negative rate of software detection.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This disclosure relates to the field of Internet technology, and in particular to a software testing method, apparatus, electronic device, and computer-readable storage medium. Background Technology

[0002] Open-source components are components whose source code is publicly available, and whose use, modification, and distribution are not restricted by licenses. For software developed based on open-source components, it is necessary to examine the open-source components used by the software in order to analyze its security performance.

[0003] In related technologies, software detection typically involves comparing software information with components in a vulnerability database. Specifically, it searches for vulnerability data based on dependencies and then uses this data to detect the software. However, this approach often fails to comprehensively and accurately identify vulnerabilities in the software under test, resulting in a high false negative rate.

[0004] It should be noted that the information disclosed in the background section above is only used to enhance the understanding of the background of this disclosure, and therefore may include information that does not constitute prior art known to those skilled in the art. Summary of the Invention

[0005] This disclosure provides a software detection method, apparatus, electronic device, and computer-readable storage medium, which at least to some extent overcomes the problem of the inability to fully and accurately identify vulnerabilities in the software under test, resulting in a high false negative rate for software vulnerabilities.

[0006] Other features and advantages of this disclosure will become apparent from the following detailed description, or may be learned in part from practice of this disclosure.

[0007] According to one aspect of this disclosure, a software detection method is provided, the method comprising: acquiring first component information of a software to be detected, the first component information being obtained based on the application source code of the software to be detected, the application source code including function calls; obtaining a target component to be selected based on the first component information and second component information of a plurality of target components; obtaining component vulnerability information based on component vulnerability information of the target component to be selected and the component source code of the software to be detected; extracting the vulnerability data flow information to obtain data flow node information; and determining whether the software to be detected has vulnerabilities based on the function calls and the data flow node information.

[0008] In one embodiment of this disclosure, before obtaining the first component information of the software to be tested, the method further includes: obtaining the first component information of the software to be tested based on the configuration file of the application source code of the software to be tested.

[0009] In one embodiment of this disclosure, before obtaining the target candidate component based on the first component information and the second component information of the plurality of candidate components, the method includes: crawling vulnerability information data of the plurality of candidate components from a vulnerability release platform; extracting information from the vulnerability information data to obtain the second component information of the plurality of candidate components.

[0010] In one embodiment of this disclosure, the first component information includes one or more of component name, component source, release time, vulnerability information, and license; the second component information includes one or more of component name, component source, release time, vulnerability information, and license, wherein obtaining the target candidate component based on the first component information and the second component information of multiple candidate components includes: selecting target second component information that matches the first component information from the second component information of multiple candidate components; and taking the candidate component corresponding to the target second component information as the target candidate component.

[0011] In one embodiment of this disclosure, obtaining component dangerous data flow information based on the component vulnerability information of the target component to be selected and the component source code of the software to be tested includes: extracting the component vulnerability information of the target component to be selected to obtain vulnerability exploit points; analyzing the component source code of the target component to be selected and the component source code of the software to be tested based on the vulnerability exploit points to obtain component dangerous data flow information, wherein the component dangerous data flow information includes one or more of the following: the source point, propagation path, and outbreak point of dangerous data.

[0012] In one embodiment of this disclosure, determining whether the software to be tested has a vulnerability based on the called function and the data stream node information includes: determining whether the called function and the data stream node information match; if they match, then the software to be tested has a vulnerability.

[0013] In one embodiment of this disclosure, the method further includes: if there is no match, then the software to be tested does not have a vulnerability.

[0014] According to another aspect of this disclosure, a software detection apparatus is provided, the apparatus comprising: an acquisition module for acquiring first component information of software to be detected, the first component information being obtained based on the application source code of the software to be detected, the application source code including function calls; a selection module for obtaining a target component to be selected based on the first component information and second component information of a plurality of components to be selected; a hazard information generation module for obtaining component hazard data stream information based on component vulnerability information of the target component to be selected and the component source code of the software to be detected; an extraction module for extracting the hazard data stream information to obtain data stream node information; and a processing module for determining whether the software to be detected has vulnerabilities based on the function calls and the data stream node information.

[0015] According to another aspect of this disclosure, an electronic device is provided, comprising: a processor; and a memory for storing executable instructions of the processor; wherein the processor is configured to perform the above-described software detection method by executing the executable instructions.

[0016] According to another aspect of this disclosure, a computer-readable storage medium is provided having a computer program stored thereon, which, when executed by a processor, implements the above-described software detection method.

[0017] According to another aspect of this disclosure, a computer program product is provided, the computer program product comprising a computer program or computer instructions, the computer program or computer instructions being loaded and executed by a processor to enable a computer to implement any of the software detection methods described above.

[0018] This disclosure provides a software detection method, apparatus, electronic device, and computer-readable storage medium. The method involves acquiring first component information of the software to be detected, which is obtained based on the application source code of the software, including function calls. Based on the first component information and second component information of multiple candidate components, a target candidate component is obtained. Based on component vulnerability information of the target candidate component and the component source code of the software to be detected, dangerous data flow information of the component is obtained. The dangerous data flow information is extracted to obtain data flow node information. Based on the function calls and data flow node information, it is determined whether the software to be detected has vulnerabilities. This disclosure, on the one hand, increases the accuracy of determining the target candidate component by analyzing the application source code of the software to be detected to obtain the first component information of the dependent components, and by using the first component information to determine the target candidate component from the second component information of multiple candidate components. This helps to reduce the false negative rate of software detection.

[0019] On the other hand, by determining whether the software under test has vulnerabilities based on the function calls and data flow node information, it is possible to identify and compare the actual dangerous functions in the software, and to perform comprehensive and accurate software detection, thus solving the problem of high false alarm rate in software detection.

[0020] It should be understood that the above general description and the following detailed description are exemplary and explanatory only, and are not intended to limit this disclosure. Attached Figure Description

[0021] The accompanying drawings, which are incorporated in and form part of this specification, illustrate embodiments consistent with this disclosure and, together with the description, serve to explain the principles of this disclosure. It is obvious that the drawings described below are merely some embodiments of this disclosure, and those skilled in the art can obtain other drawings based on these drawings without any inventive effort.

[0022] Figure 1 A schematic diagram of a system architecture according to an embodiment of this disclosure is shown;

[0023] Figure 2 This diagram illustrates a software detection method according to an embodiment of the present disclosure.

[0024] Figure 3 This diagram illustrates a data stream node of the software to be tested in an embodiment of this disclosure.

[0025] Figure 4 This diagram illustrates the flowchart for determining the target component to be selected in an embodiment of this disclosure.

[0026] Figure 5 This diagram illustrates a software detection device according to an embodiment of the present disclosure.

[0027] Figure 6 This diagram illustrates a software detection apparatus according to another embodiment of the present disclosure;

[0028] Figure 7 This diagram illustrates a structural block diagram of an electronic device according to an embodiment of the present disclosure;

[0029] Figure 8 This illustration shows a schematic diagram of a computer-readable storage medium provided in an embodiment of the present disclosure. Detailed Implementation

[0030] Exemplary embodiments will now be described more fully with reference to the accompanying drawings. However, these exemplary embodiments can be implemented in many forms and should not be construed as limited to the examples set forth herein; rather, they are provided so that this disclosure will be more comprehensive and complete, and will fully convey the concept of the exemplary embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.

[0031] Furthermore, the accompanying drawings are merely illustrative of this disclosure and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and therefore repeated descriptions of them will be omitted. Some block diagrams shown in the drawings are functional entities and do not necessarily correspond to physically or logically independent entities. These functional entities may be implemented in software, in one or more hardware modules or integrated circuits, or in different network and / or processor devices and / or microcontroller devices.

[0032] Software detection typically involves comparing software information with components in a vulnerability database. However, this method lacks fine-grained risk comparison, leading to a high false positive rate. Furthermore, software detection often involves numerous exploit points and vulnerability sources. This method cannot identify and compare actual dangerous functions within components, thus failing to provide comprehensive and accurate software detection.

[0033] To address at least one of the aforementioned problems, embodiments of this disclosure provide software detection methods, apparatus, electronic devices, and computer-readable storage media, applicable to scenarios involving component security assessment, code security detection, and code and component security requirements across various industries. For example, it can be applied to software detection. According to the technical solutions provided by embodiments of this disclosure, on one hand, by analyzing the application source code of the software to be detected, first component information of the dependent components is obtained. Using this first component information, the target component is determined from the second component information of multiple candidate components, increasing the accuracy of determining the target component and helping to reduce the false negative rate in software detection. On the other hand, by determining whether the software to be detected has vulnerabilities based on function calls and data flow node information, it is possible to identify and compare actual dangerous functions in the software, enabling comprehensive and accurate software detection and solving the problem of high false positive rates in software detection.

[0034] To facilitate a comprehensive understanding of the technical solutions provided in the embodiments of this disclosure, the software detection system provided in the embodiments of this disclosure will be described below.

[0035] Figure 1 A schematic diagram of an exemplary system architecture that can be applied to the software detection method or software detection apparatus of the embodiments of this disclosure is shown.

[0036] like Figure 1 As shown, the system architecture may include terminal device 102, network 101, and server 103.

[0037] Network 101 is a medium used to provide a communication link between terminal device 102 and server 103, and can be a wired network or a wireless network.

[0038] Optionally, the aforementioned wireless or wired networks use standard communication technologies and / or protocols. The network is typically the Internet, but can also be any network, including but not limited to Local Area Networks (LANs), Metropolitan Area Networks (MANs), Wide Area Networks (WANs), mobile, wired or wireless networks, private networks, or any combination of virtual private networks. In some embodiments, technologies and / or formats including Hyper Text Markup Language (HTML), Extensible Markup Language (XML), etc., are used to represent data exchanged over the network. Furthermore, conventional encryption technologies such as Secure Socket Layer (SSL), Transport Layer Security (TLS), Virtual Private Networks (VPNs), and Internet Protocol Security (IPsec) can be used to encrypt all or some links. In other embodiments, custom and / or dedicated data communication technologies can be used to replace or supplement the aforementioned data communication technologies.

[0039] Terminal device 102 can be various electronic devices, including but not limited to smartphones, tablets, laptops, desktop computers, wearable devices, augmented reality devices, virtual reality devices, etc.

[0040] Optionally, the client of the application installed on different terminal devices 102 may be the same, or the client of the same type of application based on different operating systems. Depending on the terminal platform, the specific form of the application client may also be different; for example, the application client may be a mobile client, a PC client, etc.

[0041] Server 103 can be a server that provides various services, such as a backend management server that supports the device operated by the user using terminal device 102. The backend management server can analyze and process received requests and other data, and feed the processing results back to the terminal device.

[0042] Optionally, server 103 can be an independent physical server, a server cluster or distributed system composed of multiple physical servers, or a cloud server that provides basic cloud computing services such as cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDN (Content Delivery Network), and big data and artificial intelligence platforms.

[0043] The technical solutions provided in the embodiments of this disclosure will now be described.

[0044] Those skilled in the art will know that Figure 1 The number of terminal devices 102, network 101, and server 103 is merely illustrative; any number of terminal devices 102, network 101, and server 103 can be used as needed. This disclosure does not limit the number of such devices.

[0045] After initially introducing the system architecture provided in the embodiments of this disclosure, the software detection method provided in the embodiments of this disclosure will be described next. This method can be executed by any electronic device with computing power. In some embodiments, the software detection method provided in the embodiments of this disclosure can... Figure 1 Execute on server 103 as shown.

[0046] Figure 2 This diagram illustrates a software detection method flowchart according to an embodiment of the present disclosure, such as... Figure 2 As shown, the software detection method provided in this embodiment may include the following steps S201 to S205.

[0047] S201, Obtain the first component information of the software to be tested. The first component information is obtained based on the application source code of the software to be tested, which includes function calls.

[0048] The source code of the software under test includes application source code and component source code. Application source code can refer to an uncompiled text file written according to a certain programming language specification. The application source code includes calling functions, which are used to directly call open source components to realize hardware control and data acquisition. Component source code is the source code of the open source components called by the software under test.

[0049] The first component information includes one or more of the following: component name, component source, release time, vulnerability information, and license. This embodiment does not limit the specific content of the first component information. For example, the first component information may include component name, component source, release time, vulnerability information, and license.

[0050] S202, based on the first component information and the second component information of multiple candidate components, the target candidate component is obtained.

[0051] In one embodiment, the software detection method may further include: extracting information from known publicly released vulnerability information data to obtain second component information of multiple candidate components.

[0052] The vulnerability information data can be vulnerability information publicly released by a vulnerability release platform. This disclosure embodiment does not limit the specific method by which known publicly released vulnerability information data is publicly released.

[0053] In another embodiment, prior to S202, the software detection method may further include: crawling vulnerability information data of multiple candidate components from a vulnerability release platform; extracting information from the vulnerability information data to obtain second component information of the multiple candidate components.

[0054] The components to be selected are those in the component vulnerability information database. Vulnerability distribution platforms include one or more of NVD, CVE, SecurityFocus, CNNVD, CNVD, and WooYun. Web crawlers are used to crawl the vulnerability distribution platforms, and the crawled information is used to build the component vulnerability information database. The database creation process may include steps such as deduplication, translation, associating with projects and components, and manual review. The vulnerability information data sources are authoritative, and the vulnerability information data is characterized by its breadth and richness.

[0055] It should be noted that the component vulnerability database can be updated as needed. Updates can be scheduled at preset times, such as once a week; alternatively, the database can be updated the day before testing the component to be tested. This embodiment does not limit the specific timing of updates; updates can be performed as needed. This disclosure crawls multiple vulnerability publishing platforms, resulting in more comprehensive vulnerability information data and thus reducing the false negative rate of software detection results.

[0056] The second component information includes one or more of the following: component name, component source, release date, vulnerability information, and license. The second component information may be identical to or partially identical to the first component information. This embodiment does not limit the specifics of the second component information; for example, it may include the component name, component source, release date, vulnerability information, and license. The license may include Apache (a free software license issued by the Apache Software Foundation), licenses from the Massachusetts Institute of Technology (MIT), the Lesser General Public License (LGPL), licenses from the Berkeley Software Distribution (BSD) license, the General Public License (GPL), the Mozilla Public License (MPL), the Server Side Public License (SSPL), etc., without specific limitations.

[0057] Statistical analysis is performed on the vulnerability information data of components in the component vulnerability information database to extract detailed information related to the vulnerabilities, thereby obtaining second component information. Second component information can be extracted using OpenUE (Open Toolkit). This disclosure does not limit the specific method used to extract second component information; any method that can extract second component information from vulnerability information data is acceptable.

[0058] This disclosure increases the accuracy of determining the target candidate component by selecting the second component information that is closest to the first component information from multiple candidate component second component information, and using the candidate component corresponding to the closest second component information as the target candidate component. This helps to reduce the false negative rate of software detection.

[0059] S203. Based on the component vulnerability information of the target component to be selected and the component source code of the software to be tested, obtain the component dangerous data stream information.

[0060] In an exemplary embodiment, S203 includes steps A1 and A2.

[0061] Step A1: Extract the component vulnerability information of the target component to be selected to obtain the vulnerability exploitation point.

[0062] Exploit points can be a crucial means of gaining system control. Users identify vulnerable exploit points in a target system and then use these exploits to gain access, thereby achieving control over the target system. Exploit points can be blank or default passwords, default shared keys, application weaknesses, and spoofed IPs (Internet Protocol). This disclosure does not limit the specific exploit points used.

[0063] Exploitation points can be obtained from known publicly released vulnerability information data. This disclosure does not restrict the specific methods used to obtain exploit points.

[0064] Step A2: Analyze the source code of the target component and the source code of the software to be tested based on the vulnerability exploitation point to obtain component dangerous data flow information. The component dangerous data flow information includes one or more of the following: the source point, propagation path, and outbreak point of the dangerous data.

[0065] From the source code of the components in the software under test, find the source code of the components related to the vulnerability exploitation point. Combine the source code of the target component and the source code of the found components in the software under test, perform code analysis based on the vulnerability exploitation point to obtain the dangerous data flow information of the component related to the vulnerability, such as... Figure 3 As shown, Figure 3 In (1), 31 is used to indicate the ordinary node of the application source code data stream of the software to be tested, and 32 is used to indicate the function call node of the application source code of the software to be tested. Figure 3 In section (2), 33 is used to indicate the source point of hazardous data, 34 is used to indicate the ordinary node of the component source code data flow of the target component to be selected, and 35 is used to indicate the outbreak point. The data flow direction between the outbreak point 35, the ordinary node 34 and the source point 33 constitutes the propagation path in the component hazardous data flow information.

[0066] S204, extract the dangerous data stream information to obtain the data stream node information.

[0067] Data stream node information can be obtained through static code analysis of hazardous data stream information. This disclosure does not limit the method used to extract hazardous data stream information; any method capable of extracting data stream node information is acceptable. For example, hazardous data stream information can be extracted using Soot (a static code analysis tool). Data stream node information includes at least one or more of the following: source point, propagation path, and outbreak point.

[0068] S205 determines whether the software under test has vulnerabilities based on the function calls and data flow node information.

[0069] In one embodiment, it is determined whether the called function and the data stream node information match; if they match, the software to be tested has a vulnerability.

[0070] like Figure 3 As shown, it is determined whether the function call node 32 of the application source code of the software under test is equal to the data stream node information. If they are equal, it indicates that the software under test has a vulnerability.

[0071] In another embodiment, it is determined whether the called function and the data stream node information match. If they do not match, the software to be tested does not have any vulnerabilities.

[0072] like Figure 3 As shown, it is determined whether the function call node 32 of the application source code of the software under test is equal to the data flow node information. If they are not equal, it means that the software under test does not have any vulnerabilities.

[0073] This disclosure combines the data flow nodes of the target component with the actual function call nodes in the application source code of the software under test for node matching. By considering the actual function call behavior of the application source code, it achieves comprehensive and accurate software testing and can also perform security assessments of the components the software under test depends on. Furthermore, it can reduce the false positive rate of software detection.

[0074] Furthermore, this disclosure analyzes the source code of the target component and the source code of the software to be tested based on the vulnerability exploitation points, thereby achieving fine-grained risk comparison and reducing the false alarm rate of software detection.

[0075] In one embodiment, before obtaining the first component information of the software to be tested, the software testing method may further include: obtaining the first component information of the software to be tested based on the configuration file of the application source code of the software to be tested.

[0076] The configuration file can be a computer file that configures parameters and initial settings for some computer programs. The configuration file includes information about the first component; for example, it can be pom.xml (a configuration file) and / or build.gradle (a configuration file). This disclosure does not limit the specific type of configuration file. The software to be tested may include multiple components.

[0077] By performing dependency analysis on the configuration file of the application code of the software to be tested, the information of the first component that the code depends on is obtained; the information of the first component is matched with the component vulnerability information database to determine whether a component in the software to be tested is a dangerous component; if the component is a non-dangerous component, the next component is matched with the component vulnerability information database; if the component is a dangerous component, the component to be selected in the component vulnerability information database corresponding to the component is selected as the target component to be selected.

[0078] In one embodiment, the first component information includes one or more of the following: component name, component source, release date, vulnerability information, and license; the second component information includes one or more of the following: component name, component source, release date, vulnerability information, and license, such as... Figure 4 As shown, the target component to be selected is obtained based on the first component information and the second component information of multiple components to be selected, which may include S401 and S402.

[0079] S401, Select the target second component information that matches the first component information from the second component information of multiple components to be selected;

[0080] The specific description of the first component information is the same as that in S201, and the specific description of the second component information is the same as that in S202, so it will not be repeated here. This embodiment of the disclosure can determine the target second component information by calculating the correlation between the first component information and the second component information.

[0081] The second component information includes one or more of the following: component name, component source, release time, vulnerability information, and license. The second component information may be the same as or partially the same as the first component information.

[0082] It should be noted that the software under test may depend on multiple candidate components. Each candidate component that the software under test depends on has corresponding second component information. By statistically analyzing the first component information and the second component information of the software under test, the correlation between the software under test and the multiple candidate components can be obtained. Alternatively, the correlation between the software under test and the multiple candidate components can be obtained by calculating the Euclidean distance between the first component information and the second component information, or by calculating the cosine similarity between the first component information and the second component information. This disclosure does not limit the specific method of obtaining the correlation.

[0083] For example, a component in the software to be detected is called the first component, and the first component corresponds to the first component information. A component in the selected components is called the second component, and the second component corresponds to the second component information. The correlation between the first component information and the second component information is calculated using cosine similarity. The correlation between the first component information and the second component information is shown in Table 1.

[0084] Table 1. Correlation between information from the first component and information from the second component.

[0085] First component information Second component information correlation Component Name Newtonsoft.Json Newtonsoft.Json 1 Component source NVD CVE 0.33 Release time 20220912 20220912 1 Vulnerability Information SQL injection attack Cross-site scripting vulnerability 0 license Apache Apache 1

[0086] It should be noted that a higher correlation value indicates a greater similarity between component information, and vice versa. A correlation value of 1 indicates that the component information is the same; a correlation value of 0 indicates that the component information is completely different. Table 1 shows the correlation between the first and second component information. The correlation is represented by an association array, i.e., {1, 0.33, 1, 0, 1}. When there are multiple components to be selected, the second component information of each component to be selected is analyzed for correlation with the first component information, resulting in multiple association arrays. The target second component information is determined through multiple association arrays. This disclosure does not limit how the target second component information is determined; for example, it can be determined through a combination of functional relationships, neural network models, or graphs.

[0087] For example, the information of the second component of the target can be determined through a functional relationship, as follows:

[0088] y=ax1+bx2+cx3+dx4+ex5 (1)

[0089] Where y is the total relevance, x1, x2, x3, x4 and x5 are the relevance of component name, component source, release time, vulnerability information and license, respectively; a, b, c, d and e are the weights of the relevance of component name, component source, release time, vulnerability information and license, respectively. The sum of a, b, c, d and e can be a constant value, for example, the sum of a, b, c, d and e is 1.

[0090] The second component information with the largest y-value can be selected as the target second component information.

[0091] S402, the component to be selected corresponding to the target second component information is taken as the target component to be selected.

[0092] In this embodiment of the disclosure, the candidate component corresponding to the target second component information can be found through the correspondence relationship. For example, the candidate component corresponding to the target second component information can be obtained by querying the correspondence relationship table. This embodiment of the disclosure does not limit how the candidate component corresponding to the target second component information is obtained.

[0093] Based on the same inventive concept, this disclosure also provides a software detection system, as described in the following embodiments. Since the principle by which this system embodiment solves the problem is similar to that of the method embodiment described above, the implementation of this system embodiment can refer to the implementation of the method embodiment described above, and repeated details will not be repeated.

[0094] Figure 5 This diagram illustrates a software detection device according to an embodiment of the present disclosure, such as... Figure 5As shown, the device includes an acquisition module 51, a selection module 52, a hazard information generation module 53, an extraction module 54, and a processing module 55.

[0095] The acquisition module 51 is used to acquire the first component information of the software to be tested. The first component information is obtained based on the application source code of the software to be tested, and the application source code includes function calls.

[0096] The selection module 52 is used to obtain the target component to be selected based on the first component information and the second component information of multiple components to be selected;

[0097] The danger information generation module 53 is used to obtain component danger data stream information based on the component vulnerability information of the target component to be selected and the component source code of the software to be tested;

[0098] Extraction module 54 is used to extract dangerous data stream information to obtain data stream node information;

[0099] Processing module 55 is used to determine whether the software to be tested has vulnerabilities based on the called function and data stream node information.

[0100] In one embodiment, such as Figure 6 As shown, the acquisition module 51 is used to acquire the application source code of the software to be tested and determine the first component information of the software to be tested based on the application source code; the selection module 52 is used to acquire the second component information of multiple components to be selected in the component vulnerability information database and obtain the target component to be selected based on the first component information and the second component information of multiple components to be selected; the danger information generation module 53 is used to acquire the component source code of the software to be tested and obtain the component danger data stream information based on the component vulnerability information of the target component to be selected and the component source code of the software to be tested.

[0101] In one embodiment, the acquisition module 51 is further configured to obtain the first component information of the software to be tested based on the configuration file of the application source code of the software to be tested.

[0102] In one embodiment, the acquisition module 51 is further configured to crawl vulnerability information data of multiple candidate components on a vulnerability release platform before obtaining the target candidate component based on the first component information and the second component information of multiple candidate components; and to extract information from the vulnerability information data to obtain the second component information of multiple candidate components.

[0103] In one embodiment, the first component information includes one or more of the following: component name, component source, release time, vulnerability information, and license; the second component information includes one or more of the following: component name, component source, release time, vulnerability information, and license; the selection module 52 is further configured to select target second component information that matches the first component information from the second component information of multiple candidate components; and to use the candidate component corresponding to the target second component information as the target candidate component.

[0104] In one embodiment, the danger information generation module 53 is further configured to extract component vulnerability information of the target component to be selected to obtain vulnerability exploit points; and analyze the component source code of the target component to be selected and the component source code of the software to be tested based on the vulnerability exploit points to obtain component danger data flow information, which includes one or more of the following: the source point, propagation path, and outbreak point of the danger data.

[0105] In one embodiment, the processing module 55 is further configured to determine whether the called function and data stream node information match; if they match, the software to be detected has a vulnerability.

[0106] In one embodiment, the processing module 55 is further configured to determine whether the called function and the data stream node information match; if they do not match, the software to be tested does not have any vulnerabilities.

[0107] The apparatus provided in this disclosure, on the one hand, analyzes the application source code of the software to be detected to obtain first component information of the components it depends on. Using this first component information, it identifies the target component from second component information of multiple candidate components, increasing the accuracy of identifying the target component and helping to reduce the false negative rate in software detection. On the other hand, it determines whether the software to be detected has vulnerabilities based on function calls and data flow node information. This allows for the identification and comparison of actual dangerous functions in the software, enabling comprehensive and accurate software detection and solving the problem of high false positive rates in software detection.

[0108] Those skilled in the art will understand that various aspects of this disclosure can be implemented as a system, method, or program product. Therefore, various aspects of this disclosure can be specifically implemented in the following forms: a completely hardware implementation, a completely software implementation (including firmware, microcode, etc.), or a combination of hardware and software aspects, collectively referred to herein as a "circuit," "module," or "system."

[0109] The following reference Figure 7 To describe an electronic device 700 according to such an embodiment of the present disclosure. Figure 7 The electronic device 700 shown is merely an example and should not impose any limitation on the functionality and scope of use of the embodiments disclosed herein.

[0110] like Figure 7 As shown, the electronic device 700 is manifested in the form of a general-purpose computing device. The components of the electronic device 700 may include, but are not limited to: at least one processing unit 710, at least one storage unit 720, and a bus 730 connecting different system components (including storage unit 720 and processing unit 710).

[0111] The storage unit stores program code that can be executed by the processing unit 710, causing the processing unit 710 to perform the steps described in the "Exemplary Methods" section of this specification according to various exemplary embodiments of this disclosure. For example, the processing unit 710 can perform the following steps of the above method embodiments: obtaining first component information of the software to be tested, the first component information being obtained based on the application source code of the software to be tested, the application source code including function calls; obtaining a target component to be selected based on the first component information and second component information of multiple components to be selected; obtaining component dangerous data flow information based on the component vulnerability information of the target component to be selected and the component source code of the software to be tested; extracting the dangerous data flow information to obtain data flow node information; and determining whether the software to be tested has vulnerabilities based on the function calls and the data flow node information.

[0112] Storage unit 720 may include a readable medium in the form of a volatile storage unit, such as random access memory (RAM) 7201 and / or cache memory 7202, and may further include a read-only memory (ROM) 7203.

[0113] The storage unit 720 may also include a program / utility 7204 having a set (at least one) program module 7205, such program module 7205 including but not limited to: an operating system, one or more application programs, other program modules and program data, each or some combination of these examples may include an implementation of a network environment.

[0114] Bus 730 can represent one or more of several types of bus structures, including a memory cell bus or memory cell controller, a peripheral bus, a graphics acceleration port, a processing unit, or a local bus using any of the various bus structures.

[0115] Electronic device 700 can also communicate with one or more external devices 740 (e.g., keyboard, pointing device, Bluetooth device, etc.), and with one or more devices that enable a user to interact with electronic device 700, and / or with any device that enables electronic device 700 to communicate with one or more other computing devices (e.g., router, modem, etc.). This communication can be performed via input / output (I / O) interface 750. Furthermore, electronic device 700 can also communicate with one or more networks (e.g., local area network (LAN), wide area network (WAN), and / or public networks, such as the Internet) via network adapter 760. As shown, network adapter 760 communicates with other modules of electronic device 700 via bus 730. It should be understood that, although not shown in the figures, other hardware and / or software modules can be used in conjunction with electronic device 700, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems.

[0116] From the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein can be implemented by software or by combining software with necessary hardware. Therefore, the technical solutions according to the embodiments of this disclosure can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (such as a CD-ROM, USB flash drive, external hard drive, etc.) or on a network, including several instructions to cause a computing device (such as a personal computer, server, terminal device, or network device, etc.) to execute the methods according to the embodiments of this disclosure.

[0117] In exemplary embodiments of this disclosure, a computer-readable storage medium is also provided, which may be a readable signal medium or a readable storage medium. Figure 8 This illustration shows a schematic diagram of a computer-readable storage medium provided in an embodiment of the present disclosure, such as... Figure 8 As shown, the computer-readable storage medium 800 stores a program product capable of implementing the methods described above in this disclosure. In some possible embodiments, various aspects of this disclosure may also be implemented as a program product comprising program code that, when run on a terminal device, causes the terminal device to perform the steps described in the "Exemplary Methods" section of this specification according to various exemplary embodiments of this disclosure.

[0118] More specific examples of computer-readable storage media in this disclosure may include, but are not limited to: electrical connections having one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.

[0119] In this disclosure, a computer-readable storage medium may include a data signal propagated in baseband or as part of a carrier wave, carrying readable program code. Such propagated data signals may take various forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination thereof. A readable signal medium may also be any readable medium other than a readable storage medium, capable of transmitting, propagating, or transmitting a program for use by or in connection with an instruction execution system, apparatus, or device.

[0120] Optionally, the program code contained on the computer-readable storage medium may be transmitted using any suitable medium, including but not limited to wireless, wired, optical fiber, RF, etc., or any suitable combination thereof.

[0121] In practical implementation, program code for performing the operations of this disclosure can be written in any combination of one or more programming languages, including object-oriented programming languages ​​such as Java and C++, and conventional procedural programming languages ​​such as C or similar languages. The program code can execute entirely on the user's computing device, partially on the user's device, as a standalone software package, partially on the user's computing device and partially on a remote computing device, or entirely on a remote computing device or server. In cases involving remote computing devices, the remote computing device can be connected to the user's computing device via any type of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computing device (e.g., via the Internet using an Internet service provider).

[0122] The disclosed embodiments also provide a computer program product, which includes a computer program or computer instructions that are loaded and executed by a processor to enable a computer to perform the steps of the various exemplary embodiments according to the present disclosure described in the "Detailed Description" section above.

[0123] It should be noted that although several modules or units for the device used to perform actions have been mentioned in the detailed description above, this division is not mandatory. In fact, according to embodiments of this disclosure, the features and functions of two or more modules or units described above can be embodied in one module or unit. Conversely, the features and functions of one module or unit described above can be further divided and embodied by multiple modules or units.

[0124] Furthermore, although the steps of the method in this disclosure are described in a specific order in the accompanying drawings, this does not require or imply that the steps must be performed in that specific order, or that all the steps shown must be performed to achieve the desired result. Additional or alternative steps may be omitted, multiple steps may be combined into one step, and / or a step may be broken down into multiple steps.

[0125] From the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein can be implemented by software or by combining software with necessary hardware. Therefore, the technical solutions according to the embodiments of this disclosure can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (such as a CD-ROM, USB flash drive, external hard drive, etc.) or on a network, including several instructions to cause a computing device (such as a personal computer, server, mobile terminal, or network device, etc.) to execute the methods according to the embodiments of this disclosure.

[0126] Other embodiments of this disclosure will readily occur to those skilled in the art upon consideration of the specification and practice of the invention disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of this disclosure that follow the general principles of this disclosure and include common knowledge or customary techniques in the art not disclosed herein. The specification and examples are to be considered exemplary only, and the true scope of this disclosure is indicated by the appended claims.

Claims

1. A software detection method, characterized by, include: Obtain first component information of the software to be tested, the first component information being obtained based on the application source code of the software to be tested, the application source code including function calls; Based on the first component information and the second component information of multiple candidate components, the target candidate component is obtained; Based on the component vulnerability information of the target component to be selected and the component source code of the software to be tested, dangerous data stream information of the component is obtained; The dangerous data stream information is extracted to obtain data stream node information; Based on the called function and the data stream node information, determine whether the software to be tested has vulnerabilities; The step of obtaining component dangerous data flow information based on the component vulnerability information of the target component to be selected and the component source code of the software to be tested includes: extracting the component vulnerability information of the target component to be selected to obtain vulnerability exploit points; and analyzing the component source code of the target component to be selected and the component source code of the software to be tested based on the vulnerability exploit points to obtain component dangerous data flow information, wherein the component dangerous data flow information includes one or more of the following: the source point, propagation path, and outbreak point of dangerous data. The step of determining whether the software to be tested has a vulnerability based on the called function and the data stream node information includes: determining whether the called function and the data stream node information match; if they match, the software to be tested has a vulnerability; if they do not match, the software to be tested does not have a vulnerability.

2. The software detection method of claim 1, wherein, Before obtaining the first component information of the software to be tested, the method further includes: The first component information of the software under test is obtained from the configuration file of the application source code of the software under test.

3. The software detection method of claim 1, wherein, Before obtaining the target component to be selected based on the first component information and the second component information of multiple components to be selected, the method includes: Vulnerability information data of the multiple components to be selected were crawled from the vulnerability release platform; Information is extracted from the vulnerability information data to obtain the second component information of the plurality of candidate components.

4. The software detection method of claim 1, wherein, The first component information includes one or more of the following: component name, component source, release date, vulnerability information, and license; the second component information includes one or more of the following: component name, component source, release date, vulnerability information, and license. The step of obtaining the target component to be selected based on the first component information and the second component information of multiple components to be selected includes: Among the second component information of multiple candidate components, select the target second component information that matches the first component information; The component to be selected corresponding to the target second component information is taken as the target component to be selected.

5. A software detection apparatus characterized by comprising: include: The acquisition module is used to acquire first component information of the software to be tested, the first component information being obtained based on the application source code of the software to be tested, the application source code including function calls; The selection module is used to obtain the target component to be selected based on the first component information and the second component information of multiple components to be selected; The danger information generation module is used to obtain component danger data stream information based on the component vulnerability information of the target component to be selected and the component source code of the software to be tested; An extraction module is configured to extract the dangerous data flow information to obtain data flow node information; A processing module is configured to determine whether the to-be-detected software has a vulnerability according to the calling function and the data flow node information. The dangerous information generation module is further configured to extract component vulnerability information of a target to-be-selected component to obtain a vulnerability exploitation point; analyze component source code of the target to-be-selected component and component source code of the to-be-detected software based on the vulnerability exploitation point to obtain component dangerous data flow information, wherein the component dangerous data flow information includes one or more of a source point, a propagation path, and an outbreak point of dangerous data. The processing module is further configured to determine whether the calling function and the data flow node information match; if yes, the to-be-detected software has a vulnerability; if no, the to-be-detected software does not have a vulnerability.

6. An electronic device, comprising: comprise: a processor; and a memory configured to store executable instructions of the processor; wherein the processor is configured to execute the executable instructions to implement the software detection method in any one of claims 1-4.

7. A computer readable storage medium having stored thereon a computer program, characterized in that, The computer program is executed by the processor to implement the software detection method in any one of claims 1-4.