Chip searching method and system based on knowledge graph and storage medium

By constructing a multi-dimensional chip knowledge graph, the problem of difficult chip search was solved, enabling fast and accurate chip search, reducing costs and improving efficiency.

CN117453758BActive Publication Date: 2026-06-26JIANGSU JICUI IC APPL TECH MANAGEMENT CO LTD +2

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
JIANGSU JICUI IC APPL TECH MANAGEMENT CO LTD
Filing Date
2023-11-16
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

Existing chip search methods rely on experienced engineers, which are inefficient, costly, and ineffective, making it difficult to accurately find the right chip model when there are not enough original chip manufacturers and distributors.

Method used

A multi-dimensional chip knowledge graph is constructed by crawling chip model, manufacturer, and application field information, performing structured processing, and building a knowledge graph. Combined with natural language processing and machine learning algorithms, a multi-dimensional chip search method is realized.

Benefits of technology

It enables rapid and accurate chip locating from multiple dimensions, reducing repetitive searching work for engineers, lowering costs, and improving search efficiency and effectiveness.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN117453758B_ABST
    Figure CN117453758B_ABST
Patent Text Reader

Abstract

The application discloses a kind of chip finding method, system and storage medium based on knowledge graph, method includes: crawling chip model, chip original factory and 3 kinds of entity information of application field, obtain data source;The unstructured information in data source is structured, and structured information is inserted into knowledge graph;Chip knowledge graph database is constructed;According to the query requirement of user, requirement integration, analysis, translation are carried out;Chip knowledge graph database returns the query result to user.The system includes: crawler module, data processing module, chip knowledge graph, requirement acquisition module, finding module and output module.The application is based on knowledge graph technology, and information such as chip, manufacturer and application field is collected into graph, and the user can find the required chip from multiple dimensions, especially in the case of incomplete product information, targeted search can be carried out, with high search efficiency and accurate search results;It is conducive to the popularization and application of semiconductor chips.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of big data search and mining, specifically to a chip search method, system, and storage medium based on knowledge graphs. Background Technology

[0002] Integrated circuits are the "heart" of electronic information products, and are widely used in consumer electronics, computers, network communications, automotive electronics, the Internet of Things, cloud computing, energy conservation and environmental protection, high-end equipment, medical electronics, and other fields. With the increasing specialization of industry, the integrated circuit industry can be divided into sub-sectors such as integrated circuit design, integrated circuit manufacturing, integrated circuit packaging and testing, integrated circuit equipment manufacturing, and integrated circuit materials.

[0003] my country is a major consumer of memory chips, consuming nearly 50% of the world's memory production capacity. Faced with a massive chip supply, the number of similar component models and manufacturers on the market is increasing. Although there are more websites and channels for finding chips, selecting and using chips is becoming increasingly difficult.

[0004] Existing chip search methods mainly rely on experienced engineers selecting chips from well-known domestic and international agent websites, which has three problems: (1) products cannot be selected because their inventory parameters are not completely compatible or they are not in the inventory; (2) chip manufacturers and well-known agent websites have not reached a cooperation agreement, making their products unavailable for selection; (3) chip search can only be conducted through chip parameters. As the number of semiconductor-related companies in China increases and the corresponding products produced also increase, it is becoming more difficult to select suitable chip products. Therefore, many engineers cannot find suitable chip models through websites and need to find chip manufacturers themselves based on their needs and then check chip information through the manufacturers' websites, which is inefficient, costly, and ineffective. Summary of the Invention

[0005] Purpose of the invention: To address the difficulties in classifying and searching chip data caused by the large number of chip manufacturers, products, and incomplete aggregation by agent websites, this invention constructs a multi-dimensional chip knowledge graph and proposes a new chip search method based on this knowledge graph. Furthermore, this invention proposes a system and storage medium for executing this chip search method.

[0006] In a first aspect, the present invention proposes a chip lookup method based on knowledge graphs, comprising the following steps:

[0007] (1) Crawl three types of entity information: chip model, chip manufacturer, and application field to obtain data source;

[0008] (2) Structure the unstructured information in the data source and insert the structured information into the knowledge graph;

[0009] (3) Construct a chip knowledge graph based on the relationship between the three entities: chip model, application field, and chip manufacturer, and / or the attribute relationship within the three entities;

[0010] (4) Based on the user's query requirements, integrate, parse, and translate the requirements;

[0011] (5) The chip knowledge graph database returns the query results to the user.

[0012] In one embodiment, the unstructured information is processed using natural language methods.

[0013] In one embodiment, the method further includes steps for maintaining and updating the knowledge graph database in an automatic and / or manual manner.

[0014] In one embodiment, if the chip model and the chip model similar to pintopin are unknown, the user inputs specific application requirements and a small number of chip requirements, where a small number of chip requirements refers to a chip requirement of two or fewer. First, the application category is identified based on the application category recognition algorithm. Based on the application category and chip requirements, the shortest distance between the corresponding application field and chip entity attributes is searched in the chip knowledge graph database, and a list of chip models is output after sorting by price.

[0015] In one embodiment, if the chip model and similar chip models to pintopin are unknown, the user inputs several specific chip requirements, searches for chip entity attributes in the chip knowledge graph database, and calculates similarity as follows:

[0016] Suppose the specific chip requirement has n chip attributes: R1, R2, R3, ..., R n ;

[0017] For each chip attribute with a numerical parameter, the system searches the chip knowledge graph database for chips with parameter values ​​greater than or equal to that value.

[0018] If two identical pairs are found, the similarity is denoted as S. i =100%;

[0019] If a value greater than this is found, the similarity is denoted as S. i =90%, deviation value dev i =R I -R i R I To find the parameter values, R i The numerical value of the parameter input by the user;

[0020] If no corresponding attribute is found, the similarity score is S. i=0%;

[0021] Iterate through the parameter values ​​of n chip attributes, calculate the mean similarity S and total deviation value dev of the chips, sort them from largest to smallest similarity; for chips with the same similarity, sort them from smallest to largest deviation value, and output the chip list.

[0022] The formulas for calculating the mean similarity S and the total deviation value dev are as follows:

[0023]

[0024]

[0025] Furthermore, if the chip model and pin-to-pin similar chip models are unknown, the user inputs specific chip requirements and specific application requirements. First, the application category is identified based on the application category recognition algorithm. Then, the corresponding application field is searched in the chip knowledge graph database, similarity is calculated, and the chip list is output after sorting.

[0026] Furthermore, if a suitable chip cannot be found, the application domain is identified based on the application category recognition algorithm. Then, all chip manufacturer entities corresponding to the relevant application domain are searched in the chip knowledge graph database, and the V value for each chip manufacturer is calculated. i Based on the centrality of the chip manufacturer, detailed information for each manufacturer is output in descending order of centrality. The formula for calculating the centrality is as follows:

[0027]

[0028] Where A j Representing the subnet and the chip manufacturer v j The number of other connected nodes, N represents the chip manufacturer's v j The number of all nodes in the network indicates the centrality of the chip manufacturer. A higher centrality indicates the importance of the chip manufacturer in the knowledge graph of domestic chips.

[0029] Secondly, the present invention proposes a chip lookup system for performing the chip lookup method, comprising:

[0030] The crawler module is used to crawl three types of entity information: chip model, chip manufacturer, and application field to obtain data sources.

[0031] The data processing module, connected to the crawler module, is used to process the data source;

[0032] A chip knowledge graph is connected to a data processing module. Based on the data source, it constructs relationships between three entities: chip model, application field, and chip manufacturer, and / or attribute relationships within these three entities.

[0033] The demand acquisition module is used to collect users' chip search requirements;

[0034] The search module is used to parse the user's query requirements and translate the parsed query requirements into query commands for graph search; the user's query requirements are one or more of the following: application field, chip attribute, chip model, similar chips, and chip manufacturer; the application field information is identified by the application category recognition algorithm built into the search module.

[0035] The output module is used to return the map query results to the user.

[0036] Preferably, the application category recognition algorithm is built based on the BERT model:

[0037] First, based on the crawled text data, word segmentation tools are used in conjunction with word frequency statistics to segment the specific applications and text, and word frequency statistics are performed to obtain one or more application categories. For each application category, the frequency of word combinations of that application category in the text is determined. If the frequency is greater than a threshold μ, the text is considered to belong to that application category, thus performing initial data screening. Then, after manual fine screening, training data is constructed. The training data is then constructed into the input format of the BERT model, and input into training to obtain a classification model with the expected accuracy.

[0038] Thirdly, the present invention provides a computer-readable storage medium storing at least one executable instruction, which, when executed on an electronic device, causes the electronic device to perform the chip lookup method.

[0039] Compared with existing methods, the present invention has the following advantages:

[0040] (1) The chip manufacturer information and product information are collected into the knowledge graph, and a multi-layer chip knowledge graph structure is constructed to meet the needs of searching for chips individually or in combination through multiple query dimensions (chip attributes, chip model, similar chips, application fields, chip manufacturers, etc.);

[0041] (2) When chip parameters are incomplete or not uploaded, it can list all original manufacturer information according to specific application requirements, and then directly search for the required chip on the relevant official website, thus achieving targeted search and saving engineers the time and difficulty of searching for chip manufacturers in the corresponding field one by one.

[0042] (3) The join query for traditional relational tables is transformed into a path query for point and edge graphs, resulting in faster query speed;

[0043] (4) It adopts an automatic information extraction and insertion mode, and has an interface and space that are constantly developing and expanding. Attached Figure Description

[0044] Figure 1 This is a flowchart of a chip search method based on a knowledge graph according to an embodiment of the present invention;

[0045] Figure 2 This is a chip knowledge graph database structure according to an embodiment of the present invention. Detailed Implementation

[0046] To make the above-mentioned objectives, features, and advantages of this application more apparent and understandable, the specific embodiments of this application are described in detail below with reference to the accompanying drawings. Many specific details are set forth in the following description to provide a thorough understanding of this application. However, this application can be implemented in many other ways different from those described herein, and those skilled in the art can make similar modifications without departing from the spirit of this application. Therefore, this application is not limited to the specific embodiments disclosed below.

[0047] Knowledge graphs are essentially knowledge bases of semantic networks. They can be simply understood as multi-relationship graphs or as a knowledge base. This is why they can be used to answer search-related questions. When we perform a search, we can directly obtain the final answer through keyword extraction and matching on the knowledge base. This search method differs from traditional search engines, which return web pages rather than the final answer, thus adding an extra layer of information filtering and selection for the user.

[0048] I. Construction of Knowledge Graph

[0049] The application of knowledge graphs requires the construction of knowledge graphs. The knowledge graph of a chip can be constructed using a property graph. Figure 2 The embodiment shown is a chip knowledge graph with a three-layer entity architecture. It constructs attribute relationships between three entities—application domain, chip model, and original manufacturer—as well as within these three entities, specifically for the chip search domain.

[0050] Specifically, the above application areas mainly include: film and television, data computing, mobile communication, network communication, wireless communication, camera equipment, audio equipment, game consoles, medical sensing equipment, drug testing, artificial intelligence in medicine, intelligent transportation, smart homes, smart cities, robotics, etc. Each area has corresponding chip application modules, and each area may have independent technical classifications, reflected in the relationship between specific applications and chips. On the other hand, chips have many general attributes, including but not limited to: interfaces, packaging, processors, peripherals, connectivity, price, and pintopin. Associating the chips involved in the above application areas with the specific application areas can be one-to-one or one-to-many.

[0051] Secondly, the original manufacturer also possesses many common attributes, including but not limited to: official website, products, and customers, which can further establish a subordinate relationship between the chip and the original manufacturer.

[0052] It's also possible to link chips with application companies through the relationship between the original manufacturer and the end-user company. Common attributes of end-user companies include, but are not limited to: website, products, registered capital, and establishment date.

[0053] Chip application modules (products) may occasionally have a subordinate relationship with specific fields.

[0054] II. Data Mining and Filling Based on Web Crawler Technology and Knowledge Extraction

[0055] The prerequisite for building a knowledge graph is to extract data from different data sources. These data sources mainly come from two channels: one is structured data from chip distributor websites; the other is publicly available or scraped data from the internet. The former generally only requires simple preprocessing before it can be used as input for subsequent AI systems, while the latter data usually exists in the form of web pages and is therefore unstructured. It generally requires the use of technologies such as natural language processing to extract structured information.

[0056] Therefore, preferably, the data source in this invention can be crawled from three directions: chip model, chip manufacturer, and application field. Specifically, crawling technology can be used to crawl relevant information on chip model, chip manufacturer, and application field from existing chip agent websites, chip manufacturer websites, and public networks. Then, based on the keywords of entity attributes in the structure of the knowledge graph designed above, natural language processing techniques such as entity naming recognition, relation extraction, entity unification, and referential resolution are used to structure the unstructured information. Finally, the structured information is inserted using Cypher language. In addition to intelligent crawling data filling of the graph, the knowledge graph database can be continuously maintained and updated automatically or manually. As the database grows and improves, the query results will be more accurate and convenient.

[0057] III. Chip lookup based on the constructed knowledge graph

[0058] The chip search function is user-oriented and includes two parts: data acquisition and data search.

[0059] Data Acquisition: Collect user input regarding application areas, chip attributes, chip models, similar chips, and chip manufacturers' requirements, and automatically extract and insert the required information.

[0060] Data search: Based on one or more search dimensions such as chip attributes (including chip parameters), chip model, similar chips, application field, and chip manufacturer, the system analyzes and combines user needs, then translates the analyzed specific requirements into Cypher query commands for graph lookup. There are three specific search scenarios, as follows:

[0061] 1) Known chip model:

[0062] The Cypher command can be used to directly query the chip entity's name and output the query results.

[0063] 2) Known pintopin similar chip models:

[0064] To find related chips based on chip attributes, such as finding the lowest-priced chip, the method is as follows: First, use the Cypher command to match similar chip models in the chip entity to find all chips, and then sort and output them from low to high price.

[0065] 3) Chip model similar to pintopin, but the chip model is unknown. The user only has partial requirements. This meets most chip search scenarios. Based on the different types of user needs, they are divided into the following categories:

[0066] (1) The user's demand type is specific application demand and a small number of chip demand ("small number" means that the chip demand is only 2 or less): The user can input a description of a specific demand, and the application category is identified based on the application category recognition algorithm (the category matches the specific application in the knowledge graph). After the category is output, the Cypher command is used to find the shortest distance between the specific application entity "Module 1" and the chip demand attribute "Processor: 68000" and sorted by price to output the chip model list.

[0067] (2) The user's requirement type is a specific chip requirement. Based on the requirement attribute in the chip entity, Cypher command search is performed, and similarity is calculated according to the following strategy:

[0068] Suppose there are N specific chip attribute lists, namely R1, R2, R3, ..., R n There are M chip attributes whose corresponding parameters are numerical. A graph search command is used to find results that are greater than or equal to these numerical values.

[0069] If a perfect matching value is found, the similarity is denoted as S. i =100%; if a chip with a parameter value greater than this value is found, the similarity is denoted as S. i =90%, deviation value dev i =RI -R i If no corresponding attribute similarity S is found i =0% (non-numeric attributes, or chip parameter values ​​less than the required values), calculate R for each attribute according to the above method. i Calculate the total similarity S and total deviation value dev for each output chip, and sort them from largest to smallest according to the similarity; for chips with the same similarity, sort them from smallest to largest according to the deviation value, and output the chip list.

[0070] The formulas for calculating the total similarity S and the total deviation value dev are as follows:

[0071]

[0072]

[0073] Based on specific chip requirements and application needs, and combining the search methods of categories A and B above, the system first identifies the categories, then combines the specific application entities and chip requirements to search using the Cypher command, and finally applies a similarity strategy to calculate, sort, and output a chip list.

[0074] (3) If a suitable chip cannot be found due to incomplete online information from the chip manufacturer or other reasons, then after identifying the application category according to the requirements, search for all chip manufacturer entities corresponding to this specific application category in the chip map, and calculate v for each chip manufacturer. j Based on the centrality of the chip, the system outputs detailed information about each chip manufacturer, sorted by centrality from largest to smallest, to facilitate further communication and chip selection by engineers. The centrality is calculated as follows:

[0075]

[0076] Where A j Representing the subnet and the chip manufacturer v j The number of other connected nodes, N represents the chip manufacturer's v j The number of all nodes in the network indicates the centrality of the chip manufacturer. A higher centrality indicates the importance of the chip manufacturer in the knowledge graph of domestic chips.

[0077] According to the chip search method proposed in this invention, users can query results from multiple dimensions such as application field, chip attributes, chip model, similar chips, and chip manufacturer. The knowledge graph database can be Neo4j, and the query language can be Cypher. If the chip model cannot be accurately located by chip attributes, users can first locate the company by the application field of the chip, then visit the official website of each company for detailed searching, and finally contact the manufacturer to inquire about unpublished chip information.

[0078] In addition to simple queries based on entity nodes and attribute relationships in the knowledge graph, this invention can also satisfy various types of query needs by utilizing knowledge graph query functions such as degree centrality, betweenness centrality, compact centrality, and shortest distance.

[0079] The above data insertion and querying will be executed by the backend using commands generated by the Cypher language through page input (or checkboxes). For example, based on the shortest distance between "Industry" and "Company A", the system can locate all chip products developed by Company A in the industrial field, and based on the retrieved chip products, it can locate application terminals and other information. The command execution format is as follows:

[0080] Match(GongYe:application{name:"industrial"}),(CompanyA:original factory

[0081] {name:"Company A"}), p = allShortestPaths((GongYe)-[*]-(CompanyA))

[0082] returnp

[0083] Based on this, the present invention also proposes a chip lookup system for performing a chip lookup method, comprising:

[0084] The web crawler module is used to crawl information such as chip model, application field, and chip manufacturer to obtain data sources;

[0085] The data processing module, connected to the crawler module, is used to process the data source;

[0086] A chip knowledge graph is connected to a data processing module. Based on the data source, it constructs relationships between three entities: chip model, application field, and chip manufacturer, and / or attribute relationships within these three entities.

[0087] The demand acquisition module is used to collect users' chip search requirements;

[0088] The search module is used to parse the user's query requirements and translate the parsed query requirements into query commands for graph search. The user's query requirements can be one or more of the following: application field, chip attribute, chip model, similar chips, and chip manufacturer. The application field needs to be identified by the application category recognition algorithm built into the search module.

[0089] Preferably, the application category recognition algorithm can be built based on the BERT model.

[0090] In the BERT model training phase, firstly, based on specific website crawler text data for chip entities and company entities, and publicly available web crawler text data for specific applications, word segmentation tools such as jieba are used in conjunction with word frequency statistics to segment the specific applications and text, and word frequency statistics are performed to obtain one or more application categories. For each application category, the frequency of word combinations of that application category in the text is determined. If the frequency is greater than a threshold μ, the text is considered to belong to that application category, thus performing initial data screening. Then, a small amount of training data (around 1000 pieces) is constructed by manual refinement, thus completing the training data construction. Finally, the training data is constructed into the input format of the BERT model, and input into training, resulting in a classification model with a good accuracy.

[0091] Once the model is trained, input the text to be recognized, and it will identify and output the application category.

[0092] In addition, application categories can be identified through conventional algorithms such as machine learning and deep learning.

[0093] Furthermore, embodiments of the present invention also provide a non-transitory computer-readable storage medium for storing computer instructions that cause the computer to execute the methods provided in the above-described method embodiments.

[0094] The technical features of the above embodiments can be combined in any way. For the sake of brevity, not all possible combinations of the technical features in the above embodiments are described. However, as long as there is no contradiction in the combination of these technical features, they should be considered to be within the scope of this specification.

[0095] The embodiments described above are merely illustrative of several implementation methods of this application, and while the descriptions are relatively specific and detailed, they should not be construed as limiting the scope of the patent application. It should be noted that those skilled in the art can make various modifications and improvements without departing from the concept of this application, and these all fall within the protection scope of this application. Therefore, the protection scope of this patent application should be determined by the appended claims.

Claims

1. A chip lookup method based on knowledge graphs, characterized in that, Includes the following steps: (1) Crawl three types of entity information: chip model, chip manufacturer, and application field to obtain the data source; (2) Structure the unstructured information in the data source and insert the structured information into the knowledge graph; (3) Construct a chip knowledge graph based on the relationship between the three entities: chip model, application field, and chip manufacturer, and / or the attribute relationship within the three entities; (4) Based on the user's query requirements, integrate, parse, and translate the query; specifically: (4-1) If the chip model and the chip model similar to pintopin are unknown, the user inputs the specific application requirements and a small number of chip requirements. The small number of chip requirements refers to a chip requirement of less than 2. First, the application category is identified based on the application category identification algorithm. According to the application category and chip requirements, the shortest distance between the corresponding application field and chip entity attributes is searched in the chip knowledge graph database. The chip model list is then searched and output according to price. (4-2) If the chip model and the chip model similar to pintopin are unknown, the user inputs several specific chip requirements, searches for chip entity attributes in the chip knowledge graph database, and calculates similarity as follows: Suppose the specific chip requirement has n chip attributes: R1, R2, R3, ..., R n ; For each chip attribute with a numerical parameter, search the chip knowledge graph database for chips with parameter values ​​greater than or equal to that value. Specifically: If two identical pairs are found, the similarity is denoted as S. i =100%; if a value greater than this is found, the similarity is recorded as S. i =90%, deviation value dev i = R I -R i R I To find the parameter values, R i The parameter value is input by the user; if no corresponding attribute is found, the similarity S is... i =0%; Iterate through the parameter values ​​of n chip attributes and calculate the average similarity of the chips. Total deviation value Sort the chips from highest to lowest similarity; for chips with the same similarity, sort them from lowest to highest deviation value, and output the chip list. The average similarity Total deviation value The calculation formula is as follows: ; ; (4-3) If the chip model and the chip model similar to pintopin are unknown, the user inputs the specific chip requirements and specific application requirements. First, the application category is identified based on the application category identification algorithm. The corresponding application field is searched in the chip knowledge graph database, and the similarity is calculated. After sorting, the chip list is output. (4-4) If a suitable chip cannot be found, the application field is identified based on the application category recognition algorithm. Then, all chip manufacturer entities corresponding to the application field are searched in the chip knowledge graph database, and the V of each chip manufacturer is calculated. i Based on the centrality of the chip manufacturer, detailed information for each manufacturer is output in descending order of centrality. The formula for calculating the centrality is as follows: ; Where A j Representing the subnet and the chip manufacturer v j The number of other connected nodes, N represents the chip manufacturer's v j The number of all nodes in the network; the greater this centrality, the more important the chip manufacturer is in the knowledge graph of domestic chips. (5) The chip knowledge graph database returns the query results to the user.

2. The chip search method according to claim 1, characterized in that, In step (2), the unstructured information is processed using natural language methods.

3. The chip search method according to claim 1, characterized in that, It also includes steps for maintaining and updating the knowledge graph database using automated and / or manual methods.

4. A chip lookup system for performing the chip lookup method as described in any one of claims 1-3, characterized in that, include: The crawler module is used to crawl three types of entity information: chip model, chip manufacturer, and application field to obtain data sources. The data processing module, connected to the crawler module, is used to process the data source; A chip knowledge graph is connected to a data processing module. Based on the data source, it constructs relationships between three entities: chip model, application field, and chip manufacturer, and / or attribute relationships within these three entities. The demand acquisition module is used to collect users' chip search requirements; The search module is used to parse the user's query requirements and translate the parsed query requirements into query commands for graph search; the user's query requirements are one or more of the following: application field, chip attribute, chip model, similar chips, and chip manufacturer; the application field information is identified by the application category recognition algorithm built into the search module. The output module is used to return the map query results to the user.

5. The chip lookup system according to claim 4, characterized in that, The application category identification algorithm is based on the BERT model. First, based on the crawled text data, word segmentation tools are used in conjunction with word frequency statistics to segment the specific applications and text, and word frequency statistics are performed to obtain one or more application categories. For each application category, the frequency of word combinations of that application category in the text is determined. If the frequency is greater than a threshold μ, the text is considered to belong to that application category, thus performing initial data screening. Then, after manual fine screening, training data is constructed. The training data is then constructed into the input format of the BERT model, and input into training to obtain a classification model with the expected accuracy.

6. A computer-readable storage medium, characterized in that, The storage medium stores at least one executable instruction, which, when executed on an electronic device, causes the electronic device to perform the chip lookup method as described in any one of claims 1-3.