Product matching master generation device, product matching master generation method, and program

The product master generation device and method address the inefficiencies in product data transmission by automatically aligning and integrating data across organizations, enhancing data consistency and reducing errors.

JP7875578B2Active Publication Date: 2026-06-18LAZULI CO LTD

Patent Information

Authority / Receiving Office
JP · JP
Patent Type
Patents
Current Assignee / Owner
LAZULI CO LTD
Filing Date
2021-07-16
Publication Date
2026-06-18

AI Technical Summary

Technical Problem

Conventional methods of transmitting product information between organizations are time-consuming and laborious, often leading to transcription errors and data inconsistencies due to varying notations and formats, especially when handling multiple products across different organizations.

Method used

A product master generation device and method that utilizes pattern generation, vector information creation, similarity calculation, and integration to automatically consolidate and align product data, determining identical products across different organizations.

🎯Benefits of technology

Automatically consolidates and aligns product data, reducing the time and effort required for data integration and transmission, while minimizing errors and inconsistencies.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure 0007875578000001
    Figure 0007875578000001
  • Figure 0007875578000002
    Figure 0007875578000002
  • Figure 0007875578000003
    Figure 0007875578000003
Patent Text Reader

Abstract

To provide a name identification product master generation device, a generation method thereof, and a program that automatically identify names and collect a plurality of pieces of product data.SOLUTION: In a product name identification system 1, a name identification product master generation device 10 includes: a pattern generation unit that generates a plurality of product identification latent patterns for each of a plurality of pieces of product data, based on product identification latent information including at least a product name included in the product data; a vector information generation unit that generates vector information for each product identification latent pattern for each piece of product data; a same product determination unit having a similarity calculation unit that calculates the degree of similarity between two products based on the vector information of the product data on the two products for each product identification latent pattern and a determination unit that determines whether the two products are the same based on the degree of similarity; and an integration generation unit that integrates product information included in the two pieces of product data that are determines to be the same by the determination unit, for each category of the product information, and generates a name identification product master which is a product master of the same product.SELECTED DRAWING: Figure 2
Need to check novelty before this filing date? Find Prior Art

Description

【Technical Field】 【0001】 The present invention relates to a grouped product master generation device, a method for generating a grouped product master, and a program. 【Background Art】 【0002】 Goods are transported and distributed through commodity transactions between multiple organizations, such as between a manufacturer and a wholesaler, between a wholesaler and a retailer, or between multiple departments within a single company. Within each organization, product data including product information is managed using spreadsheet software or the like to manage the product information. At the time of commodity transactions, not only the transportation of goods but also the transmission of product data between organizations occurs. As a conventional method of transmitting product data, it is transmitted to the organization of the business partner by email to transmit the product information of the goods related to the transaction. 【Prior Art Documents】 【Patent Documents】 【0003】 【Patent Document 1】 Japanese Patent No. 6427850 【Summary of the Invention】 【Problems to be Solved by the Invention】 【0004】 However, the conventional methods of transmitting product information described above had the problem of being time-consuming and laborious in handling product information. For example, even for the same product, product data could vary in notation or format from organization to organization. Therefore, in order to use product information contained in product data obtained from other organizations within one's own organization, it was sometimes necessary to re-enter the data into spreadsheet software for that organization and manage it accordingly. In such cases, there was a risk of transcription errors or omissions of data that should have been entered. Furthermore, this data preparation work had to be done every time the product information contained in the product data was updated. In addition, one organization may handle multiple products with multiple organizations, such as between manufacturers and wholesalers, between wholesalers and retailers, or between multiple departments within a single company, in which case the above-mentioned data preparation work becomes enormous. Thus, the transmission and preparation of product information had the problem of being time-consuming and laborious. 【0005】 The present invention was made to solve these problems, and one of its objectives is to provide a product master generation device, a method for generating a product master, and a program that can automatically consolidate multiple product data, thereby saving the time and effort required to integrate product data between different organizations and facilitating the transmission of product information. [Means for solving the problem] 【0006】 A matching product master generation device according to one aspect of the present invention is characterized by comprising: a pattern generation unit that generates a plurality of product identification latent patterns for each of the plurality of product data based on product identification latent information including at least a product name contained in the product data; a vector information generation unit that generates vector information for each of the product identification latent patterns for each of the product data; a similarity calculation unit that calculates the similarity of two products for each of the product identification latent patterns based on the vector information of the product data of the two products; a determination unit that determines whether the two products are the same based on the similarity; and an integration generation unit that integrates the product information contained in the two product data determined to be the same by the determination unit for each category of the product information and generates a matching product master which is a product master for the same products. 【0007】 A method for generating a matching product master according to one aspect of the present invention is characterized by comprising: a pattern generation step of generating a plurality of product identification latent patterns for each of a plurality of product data based on product identification latent information including at least a product name contained in the product data; a vector information generation step of generating vector information for each of the product identification latent patterns for each of the product data; a similarity calculation step of calculating the similarity of two products for each of the product identification latent patterns based on the vector information of the product data of the two products; a determination step of determining whether the two products are identical based on the similarity; and an integration generation step of integrating the product information contained in the two product data determined to be identical in the determination step for each category of the product information to generate a matching product master which is a product master for the identical products. 【0008】 One aspect of the present invention is a program that causes a computer to execute the above method. [Effects of the Invention] 【0009】 According to the present invention, multiple product data can be automatically consolidated, thus eliminating the time and effort required to integrate product data between different organizations and facilitating the transmission of product information. [Brief explanation of the drawing] 【0010】 [Figure 1] This figure shows the hardware configuration of the product name matching system according to the embodiment. [Figure 2] This is a diagram showing the configuration of the product name matching system according to the embodiment. [Figure 3] This is an example of product data, specifically an example of web data on an e-commerce site. [Figure 4] This figure shows an example of the detailed configuration of the identical product determination unit. [Figure 5] This is a diagram illustrating the process for determining whether two products are identical. [Figure 6] This diagram illustrates an example where the identical product determination unit determines that products 0 to 3 are the same product. [Figure 7] This diagram illustrates an example where the identical product determination unit determines that products 0, 4-6 are different products. [Figure 8] This diagram illustrates an example where the identical product determination unit determines that products 10 to 13, which are different from product 0, are the same product. [Figure 9] This diagram illustrates an example where the identical product determination unit determines that products 10, 14-16 are different products. [Figure 10] This diagram illustrates an example of identical product determination for products 20-23 by the identical product determination unit. [Figure 11] This diagram illustrates an example of identical product determination for products 30 to 33 by the identical product determination unit. [Figure 12] This figure shows an example of the detailed configuration of the integrated generation unit. [Figure 13] This figure shows an example of a product matching master. [Figure 14] This figure shows an example of the detailed configuration of the feature / evaluation information generation unit. [Figure 15] This is a diagram showing an example of the detailed configuration of the product graph generation unit. [Figure 16] This is a diagram showing an example of a product graph. [Figure 17] This is an example of the operation flowchart of the product name alignment system according to the embodiment. [Figure 18] This is an example of the operation flowchart of the aligned product master generation device according to the embodiment. [Embodiments for Carrying Out the Invention] 【0011】 Hereinafter, embodiments of the product name alignment system and the aligned product master generation device according to the present invention will be described in detail with reference to the drawings. In this specification, for the sake of convenience of explanation, detailed explanations that are not necessary may be omitted. For example, detailed explanations of well-known matters and duplicate explanations of substantially the same configurations may be omitted. 【0012】 [1. Configuration] [1-1. Overview] The product name alignment system of this embodiment acquires a plurality of product data which are data related to products. When these product data are related to the same product, the product information included in the product data is classified and integrated for each category of the product information to generate an aligned product master (see FIG. 13 described later). 【0013】 Product data is data related to products and includes, for example, product information such as product names, product specification information, and information indicating product features / evaluations. Product data may include product logistics information (e.g., inventory location, allowable days for shipping / arrival), transaction information (e.g., purchase unit price, selling unit price), customer information, and purchase information. The products targeted by the product data may be pharmaceuticals. 【0014】 Even for the same product, product data can vary and be inconsistent. For example, in the case of product names, some data may include the manufacturer's official product name, while others may not include the official product name but instead include an abbreviation or common name. Similarly, in the case of product specifications, the units for product volume and size may vary. Thus, the product name matching system of this embodiment organizes and integrates the inconsistent product data for each product to generate a unified product master, which is a product name matching master. 【0015】 Furthermore, the product name matching system generates product feature / evaluation information from data indicating the characteristics of the product included in the product data. The product name matching system may also generate a product graph based on the feature / evaluation information. The feature / evaluation information is information that indicates the characteristics and / or evaluation of a product, and does not include product name or specification information. Details of the feature / evaluation information will be described later. The product graph is a graph that shows the relationships between products, and details will be described later. 【0016】 Such product name matching systems, product master lists, feature / evaluation information, and product graphs can be provided via cloud computing. 【0017】 [1-2. Hardware Configuration] Figure 1 shows the hardware configuration of the product name matching system according to this embodiment. The product name matching system 1 comprises a processor 1a, a storage device 1b, an input device 1c, an output device 1d, and a communication device 1e. Each of the components 1a to 1e is connected by a bus 1f. An interface may be interposed between the bus 1f and each of the components 1a to 1e as needed. The product name matching system 1 can be configured to include computers such as desktop computers, tablet computers, and notebook computers, and does not need to be composed of a single physical device, but may be composed of multiple physical devices. 【0018】 Processor 1a controls the operation of the entire product name matching system 1. Processor 1a is an electronic circuit such as a CPU or MPU. Processor 1a performs various processes by reading and executing programs and data stored in the memory device 1b. Processor 1a may be composed of multiple processors. 【0019】 The storage device 1b includes volatile memory RAM 1b-1 and non-volatile memory ROM 1b-2. The storage device 1b may also include external memory 1b-3. RAM 1b-1 functions as the main memory and / or workspace of the processor 1a. The processor 1a loads programs and other necessary data from ROM 1b-2 and external memory 1b-3 into RAM 1b-1 to perform various operations, and executes the loaded programs. ROM 1b-2 and external memory 1b-3 store the BIOS and OS, which are control programs for the processor 1a, as well as various programs, data, tables, etc., necessary to realize the functions executed by the computer. External memory 1b-3 can include, for example, flash memory, hard disk, DVD-RAM, USB memory, SSD, etc. 【0020】 Input device 1c receives operation instructions and input from the user, etc. Input device 1c is a user interface such as an input button, keyboard, mouse, touch panel, touchpad, wireless remote control, microphone, or camera. Note that a touch panel can function as both an input device 1c and an output device 1d. 【0021】 The output device 1d outputs data processed by the processor 1a and / or stored in the storage device 1b. The output device 1d may include, for example, display devices such as CRT displays, liquid crystal displays, organic EL displays, and plasma displays; sound devices such as speakers that emit sound; and printing devices such as printers. 【0022】 The communication device 1e is an interface that connects to and communicates with external devices via a network or directly. The communication device 1e can be, for example, a serial interface, a LAN interface, or the like. 【0023】 Each part of the product name matching system 1 is realized by various programs stored in ROM 1b-2 and external memory 1b-3 using each configuration 1a to 1f as resources. 【0024】 [1-3. Detailed Configuration] Figure 2 shows the configuration of the product name matching system 1 according to this embodiment. As shown in Figure 2, the product name matching system 1 includes a product data acquisition unit 20, a same product determination unit 30, an integrated generation unit 40, a feature / evaluation information generation unit 50, a product graph generation unit 60, and an information addition unit 70. The product name matching master generation device 10 may also include the same product determination unit 30 and the integrated generation unit 40. The product name matching master generation device 10 may also include the product data acquisition unit 20, and the hardware configuration of the product name matching master generation device 10 is the same as that of the product name matching system 1 shown in Figure 1, so a description is omitted. 【0025】 The product data acquisition unit 20 acquires multiple product data. Product data can include, for example, web data related to products such as HTML, and product master data used within the organization. Web data refers to internet data related to products, such as web pages related to the manufacturer's products on the manufacturer's website, or web pages related to products on e-commerce (EC) sites. 【0026】 In one example, the product data acquisition unit 20 uses a crawler program stored in the storage device 1b to traverse the internet and acquire web data related to products. This web data can be, for example, HTML data about products on e-commerce sites or HTML data about products on manufacturer websites. 【0027】 In another example, the product data acquisition unit 20 acquires product master data used by two or more organizations (for example, different companies or different departments within the same company). For example, the product data acquisition unit 20 may acquire the product master data from equipment owned by an organization via a communication device 1e, either wired or wirelessly, or it may acquire it from an external memory 1b-3 where the product master data is stored. 【0028】 In other examples, the product data acquisition unit 20 may acquire web data and product master data held by the organization as product data. In other words, the product data acquisition unit 20 can acquire any data related to products, not limited to open data accessible to unspecified persons such as web data, or closed data such as internal product master data within the organization. 【0029】 The product data acquisition unit 20 stores the acquired product data in the storage device 1b, and the storage device 1b can hold the acquired product data as a product data database. 【0030】 Product data is data related to a product, and includes product information such as product name, product specifications, and information indicating product features / evaluations. Product data may also include product identification codes such as JAN codes assigned to each product. Product names can include the product's name, abbreviation, or common name. Specifications vary depending on the type of product, but for example, if the product is bottled tea, it can include objective information about the product such as the volume, container type, size, weight, and number of bottles. Specifications may also include numerical information related to the product. Numerical information includes the number and its unit. Information indicating product features / evaluations can include product descriptions, product introductions, sales pitches, catchphrases, customer reviews, and consumer feedback on the product. One example is the product description included in web data related to a product. 【0031】 Product data includes product identification latent information. This product identification latent information may include information that can identify a product, as well as information that can potentially identify a product. For example, product identification latent information may include the product name. 【0032】 Figure 3 shows an example of product data, specifically an example of web data on an e-commerce site. In the example shown in Figure 3, the web data includes the product image G0, product identification latent information G1, specification information G2, and information indicating the product's features / evaluation G3. In the example shown in Figure 3, the product identification latent information G1 is "Yokohama Foods Instant Noodles, Pork Kimchi Ramen, Spicy Flavor, 3-Pack" written at the top of the web page. The product identification latent information G1 includes the product name, "Pork Kimchi Ramen." The specification information G2 includes the product brand "Pork Kimchi Ramen," the manufacturer name "Yokohama Foods," product size, weight, ingredients, etc. The information indicating the product's features / evaluation G3 is product description information, specifically, "Our pride and joy is our spicy soup made with kimchi and sesame oil. Kimchi and egg are included as toppings. Adding chives makes it devilishly delicious...! Guaranteed to be addictive." 【0033】 The identical product determination unit 30 analyzes multiple product data acquired by the product data acquisition unit 20 and identifies multiple product data related to the same product. Figure 4 shows an example of the detailed configuration of the identical product determination unit 30. 【0034】 As shown in Figure 4, the identical product determination unit 30 includes a pattern generation unit 31, a vector information generation unit 32, a similarity calculation unit 33, and a determination unit 34. 【0035】 (Pattern generation unit) The pattern generation unit 31 generates multiple product identification latent patterns for each of the multiple product data based on the product identification latent information. Specifically, the pattern generation unit 31 extracts the product identification latent information contained in the product data. The pattern generation unit 31 identifies and extracts the product identification latent information according to the type of product data using program 311. Then, the pattern generation unit 31 identifies one or more unit information contained in the product identification latent information and generates multiple product identification latent patterns based on that unit information. Program 311 may use artificial intelligence (AI). In this specification, the AI ​​may be a specialized artificial intelligence specialized for a specific application, or a general-purpose artificial intelligence capable of handling various situations and challenges. The AI ​​may also use known machine learning models such as neural networks, decision trees, random forests, SVM (support vector machines), and k-nearest neighbors. The machine learning model may be updated after training. 【0036】 (Product identification latent information) Product identification potential information is potential information that can identify a product, comprising one or more pieces of unit information. Unit information is coherent information. Unit information includes at least the product name. In addition to the product name, unit information may also include the manufacturer, net weight, quantity per package, and surplus information. Surplus information is information that is not necessary for identifying the product, such as "free shipping," "bulk purchase," "refill," or "credit card payment only." Product name may include the product name, abbreviation, or common name. 【0037】 In one example, if the product data is HTML data obtained from an e-commerce site, the product identification latent information is the product title. The product title is the information containing the product name, which is included in the section or range where the title tag or name tag is attached to the HTML data. In the example shown in Figure 3, the product identification latent information (product title) is product identification latent information G1, which includes the product name "Buta Kimchi Ramen". The product title is typically displayed at the top of the web page along with the product image, as shown in Figure 3. In another example, if the product data is a product master, the product identification latent information is the product name included in the product master. The notation of the product name is not particularly limited; it may be the official product name published by the manufacturer, an abbreviation or common name of the product, or a katakana notation. 【0038】 If the product data is web data, the pattern generation unit 31 can identify the tags mentioned above using program 311, a trained model, or AI, and identify the locations containing the product identification latent information, thereby identifying the product identification latent information. If the product data is a product master, the pattern generation unit 31 can identify the product identification latent information using program 311, a trained model, or AI. These programs 311, trained models, and AI are stored in the storage device 1b, read by the pattern generation unit 31, and processed by the processor 1a. 【0039】 The pattern generation unit 31 excludes or converts predetermined unit information based on the dictionary 313 or AI. The dictionary 313 or AI includes exclusion and conversion dictionaries and is stored in the storage device 1b. For example, the exclusion dictionary 313a defines words and phrases that are unit information to be excluded, and the pattern generation unit 31 refers to the exclusion dictionary 313a from the storage device 1b and excludes the words and phrases from the extracted product identification latent information. The predetermined unit information to be excluded is, for example, surplus information. The conversion dictionary 313b defines unit information to be converted, and the pattern generation unit 31 converts the unit information defined in the conversion dictionary 313b into the corresponding unit information. The unit information to be converted can be, for example, the manufacturer name or product name. Conversions can include conversion from half-width to full-width, conversion from full-width to half-width, conversion between English and Japanese, and conversion of abbreviations or common names of manufacturer names and product names to their official names. In one example, the pattern generation unit 31 converts the abbreviation or common name of a product, which is unit information included in the product identification latent information, into the official name of the product based on the conversion dictionary 313b. The exclusion AI is a model that excludes the unit information to be excluded from the extracted product identification latent information. The conversion AI is a model that identifies and converts the unit information to be converted from the extracted product identification latent information. 【0040】 After converting the unit information as described above, the pattern generation unit 31 may exclude one of the duplicate unit information if there is any duplicate information. 【0041】 Thus, the pattern generation unit 31 performs preprocessing for exclusion or transformation before generating product identification latent patterns. This improves the accuracy of determining whether products are identical. 【0042】 The pattern generation unit 31 generates a product identification latent pattern based on the converted unit information and other unit information included in the product identification latent information after the preprocessing is performed. The other unit information referred to here is unit information that is not excluded or converted in the preprocessing. 【0043】 (Latent patterns for product identification) A product identification latent pattern is a string pattern generated by excluding, transforming, or extracting unit information from product identification latent information. Here, multiple product identification latent patterns include any of the first to seventh patterns, or a combination of two or more of these. 【0044】 The first pattern is a string pattern generated by excluding or transforming predetermined unit information from the product identification latent information, and also by excluding the specification information included in the product identification latent information. The predetermined unit information to be excluded is redundant information. Specification information includes, for example, net volume, size, weight, container type, and number of items per package. If the specification information is net volume "500ml", then in the first pattern, "500ml" included in the product identification latent information is excluded. The first pattern is a string pattern that includes, for example, the manufacturer name and product name. 【0045】 The second pattern is a string pattern obtained by converting the first pattern into another representation format. Conversion to another notation format means converting to a notation format different from the original notation format, such as kanji, hiragana, katakana, or English letters. In this embodiment, in one example, if the product identification latent information includes kanji, hiragana, katakana, or a combination thereof, the second pattern is a string of kana characters obtained by converting these strings into kana characters. The second pattern may also involve conversions between full-width and half-width characters. 【0046】 The third pattern is a string pattern obtained by removing the manufacturer name from the first pattern. The third pattern is, for example, a string pattern that includes the product name or brand name. 【0047】 The fourth pattern is a string pattern generated by excluding or transforming predetermined unit information and numerical information from the latent product identification information. The predetermined unit information to be excluded or transformed is the same as in the first pattern. The fourth pattern is a string pattern that includes, for example, the manufacturer name, product name, and brand name. Numerical information includes the number and its unit. 【0048】 The fifth pattern is a string pattern consisting only of the manufacturer's name included in the product identification latent information. The fifth pattern may be, for example, the manufacturer's official name, and may be converted to the official name by the conversion dictionary 313b, etc., if the product identification latent information includes the manufacturer's abbreviation or common name. Alternatively, the fifth pattern may be generated by the pattern generation unit 31 by converting it to the product name included in the product identification latent information, if the conversion dictionary 313b defines a relationship between the product name (official name, abbreviation, or common name of the product) and the manufacturer's name. 【0049】 The sixth pattern is a string pattern consisting only of the product name (brand name) included in the product identification latent information. The sixth pattern may be, for example, the official name of the product, and may also be the result of converting the abbreviated or common name of the product name (brand name) to the official name using the conversion dictionary 313b, etc., if the product identification latent information includes an abbreviation or common name of the product name (brand name). 【0050】 The seventh pattern is a string pattern consisting only of product specification information included in the product identification latent information. The seventh pattern can be, for example, a string pattern consisting only of content volume, size, weight, container type, or number of items per package. Multiple seventh patterns may be generated for each type of specification information, such as content volume, size, weight, container type, and number of items per package. 【0051】 The product identification latent pattern is not limited to patterns 1 through 7; it can be any pattern based on product identification latent information. 【0052】 Figure 5 is a diagram illustrating the process for determining identical products. In the example shown in Figure 5, the pattern generation unit 31 identifies and extracts product identification latent information G1 "Free shipping Showa Goemon Genmaicha PET 2000ml x 6 bottles" from the product data as product identification latent information. The pattern generation unit 31 identifies unit information from product identification latent information G1. Here, for example, the pattern generation unit 31 identifies unit information such as "shipping fee," "free," "Showa," "Goemon," "Genmaicha," "PET," "2000ml," and "6 bottles." Then, the pattern generation unit 31 excludes "shipping fee" and "free," which are defined in the exclusion dictionary 313a, from product identification latent information G1, and uses a trained model, AI, etc. to identify specification information (here, "PET," "2000ml x 6 bottles") and delete it from product identification latent information G1 to generate the first pattern P1, "Showa Goemon Genmaicha." 【0053】 The pattern generation unit 31 converts the first pattern P1 into kana and half-width characters to generate the second pattern P2. The pattern generation unit 31 removes the manufacturer name "Showwa" from the first pattern P1 using a dictionary 313 defining the manufacturer name, AI, etc., to generate the third pattern P3, "Goemon Genmaicha". 【0054】 The pattern generation unit 31, similar to the first pattern P1, removes "free shipping" and "2000ml x 6 bottles" from the product identification latent information G1 to generate the fourth pattern P4, "Showwa Goemon Genmaicha PET bottle". The pattern generation unit 31 uses a dictionary 313 defining manufacturer names, AI, etc., to generate the fifth pattern P5, which consists only of the manufacturer name "Showwa" from the product identification latent information G1. The pattern generation unit 31 uses a dictionary 313 defining product names or brand names, AI, etc., to generate the sixth pattern P6, which consists only of the product name or brand name "Goemon". 【0055】 The pattern generation unit 31 generates the seventh pattern P7 containing only specification information using the program 311, AI, etc. In this case, three patterns of the seventh pattern P7 are generated: pattern P7a with "2000" indicating the content volume, pattern P7b with "pet" indicating the container type, and pattern P7c with "6" indicating the number of items per package. 【0056】 In this way, the pattern generation unit 31 decomposes the product identification latent information into unit information, and generates multiple product identification latent patterns from multiple perspectives by excluding, transforming, or extracting the unit information. Therefore, each product identification latent pattern consisting of a combination of unit information and / or transformed unit information can focus on the unit information that humans pay attention to when identifying a product. 【0057】 As described above, the pattern generation unit 31 excludes or converts predetermined unit information included in the product identification latent information, and generates the product identification latent pattern based on the converted unit information and other unit information included in the product identification latent information. 【0058】 Furthermore, if the product identification latent information of the reference product data does not include the manufacturer's name or its abbreviation, common name, or specification information, the pattern generation unit 31 may identify this information from the product data and generate the 5th and 7th patterns based on the identified information. The reference product data is the product data that serves as the basis for determining the identity of products, and can be, for example, the web data related to the product on the manufacturer's website. The reference product data is also referred to as the reference product data. 【0059】 Referring again to Figure 4, the vector information generation unit 32 generates vector information for each product identification latent pattern for each product data. For example, if product identification latent patterns 1 to 5 have been generated for two product data, the vector information generation unit 32 generates vector information for patterns 1 to 7 for one product data and vector information for patterns 1 to 7 for the other product data. In the example shown in Figure 5, the vector information generation unit 32 generates vector information V1 to V7 for patterns 1 to 7 P7 for each product data. The vector information for patterns 7a to 7c is vector information V7a to V7c. 【0060】 To generate vector information, you can use publicly available tools that convert strings into vectors (N-dimensional numbers). For example, you can use, but are not limited to, Google's Bert or Facebook's fasttext. 【0061】 The vector information generation unit 32 may also generate vector information for each pattern, such as the fifth pattern and the seventh pattern, which are generated based on the reference product data. 【0062】 The similarity calculation unit 33 calculates the similarity between two products for each product identification latent pattern based on the vector information of the product data of the two products. For example, the similarity calculation unit 33 calculates the similarity based on the vector information of the i-th pattern of one product's data and the vector information of the i-th pattern of the other product's data (where i is a natural number from 1 to the number of product identification latent patterns). In other words, the similarity calculation unit 33 calculates the similarity independently for each pattern without adding or multiplying the vector information of different patterns. The similarity can be, for example, the dot product of the vector information of the i-th pattern of one product's data and the vector information of the i-th pattern of the other product's data, the cosine similarity, or the Euclidean distance. In this embodiment, the similarity is the cosine similarity. In one example, a similarity value closer to 0 indicates less similarity, and a similar value closer to 1 indicates greater similarity. 【0063】 The determination unit 34 determines whether the products indicated by the two product data are the same, based on the similarity calculated by the similarity calculation unit 33. 【0064】 In one example, the determination unit 34 determines that the products indicated by the two product data are the same product if the similarity of one or more of the multiple potential product identification patterns is above a predetermined threshold, and determines that the products indicated by the two product data are different products if the similarity of all patterns is below the predetermined threshold. 【0065】 In another example, the determination unit 34 determines that the products indicated by the two product data are the same product if the similarity of at least one of the first to fourth patterns among the multiple potential product identification patterns is above a predetermined threshold, and determines that the products indicated by the two product data are different products if the similarity is not above a predetermined threshold. This makes it possible to determine whether the products are the same based on the granularity of the product name. 【0066】 In yet another example, the determination unit 34 determines that the products indicated by the two product data are the same product if the similarity of all seven of the multiple potential product identification patterns is above a predetermined threshold, and determines that the products indicated by the two product data are different products if the similarity of all seven patterns is above a predetermined threshold. This makes it possible to determine whether the products are the same at a granularity level of product identification codes such as JAN codes. 【0067】 Furthermore, the predetermined threshold may differ for each product identification latent pattern, and can be appropriately modified, for example, by setting it to 0.6, 0.7, 0.8, or 0.9. Also, while the condition for being the same product is defined as the similarity being above the predetermined threshold, the condition may also be that the similarity exceeds the predetermined threshold in any of the product identification patterns. 【0068】 In this way, the level of granularity at which products are determined to be the same can be adjusted depending on the type and number of latent patterns for product identification. 【0069】 The identical product determination unit 30 can determine if products are identical based on the existence of reference product data (hereinafter also referred to as reference product data) and product data whose identity as the same product as the reference product data has not yet been determined (hereinafter also referred to as undetermined product data), using units 31 to 34. The reference product data can be the web data for the product on the manufacturer's website. Furthermore, regardless of the existence of reference product data, even if the two product data are undetermined product data, units 31 to 34 can determine if products are identical. 【0070】 Here, using Figures 6 to 11, we will explain examples of identical product determination using web data and examples using the organization's product master as product data. 【0071】 (Example using web data) Figure 6 is a diagram illustrating an example in which the identical product determination unit 30 determines that products 0 to 3 are the same product. The latent product identification information G1 is as follows: Product 1 is "Showwa Orange Drink 900ml 1 case (12 bottles)" (however, "Showwa Orange Drink" is written in half-width katakana); Product 2 is "Free Shipping Heisei Orange Drink 900ml PET bottle x 12 bottles"; Product 3 is "Showwa Heisei Orange Drink 900ml x 12 bottles 1 case KK"; and Product 0 is "Heisei Orange Drink". The latent product identification information G1 for products 1 to 3 was obtained from the e-commerce site, and the latent product identification information G1 for product 0 was obtained from the manufacturer's site for product 0. In other words, the product data for product 0 is the standard product data. 【0072】 The exclusion dictionary 313a defines entries such as "1 case," "free shipping," and "KK." The conversion dictionary 313b defines entries such as "Showwa" and "orange drink." 【0073】 The first patterns P1 to the seventh patterns P7 for each product 0 to 3 are generated by the pattern generation unit 31, as shown in Figure 6. As shown in Figure 6, redundant information and numerical information are removed in the first to sixth patterns, and half-width katakana characters are converted to full-width characters. However, blank spaces in the product identification latent pattern in Figure 6 (for example, the sixth pattern for product 1, the fifth pattern for product 2, and the seventh pattern Pb for product 3) indicate that the unit information is not included in the product identification latent information G1. Furthermore, the fifth pattern P5 and the seventh pattern P7 for product 0 are generated by the pattern generation unit 31 based on the standard product data, identifying the manufacturer name and each specification information. Note that the seventh pattern P7c for product 0 indicates that product 0 is a single case. 【0074】 The lower part of Figure 6 shows the similarity scores for each pattern of product 0 to 3. The similarity scores for each pattern of each product are generated by the vector information generation unit 32. The similarity scores for each pattern of product 0 to 3 are values ​​calculated by the similarity calculation unit 33 based on the vector information of each pattern of product 0 to 3 and the vector information of each pattern of product 0. The similarity score for each pattern of product 0 is 1.0 for all patterns because it is based only on its own vector information. Note that patterns with a blank or 0 similarity score indicate that the unit information corresponding to that pattern is not included in the product identification latent information G1. 【0075】 In the example shown in Figure 6, although there are patterns in products 1 to 3 where one or two similarity values ​​are blank or 0, there are at least five similarity values ​​of 0.6 or higher in at least the first pattern P1 to the sixth pattern P6, and the determination unit 34 determines that products 1 to 3 are the same product as product 0. 【0076】 Figure 7 is a diagram illustrating an example in which the identical product determination unit 30 determines that products 0, 4-6 are different products. The latent product identification information G1 for product 4 is "[Outlet] Elber Delicious Vitamin C Orange 1 box (24 bottles)", for product 5 is "Nagoya Kyoryuchi Orange & Camu Camu 500mL PET bottle [24 bottles]", and for product 6 is "Free shipping Heisei Orange Drink 200ml paper pack 24 bottles [Showa]". The latent product identification information G1 for products 4-6 was obtained from an e-commerce site. Product 0 is the same as product 0 in Figure 6. 【0077】 The exclusion dictionary 313a defines terms such as "outlet," "1 box," and "free shipping." The conversion dictionary 313b defines conversions for terms such as {mL, ml}, {PET bottle, pet}, and {paper carton, pack}. 【0078】 The first patterns P1 to the seventh patterns P7 for each product 4 to 6 are generated by the pattern generation unit 31, as shown in Figure 7. As shown in Figure 7, redundant information and numerical information are removed from the first to sixth patterns, and units are converted to lowercase. However, the blank spaces in the product identification latent patterns in Figure 7 (for example, the sixth pattern P6, seventh patterns P7a and P7b for product 4, and the fifth pattern P5 for product 5) indicate that the unit information is not included in the product identification latent information G1. 【0079】 The similarity scores for each product 0, 4-6 are shown at the bottom of Figure 7. The similarity score for each pattern of product 0, 4-6 is a value calculated by the similarity calculation unit 33 based on the vector information of each pattern of product 0, 4-6, and the vector information of each pattern of product 0. The similarity score for each pattern of product 0 is 1.0 for all patterns because it is based only on its own vector information. Note that patterns with a blank or 0 similarity score indicate that the unit information corresponding to that pattern is not included in the product identification latent information G1. 【0080】 In the example shown in Figure 7, products 4 and 5 have many patterns where the similarity is blank or 0, and the calculated similarity of each pattern is less than 0.7, with some exceptions. Therefore, the determination unit 34 determines that they are different products from product 0. Product 6 has similarity of 0 or higher for patterns 1 P1 to 6 P6, and can be determined to be the same product at the brand name level. However, the similarity of pattern 7 P7 regarding the specification information is low, less than 0.7, so the determination unit 34 determines that it is a different product from product 0. In fact, product 0 is a PET bottle product, while product 6 is a paper carton product, so this determination is reasonable. In other words, by using the criterion of whether all patterns, including the similarity of pattern 7 regarding the specification information, are above a predetermined threshold, it is possible to determine identical products at a fine granularity level, such as the JAN code level. 【0081】 Figure 8 is a diagram illustrating an example in which the identical product determination unit 30 determines that products 10 to 13, which are different from product 0, are the same product. While the product category of products 0 to 6 is beverages, the product category of products 10 to 13 is food. The product identification latent information G1 for product 11 is "Yokohama Beef Ramen Bowl 12 servings x 1 case [Credit card payment only]", for product 12 it is "Yokohama Foods Yokohama Beef Ramen Bowl (12 pieces) Disaster Prevention", for product 13 it is "<<Case>> Yokohama Foods Beef Ramen Bowl (85g) x 12 cup noodles", and for product 10 it is "Beef Ramen Bowl". The product identification latent information G1 for products 11 to 13 was obtained from the e-commerce site, and the product identification latent information G1 for product 10 was obtained from the manufacturer's site for product 10. In other words, the product data for product 10 is the standard product data. 【0082】 The exclusion dictionary 313a defines terms such as "credit card payment," "only," "1 case," "case," and "<<case>>." The conversion dictionary 313b defines terms such as "Yokohama," "Yokohama Foods," etc. 【0083】 The first patterns P1 to the seventh patterns P7 for each product 10 to 13 are generated by the pattern generation unit 31 as shown in Figure 8. As shown in Figure 8, in the first to sixth patterns, redundant information and numerical information are removed, and the manufacturer name is converted to the official name "Yokohama Foods". However, the blank spaces in the product identification latent pattern in Figure 8 (for example, the seventh patterns P7b and P7c for product 11, the seventh patterns P7b and P7c for product 12, and the seventh pattern P7b for product 13) indicate that the unit information is not included in the product identification latent information G1. In addition, the fifth pattern P5 and the seventh pattern P7 for product 10 are generated by the pattern generation unit 31 based on the standard product data, identifying the manufacturer name and each specification information. The seventh pattern P7c for product 10 indicates that product 10 is one case. 【0084】 The lower part of Figure 8 shows the similarity scores for each pattern of each product 10 to 13. The similarity scores for each pattern of each product were generated by the vector information generation unit 32. The similarity scores for each pattern of products 10 to 13 are values ​​calculated by the similarity calculation unit 33 based on the vector information of each pattern of product 10 to 13 and the vector information of each pattern of product 10. The similarity scores for each pattern of product 10 are all 1.0 because they are based only on its own vector information. 【0085】 In the example shown in Figure 8, although there are patterns in products 11 to 13 where one or two similarity values ​​are blank or 0, at least six of the first patterns P1 to the sixth patterns P6 have a similarity value of 0.6 or higher, and the determination unit 34 determines that products 11 to 13 are the same product as product 10. 【0086】 Figure 9 is a diagram illustrating an example in which the identical product determination unit 30 determines that products 10, 14-16 are different products. The latent product identification information G1 for product 14 is "Oyama Company Child Star Ramen Beef 39g [Snack 1 case 24 bags]", for product 15 is "Yokohama Beef Ramen Cabbage Salad Light Aromatic Soy Sauce Flavor 3-pack 120g", and for product 16 is "S&X Snack Ramen (Beef Flavor) 12 servings". The latent product identification information G1 for products 14-16 was obtained from an e-commerce site. Product 10 is the same as product 10 in Figure 8. 【0087】 The exclusion dictionary 313a defines terms such as "confectionery" and "1 case." The conversion dictionary 313b defines terms such as "Yokohama" and "Yokohama Foods." 【0088】 The first pattern P1 to the seventh pattern P7 for each product 10, 14-16 are generated by the pattern generation unit 31 as shown in Figure 9. As shown in Figure 9, in the first to sixth patterns, redundant information and numerical information are removed, and the manufacturer name is converted to the official name "Yokohama Foods". However, the blank spaces in the product identification latent pattern in Figure 9 (for example, the seventh pattern P7c for product 14, the seventh pattern P7c for product 15, the fifth pattern P5, the seventh pattern P7b, and P7c for product 16) indicate that the unit information is not included in the product identification latent information G1. 【0089】 The lower part of Figure 9 shows the similarity scores for each pattern of product 10, 14-16. The similarity scores for each pattern of each product are generated by the vector information generation unit 32. The similarity scores for each pattern of product 10, 14-16 are values ​​calculated by the similarity calculation unit 33 based on the vector information of each pattern of product 10, 14-16, and the vector information of each pattern of product 10. The similarity score for each pattern of product 10 is 1.0 for all patterns because it is based only on its own vector information. Note that patterns with a blank or 0 similarity score indicate that the unit information corresponding to that pattern is not included in the product identification latent information G1. 【0090】 In the example shown in Figure 9, products 14 to 16 are determined to be different products from product 10 because, with some exceptions, the similarity of each calculated pattern is less than 0.7. 【0091】 (Example using the organization's product master) Figure 10 is a diagram illustrating an example of identical product determination for products 20 to 23 by the identical product determination unit 30. The product identification latent information G1 is as follows: product 21 is "Reiwa Probao R-1 Drink Type 112ml", product 22 is "Reiwa "R-1" 112g", product 23 is "Reiwa "R-1" Low Fat 112g", and product 20 is "Reiwa Probao Yogurt R-1 112g". The product identification latent information G1 for products 21 to 23 is obtained from the product master of the organization handling those products (e.g., wholesaler, retailer, etc.), while the product identification latent information G1 for product 20 is obtained from the manufacturer's website or the manufacturer's product master for product 20. In other words, the product data for product 20 is the standard product data. 【0092】 The exclusion dictionary 313a defines various surplus information. The conversion dictionary 313b defines conversions such as {R-1, Probao Yogurt R-1}, {reiwa, reiwa}, {reiwa, Reiwa}, {R-1, R-1}, {R-1, Probao Yogurt R-1}. 【0093】 The first patterns P1 to seventh patterns P7 of each of the products 20 to 23 are generated by the pattern generation unit 31 as shown in FIG. 10. As shown in FIG. 10, in the first to sixth patterns, surplus information and numerical information are excluded, and the abbreviations "R-1", "R-1" are converted to the official product name "Probio Yogurt R-1", and the full-width alphabetic notation "reiwa" is converted to the kanji notation "令和". However, the blanks in the product identification potential patterns in FIG. 10 (for example, the seventh pattern P7a of product 21, the seventh patterns P7b, P7c of product 22, the seventh patterns P7b, P7c of product 23) indicate that each unit information is not included in the product identification potential information G1. Also, the fifth pattern P5 and the seventh pattern P7 of product 20 are generated by the pattern generation unit 31 by specifying each specification information based on the reference product data. Note that the seventh pattern P7b of product 20 indicates that the unit form of its content volume is the ml form, and the official unit form of the content volume is the g form, so the corresponding unit information is not included in the product identification potential information G1. The seventh pattern P7c of product 20 indicates that product 20 is in the cup form. That is, product 20 is a yogurt with the product name "Probio Yogurt R-1" in one cup with a content volume of 112 g, and is a type of yogurt to be eaten. 【0094】 The similarity in each pattern of each of the products 20 to 23 is shown at the bottom of FIG. 10. The similarity of each pattern of each product is generated by the vector information generation unit 32. The similarity of each pattern of products 20 to 23 is a value calculated by the similarity calculation unit 33 based on the vector information of each pattern of the products 20 to 23 and the vector information of each pattern of product 20. Since the similarity of each pattern of product 20 is based only on its own vector information, it is 1.0 in each pattern. 【0095】 In the example shown in Figure 10, product 22 has a similarity score of 1.0 in all patterns except for the seventh pattern P7b and P7c, and the determination unit 34 determines that it is the same product as product 20. That is, product 22 is a type of yogurt that can be eaten, just like product 20. On the other hand, product 21 has several patterns with similarity scores of less than 0.8 (for example, the first pattern P1 to the sixth pattern P6), and the determination unit 34 determines that it is a different product from product 20. In fact, product 21 is the same brand as product 20, but it is a drink type, and is different from the eatable product 20. In other words, the fact that the string "drink type" was included in the first to fourth patterns P1 to P4 was the reason why the similarity score decreased and it was determined to be a different product. Furthermore, product 23 has a similarity score of 0.8 or higher in many patterns, indicating similarity with product 20. However, in the third pattern P3, the similarity score is 0.3, which is below the threshold of 0.8. Therefore, the determination unit 34 determines that product 23 is a different product from product 20. In fact, product 23 is the same brand as product 20, but it is a low-fat version, and the inclusion of "low-fat" in the third pattern P3 is the reason why the similarity score decreased and it was determined to be a different product. 【0096】 Thus, it can be seen that even when product data is the organization's product master and the product abbreviation is included in the product identification latent information G1, it is still possible to correctly determine if the products are the same. 【0097】 FIG. 11 is a diagram for showing an example of the same product determination for products 30 to 33 by the same product determination unit 30. The product identification potential information G1 shows that product 31 is "Good Pureance Care Shampoo" in half-width characters, product 32 is "Good Pureance Natural Shampoo 340 ml refill", product 33 is "Hanako kako Good Pureance Natural Shampoo Pump 425 ml", and product 30 is "Good Pureance Natural Cleansing Care Shampoo 425 ml". The product identification potential information G1 of products 31 to 33 is obtained from the product master of the organization (such as wholesalers, retailers, etc.) handling the products 31 to 33, and the product identification potential information G1 of product 30 is obtained from the manufacturer's website or the manufacturer's product master of product 30. That is, the product data of product 30 is the reference product data. 【0098】 Various surplus information is defined in the exclusion dictionary 313a. Conversions such as {kako, Hanako} and {Good, Hanako} are defined in the conversion dictionary 313b. 【0099】 The first pattern P1 to the seventh pattern P7 of each product 30 to 33 are generated by the pattern generation unit 31 as shown in FIG. 11. As shown in FIG. 11, in the first to sixth patterns, surplus information and numerical information are excluded, and the English name "kako" of the manufacturer is converted to the Chinese character notation "Hanako". For products 31 and 32, the manufacturer name "Hanako" is not included in the product identification potential information G1, but the fifth pattern P5 of products 31 and 32 is generated by converting the product name "Good" to the manufacturer name "Hanako" by the conversion dictionary 313b. However, the blank spaces (for example, the seventh pattern P7 of product 31, the seventh pattern P7b of product 32) in the product identification potential pattern in FIG. 11 indicate that each unit information is not included in the product identification potential information G1. Also, the fifth pattern P5 and the seventh pattern P7 of product 30 are generated by the pattern generation unit 31 by specifying each specification information based on the reference product data. Product 30 is a shampoo of the bottle (pump) type with a content volume of 425 ml. 【0100】 The similarity scores for each pattern of each product 30-33 are shown at the bottom of Figure 11. The similarity scores for each pattern of each product were generated by the vector information generation unit 32. The similarity scores for each pattern of products 30-43 are values ​​calculated by the similarity calculation unit 33 based on the vector information of each pattern of product 30-33 and the vector information of each pattern of product 30. The similarity scores for each pattern of product 30 are all 1.0 because they are based only on its own vector information. 【0101】 In the example shown in Figure 11, product 33 has a similarity score of 0.7 or higher in all patterns except for the 7th pattern P7b of the container type (packaging type), and the determination unit 34 determines that it is the same product as product 30. That is, product 33 is a bottle-type shampoo, just like product 30. The similarity score of the 7th pattern P7b of the container type (packaging type) is 0.6, which is lower than the other patterns, because the 7th pattern P7b of the reference product 30 contains two units of information: "bottle, pump". On the other hand, product 31 has a similarity score of 0.8 in the 1st to 6th patterns, which is above the threshold of 0.7, but there is no similarity score for the 7th pattern, and the content volume and container type (packaging type) are unknown. Therefore, it can be determined that it is the same product at the brand name level, but it is not possible to determine whether it is the same product or a different product at the JAN code level. Product 32 has a similarity score of 0.6 in about half of the patterns, which is below the threshold of 0.7, and the determination unit 34 determines that it is a different product from product 30. In other words, product 32 is a "refill" shampoo, which is different from bottle-type shampoo, and it is thought that this is why the similarity score was calculated as low in half of the patterns. 【0102】 (Integrated generation part) Figure 12 shows an example of the detailed configuration of the integrated generation unit. The integrated generation unit 40 analyzes multiple product data, classifies the product information contained in the product data according to the category of the product information, and for each product, integrates the product information contained in multiple product data related to the same product according to the category of the product information to generate a unified product master. 【0103】 Specifically, as shown in Figure 12, the integration generation unit 40 estimates, classifies, and / or categorizes the categories of product information included in the product data. For example, the integration generation unit 40 estimates, classifies, and / or categorizes the categories of product information included in the product data using AI 41. Then, the integration generation unit 40 generates a unified product master by integrating the product information of the product data relating to the same product for each categorized category. 【0104】 The integrated generation unit 40 estimates the category of a product from the product information of the product data. The product category can be any product category classification, such as food, beverages, confectionery, alcoholic beverages, bags, cosmetics, etc. The product category may also be estimated by classifying it into multiple categories, such as major, medium, and minor categories. In one example, if the product is beer, the integrated generation unit 40 estimates the product category by classifying food as the major category, beverages as the medium category, and alcoholic beverages or alcohol as the minor category. The estimation of the product category can be done, for example, by dictionary and / or AI. The integrated generation unit 40 generates a product information category for the product category and adds the generated product category to the estimated product category, thereby adding the product category to the matching product master as one of the product information categories. 【0105】 The integration generation unit 40 may eliminate duplicate product information for each categorized item during the integration process. For example, the integration generation unit 40 may associate identical product names from two product data sets with a product name category, but eliminate one of them. 【0106】 The integration generation unit 40 may normalize the representation of product information for each categorized unit during integration. Normalization may include, for example, unifying the units included in the product information within a category, or unifying variations in the words and phrases that constitute the product information. A normalization dictionary 42 and a normalization AI 43 can be used for normalization. The normalization dictionary 42 and the normalization AI 43 are, for example, dictionaries and AIs that convert units, such as converting l to ml and m to cm, or dictionaries and AIs that unify variations in the notation of words and phrases that constitute product information, such as converting full-width characters to half-width characters. The normalization dictionary 42 and the normalization AI 43 may be provided independently for unit unification and notation unification. 【0107】 (Product matching master) A product master for matching products is a product master formed by integrating product information from two or more product data sets. A product master for matching products can be a table where each product is listed vertically, and the categorized product information for each product is listed horizontally. In other words, a product master for matching products is a table where product information for one product is stored in a categorized column in each row. Each column represents a categorized or classified item. A product master for matching products includes product information categories found in only one of the product data sets relating to two or more identical products. That is, product information from two or more identical product data sets is aggregated in the product master for matching products. 【0108】 Figure 13 shows an example of a matching product master M. In the example shown in Figure 13, product information for three products is aggregated in each row. The matching product master M has at least three categorized product information categories (columns) C10 to C17 for each product. Columns C10 to C17 represent product name, product identifier (JAN code in this case), product category, capacity, energy, product size, features / evaluation information, and vector information, respectively. The features / evaluation information and vector information are generated by the features / evaluation information generation unit 50 and the product graph generation unit 60, respectively, and added to the matching product master M by the information addition unit 70. Note that although the matching product master M in Figure 13 has three products, it is not limited to this. The matching product master M can have any number of products for which product data can be obtained. 【0109】 (Feature / Evaluation Information Generation Unit) Figure 14 shows an example of the detailed configuration of the feature / evaluation information generation unit. The feature / evaluation information generation unit 50 generates product feature / evaluation information from data indicating the features of products included in the product data. Specifically, the feature / evaluation information generation unit 50 has a specification unit 51, a generation unit 52, and a category estimation unit 53. 【0110】 The identification unit 51 identifies data that indicates the characteristics of the product included in the product data. In this case, the data indicating the characteristics of the product is text data describing the product, but it may also be image data or sound data, as long as it indicates the characteristics of the product. The identification unit 51 identifies words and phrases that indicate the characteristics of the product from the identified text data. This identification can be done by dictionary 51a, AI, or both. Alternatively, it may be identified by program depending on the type of product data. 【0111】 In one example, if the product data is HTML data obtained from an e-commerce site, the identification unit 51 identifies and extracts the product description text contained in the location or range to which the description tag is attached in the data. In Figure 3, for example, this location or range is the location or range to which the code G3 is attached. The identification unit 51 then uses AI to perform natural language analysis on the extracted product description text to break it down into words and phrases, and identifies words and phrases that indicate the product characteristics. Morphological analysis can be used in the process of breaking down the product description text into words. Examples of morphological analysis tools that can be used include, but are not limited to, MeCab, Juman++, and Janome. 【0112】 In the example shown in Figure 3, the identification unit 51 identifies words such as "kimchi," "sesame oil," "deliciously spicy," "deliciously spicy soup," "egg," "devilishly delicious," and "addictive" from the information G3 which describes the product's characteristics / evaluation: "We are proud of our delicious spicy soup made with kimchi and sesame oil. Kimchi and eggs are included as toppings. Adding chives makes it devilishly delicious...! Guaranteed to be addictive." 【0113】 The generation unit 52 generates feature / evaluation information in association with the identified words and phrases. Specifically, it generates a feature / evaluation information column as one of the product information categories (columns) of the product master data, and generates feature / evaluation information by associating this column with the identified words and phrases. 【0114】 In the example shown in Figure 3, the generation unit 52 generates a feature / evaluation information column and generates feature / evaluation information by associating this feature / evaluation information column with words such as "kimchi," "sesame oil," "deliciously spicy," "deliciously spicy soup," "egg," "devilishly delicious," and "addictive." 【0115】 In the example shown in Figure 13, the characteristics / evaluation information for the top-ranked product "ABC Coffee" in the product master M is "Black," "Mandheling," "Refreshing," and "Fruit." The characteristics / evaluation information for the middle-ranked product "DEF Sports Drink" in the product master M is "Sports," "Large Capacity," and "Grapefruit Flavor." The characteristics / evaluation information for the bottom-ranked product "GHI Snack" in the product master M is "Corn," "Crispy," "BBQ," and "Limited." These words and phrases are associated with the characteristics / evaluation information column C16 for each product, and characteristic / evaluation information is generated. 【0116】 Furthermore, the generation unit 52 may associate the product information category estimated or generated by the category estimation unit 53 with the identified words and phrases. This product information category is a feature / evaluation information column, or a category classification or category division that assigns meaning to the identified words and phrases contained in the feature / evaluation information column. That is, the generation unit 52 can associate the feature / evaluation information column with the identified words and phrases regardless of the meaning of the identified words and phrases, or it may associate the feature / evaluation information column corresponding to the meaning of the identified words and phrases with the identified words and phrases, or it may associate the semantic content of further identified words and phrases within the feature / evaluation information column with the identified words and phrases. In the example shown in Figure 13, as will be described later, the category estimation unit 53 generates a raw material column from "Mandheling" and "corn," and associates these words and phrases with the column, and it may also generate an impression column from "refreshing" and "crispy," and associate these words and phrases with the column. In this case, the feature / evaluation information column C16 includes raw material columns containing "Mandheling" and "corn," as well as impression columns containing "refreshing" and "crispy," etc. 【0117】 The category estimation unit 53 estimates or generates product information categories corresponding to the identified words and phrases. This estimation or generation can be performed using a dictionary 53a, a natural language processing library 53b, AI, or two or more of these. In other words, the category estimation unit 53 assigns meaning to the identified words and phrases. Assigning meaning means estimating or generating product information categories (columns) corresponding to the words and phrases. 【0118】 In one example, the category estimation unit 53 determines the part of speech of a word or phrase and associates the determined part of speech with the word or phrase. Parts of speech include nouns, adjectives, ideograms, adverbs, etc. An ideogram is a word that describes the shape of a product. A dictionary 53a, a natural language processing library 53b, AI, or two or more of these can be used to determine the part of speech. The category estimation unit 53 estimates and generates the determined part of speech as a category of product information, or generates a product information category corresponding to the determined part of speech. 【0119】 In another example, the category estimation unit 53 determines the meaning of a word or phrase and estimates or generates a product information category that encompasses that meaning. That is, the category estimation unit 53 estimates or generates one of the product information categories (i.e., columns) from the meaning of a word or phrase. A dictionary 53a, a natural language processing library 53b, AI, or two or more of these can be used for this determination. 【0120】 Specifically, the category estimation unit 53 estimates or generates a product information category that corresponds to the subjective meaning of a product when the meaning of a word or phrase corresponds to the subjective meaning of the product. In one example, the category estimation unit 53 determines whether a word or phrase corresponds to one of the product information categories (i.e., columns) such as impression, atmosphere, taste, texture, quality, and use. The product information categories corresponding to the determined words or phrases are not limited to these and may be arbitrarily estimated or generated to suit the product. For example, words and phrases such as "refreshing," "chewy," and "fluffy" would correspond to the impression column. The product information categories to be semantized are not limited to subjective ones, but may also be objective ones such as the raw materials and materials of the product. A dictionary 53a, a natural language processing library 53b, AI, or two or more of these can be used for estimation or generation. 【0121】 In yet another example, the category estimation unit 53 generates combinations of words or phrases that have word or phrase dependencies. Dependency refers to a relationship in which different words or phrases are semantically connected, such as subject and predicate, modifier and modified word, or a sentence that indicates what is being referred to and what is being referred to. Examples of dependencies include "vivid_color" and "stylish_atmosphere," but are not limited to these; they can be made appropriate for the product based on information that indicates the product's characteristics / evaluation. A dictionary 53a, a natural language processing library 53b, AI, or two or more of these can be used to generate the combinations. 【0122】 Subjective characteristics / evaluation information such as impressions, and dependency characteristics / evaluation information are highly likely to be entered as search keywords, making them valuable when searching a database of product master data. For example, they are highly valuable as metadata for e-commerce sites. 【0123】 The feature / evaluation information generation unit 50 generates a single feature / evaluation information that combines all the generated feature / evaluation information. In this specification, this feature / evaluation information is referred to as overall feature / evaluation information, and the feature / evaluation information relating to the above words, phrases, or combinations of words and phrases may be referred to as individual feature / evaluation information. The overall feature / evaluation information is a string formed by concatenating the words and phrases of all the individual feature / evaluation information, and is associated with the overall feature / evaluation information column, which is one of the feature / evaluation information columns generated by the generation unit 52. 【0124】 (Features / Evaluation Information) Feature / evaluation information is a string of characters that indicates the characteristics of a product, and is also called a meta tag. In this embodiment, feature / evaluation information is a word, phrase, or combination of words or phrases that have dependencies that indicate the characteristics of the product. Feature / evaluation information is associated with the corresponding product information category. The information addition unit 70 associates the feature / evaluation information with the corresponding product in the product matching master. Feature / evaluation information is one of the additional information added to the product matching master. 【0125】 (Product graph generation section) Figure 15 shows an example of the detailed configuration of the product graph generation unit. The product graph generation unit 60 generates a product graph that shows the relationships between products based on feature / evaluation information. Specifically, the product graph generation unit 60 includes a vector information calculation unit 61, a distance calculation unit 62, and a graph generation unit 63. 【0126】 The vector information calculation unit 61 calculates vector information based on feature / evaluation information. For example, the vector information calculation unit 61 converts the string of feature / evaluation information into vector information. For this calculation (conversion) of vector information, known tools that convert strings into vectors (N-dimensional numerical values) can be used. For example, Bert provided by Google and fast text provided by Facebook can be used, but are not limited to these. 【0127】 The vector information calculation unit 61 calculates vector information for all feature / evaluation information, that is, for all individual feature / evaluation information and overall feature / evaluation information. In this specification, the vector information for individual feature / evaluation information may be referred to as individual vector information, and the overall feature / evaluation information may be referred to as overall vector information. Each calculated vector information is stored in the storage device 1b in association with the corresponding product and the matching product master. This association can be performed, for example, by the information addition unit 70. 【0128】 The distance calculation unit 62 calculates the distance between products based on the vector information. This distance can be, for example, the dot product of the vector information or the Euclidean distance. 【0129】 The distance between products can be broadly divided into the distance between individual vector information (also called the "distance between feature / evaluation information") and the distance between overall vector information (also called the "distance between products"). The distance between feature / evaluation information includes the distance between individual vector information of different products within the same product category, and the distance between individual vector information of different products within different product categories. The distance between products includes the distance between overall vector information of different products within the same product category, and the distance between overall vector information of different products within different product categories. 【0130】 The graph generation unit 63 (product graph generation unit 60) generates product graphs for the same product category and / or product graphs for multiple product categories. This generation is based on the distance calculated by the distance calculation unit 62. For example, the graph generation unit 63 generates a product graph that includes products whose calculated distance is within a predetermined distance. Product graphs can be generated using known methods such as social graph creation methods. 【0131】 Figure 16 shows an example of a product graph. The product graph in Figure 16 is a product graph for convenience store sweets from companies A, B, and C. In other words, this product graph was obtained by extracting products from a product master where the manufacturer name is "Company A," "Company B," or "Company C," and which have a "Convenience Store Sweets" characteristics / evaluation information column, and plotting them on the graph. In the product graph, cohesive areas are enclosed in circles, and each area is labeled with a word that represents the characteristics / evaluation information, as shown in Figure 16. For example, the area of ​​a product with material characteristics / evaluation information is labeled with the material (e.g., "strawberry," "blueberry," etc.), the area of ​​a product with product category characteristics / evaluation information is labeled with the category (e.g., "cake," "pudding," etc.), and the area of ​​a product with impression characteristics / evaluation information is labeled with the impression (e.g., "smooth," "chewy," "moist," etc.). This product graph allows us to understand the competitive relationships between each company. 【0132】 (Product graph) Product graphs for the same product category include: (1) a product graph showing the relationships between products within the same product category based on the distance between single individual features / evaluation information; (2) a product graph showing the relationships between products within the same product category based on the distance between multiple individual features / evaluation information; and (3) a product graph showing the relationships between products within the same product category based on the distance between overall features / evaluation information. Product graph (1) above is a graph (map) for products that share a product category and a single individual feature / evaluation information. Product graph (2) above is a graph (map) for products that share a product category and multiple individual features / evaluation information. Product graph (3) above is a graph (map) for products that share a product category. 【0133】 Product graphs across multiple product categories include: (4) a product graph showing the relationships between products within multiple product categories based on the distance between single individual features / evaluation information; (5) a product graph showing the relationships between products within multiple product categories based on the distance between multiple individual features / evaluation information; and (6) a product graph showing the relationships between products within multiple product categories based on the distance between overall features / evaluation information. The product graph in (4) above is a graph (map) for products that share a single individual feature / evaluation information. The product graph in (5) above is a graph (map) for products that share multiple individual features / evaluation information. The product graph in (6) above is a graph (map) for products across multiple product categories. 【0134】 The product graphs (1) to (6) above can provide relationships between products from different perspectives based on the type and number of features / evaluation information and the number of product categories, making it easier to give users insights when analyzing the relationships between product groups. 【0135】 (Information added section) The information addition unit 70 associates characteristic / evaluation information related to the same product with that product. Specifically, the information addition unit 70 stores the characteristic / evaluation information associated with the product in the storage device 1b, in association with the product matching master. 【0136】 The information addition unit 70 stores the product graph generated by the product graph generation unit 60 in the storage device 1b in association with the corresponding product and the matching product master. The information addition unit 70 may also store the vector information calculated by the vector information calculation unit 61 in the storage device 1b in association with the corresponding product and the matching product master. The vector information calculated by the vector information calculation unit 61 is individual vector information and / or overall vector information, and is also referred to as graph vector information. 【0137】 Feature / evaluation information, product graphs, and graph vector information are included in the additional information added to the matching product master. Product graphs and graph vector information are included in the product graph information. When the information addition unit 70 associates feature / evaluation information and / or product graph information with the matching product master, it associates at least one of the feature / evaluation information, product graphs, and graph vector information with the matching product master. 【0138】 The product matching master only needs to store the product name, specification information, feature / evaluation information, and product graph information associated with each product in the storage device 1b. Based on this association, the product matching system 1 may also include a relational database having a product matching master database generated by the integrated generation unit 40, a feature / evaluation information database that collects feature / evaluation information for each product, and a product graph information database that collects product graphs and / or graph vector information for each product. 【0139】 [2. Operation] [2-1. Overall Operation] Figure 17 is an example of an operation flowchart of the product name matching system of this embodiment. First, the product name matching system 1 acquires two or more product data sets using the product data acquisition unit 20 (S01: Acquisition of product data). Here, the product data is assumed to be two or more product data sets (HTML data) acquired from one or more e-commerce sites, but as mentioned above, it is not limited to this, and a product master used by the organization may also be acquired. 【0140】 Next, the product name matching system 1 analyzes multiple product data using the identical product determination unit 30 and identifies multiple product data related to the same product (S02: Identification of product data related to the same product). Specifically, the identical product determination unit 30 determines whether two or more acquired product data relate to the same product. If the identical product determination unit 30 determines that two or more acquired product data do not relate to the same product, it returns to S01. If the identical product determination unit 30 determines that two or more acquired product data relate to the same product, it proceeds to the next S03. 【0141】 If it is determined that two or more acquired product data pertain to the same product, the integration generation unit 40 analyzes the two or more product data and classifies the product information contained in the product data according to the category of the product information (S03: Category classification of product information). In one example, the integration generation unit 40 estimates, classifies, and separates the categories of product information using AI 41. Then, the integration generation unit 40 integrates the product information contained in two or more product data pertaining to the same product according to the category of the product information to generate a matching product master (S04: Generation of matching product master). As a result, product data pertaining to the same product is integrated, eliminating the need to manually enter the product data of one product into the other's product data. 【0142】 Furthermore, in S04, the integration generation unit 40 eliminates duplication of product information for each divided category. This is because identical information is unnecessary in the same product information category. Also in S04, the integration generation unit 40 normalizes the representation of product information for each divided category. That is, the integration generation unit 40 uses the normalization dictionary 42 and the normalization AI 43 to unify the units included in the product information within a category and to unify the variations in words and phrases that constitute the product information. Which units, words, or phrases to unify to may be defined in the normalization dictionary 42 or determined by the normalization AI 43. 【0143】 The feature / evaluation information generation unit 50 generates product feature / evaluation information from data indicating the product's features included in the product data (S05: Generation of feature / evaluation information). In one example, the identification unit 51 identifies and extracts product description texts that are included in the parts or ranges with predetermined tags, from the text data indicating the product's features included in each product data. The identification unit 51 then uses AI to perform natural language analysis on the extracted product description texts to break them down into words and phrases, and identifies words and phrases that indicate product features. The generation unit 52 generates feature / evaluation information associated with these words. More specifically, the category estimation unit 53 may estimate the product information category corresponding to the identified words and phrases, and the generation unit 52 may generate feature / evaluation information associated with the words, phrases and product information categories. 【0144】 The product graph generation unit 60 generates a product graph based on the generated feature / evaluation information (S06: Product Graph Generation). Specifically, the product graph generation unit 60 calculates vector information (i.e., graph vector information) based on the feature / evaluation information using the vector information calculation unit 61, and calculates the distance between products using the distance calculation unit 62. Then, the graph generation unit 63 generates product graphs for the same product category and / or product graphs between multiple product categories based on the calculated distances. 【0145】 The information addition unit 70 associates the generated feature / evaluation information and / or product graph information with the product corresponding to the feature / evaluation information (S07: Addition of feature / evaluation information and / or product graph information). As a result, the feature / evaluation information and / or product graph information, which are additional information, are added to the matching product master, and comprehensive information can be obtained for each product from the matching product master. 【0146】 In the above example, product graph information was associated with the product matching master, but this association is not always necessary. Alternatively, instead of the product graph, the vector information calculated by the vector information calculation unit 61 may be associated with the corresponding product and the product matching master by the information addition unit 70. 【0147】 [2-2. Product Master Data Generation Process] Figure 18 shows an example of an operation flowchart of the product master generation device for product matching according to this embodiment. Here, the product master generation device 10 is configured to include a same product determination unit 30 and an integrated generation unit 40, and product data obtained by the product data acquisition unit 20, etc., is input to the product master generation device 10. Furthermore, the product data is to be obtained from one or more e-commerce sites, consisting of two or more product data (HTML data), but as described above, it is not limited to this, and product masters used by the organization may also be obtained. 【0148】 First, the pattern generation unit 31 generates multiple product identification latent patterns for each of the multiple product data based on the product identification latent information (S21: Generation of multiple product identification latent patterns). Specifically, the pattern generation unit 31 identifies and extracts the product identification latent information contained in the product data using the program 311, AI, a trained model, or a combination thereof (S211: Identification and extraction of product identification latent information). Then, the pattern generation unit 31 identifies one or more unit information contained in the product identification latent information using the program 311, AI, a trained model, or a combination thereof (S212: Identification of unit information). Furthermore, the pattern generation unit 31 excludes and / or transforms predetermined unit information contained in the product identification latent information based on the dictionary 313 or AI (S213: Exclusion and / or transformation of predetermined unit information). The predetermined unit information to be excluded is, for example, surplus information, and the unit information to be transformed can be, for example, the manufacturer name or product name. The conversion capabilities include converting from half-width characters to full-width characters, from full-width characters to half-width characters, converting between English and Japanese, and converting abbreviations and common names of manufacturer names and product names to their official names. 【0149】 In this way, after excluding and / or converting predetermined unit information, multiple product identification latent patterns are generated based on the converted unit information and other unit information included in the product identification latent information (S214: generation of multiple product identification latent patterns). The product identification latent patterns can be determined based on the granularity of product identity requested by the user. For example, if the granularity of identity is at the product name (brand name) level, that is, if products are determined to be the same product if the product names are the same, the pattern generation unit 31 only needs to generate at least one of the 1st to 6th patterns. If the granularity of identity is at the product identification code level, such as a JAN code including the sales format, that is, if products are determined to be the same product if the product name, manufacturer name, and various specification information are the same, the pattern generation unit 31 generates at least the 1st to 7th patterns. To improve the accuracy of identity determination, it is good to generate multiple 7th patterns related to specification information. 【0150】 Next, the vector information generation unit 32 generates vector information for each product identification latent pattern for each product data (S22: Vector information generation). Specifically, the vector information generation unit 32 uses a known tool to convert the string of the product identification latent pattern into vector information, which is a collection of N-dimensional numbers. 【0151】 The similarity calculation unit 33 calculates the similarity between two products for each product identification latent pattern based on the vector information of the product data of the two products (S23: Similarity calculation). The similarity here is cosine similarity, where a similarity closer to 0 indicates less similarity, and a similarity closer to 1 indicates greater similarity. 【0152】 The determination unit 34 determines whether two products are identical based on their similarity (S24: Are they identical products?). The determination unit 34 determines whether the products are identical based on whether the similarity of each pattern is above a predetermined threshold. The predetermined threshold and the number and types of patterns to be determined as identical products can be determined according to the level of granularity of product identity desired by the user. 【0153】 In one example, if products are determined to be the same if their product names are identical, they are determined to be the same product if the similarity of at least one of the first to sixth patterns is equal to or greater than a predetermined threshold (e.g., 0.7, 0.8, or 0.9), and they are determined to be different products if the similarity of all patterns falls below the predetermined threshold. 【0154】 In another example, if products are determined to be the same product if the product name, manufacturer name, and various specifications are identical, they are determined to be the same product if the similarity of all seven patterns (Pattern 1 to Pattern 7) is equal to or greater than a predetermined threshold (e.g., 0.7, 0.8, or 0.9). If any of the patterns falls below the predetermined threshold, they are determined to be different products. 【0155】 If the determination unit 34 determines that the products are not the same (NO in S24), the process returns to the input of product data prior to S21 (for example, S01). If the determination unit 34 determines that the products are the same (YES in S24), the integration generation unit 40 classifies the product information contained in the two product data determined to be the same by the determination unit 34 according to the category of the product information (S03), and integrates them to generate a unified product master (S04). S03 and S04 are the same as in Figure 17, so their explanation is omitted. Note that in S04, the integration generation unit 40 may eliminate duplication of product information for each classified category and / or normalize the representation of product information for each classified category. 【0156】 [3. Action / Effect] (1) The product matching master generation device 10 of this embodiment includes: a pattern generation unit 31 that generates a plurality of product identification latent patterns for each of a plurality of product data based on product identification latent information including at least the product name contained in the product data; a vector information generation unit 32 that generates vector information for each product identification latent pattern for each product data; a similarity calculation unit 33 that calculates the similarity of two products for each product identification latent pattern based on the vector information of the product data of the two products; a determination unit 34 that determines whether the two products are the same based on the similarity; and an integration generation unit 40 that integrates the product information contained in the two product data determined to be the same by the determination unit 34 for each category of the product information and generates a product matching master which is the product master of the same product. 【0157】 This allows for the automatic consolidation of multiple product data sets, saving the time and effort required to integrate product data between different organizations and facilitating the transmission of product information. 【0158】 (2) The pattern generation unit 31 identifies one or more unit information contained in the product identification latent information and generates multiple product identification latent patterns based on said unit information. As a result, multiple product identification latent patterns are generated based on the smallest unit of information contained in the product identification latent information, so it is possible to make a multifaceted determination of whether or not they are the same product from multiple perspectives, and the accuracy of the determination of whether or not they are the same product can be improved. In other words, since a number of product identification latent patterns can be generated in accordance with the actual handling of the products, the accuracy of the determination can be improved. 【0159】 (3) The pattern generation unit 31 excludes or converts predetermined unit information included in the product identification latent information, and generates a product identification latent pattern based on the converted unit information and other unit information included in the product identification latent information after exclusion. This makes it possible to exclude unit information included in the product identification latent information that does not affect the determination of identical products, or to convert variations in notation of things that do affect the determination of identical products, thereby improving the accuracy of determining whether or not they are identical products. 【0160】 (4) The pattern generation unit 31 converts the abbreviated name of a product, which is unit information included in the latent product identification information, into the official name of the product. This improves the accuracy of determining whether or not they are the same product. In particular, certain businesses such as wholesalers have a practice of entering the abbreviated name of a product, rather than the official name, into the product master for a product, and identifying products by the abbreviated name. When a product master, which is product data from an industry with such a practice, is obtained, generating a latent product identification pattern based on the abbreviated name of the product may reduce the accuracy of determining whether or not they are the same product. In contrast, in this embodiment, the abbreviated name of a product is converted into the official name of the product, thereby improving the accuracy of the determination. 【0161】 (5) The product identification latent information further includes product specification information, and the pattern generation unit 31 generates a product identification latent pattern that further includes the specification information. This improves the accuracy of determining identical products. For example, even for bottled tea from the same manufacturer and with the same product name, there are various sales formats, such as selling 500ml bottles individually, selling in boxes of 12 bottles, or selling 2L bottles individually. Each product is assigned a product identification code such as a JAN code that matches its sales format. In other words, even if the brand name is the same, different sales formats result in different products being identified. Even in such cases, it is possible to determine whether they are identical products with the accuracy of the product identification code. In other words, it is possible to determine identical products with finer granularity than determining identical products based on the manufacturer or brand name, regardless of specification information such as capacity, quantity, weight, size, and container type, thereby improving the accuracy of the determination. 【0162】 (6) The integrated generation unit 40 uses the specification information as product information, classifies and integrates the product information by category, and generates a matching product master. This makes it possible to consolidate all specification information into the matching product master. 【0163】 (7) Multiple product identification latent patterns include at least one of the following: a first pattern in which predetermined unit information is excluded or converted from the product identification latent information and specification information included in the product identification latent information is excluded; a second pattern in which the first pattern is converted into another notation format; a third pattern in which the manufacturer name included in the product identification latent information is excluded from the first pattern; a fourth pattern in which predetermined unit information is excluded or converted from the product identification latent information and numerical information included in the product identification latent information is excluded; a fifth pattern containing only the manufacturer name included in the product identification latent information; a sixth pattern containing only the product name included in the product identification latent information; and a seventh pattern containing only the product specification information included in the product identification latent information. This makes it possible to improve the accuracy of identifying identical products in response to the product information transmission methods of multiple organizations. 【0164】 (8) The integration generation unit 40 eliminates duplication of product information for each category during integration. This makes the product master data easier to use. 【0165】 (9) The integration generation unit 40 normalizes the representation of product information for each category during integration. This makes the matching product master easier to use. 【0166】 (10) Of the product data for two products, one is the standard product data that serves as the criterion for determining whether the products are the same, and the other is the undetermined product data that has not been determined by the determination unit 34. If the determination unit 34 determines that the product indicated by the undetermined product data and the product indicated by the standard product data are the same product, the undetermined product data is made one of the standard product data. This allows for the accumulation of training data (correct answer data), which can be used to train learning models such as excluding and transforming unit information. 【0167】 (11) The product name matching system 1 of this embodiment includes a product data acquisition unit 20 that acquires multiple product data; a same product determination unit 30 that analyzes multiple product data and identifies multiple product data relating to the same product; an integration generation unit 40 that analyzes multiple product data, classifies the product information contained in the product data according to the category of the product information, and for each product, integrates the product information contained in multiple product data relating to the same product according to the category of the product information to generate a matching product master; a feature / evaluation information generation unit 50 that generates product feature / evaluation information from data indicating the characteristics of the product contained in the product data; and an information addition unit 70 that associates additional information including feature / evaluation information relating to the same product with the same product. 【0168】 This reduces the time and effort required for transmitting product information and preparing data, and allows for the provision of product-related information necessary for marketing analysis such as demand forecasting, product development, or product recommendations, through product information or supplementary information. 【0169】 (12) The feature / evaluation information generation unit 50 includes an identification unit 51 that identifies words that represent the features of a product from text data that represents the features of a product included in the product data, and a generation unit 52 that generates feature / evaluation information in association with the words. 【0170】 This allows for the storage of product characteristics from various perspectives or angles as data associated with the product master, enabling the provision of product-related information from these perspectives or angles. For example, it can improve product searchability and provide product-related information necessary for marketing analysis, product development, or product recommendations. In one example, instead of limiting the product group to only wine, it is possible to extract products from other categories such as wine, sake, and cocktails, enabling broader analysis and recommendations. In another example, in an e-commerce site using the product master, product searchability can be improved by using characteristic / evaluation information as search tags. 【0171】 (13) The feature / evaluation information generation unit 50 has a category estimation unit 53 that estimates the category of product information corresponding to a word, and the generation unit 52 generates feature / evaluation information in association with the word and category. 【0172】 This makes it possible to classify the categories of words identified by the identification unit 51. For example, if a product is a beverage with a refreshing taste, the word "refreshing" can be classified into categories such as impression or flavor. Furthermore, these categories can be hierarchically structured according to the scope of their concepts. As a result, it becomes possible to search the database of the product master using characteristic / evaluation information of various granularities as search keys, thereby improving search accuracy and enabling the generation of product graphs from various angles, which can lead to new insights for the user. 【0173】 (14) The feature / evaluation information is information that falls under at least one of the following product information categories: impression, atmosphere, taste, texture, quality, and use. This allows the product features to be presented from a subjective or sensory perspective, providing hints for marketing analysis, product development, or product recommendations. 【0174】 (15) The system includes a product graph generation unit 60 that generates a product graph showing the relationships between products based on feature / evaluation information, and the additional information includes a product graph or graph vector information for generating a product graph. This allows the user to see the relationships between products, which can be used for marketing analysis such as demand forecasting, product development, or product recommendations, and can give the user new insights into products. 【0175】 (16) The product graph generation unit 60 is configured to generate product graphs for products in the same product category. This makes it possible to show the relationships between products in the same product category, making it easier to analyze groups of products in the same product category. 【0176】 (17) The product graph generation unit 60 is configured to generate product graphs between multiple product categories. This allows for the presentation of cross-category relationships between products that are not limited to specific product categories, thereby providing users with new insights that cannot be obtained within the same product category. For example, even if there are multiple product categories, products with the same taste and texture are presented in the product graph, which can provide retailers with hints for shelf layout and product ordering, and manufacturers with hints for product development. Online retailers can also recommend products with a consistent taste and texture. 【0177】 (18) The product graph generation unit 60 includes a vector information calculation unit 61 that calculates vector information based on feature / evaluation information, and a distance calculation unit 62 that calculates the distance between products based on the vector information. This makes it possible to prepare the materials for generating various product graphs. 【0178】 (19) Any product data in any of the multiple product data sets includes at least one of the following: product name, specifications, product logistics information, transaction information, customer information, and purchase information. This allows for automatic linking of product information and additional information with at least one of the logistics information, transaction information, customer information, and purchase information for the same product, thus eliminating the effort required to acquire and input information. Furthermore, by including product logistics information, transaction information, customer information, or purchase information in the matching product master, more advanced marketing analysis, product development, or product recommendations can be achieved. 【0179】 For example, if product data includes purchase information, it is possible to identify best-selling products from the purchase information and then analyze whether the reason for the product's popularity lies in the product information or additional information contained in the product master. Based on the results of this analysis, retailers can be provided with hints for shelf allocation and product ordering, and manufacturers can be provided with hints for product development. In one example, POS data containing product names and sales data for those products is known as purchase information, but POS data does not include product information such as product specifications. This requires the effort of collecting product information and additional information and entering it into the product master. However, according to this embodiment, this collection and effort can be omitted. Furthermore, sales trends can be grasped from the purchase information, and since the product master is linked to this purchase information with product information and additional information from various perspectives, it is possible to analyze the factors behind the popularity of best-selling products based on the commonalities in their product information and additional information. 【0180】 Furthermore, by including product name, specifications, logistics information, transaction information, customer information, and purchasing information, it becomes possible to integrate all information related to the product, from material procurement to manufacturing and distribution. 【0181】 (20) Multiple product data sets include product data from two or more different organizations. This allows for the creation of a unified product master across different organizations, enabling the quick and appropriate sharing of product information between different organizations without the need for manual entry of product information and additional information. 【0182】 [4. Other Embodiments] In other embodiments of the present invention, the present invention may also include a program that implements the functions and information processing shown in the flowchart of the embodiments of the present invention described above, or a computer-readable storage medium that stores the program. In yet another embodiment, the present invention may also include a method for implementing the functions and information processing shown in the flowchart of the embodiments of the present invention described above. In yet another embodiment, the present invention may also include a server that can supply a computer with a program that implements the functions and information processing shown in the flowchart of the embodiments of the present invention described above. In yet another embodiment, the present invention may also include a virtual machine that implements the functions and information processing shown in the flowchart of the embodiments of the present invention described above. 【0183】 In the processes or operations described above, the processes or operations can be freely modified, as long as no inconsistencies arise in the processes or operations, such as using data that should not yet be available at a given step. Furthermore, the embodiments described above are illustrative examples for explaining the present invention, and the present invention is not limited to these embodiments. The present invention can be implemented in various forms without departing from its essence. 【0184】 In the above embodiment, data and information related to products are handled, but services may be used instead of products. For example, service data related to services may be used instead of product data. In this case, the functions of each part of System 1 and Device 10 can be replaced with functions related to services rather than products. [Explanation of symbols] 【0185】 1. Product Name Matching System 1a processor 1b Storage device 1b-1 RAM 1b-2 ROM 1b-3 External memory 1c Input device 1d output device 1e Communication equipment 1F Bus 10. Product Master Generation Device for Data Matching 20 Product Data Acquisition Section 30 Same product determination department 31 Pattern generation unit 311 Programs 313 Dictionary 313a Dictionary for exclusion 313b Conversion Dictionary 32 Vector Information Generation Unit 33 Similarity calculation unit 34 Judgment section 40 Integrated generation section 41 AI 42 Normalization Dictionary 43 AI for normalization 50 Feature / Evaluation Information Generation Unit 51 Specific section 52 Generation part 53 Category Estimation Section 53a Dictionary 53b Natural Language Processing Library 60 Product Graph Generation Unit 61 Vector Information Calculation Unit 62 Distance Calculation Unit 63 Graph Generation Unit 70 Information Addition Section C10~C17 Product Information Categories (Columns) G0 Product Image G1 Product identification latent information G2 Specifications G3 Product Features / Evaluation Information M Product Matching Master P1-P7: Pattern 1-7 V1~V7 Vector information for patterns 1 to 7

Claims

[Claim 1] A pattern generation unit generates multiple product identification latent patterns for each of the multiple product data based on product identification latent information that includes at least the product name contained in the product data, A vector information generation unit generates vector information for each of the aforementioned product identification latent patterns for each of the aforementioned product data, A similarity calculation unit calculates the similarity between two products based on the vector information of the product data of the two products for each of the aforementioned product identification latent patterns. A determination unit that determines whether the two products are identical based on the similarity score, An integration generation unit integrates the product information contained in the product data of the two products that the determination unit has determined to be identical, for each category of product information, and generates a matching product master which is a product master for the identical products. Equipped with, The pattern generation unit identifies a plurality of unit information contained in the product identification latent information, generates a plurality of product identification latent patterns based on at least two or more of the plurality of unit information, and each of the product identification latent patterns is a different combination of the unit information. Product matching and master data generation device. [Claim 2] The pattern generation unit identifies one or more unit information included in the product identification latent information and generates a plurality of product identification latent patterns based on the unit information. The product master generation device for matching products according to claim 1. [Claim 3] The pattern generation unit excludes or converts predetermined unit information included in the product identification latent information, and generates the product identification latent pattern based on the converted unit information and other unit information included in the product identification latent information after exclusion. A product matching master generation device according to claim 1 or 2. [Claim 4] The pattern generation unit converts the abbreviation of the product, which is the unit information included in the product identification latent information, into the official name of the product. The product master generation device for matching products according to claim 3. [Claim 5] The aforementioned product identification potential information further includes the product specification information, The pattern generation unit generates the product identification latent pattern which further includes the specification information. A product name matching system according to any one of claims 1 to 4. [Claim 6] The integrated generation unit uses the specification information as product information, classifies and integrates the product information by category, and generates the matching product master. The product master generation device for matching products according to claim 5. [Claim 7] The plurality of product identification latent patterns include at least two combinations of the following: a first pattern in which predetermined unit information is excluded or converted from the product identification latent information and specification information included in the product identification latent information; a second pattern in which the first pattern is converted into another notation format; a third pattern in which the manufacturer name included in the product identification latent information is excluded from the first pattern; a fourth pattern in which predetermined unit information is excluded or converted from the product identification latent information and numerical information included in the product identification latent information; a fifth pattern consisting only of the manufacturer name included in the product identification latent information; a sixth pattern consisting only of the product name included in the product identification latent information; and a seventh pattern consisting only of the product specification information included in the product identification latent information. A product matching master generation device according to any one of claims 1 to 6. [Claim 8] The integration generation unit eliminates duplication of the product information for each category during the integration process. A product matching master generation device according to any one of claims 1 to 7. [Claim 9] The integration generation unit normalizes the representation of the product information for each category during the integration. A product matching master generation device according to any one of claims 1 to 8. [Claim 10] Of the product data for the two products mentioned above, one is the standard product data that serves as the criterion for determining whether the products are identical, and the other is the undetermined product data that has not been determined by the determination unit. If the determination unit determines that the product indicated by the undetermined product data and the product indicated by the standard product data are the same product, the undetermined product data is designated as one of the standard product data. A product matching master generation device according to any one of claims 1 to 9. [Claim 11] A pattern generation step in which multiple product identification latent patterns are generated for each of the multiple product data based on product identification latent information that includes at least the product name contained in the product data, A vector information generation step for each of the aforementioned product data, which generates vector information for each of the aforementioned product identification latent patterns, For each of the aforementioned latent product identification patterns, a similarity calculation step is performed to calculate the similarity between the two products based on the vector information of the product data of the two products. A determination step of determining whether the two products are identical based on the similarity score, The integration generation step involves integrating the product information contained in the product data of the two products that have been determined to be identical in the determination step, for each category of product information, and generating a matching product master which is the product master for the identical products. The computer executes this, The pattern generation step involves identifying a plurality of unit information contained in the product identification latent information, generating a plurality of product identification latent patterns based on at least two or more of the plurality of unit information, and each of the product identification latent patterns being a different combination of the unit information. Method for generating a product master data for data matching. [Claim 12] A program that causes a computer to perform the method according to claim 11.