Social risk calculation device, social risk calculation method, and computer program
The system addresses inaccuracies in existing ESG risk evaluations by generating feature vectors to calculate detailed social risks by country, industry, and social issue, ensuring precise ESG risk management and evaluation.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- AIESG INC
- Filing Date
- 2025-09-12
- Publication Date
- 2026-06-18
AI Technical Summary
Existing ESG risk evaluation methods, such as those described in Patent Document 1, face inaccuracies due to differences in ESG risk scores between companies that do and do not make news, lack consideration of sentiment and relevance scores, and cannot perform risk evaluations by country, industry, or social issue.
A system that generates feature vectors from news article data to calculate industry-related, social issue-related, and emotional scores, using pre-trained neural networks to assess social risks, and assigns country labels, enabling detailed social risk assessments by country, industry, and social issue.
Enables accurate and comprehensive social risk assessments by country, industry, and social issue, allowing for precise ESG risk management and evaluation of companies, with the ability to update scores efficiently with new news data.
Smart Images

Figure JP2025032277_18062026_PF_FP_ABST
Abstract
Description
Social Risk Calculation Device, Social Risk Calculation Method, and Computer Program 【0001】 The present invention relates to a social risk calculation device, a social risk calculation method, and a computer program for calculating social risk. 【0002】 In recent years, as an evaluation of corporations such as companies, local governments, and university corporations, it has become common to evaluate corporations from three perspectives of ESG (Environment, Social, Governance) as well as from a financial perspective. For this reason, it has become indispensable for the corporate side to manage ESG-related risks that can affect business entities, and it has become important to grasp and consider social risks due to ESG even in the selection of raw material suppliers and business partners. 【0003】 For example, in Patent Document 1, news articles are collected, classified by date and company, similarity analysis between news articles is performed, clustering is performed on news articles with a similarity equal to or higher than a reference value, and then news articles are classified by each item of E (Environment), S (Social), and G (Governance), and a technique for calculating an ESG corporate evaluation score by calculating ESG risks in cluster units is disclosed. 【0004】 Japanese Patent Publication No. 2021-504789 【0005】 However, in the technique of Patent Document 1, since news on the Internet is collected, there is a problem that there is a difference in ESG risk scores between companies that do not make the news and companies that are covered in the news, and the scores deviate from the actual situation. 【0006】 Further, in Patent Document 1, since only the ESG evaluation of a company is calculated, there is a problem that risk evaluations by country, industry, and social issue cannot be performed. Also, since the sentiment score and relevance score in individual news are not considered, there is a problem that the accuracy in risk evaluation is insufficient. 【0007】Therefore, in this invention, feature vectors are generated for each piece of collected news article data, and scores such as industry-related scores and social issue-related scores are calculated based on the generated feature vectors of each piece of news article data, thereby enabling social risk assessment by country, industry, and social issue. 【0008】 The present invention provides a social risk calculation device for calculating social risks, comprising: a news article input unit for inputting news article data; a related score calculation unit for calculating an industry related score, which is a related score between the input news article data and an industry, and a social issue related score, which is a related score between the news article data and a social issue; an emotional score calculation unit for calculating the emotional tendency of the content of the news article data as an emotional score; and a social risk calculation unit for calculating a social risk value based on the industry related score, social issue related score, and emotional score of a plurality of news articles. 【0009】 Furthermore, it may also have a country labeling unit that assigns country labels to news article data. 【0010】 The social risk calculation device according to the present invention further includes a news article score storage unit that stores country labels, industry-related scores, social issue-related scores, and sentiment scores assigned to news article data, and the social risk calculation unit may calculate the social risk in a particular social issue of a particular industry in a particular country based on the industry-related score, social issue-related score, and sentiment score calculated from each news article. 【0011】 Furthermore, the system may also include a news article feature vector generation unit that generates news article feature vectors from the input news article data, an association score calculation unit that calculates an industry association score by comparing the news article feature vectors generated by the news article feature vector generation unit with industry feature vectors and calculating the similarity, a social issue association score by comparing the news article feature vectors with social issue feature vectors and calculating the similarity, and a sentiment score calculation unit that calculates a sentiment score by inputting news article feature vectors, which may be a pre-trained machine learning model. 【0012】 Furthermore, the association score calculation unit includes a feature vector matching unit, which is an association score calculation model based on a pre-trained neural network that has been trained to calculate a high score when news article data is highly related to an industry or social issue by matching the feature vectors of news article data with the feature vectors of an industry or social issue. 【0013】 Furthermore, the system may include a social risk memory unit that stores the social risks calculated by the social risk calculation unit, categorized by industry and social issue. 【0014】 The present invention provides a social risk calculation method for calculating social risks using a computer, comprising: an input step of inputting news article data; an association score calculation step of calculating an industry association score, which is the association score between the input news article data and industry, and a social issue association score, which is the association score between the news article data and social issues; an emotion score calculation step of calculating the emotional tendency of the content of the news article data as an emotion score; and a social risk calculation step of calculating a social risk value based on the industry association score, social issue association score, and emotion score of multiple news articles. 【0015】 The present invention provides a computer program for causing a computer to function to calculate social risks, which includes an input step of inputting news article data, a related score calculation step of calculating an industry related score, which is a related score between the input news article data and an industry, and a social issue related score, which is a related score between the news article data and a social issue, a sentiment score calculation step of calculating the sentiment tendency of the content of the news article data as a sentiment score, and a social risk calculation step of calculating a social risk value based on the industry related score, social issue related score, and sentiment score of multiple news articles. 【0016】According to this invention, it is possible to collect a large amount of news article data and calculate social risks based on that data. This allows corporations seeking to implement ESG risk management to understand the social risks necessary for ESG risk management, broken down by country, industry, and social issue. Therefore, in corporate evaluations, it becomes possible to analyze companies by country, industry, and social issue before conducting a comprehensive evaluation. Furthermore, since scores can be calculated for each social issue by country, it is possible to conduct risk assessments in specific countries. 【0017】 Furthermore, because social risk is calculated based on industry-related scores, social issue-related scores, and sentiment scores calculated for individual news article data, even if new news article data is added, only the score for each individual news article needs to be calculated, allowing for the calculation of social risk with minimal computation. 【0018】 Figure 1 is a block diagram showing an example of the functional configuration of the social risk calculation device 100 in the present invention. Figure 2 is a flowchart showing an example of the process of assigning country labels performed by the country label assignment unit 120. Figure 3 is a diagram showing an example of a flowchart for generating news article feature vectors in the news article feature vector generation unit 130. Figure 4 is a functional block diagram showing an example of the configuration of the association score calculation unit 140. Figure 5 is a block diagram schematically showing the procedure for learning the news article feature vector generation unit 130, the term feature vector generation unit 1430, and the feature vector matching unit 1440. Figure 6 is a block diagram explaining the process of calculating sentiment scores in the sentiment score calculation unit 150. Figure 7 is an example of a database of news article scores stored in the news article score storage unit 160. Figure 8 is a block diagram explaining an example of the process of calculating social risk in the social risk calculation unit 170. Figure 9 is an example of a database of social risk values stored in the social risk storage unit 180. Figure 10 is an example of a hardware configuration diagram constituting the social risk calculation device 100 in the present invention. 【0019】Hereinafter, embodiments for carrying out the present invention will be described with reference to the drawings. In this specification and the drawings, components having substantially the same function and configuration are denoted by the same reference numerals, and redundant explanations are omitted. 【0020】 Figure 1 is a block diagram showing an example of the functional configuration of the social risk calculation device 100 in the present invention. The social risk calculation device 100 of the present invention includes a news article input unit 110, a country label assignment unit 120, a news article feature vector generation unit 130, a related score calculation unit 140, an emotion score calculation unit 150, a news article score storage unit 160, a social risk calculation unit 170, and a social risk storage unit 180. The news article input unit 110 inputs news article data. The news article data is, for example, text data that can be collected from the internet. The news article data includes news articles in various languages and from various publication media. The news article input unit 110 inputs news article data via communication such as the internet from an external storage medium or an external website. The news article data is not limited to Japanese, but is collected and input from a wide range of sources around the world, including news in English and other languages. The news article data may be configured to be collected and input periodically. 【0021】 The country label assignment unit 120 assigns a country label to the news article data input by the news article input unit 110. The country label indicates which country the news article data pertains to. The country label assignment unit 120 may, for example, assign a country label to the news article data based on the country name contained in the news article data. Alternatively, the country label assignment unit 120 may assign a country label to the news article data based on the language of the news article data. Furthermore, the country label assignment unit 120 may assign a country label of the country from which the news article originated. Since the country label is assigned to group news articles into countries, for example, if the country to which the label is assigned is Japan, the country label may be "Japan" or the abbreviation for Japan, "JP," and the format of the country label can be any format. 【0022】 The news article feature vector generation unit 130 generates news article feature vectors from news article data input from the news article input unit 110. For example, the news article feature vector generation unit 130 tokenizes the text contained in the news article data, converts it into a token ID list of a predetermined length, and generates feature vectors for the news article data. At this time, the news article feature vector generation unit 130 is a neural network using BERT and is trained to generate feature vectors for news article data. The news article feature vector generation unit 130 is trained together with the association score calculation unit 140. Details will be described later. 【0023】 The related score calculation unit 140 calculates an industry related score, which indicates the degree to which the input news article data is related to a particular industry. The related score calculation unit 140 calculates the industry related score based on the news article feature vector input from the news article feature vector generation unit 130 and the term feature vector based on industry terms. The related score calculation unit 140 also calculates a social issue related score, which is the relatedness score between the news article data and social issues. The related score is a score that indicates the degree of relatedness between news article data and an industry or social issue, with a higher value indicating a higher degree of relatedness. The related score calculation unit 140 calculates the industry related score by comparing the news article feature vector generated by the news article feature vector generation unit 130 with the industry feature vector and calculating the similarity. The related score calculation unit 140 also calculates the social issue related score by comparing the news article feature vector generated by the news article feature vector generation unit 130 with the social issue feature vector and calculating the similarity. 【0024】The association score calculation unit 140 may be an association score calculation model, which is a pre-trained neural network that has been trained to calculate a high score when the association between news article data and industry or social issues is high, by matching the feature vectors of news article data with the feature vectors of industry or social issues. The association score calculation unit 140 may also be trained to calculate a value close to 1 when news article data and industry or social issues are related to each other. 【0025】 The emotion score calculation unit 150 calculates an emotion score based on the emotional tendencies of the content of the news article data. The emotion score here is a numerical representation of the general emotional tendencies towards the news content that can be gleaned from the news article. The emotion score shown by the content of the news article reflects the attitude of society, and a negative attitude indicates a high social risk. 【0026】 The sentiment score calculation unit 150 may be a pre-trained machine learning model that calculates a sentiment score by inputting a news article feature vector. For example, the sentiment score calculation unit 150 may be a sentiment score calculation model composed of a pre-trained neural network. 【0027】The news article score storage unit 160 is a database that stores, in association with the country label assigned by the country label assignment unit 120, the industry-related score and social issue-related score calculated by the related score calculation unit 140, and the sentiment score calculated by the sentiment score calculation unit 150, for each news article data. When news article data is input from the news article input unit 110, the country label assignment unit 120 assigns a country label to that news article data, and the news article feature vector generation unit 130 generates a news article feature vector. Based on this news article feature vector, the industry-related score, social issue-related score, and sentiment score are calculated and stored in the news article score storage unit 160 in association with each other. Therefore, the social risk calculation unit 170, described later, calculates a score of social risk by country, industry, and social issue based on the country label, industry-related score, social issue-related score, and sentiment score assigned to each news article data. 【0028】 The social risk calculation unit 170 calculates a social risk value based on the industry-related score, social issue-related score, and sentiment score of multiple news articles. The social risk calculation unit 170 calculates the social risk of a certain social issue in a certain industry in a certain country based on the industry-related score, social issue-related score, and sentiment score calculated from multiple news articles. The social risk calculation unit 170 calculates the social risk based on the industry-related score, social issue-related score, and sentiment score of each news article stored in the news article score storage unit 160. However, the social risk calculation unit 170 is not limited to this and may be configured to calculate social risk based on the country label assigned by the country label assignment unit 120, the related score calculation unit 140, and the score calculated by the sentiment score calculation unit 150. 【0029】 The social risk memory unit 180 stores the social risk score for a particular social issue in a particular industry for each country, calculated by the social risk calculation unit 170. 【0030】Figure 2 is a flowchart illustrating an example of the process by which the country label assignment unit 120 assigns a country label. The country label assignment unit 120 identifies which country the input news article data pertains to and assigns a country label. For example, the country label assignment unit 120 determines whether the news language is English (step S201). If the news language is not English (No), it assigns the label of the country to which the media or company publishing the news belongs to the news article data (step S202). If the news language is English (Yes), the country label assignment unit 120 searches the news article data by country name (step S203). For example, the country label assignment unit 120 may have a country name database that stores country names in advance and search whether the stored country name is included in the news article data. In this case, the system may be configured to store not only the official name but also abbreviations and multiple country names for the same country. For example, in the case of the United States, not only United States of America but also USA and America may be stored. 【0031】 The country label assignment unit 120 determines whether the news article data contains a country name (step S204). If the country name is not included (No), it assigns the label of the country to which the media outlet or company publishing the news belongs to the news article data, similar to step S202 (step S205). If the country name is included (Yes), it assigns the label of that country to the news article data. Here, the system is configured to search by country name only for English news articles, but it is not limited to this, and the system may be configured to search for whether the news article contains a country name for news article data in all languages. In this case, a database of country names would be maintained for each language, and if the news article contains a country name, the label of that country would be assigned, and if the news does not contain a country name, the label of the country of the news publisher would be assigned. 【0032】Figure 3 shows an example of a flowchart for generating news article feature vectors in the news article feature vector generation unit 130. When news article data is input from the news article input unit 110, the news article feature vector generation unit 130 performs preprocessing on the news article data (step S301). Preprocessing of news article data is performed to make the news article data easier to machine learn using a natural language processing model. For example, this includes cleaning processes to remove punctuation and case sensitivity, normalization processes to split into words, and stop word removal. For example, preprocessing may be performed to adapt the data to BERT (Bidirectional Encoder Representations from Transformers) as a natural language processing model. 【0033】 The news article feature vector generation unit 130 may perform tokenization as a preprocessing step, dividing the sentence contained in the news article data into individual words, etc. For example, if the sentence is "I love Japan.", it is divided into "I", "love", and "Japan" and tokenized. The news article feature vector generation unit 130 may also perform mapping to token IDs after tokenization. Mapping to token IDs means, for example, mapping to the corresponding ID in the dictionary held by BERT. For example, in the case of "I", the token ID for "I" in the BERT dictionary is 3678. In the case of "I love Japan." mentioned above, the token ID list is generated by mapping to "3678, 6430, 84530, 538". 【0034】Furthermore, the news article feature vector generation unit 130 performs padding and truncation as preprocessing to fix the length of the token ID list. This is because the amount of data in news article data varies from article to article, and therefore the length of the token ID list also changes depending on the news article data. For this reason, the length of the token ID list is fixed before the news article feature vector generation process. For example, the length of the token ID list is set to 512. It may be longer or shorter than this, but if it is too short, the content of the news article will not be accurately reflected. If the length of the token ID list is set to a predetermined value, the length of the token ID list may be less than that predetermined value if the news article is short. In this case, the length of the token ID list is made the same by adding zeros to the end of the token ID list. Alternatively, if the token ID list is longer than the predetermined value, the first token may be kept and the part longer than the predetermined value may be truncated. However, this is not limited to this, after truncating the first token of the predetermined length, a token of a fixed length may be kept and the rest truncated. 【0035】Next, the news article feature vector generation unit 130 generates a feature vector for a news article data from a token ID list of a predetermined length corresponding to that news article data (step S302). The generation of the news article data feature vector is performed, for example, using a neural network model that partially uses BERT. When a fixed-length token ID list is input to the news article feature vector generation unit 130, it generates a tensor and converts each token into a vector using a model adapted from BERT. For example, when a token ID list of length 512 is input, it generates a 512 tensor and converts each token into a 768-element vector, resulting in a 512 × 768-dimensional tensor. This is then passed through an additive layer, a connected layer, a pooling layer, etc., and the 768-dimensional tensor is output as the news article feature vector. Note that, although a 768-dimensional tensor is used as the feature vector here, it is not limited to this. It is sufficient to generate a feature vector of the same dimension as the feature vector generated based on industry terms or social issue terms. Note that, although not shown in the figure here, the generated news article feature vector may be stored in a memory unit. By storing the feature vectors of news articles in memory, it is possible to prevent the need for repeated calculations. 【0036】 The news article feature vector generation model, which is a neural network model that generates news article feature vectors, is generated by training it together with the neural network model of the association score calculation unit, which will be described later. 【0037】 Figure 4 is a functional block diagram showing an example configuration of the related score calculation unit 140. The related score calculation unit 140 includes an industrial term memory unit 1410, a social issue term memory unit 1420, a term feature vector generation unit 1430, and a feature vector matching unit 1440. The industrial term memory unit 1410 stores industrial terms. Industrial terms are words that indicate industries such as forestry, fisheries, insurance, mining, telecommunications, education, automobile manufacturing, petroleum, and coal. Industrial terms may be, for example, industry names defined in the Japan Standard Industrial Classification, or they may be stored according to industry names based on internationally established industrial classifications, but the industrial names are stored so as to cover all industries in that classification. 【0038】 The Social Issues Memory Unit 1420 memorizes terms related to social issues considered risks in ESG management. Examples include: "wage assessment," "workers in poverty," "child labor," "forced labor," "excessive working hours," "freedom of association, collective bargaining, and the right to strike," "migrant labor," "social benefits," "labor laws and treaties," "discrimination and equal opportunity," "unemployment," "occupational hazards and risks," "work-related accidents and fatalities," "gender equality," "conflict-prone areas," "non-communicable diseases and other health risks," "communicable diseases," "environmental sustainability," "legal systems," "corruption," "democracy," "freedom of speech," "access to improved drinking water sources," "health and safety," "children out of school," "access to healthcare," "access to electricity," "property rights," "indigenous rights," and "smallholder farmers and commercial farms." However, it is not limited to these; any issue that is considered a social issue may be memorized as a social issue. 【0039】 The term feature vector generation unit 1430 sequentially reads industrial terms from the industrial term memory unit 1410 and generates industrial term feature vectors, and also sequentially reads social issue terms from the social issue term memory unit 1420 and generates social issue term feature vectors. The term feature vector generation unit 1430 has a term feature vector generation model that is a neural network model using BERT, similar to the news article feature vector generation unit 130. The term feature vector generation unit 1430 also performs preprocessing of input terms to generate a token ID list, similar to the news article feature vector generation unit 130, and generates term feature vectors by inputting the generated token ID list. However, since it generates from terms and not sentences, the predetermined fixed-length token ID list can be short, for example, 16. 【0040】When the fixed-length token ID list is input, the term feature vector generation unit 1430 generates a tensor and converts each token into a vector using a model transplanted from BERT. For example, when the length of the token ID list is 16, it is converted into a 16×768-dimensional tensor, which is then passed through a neural network model including an addition layer, a concatenation layer, a pooling layer, etc., and a 768-dimensional tensor is output as the term feature vector. Here, although it is set to 768 dimensions, it is not limited to this, and it may be output as a feature vector with the same dimensions as the news article feature vector. In this way, the term feature vectors of industrial terms or social issue terms are generated by the learned neural network model. Although not shown here, the generated term feature vectors may be configured to be stored in the storage unit. By storing the term feature vectors in the storage unit, it is possible to prevent performing calculation processing multiple times. 【0041】 The feature vector matching unit 1440 calculates the relevance between a news article and its terms by matching the news article feature vector generated by the news article feature vector generation unit 130 and the term feature vector generated by the term feature vector generation unit 1430. The term feature vectors include industrial term feature vectors that are industrial terms and industrial term feature vectors that are social issue terms. These term feature vectors are sequentially input, and the relevance is calculated by matching each term feature vector with the news article feature vector. Therefore, the feature vector matching unit 1440 has one news article feature vector for one news article, but for one news article feature vector, it matches all the term feature vectors of industrial terms and social issue terms respectively to calculate the relevance. 【0042】The feature vector matching unit 1440 is a learned neural network-based relevance score calculation model that is trained to calculate a high score when the news article data has a high relevance to an industry or a social issue by matching the feature vector of the news article data with the term feature vector of the industry or the social issue. The feature vector matching unit 1440 receives as input the news article data feature vector generated by the news article feature vector generation unit 130 and the term feature vector based on the industry terms or social issue terms generated by the term feature vector generation unit 1430. The news article data feature vector and the term feature vector are both feature vectors having the same dimension, and the feature vector matching unit 1440 calculates the difference between the feature vectors in a difference layer that is a part of the neural network, and calculates the relevance as a one-dimensional numerical value by passing through, for example, several dimensionality reduction layers. The numerical value may be configured such that, for example, it is a numerical value between 0 and 1, and the closer the numerical value is to 1, the higher the relevance. 【0043】 When the relevance calculated by the feature vector matching unit 1440 is obtained by matching the feature vector of the news article data with the term feature vector of a certain industry, it is calculated as the industry relevance score of the news article data for that industry. That is, when matching with the term feature vector of the communications industry, the relevance of the news article data to the communications industry is calculated as the industry relevance score of the communications industry. For this reason, the feature vector matching unit 1440 calculates the industry relevance of all the industry terms stored in the industry term storage unit 1410 with one news article data. 【0044】 The calculation of the industry relevance score of a certain industry for a certain news article data can be expressed by the following formula. Let the industry relevance score of a certain industry for a certain news article data be Industry ij Then, 【0045】 【0046】 Here, i is a certain news article data, j is a certain industry, and TMPT is the neural network model for relevance calculation in the feature vector matching unit 1440. 【0047】 Furthermore, when the feature vector of news article data is matched with the term feature vector of a particular social issue, the social issue relevance score for that news article data is calculated. In other words, when the term feature vector of corruption is matched with the feature vector of news article data, the degree of relevance of that news article data to corruption is calculated as the social issue relevance score for corruption. For this reason, the feature vector matching unit 1440 calculates the degree of social issue relevance between all social issue terms stored in the social issue term storage unit 1420 and a single news article data. 【0048】 The calculation of the social issue-related score for a given social issue in relation to a given news article data can be expressed by the following formula. ik So, 【0049】 【0050】 Here, i represents a news article data, k represents a social issue, and TMPT is a neural network model used for calculating relevance in the feature vector matching unit 1440. 【0051】 Figure 5 is a block diagram schematically showing the procedure for training the news article feature vector generation unit 130, the term feature vector generation unit 1430, and the feature vector matching unit 1440. 【0052】The news article feature vector generation unit 130, the term feature vector generation unit 1430, and the feature vector matching unit 1440 are all composed of neural networks, and a model is built by these three units learning together. For example, news data on various industries and social issues is prepared as training data, and labels are added to these news data indicating which industry or social issue the news pertains to. The news article feature vector generation unit 130 is input with the news article data, and the term feature vector generation unit 1430 is input with industry terms or social issue terms. If the news article is related to the input industry or social issue, the parameters are adjusted so that the output from the feature vector matching unit 1440 becomes 1, and the system is trained accordingly. If the industry terms or social issue terms are unrelated to the news article data, the system is trained so that the output from the feature vector matching unit 1440 becomes 0. 【0053】 Figure 6 is a block diagram illustrating the process of calculating the sentiment score in the sentiment score calculation unit 150. The sentiment score calculation unit 150 is a numerical value that indicates the sentiment tendency of the content of the news article data. The sentiment score shown by the content of the news article indicates the attitude of society, and a negative attitude of society indicates a high social risk. For this reason, the sentiment score may be set to a range of -1 to 1, with a value close to -1 indicating a negative attitude of society and a value close to 1 indicating a positive attitude of society. The sentiment score calculation unit 150 receives the news article feature vector generated by the news article feature vector generation unit 130 as input. This news article feature vector is, for example, a feature vector with 768 dimensions. 【0054】The sentiment score calculation unit 150 is a pre-trained sentiment prediction model. For example, XGBooster (eXtreme Gradient Boosting), a gradient boosting model algorithm for decision trees, may be used. Alternatively, the Natural Language API provided by Google may be used. During the training phase, training is performed using training data in which a number of news articles are assigned a value of 1 if the content is positive and -1 if it is negative. When the pre-trained sentiment score calculation unit 150 receives a news article feature vector as input, it calculates the sentiment score by outputting a number in the range of -1 to 1. 【0055】 Figure 7 shows an example of a database of news article scores stored in the news article score storage unit 160. The country label assigned to a news article data, the calculated industry-related score, social issue score, and sentiment score are stored in the news article score storage unit 160, associated with the news ID assigned to each news article data. Here, the data is stored in association with the news ID assigned to each news article data, but it is not limited to this, and may also be stored in association with the news article data itself. 【0056】 In the database example in Figure 7, news with news ID 1 is assigned the country label "KR" indicating South Korea, and an industry relevance score is stored for each industry. Here, the industry relevance score is represented by a number from 0 to 1, with a value closer to 1 indicating a higher degree of relevance. Only chemical, coal, and telecommunications are displayed, and other industries are omitted, but the industry relevance scores for this news article data are stored for all industries stored in the industry terminology database. For the news article with news ID 1, the chemical industry relevance score is 0.8, indicating that it is likely news related to chemicals. 【0057】Similarly, for social issues, the social issue-related score is expressed as a number between 0 and 1, with a value closer to 1 indicating a higher degree of relevance. Here, only the social issue scores for corruption and democracy are displayed, while others are omitted. However, social issue scores are also stored for other social issue terms stored in the social issue terminology database. In addition, sentiment scores are stored in association with the news ID. Here, news ID 1 has a score of -0.5, indicating a negative social attitude. On the other hand, news ID 2 has a sentiment score of 0.7, indicating a positive sentiment. 【0058】 Figure 8 is a block diagram illustrating the process of calculating social risk in the social risk calculation unit 170. Social risk is a social risk value calculated by country, industry, and social issue. For example, the numerical range is between -1 and 1, with 0 as the dividing line. A negative value closer to -1 indicates a higher social risk, while a positive value closer to 1 indicates a lower social risk. As mentioned above, each news article data is classified by country. Therefore, in calculating the social risk for a particular social issue in a particular industry in a particular country, the industry-related score for that industry, the social issue-related score for that social issue, and the sentiment score, which are assigned to each news article data classified for that country, are used to calculate the social risk value. 【0059】 The social risk value for a certain social issue related to a certain industry in a given country, based on a given news article data set, can be expressed by the following formula: 【0060】 【0061】 SR icjk This represents the social risk value related to social issues k concerning industry j in news article data i belonging to a certain country c. ij This refers to the industry-related score in industry j within news article data i, and Topic ik This refers to the social issue-related score for social issue k in news article data i, and Sentiment i This represents the sentiment score in news article data i. 【0062】 The social risk value for a particular social issue in a particular industry within a given country is calculated by summing the social risk values, which are derived from the industry-related score for that industry, the social issue-related score for that social issue, and the sentiment score assigned to each news article data item classified within that country, and then averaging these values over the number of news articles. 【0063】 The social risk calculation unit 170 specifies a country, industry, and social issue, and reads the industry-related score, social issue-related score, and sentiment score assigned to news article data classified under a certain country, which are stored in the news article score storage unit 160. The social risk calculation unit 170 also calculates the social risk value for a certain social issue in a certain industry in each news article data based on the various scores read out, and calculates the social risk value for a certain social issue in a certain industry in a certain country by averaging the calculated values. 【0064】 For example, when calculating the social risk of "environmental sustainability" in the "telecommunications industry" in "Japan," the social risk calculation unit 170 reads out the industry-related score for the telecommunications industry, the social issue-related score for "environmental sustainability," and the sentiment score assigned to the news article data classified as Japan. The social risk calculation unit 170 calculates the social risk value for each news article data based on the read-out scores, and then calculates the social risk value for environmental sustainability in the telecommunications industry in Japan. 【0065】 The social risk calculation unit 170 calculates social risk values by country, industry, and social issue, but is not limited to these. It may also calculate the social risk value for a particular social issue across all industries in a country, or calculate the social risk value for a particular social issue in a particular industry within a region that includes multiple countries. When calculating social risk values for multiple industries or multiple countries together, the social risk values for individual industries or countries may be calculated and averaged, or the countries or industries may be weighted by their production value, etc., before calculation. 【0066】The calculated social risk value may be stored in the social risk storage unit 180, or it may be output by displaying it on the display unit. The social risk calculation unit 170 may also be configured to calculate the risk by receiving a social risk value calculation request from the user via the input unit or the internet, and to output the calculated social risk value. 【0067】 Here, the system is configured to output social risk values as numerical values in the range of -1 to 1. However, it is also possible to set multiple thresholds and output risk categories such as "low risk," "high risk," and "medium risk." For example, values between -1 and -0.5 could be "very high risk," values between -0.5 and 0 could be "high risk," values between 0 and 0.5 could be "medium risk," and values above 0.5 could be "low risk." The numerical ranges and categories shown here are examples and can be changed in various ways. 【0068】 Figure 9 shows an example of a database of social risk values stored in the social risk memory unit 180. The social risk memory unit 180 stores the social risk values calculated by the social risk calculation unit 170, associating them with countries, industries, and social issues. 【0069】 Figure 10 is an example of a hardware configuration diagram of the social risk calculation device 100 in the present invention. The computer 100 forming the social risk calculation device 100 is configured as shown in Figure 10 by connecting a CPU 11, a communication interface 12 connected to a network such as the Internet, ROM 13, RAM 14, a hard disk drive 15, an input / output interface 16, a display unit 17 connected to the input / output interface 16, a pointing device 18, and a keyboard 19 to a bus. An external storage device 20 such as a USB memory can also be connected to the input / output interface 16. 【0070】 The display unit 17 is, for example, a display device such as a liquid crystal display. The pointing device 18 is, for example, a mouse or a trackball. 【0071】When a series of processes are executed by a program, for example, the functions of the news article input unit 110, the country label assignment unit 120, the news article feature vector generation unit 130, the association score calculation unit 140, the sentiment score calculation unit 150, and the social risk calculation unit 170 are stored in the ROM 13 or hard disk drive 15 as a computer program that calculates social risk, and the CPU 11 executes the various functions. In addition, the industrial terminology storage unit 1410, the social issue terminology storage unit 1420, and the news article score storage unit 160 may be stored in the ROM 13 or hard disk drive 15 as part of a computer program that calculates social risk. 【0072】 The computer program for corporate evaluation is installed on the information processing device 100 by connecting an external storage device 20, such as a USB memory stick containing the computer program, to the input / output interface 16. Alternatively, the computer program may be installed on the information processing device 100 via the communication interface 12 over a network, or it may be pre-installed in the information processing device itself, for example, in a ROM 13 on which the computer program is stored. 【0073】 100 Social risk calculation device 110 News article input unit 120 Country label assignment unit 130 News article feature vector generation unit 140 Relevance score calculation unit 150 Sentiment score calculation unit 160 News article score storage unit 170 Social risk calculation unit 180 Social risk storage unit
Claims
1. A social risk calculation device for calculating social risk, comprising: a news article input unit for inputting news article data; a related score calculation unit for calculating an industry related score, which is a related score between the input news article data and an industry, and a social issue related score, which is a related score between the news article data and a social issue; an emotional score calculation unit for calculating the emotional tendency of the content of the news article data as an emotional score; and a social risk calculation unit for calculating a social risk value based on the industry related score, social issue related score, and emotional score of multiple news articles.
2. The social risk calculation device according to claim 1, further comprising a country labeling unit for assigning country labels to the news article data.
3. The social risk calculation device according to claim 2, further comprising a news article score storage unit that stores the country label, the industry-related score, the social issue-related score, and the sentiment score assigned to the news article data, wherein the social risk calculation unit calculates the social risk in a certain social issue of a certain industry in a certain country based on the industry-related score, the social issue-related score, and the sentiment score calculated from each news article.
4. The social risk calculation device according to claim 1 further comprises a news article feature vector generation unit that generates news article feature vectors from input news article data, the association score calculation unit calculates an industry association score by comparing the news article feature vectors generated by the news article feature vector generation unit with industry feature vectors and calculating the similarity, the social issue association score calculates a social issue association score by comparing the news article feature vectors with social issue feature vectors and calculating the similarity, and the sentiment score calculation unit is a trained machine learning model that calculates a sentiment score by inputting the news article feature vectors.
5. The social risk calculation device according to claim 4 is characterized in that the association score calculation unit has a feature vector matching unit which is an association score calculation model using a trained neural network that has been trained to calculate a high score when the news article data is highly associated with the industry or social issue by matching the feature vector of the news article data with the feature vector of the industry or social issue.
6. The social risk calculation device according to claim 1 further comprises a social risk storage unit that stores the social risks calculated by the social risk calculation unit by industry and by social issue.
7. A method for calculating social risk using a computer, comprising: an input step of inputting news article data; an association score calculation step of calculating an industry association score, which is the association score between the input news article data and an industry, and a social issue association score, which is the association score between the news article data and a social issue; an emotion score calculation step of calculating the emotion tendency of the content of the news article data as an emotion score; and a social risk calculation step of calculating a social risk value based on the industry association score, social issue association score, and emotion score of multiple news articles.
8. A computer program for causing a computer to function to calculate social risk, the program causing the computer to perform the following steps: an input step of inputting news article data; an association score calculation step of calculating an industry association score, which is the association score between the input news article data and an industry, and a social issue association score, which is the association score between the news article data and a social issue; an emotional score calculation step of calculating the emotional tendency of the content of the news article data as an emotional score; and a social risk calculation step of calculating a social risk value based on the industry association scores, social issue association scores, and emotional scores of multiple news articles.