Neural-symbolic methods for entity linking

By using logistic neural networks (LNNs) to generate and learn feature sets and connection rules, the ambiguity problem of entity links in short texts is solved, achieving more efficient entity link accuracy in short text environments.

CN117043785BActive Publication Date: 2026-06-16INTERNATIONAL BUSINESS MACHINE CORPORATION

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
INTERNATIONAL BUSINESS MACHINE CORPORATION
Filing Date
2022-03-14
Publication Date
2026-06-16

AI Technical Summary

Technical Problem

Existing technologies struggle to effectively resolve ambiguities in entity links within short texts, especially in short text environments consisting of a single sentence or question. The challenge lies in the difficulty of entity linking due to limited contextual information.

Method used

Interpretable rule-based logical neural networks (LNNs) are used to eliminate ambiguity. By generating feature sets, evaluating logical connection rules and connectivity weights, learning connectivity weights using artificial neural networks and machine learning algorithms, dynamically updating the weights associated with the rules, and generating a learned model.

🎯Benefits of technology

It improves the accuracy and efficiency of entity linking in short text environments, effectively handles entity linking challenges in single sentences or questions, and enhances the computer's ability to understand natural language.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN117043785B_ABST
    Figure CN117043785B_ABST
Patent Text Reader

Abstract

A system, computer program product, and method for entity linking in a logical neural network (LNN) are provided. A set of features is generated for one or more entity-mentions in an annotated dataset. The generated set of features is evaluated against entity linking LNN rule templates having one or more logical connection rules and corresponding connectivity weights organized in a tree structure. An artificial neural network is used with a corresponding machine learning algorithm to learn the connectivity weights. The connectivity weights associated with the logical connection rules are selectively updated and a learned model is generated with learned thresholds and learned weights for the logical connection rules.
Need to check novelty before this filing date? Find Prior Art

Description

Background Technology

[0001] This invention relates to a computer system, computer program product, and computer-implemented method using artificial intelligence (AI) and machine learning to disambiguate references in text by linking them to entities in a knowledge graph. More specifically, embodiments involve using interpretable rules for logical neural network entity linking and learning corresponding connectivity weights and rules.

[0002] Entity linking is the task of disambiguating textual mentions by linking them to canonical entities provided by a knowledge graph. General approaches target long texts consisting of multiple sentences, where features are extracted to measure the degree or similarity between a mention and one or more candidate entities, and disambiguation steps are performed via a non-learning heuristic to link the mention to the actual entity. Challenges in entity linking target short texts such as single sentences or questions and the limited context surrounding the mention. Platforms supporting short texts include dialogue systems such as chatbots. The embodiments shown and described herein target linking artificial intelligence (AI) platforms to entities to mitigate the challenges associated with short texts and their corresponding platforms(s). Summary of the Invention

[0003] The embodiments disclosed herein include computer systems, computer program products, and computer-implemented methods for dissolving ambiguity in textual references by linking references in text to entities in a logical neural network using interpretable rules. These embodiments will be further described in the following detailed description. This invention is not intended to identify key or essential features or concepts of the claimed subject matter, nor is it intended to be used in any way that would limit the scope of the claimed subject matter.

[0004] In one aspect, a computer system has a processor operatively coupled to memory and an artificial intelligence (AI) platform operatively coupled to the processor. The AI ​​platform is configured with a feature manager, an evaluator, and a machine learning (ML) manager, the ML manager being configured to support entity linking in a logistic neural network (LNN). The feature manager is configured to generate a feature set for one or more entity-mention pairs in an annotated dataset. The evaluator is operatively coupled to the feature manager and configured to evaluate the generated feature set against an LNN rule template for entity linking, having one or more logical connection rules organized in a hierarchical structure and corresponding connectivity weights. The ML manager, operatively coupled to the evaluator, is configured to learn connectivity weights using the AI ​​neural network and a corresponding ML algorithm. The ML manager is also configured to selectively update the connectivity weights associated with the logical connection rules. A learned model is generated using the learned thresholds and the learned connectivity weights for the logical connection rules.

[0005] On the other hand, a computer program product having a computer-readable storage medium is provided, the computer-readable storage medium having embodied program code. The program code is executable by a processing unit and has the function of generating a feature set for one or more entity-mention pairs in an annotated dataset. The generated feature set is evaluated for entities linked by an LNN rule template, the LNN rule template having one or more logical connection rules organized in a hierarchical structure and corresponding connectivity weights. The program code supports the function of learning connectivity weights using artificial neural networks and corresponding machine learning algorithms. The connectivity weights associated with the logical connection rules are selectively updated, and a learned model is generated using the learned threshold and the learned connectivity weights for the logical connection rules.

[0006] In another aspect, a method is provided. A feature set is generated for one or more entity-mention pairs in an annotated dataset. The generated feature set is evaluated for entities linked by an LNN rule template, which has one or more logical connection rules organized in a hierarchical structure and corresponding connectivity weights. An artificial neural network, along with a corresponding machine learning algorithm, is used to learn the connectivity weights. The connectivity weights associated with the logical connection rules are selectively updated, and a learned model is generated using the learned thresholds and the learned connectivity weights for the logical connection rules.

[0007] These and other features and advantages will become apparent from the following detailed description of the presently preferred embodiments, taken in conjunction with the accompanying drawings. Attached Figure Description

[0008] The accompanying drawings, which are referenced herein, form part of the specification. The features shown in the drawings are merely illustrative of some embodiments, and not of all embodiments, unless explicitly indicated otherwise.

[0009] Figure 1 A block diagram of a computer system is described, illustrating a tool that supports neural symbolic solutions for entity linking, in an exemplary embodiment of which the entity linking is an application in a short text scenario.

[0010] Figure 2 A block diagram is described, providing an illustration. Figure 1 A block diagram showing the tools and their associated APIs.

[0011] Figures 3A to 3C A flowchart is described illustrating the process of learning thresholding operations and weights in an entity linking algorithm.

[0012] Figure 4 The flowchart describes the process of using LNN to learn new rules with appropriate weights for logical connectives.

[0013] Figure 5 A block diagram is shown for an example LNN reformulation used to illustrate the EL algorithm.

[0014] Figure 6 It describes the methods used to achieve the above regarding Figures 1 to 5 A block diagram illustrating an example of a cloud-based support system for computer systems / servers.

[0015] Figure 7 A block diagram illustrating a cloud computing environment is described.

[0016] Figure 8 A block diagram is described, illustrating the set of functional abstraction model layers provided by a cloud computing environment. Detailed Implementation

[0017] It is readily understood that, as generally described and illustrated in the accompanying drawings, the components of this embodiment can be arranged and designed in a variety of different configurations. Therefore, the following detailed description of embodiments of the apparatus, system, method, and computer program product of this embodiment, as presented in the drawings, is not intended to limit the scope of the claimed embodiments, but is merely representative of selected embodiments.

[0018] Throughout this specification, references to "selective embodiment," "one embodiment," or "embodiment" mean that a particular feature, structure, or characteristic described in connection with that embodiment is included in at least one embodiment. Therefore, the phrases "selective embodiment," "in one embodiment," or "in an embodiment" appearing throughout this specification do not necessarily refer to the same embodiment.

[0019] The illustrated embodiments will be better understood by referring to the accompanying drawings, in which the same components are always designated by the same reference numerals. The following description is intended to be illustrative only and shows only some selected embodiments of devices, systems, and processes consistent with the embodiments claimed herein.

[0020] Artificial intelligence (AI) is a field of computer science that encompasses computers and their human-related behavior. AI refers to the intelligence of machines when they can make decisions based on information, maximizing their chances of success on a given subject. More specifically, AI can learn from datasets to solve problems and provide relevant recommendations. For example, in the field of artificial intelligence computer systems, natural language (NL) systems (such as IBM's)... Artificial intelligence computer systems or other natural language question answering systems (NLE) process NL based on the knowledge acquired by the system.

[0021] In the field of AI computer systems, Natural Language Processing (NLP) systems process natural language based on acquired knowledge. NLP is a field of AI that serves as a translation platform between computer language and human language. More specifically, NLP enables computers to analyze and understand human language. Natural Language Understanding (NLU) is a type of NLP that parses and translates input according to natural language principles. An example of such an NLP system is IBM. Artificial intelligence computer systems and other natural language question-answering systems.

[0022] Machine learning (ML) is a subset of AI that uses algorithms to learn from data and create predictions based on that data. ML applies AI by creating models, such as artificial neural networks that can demonstrate learned behavior by performing tasks that are not explicitly programmed. Different types of ML exist, including learning problems such as supervised, unsupervised, and reinforcement learning; hybrid learning problems such as semi-supervised, self-supervised, and multi-instance learning; statistical inference such as inductive, deductive, and transformational learning; and learning techniques such as multi-task, active, online, transport, and ensemble learning.

[0023] At the heart of AI and related reasoning lies the concept of similarity. Structures, including static and dynamic structures, define a definite output or action for a given, definite input. More specifically, the determined output or action is based on expressions or inherent relationships within the structure. This arrangement may be satisfactory for the chosen environment and conditions. However, it should be understood that dynamic structures are inherently prone to change, and the output or action can correspondingly be prone to change. Existing solutions for effectively identifying objects and understanding similarity relationships (NL), as well as handling changes in the content and structure in response to identification and understanding, are extremely difficult in practice.

[0024] Artificial Neural Networks (ANNs) are models of how the nervous system operates. The basic unit is called a neuron, which is typically organized into layers. ANNs work by simulating a large number of interconnected processing units that resemble abstract versions of neurons. There are generally three parts in an ANN: an input layer with units representing the input domain, one or more hidden layers, and an output layer with one or more units representing the target domain. These units are connected with varying connection strengths or weights. Input data is provided to the first layer, and values ​​propagate from each neuron to neurons in the next layer. At a basic level, each layer of a neural network includes one or more operators or functions that are operatively coupled to the input and output. The output of the activation function used to evaluate each neuron using the provided input is referred to in this paper as the activation. Complex neural networks are designed to mimic how the human brain works, thus allowing computers to be trained to support poorly constrained abstractions and problems where training data is available. ANNs are frequently used in image recognition, speech, and computer vision applications.

[0025] Natural Language Processing (NLP) is a field within AI and linguistics that studies the inherent problems in the processing and manipulation of natural language, with the aim of improving computers' ability to understand human language. NLP focuses on extracting meaning from unstructured data.

[0026] Entity linking (EL) is referred to in this paper as the task of eliminating ambiguity (e.g., removing uncertainty) in such mentions by linking textual mentions to canonical entities provided by a knowledge graph (KG). Text or textual data T consists of a set of mentions M = {m1, m2, ...}, where each mention m... i It is contained in text data T. The knowledge graph (KG) consists of a set of entities ε, where each entity is referred to as e in this paper. ij Entity linking is a many-to-one function that links each mention m i ∈M links to entities in KG. More specifically, the link points to e. ij ∈C i C i It refers to the volume m i A subset of the relevant candidate ε.

[0027] Logical Neural Networks (LNNs) are neurosymbolic frameworks designed to simultaneously provide the key properties of neural networks (NNs) and symbolic logic (knowledge and reasoning). More specifically, LNNs are used to simultaneously provide the learning of knowledge and reasoning and the characteristics of symbolic logic. LNNs use observations where the weights of logical neurons are constrained to act as logical AND or OR gates to create direct correspondences between artificial neurons and logical elements. The LNNs shown and described employ rules represented in first-order logic (FOL), a form of symbolic reasoning where each sentence or statement is broken down into a subject and a predicate. Each rule is a disambiguation model that captures specific features of the links. Given a rule template, the parameters of the rule are learned based on a labeled dataset, taking the form of thresholding operations on the predicates, and the weights of the predicates appearing in the rule are also learned. Thus, the LNN learns the parameters of the rule to enable and implement parameter tuning.

[0028] Structurally, an LNN is a graph consisting of syntax trees representing all the formulas, which are interconnected by neurons added for each proposition. Specifically, there is one neuron for each logical operation that occurs in each formula, and additionally, there is one neuron for each unique proposition that occurs in any formula. All neurons return pairs of values ​​in the range [0, 1], which represent the lower and upper bounds of the truth values ​​of their corresponding subformulations and propositions.

[0029] Using the semantics of FOL, LNNs enforce constraints when learning operators. Examples of such operators include, but are not limited to, logical AND and logical OR, where logical AND is denoted as LNN-∧ and logical OR as LNN-∨ in this paper. Logical AND(LNN-∧) is represented as:

[0030] max(0, min(1, β-w1(1-x)-w2(1-y)))

[0031] It has the following constraints:

[0032] β-1(1-α)(w1+w2)≥α Constraint 1

[0033] β-αw1≤1-α Constraint 2

[0034] β-αw²≤1-α (Constraint 3)

[0035] w1, w2≥0

[0036] Where β, w1, and w2 are learnable parameters, x and y ∈ [0, 1] are inputs, and α ∈ [1 / 2, 1] is a hyperparameter. Similar to logical AND, logical OR is constrained according to logical AND as follows:

[0037] LNN-V(x,y)=1-LNN-∧(1-x,1-y)

[0038] Typically, Boolean logic returns either 1 or true when both inputs are 1. LNNs relax Boolean connections, such as logical AND, by using α as a surrogate for 1 and 1-α as a surrogate for 0. When both inputs are greater than α, constraint 1 forces the output of logical AND to be greater than α. Similarly, constraints 2 and 3 constrain the behavior of logical AND when one input is low and the other is high. More specifically, for y = 1 and x ≤ 1-α, constraint 2 forces the output of logical AND to be less than 1-α. This formula allows unconstrained learning when x, y ∈ [1-α, α]. The degree of learning can be controlled by changing α. In an exemplary embodiment, constraints (e.g., constraints 1, 2, and 3) can be relaxed.

[0039] Features are referred to herein as attributes that measure the degree of similarity between text mentions and candidate entries. In exemplary embodiments, a catalog of feature functions is used to generate features, including non-embedding and embedding-based functions. As shown and described herein, an exemplary set of non-embedding-based feature functions is provided to measure mentions m i With candidate entity e ij The similarity between them. Name features are a set of general similarity functions, such as, but not limited to, Jaccard, Jaro Winkler, Levenshtein, and partial proportions, to calculate mention m. i Name and candidate entity e ij The similarity between names. Contextual features include mentions of m. i Context and candidate entity e ij The aggregation similarity described. In an exemplary embodiment, the context feature Ctx is evaluated as follows:

[0040]

[0041] Here, pr is a partial proportion that measures the similarity between each contextual mention and description. In one exemplary embodiment, this partial proportion calculates the maximum similarity between a substring of the shorter input string and a substring of the second longer string. The type feature is the mention m. i Type and e ij The overlapping similarity of the domain sets. In an exemplary embodiment, a bidirectional encoder representation trained from a Transformer-based (BERT) entity type detection model is used to obtain the similarity for each mention m. i The type information. Entity saliency features are candidate entity e. ij The saliency measure, as in the target knowledge graph (i.e., indegree (e ij Links to candidate entity e in ))ij The number of entities.

[0042] like Figures 1 to 5 As shown and described, the Entity Linking (EL) algorithm, consisting of a set of disjunctive rules, is reformulated into an LNN representation for learning. Entity Linking is a restricted form of First-Order Logic (FOL) rules, comprising a set of Boolean predicates connected by logical operators of the form AND (∧) and OR (∨). Boolean predicates have the form f k >θ, where f k ∈F is one of the feature functions, while θ is the learned thresholding operation. Below are examples of two entity linking rules:

[0043] R1(m i e ij )←jacc(m i e ij )>θ1∧Ctx(m i e ij )>θ2

[0044] R2(m i e ij )←lev(m i e ij )>θ3∧Prom(m i e ij )>θ4

[0045] Based on these examples, if the predicate jacc(m i e ij )>θ1 and predicate Ctx(m i e ij If ) > θ2 are both true, then the first example rule R1(m) is true. i e ij The evaluation is true, and if the predicate lev(m) is true, then the evaluation is true. i e ij )>θ3 and predicate Prom(m i e ij If ) > θ4 are both true, then the second example rule R2(m) i e ij The evaluation is true. In an exemplary embodiment, rules such as the first and second example rules can be dissected together to form a larger EL algorithm. Below is an example of such an extension:

[0046] Links(m i e ij )←R1(m i e ij )∨R2(m i e ij )

[0047] Links(m) i e ij The evaluation is true if either the first or second rule is true. In an exemplary embodiment, the link predicate represents a disjunction between at least two rules and is used to store high-quality links between mentioned entities and candidate entities that meet the conditions of at least one rule.

[0048] The EL algorithm is also used as a scoring mechanism. Below is an example of a scoring function based on the first and second rules in the example:

[0049]

[0050] Among them rw i It is a manually assignable rule weight, fw i These are manually assignable feature weights. As shown and described in this paper, the learning is performed with respect to the thresholding operation θ. i Feature weights fw i and rule weights rw i .

[0051] refer to Figure 1A block diagram (100) is provided to illustrate a computer system with tools supporting a neural symbolic solution for entity linking, which is applied to a short text scenario in an exemplary embodiment. Typically, entity linking extracts features that measure the degree of similarity between a textual mention and any of a number of candidate entities. In an exemplary embodiment, the short text targets a single sentence or question. A challenge associated with effective techniques in short text environments is the limited context surrounding the mention. As described herein, the system and associated tools combine logical rules and learning to facilitate the combination of multiple types of EL features with interpretability and learning using gradient-based techniques. As shown, a server (110) is provided that communicates with multiple computing devices (180), (182), (184), (186), (188), and (190) via a network connection (105). The server (110) is configured with a processing unit (112) operatively coupled to a memory (114) via a bus (116). Tools in the form of an artificial intelligence (AI) platform (150) are shown local to a server (110) and operatively coupled to a processing unit (112) and a memory (114). As shown, the AI ​​platform (150) includes tools in the form of a feature manager (152), an evaluator (154), a machine learning (ML) manager (156), and a rule manager (158). Together, these tools provide functional support for physical links between one or more computing devices (180), (182), (184), (186), (188), and (190) via a network (105). The computing devices (180), (182), (184), (186), (188), and (190) communicate with each other and with other devices or components via one or more wired and / or wireless data communication links, wherein each communication link may include one or more wired devices, routers, switches, transmitters, receivers, etc. In this network arrangement, the server (110) and network connection (105) enable the generation of features and the application of the generated features to an EL algorithm, which consists of a set of disjunctive rules re-expressed as an LNN representation for learning. Other embodiments of the server (110) can be used with components, systems, subsystems and / or devices other than those described herein.

[0052] Tools including an AI platform (150), or in one embodiment, tools embedded therein including a feature manager (152), an evaluator (154), an ML manager (156), and a rule manager (158), can be configured to receive input from various sources, including but not limited to input from a network (105), and an operationally coupled knowledge base (160). As shown herein, the knowledge base (160) includes a first library (1620) of annotated datasets, denoted herein as datasets. 0,0 (1640,0 ), dataset 0,1 (164 0,1 ), ..., dataset 0,N (164 0,N The number of datasets in the first library (1620) is for illustrative purposes and should not be considered limiting. Similarly, in an exemplary embodiment, the knowledge base (160) may include one or more supplementary libraries, each containing one or more datasets. Therefore, the number of libraries shown and described herein should not be considered limiting.

[0053] Various computing devices (180), (182), (184), (186), (188), and (190) communicating with the network (105) demonstrate access points to the AI ​​platform (150) and corresponding tools (e.g., managers and evaluators), including a feature manager (152), an evaluator (154), an ML manager (156), and a rule manager (158). Some computing devices may include devices used by the AI ​​platform (150) and, in one embodiment, include tools (152), (154), (156), and (158) that support generating a learning model using weights of learned thresholded operations and logical connections, and dynamically generating templates for applications of the learning model. In various embodiments, the network (105) may include local network connectivity and remote connectivity, enabling the AI ​​platform (150) and embedded tools (152), (154), (156), and (158) to operate in environments of any size, including local and global (e.g., the Internet). Therefore, the server (110) and AI platform (150) are used as the front-end system, and the knowledge base (160) and one or more libraries and datasets are used as the back-end system.

[0054] Data annotation is the process of adding metadata to a dataset, effectively labeling associated datasets, and allowing ML algorithms to classify corresponding pre-existing data. As described in detail below, the server (110) and the AI ​​platform (150) utilize input from a knowledge base (160), which consists of a dataset from one of the libraries (e.g., library (1620)) and the corresponding dataset (e.g., dataset). 0,1 (164 0,1 The annotation data is in the form of entity-mention pairs (m). In an exemplary embodiment, the annotation data is in the form of entity-mention pairs (m). i e ijIn the form of ), each of these pairs has a corresponding label. Similarly, in one embodiment, an annotated dataset can be transmitted across a network (105) from one or more operatively coupled machines or systems. The AI ​​platform (150) utilizes a feature manager (152) to generate a feature set for one or more entity-mention pairs in the annotated dataset. In an exemplary embodiment, a catalog of feature functions (including non-embedding and embedding-based functions) is used to generate features to measure (e.g., compute) mentions m for a subset of the labeled entity-mention pairs. i and candidate entity e ij The similarity between features, where each feature has a corresponding similarity predicate. Examples of such features include, but are not limited to, those used to calculate mention m. i Name and candidate entity e ij Name characteristics that determine the degree of similarity between names, used to assess mentions m i Contextual features of the degree of cluster similarity of the context, for candidate entity e ij Description, as a reference to m i Type and e ij The degree of similarity of the domain sets, the overlapping type features, and the measures used to measure candidate entity e ij The saliency of entity saliency features serves as a link to candidate entity e in the target knowledge graph. ij The number of entities. Therefore, the initial aspect involves a similarity assessment of candidate entity-mention pairs, which generates quantitative properties.

[0055] The evaluator (154) shown in this paper is operatively coupled to the feature manager, causing the features of the generated entity-mention pairs to conform to an entity linking (EL) logistic neural network (LNN) rule template. More specifically, the evaluator (154) re-formulates the entity linking algorithm, consisting of a set of disjunctive rules, into an LNN representation. Figure 5 Example LNN rule templates (e.g., LNN representations) are shown and described herein. In an exemplary embodiment, one or more LNN rule templates are provided in a knowledge base or transmitted to an evaluator (154) via a network (105). As an example, a knowledge base (160) is shown herein as a library with LNN rule templates, such as a second library (1621), which is shown herein as a template. 1,0 (164 1,0 ),template 1,1 (164 1,1 ),...,template 1,M (164 1,MThe number of rule templates in the second library (1621) is for illustrative purposes and should not be considered limiting. Similarly, in an exemplary embodiment, the knowledge base (160) may include one or more supplementary libraries, each containing one or more LNN rule templates. Figure 5 As illustrated in the example, an LNN rule template can be formulated as an inverted binary tree structure with one or more logical connection rules and corresponding connectivity weights. This example rule template is relatively basic. In exemplary embodiments, the LNN rule template can be extended with additional layers and expansion rules in the binary tree. Therefore, as shown herein, the generated features are evaluated against the selected or identified LNN rule template.

[0056] LNN rule templates can be formulated as inverted binary trees, where features or subsets of feature functions are represented in the leaf nodes of the binary tree. Each feature is associated with a corresponding threshold θ. i The associated operations are also referred to herein as thresholding operations. Internal nodes of the binary tree represent logical AND or logical OR operations. Edges are provided between each internal node and a thresholding operation, and between each internal node and the root node. In an exemplary embodiment, the binary tree may have multiple levels of internal nodes, where edges extend between adjacent levels of nodes. Each edge has a corresponding weight, referred to herein as a rule weight. Each of the thresholding operations and rule weights, collectively referred to herein as connectivity weights, is learned. As shown herein, an ML manager (156) operatively coupled to the evaluator (154) is configured to learn the thresholding operations and connectivity weights using an ANN and a corresponding ML algorithm. For a thresholding operation, the ML manager (156) learns an appropriate threshold for each computed feature associated with the corresponding similarity predicate. The evaluator (154) interfaces with the ML manager (156) to filter one or more features based on the learned thresholds. More specifically, filtering enables the evaluator (154) to determine whether to incorporate a feature into an LNN rule template, which is done by removing the feature or assigning a non-zero score to the feature.

[0057] Connectivity weights are identified and associated with each rule template. As illustrated by examples in this paper, the templates... 1,0 (164 1,0 A set of interconnected weights, referred to as weights in this paper. 1,0 (166 1,0 ), weight 1,1 (166 1,1 ), ..., weight 1,M (166 1,M Although not shown, each template (e.g., template) 1,1 (164 1,1 ) and template 1,M(164 1,M Each knowledge base (160) has corresponding connectivity weights. The number and characteristics of the weights are based on the corresponding template. Similarly, in an exemplary embodiment, the knowledge base (160) has a third library (1622) populated with an ANN, which is illustrated herein by way of example as an ANN. 2,0 (164 2,0 ), ANN 2,1 (164 2,1 ), ..., ANN 2,P (164 2,P The quantities of ANNs shown herein are for illustrative purposes only and should not be considered limiting. In one embodiment, each ANN may have a corresponding or embedded ML algorithm. Thresholding operations and connectivity weights are parameters that are learned and selectively updated, respectively or jointly, by the ML manager (156). Details of the learning are detailed in... Figure 4 The following is illustrated and described. Once learning and updating are complete, a learned model with learned weights for thresholding operations and logical connectives is generated.

[0058] As shown and described herein, rule templates with corresponding rules can be provided, wherein thresholding operations and connectivity weights are learned to generate a learned model. In an exemplary embodiment, given a feature set and an EL-annotated dataset, new rules with appropriate weights for logical connectives can be learned. A rule manager (158), operatively coupled to the evaluator (154) as shown herein, is provided to support such functionality. More specifically, the rule manager (158) learns rules for one or more connections, dynamically generates a template for a binary tree, and learns logical rules associated with the template. Once learned, the rule manager (158) evaluates the selected rules on the labeled dataset and selectively assigns the selected rules to corresponding nodes in the binary tree. The rule manager (158) selectively assigns conjunction operators (e.g., logical AND) or disjunction operators (e.g., logical OR) to each internal node of the binary tree. Figure 4 The details of the rule manager (158)’s functions regarding rule learning and node operator assignment are shown and described.

[0059] Although shown as being included in or integrated with the server (110), the AI ​​platform (150) may be implemented in a separate computing system (e.g., 190) connected to the server (110) across the network (105). Similarly, although shown as being local to the server (110), the tools (152), (154), (156), and (158) may be distributed together or separately on the network (105). Regardless of where they are implemented, the feature manager (152), evaluator (154), ML manager (156), and rule manager (158) are used to support and enable LNN EL.

[0060] The range of information processing systems that can utilize a server (110) ranges from small handheld devices such as handheld computers / mobile phones (180) to large systems such as mainframe computers (182). Examples of handheld computers (180) include personal digital assistants (PDAs), personal entertainment devices such as MP4 players, portable televisions, and CD players. Other examples of information processing systems include pen or tablet computers (184), laptop or notebook computers (186), personal computer systems (188), and servers (190). As shown, various information processing systems can be networked together using a computer network (105). Types of computer networks (105) that can be used to interconnect various information processing systems include local area networks (LANs), wireless local area networks (WLANs), the Internet, the public switched telephone network (PSTN), other wireless networks, and any other network topologies that can be used to interconnect information processing systems. Many information processing systems include non-volatile data storage, such as hard disk drives and / or non-volatile memory. Some information processing systems may use a separate non-volatile data storage device (e.g., a server (190) uses a non-volatile data storage device (190A), a mainframe computer (182) uses a non-volatile data storage device (182A), and the non-volatile data storage device (182A) may be a component external to various information processing systems or may be a component internal to one of the information processing systems.

[0061] Information processing systems can take many forms, some of which are... Figure 1 As shown, for example, an information processing system can take the form of a desktop computer, server, portable computer, laptop computer, notebook computer, or other form factor computer or data processing system. Furthermore, an information processing system can take other form factors, such as a personal digital assistant (PDA), gaming device, ATM machine, portable telephone device, communication device, or other device including a processor and memory.

[0062] An application programming interface (API) is understood in this field as a software intermediary between two or more applications. About Figure 1 In the embodiments shown and described, one or more APIs can be used to support one or more AI platform tools, including a feature manager (152), an evaluator (154), an ML manager (156), and a rule manager (158), and their associated functionalities. References Figure 2 A block diagram (200) is provided illustrating the AI ​​platform tools and their associated APIs. As shown, multiple tools are embedded within the AI ​​platform (205), including a feature manager (252) associated with API0 (212), an evaluator (254) associated with API1 (222), an ML manager (256) associated with API2 (232), and a rule manager (258) associated with API3 (242). Each API can be implemented using one or more languages ​​and interface specifications.

[0063] API0 (212) provides support for generating feature sets of entity-mention pairs. API1 (222) provides support for evaluating generated features against ELLNN rule templates. API2 (232) provides support for learned thresholding operations and connectivity weights in rule templates. API3 (242) provides support for learning EL rules and selectively assigning learned rules to templates.

[0064] As shown in the figure, each of APIs (212), (222), (232), and (242) is operatively coupled to an API orchestrator (260), or orchestration layer, which is understood in the art as an abstraction layer that transparently threads individual APIs together. In one embodiment, the functionality of individual APIs can be combined or combined. Thus, the configuration of the APIs shown herein should not be considered limiting. Therefore, as shown herein, the functionality of a tool can be embodied or supported by its respective API.

[0065] refer to Figures 3A to 3CA flowchart (300) is provided to illustrate the process for learning thresholding operations and weights in the Entity Linking (EL) algorithm. As shown, the Entity Linking (EL) algorithm has rules (302) in the form of Boolean predicates connected by logical AND and logical OR operators. To facilitate and implement the learning of thresholding operations and weights in the EL algorithm, Boolean logic rules are mapped to an LNN formal system (304), where the LNN constructs logical OR and logical AND in the LNN formal system, allowing continuous real values ​​in [0, 1]. In an exemplary embodiment, the LNN formal system may be an inverted tree structure with features assigned to leaf nodes, and the entity linking rules are represented in internal nodes and root nodes. Each LNN operator produces a value in [0, 1] based on the input value, their weights, and their bias β, where the weights and biases are learnable parameters. The internal nodes of the LNN formal system (also referred to herein as LNN rule templates) consist of external nodes that are operationally connected to the internal nodes via corresponding linking operations. External nodes represent features or feature nodes, and internal nodes represent one of logical AND, logical OR, or thresholding operations.

[0066] Initialize the thresholds (306) for the feature weights and rule weights in the LNN formal system (e.g., an LNN rule template). In an exemplary embodiment, the feature weights and rule weights are collectively referred to as weights herein. After initialization in step (306), select or receive a subset (e.g., triples) of labeled mention-entity pairs S from the labeled dataset L. (308) In an exemplary embodiment, the selection at step (308) is a random selection of mention-entity pairs. Each triple is represented as (m i e i y i ), where m i It indicates that e i Represents an entity, and y i This indicates a match or no match, where in a non-limiting exemplary embodiment, 1 represents a match and 0 represents a no match. Variable S Total The quantities of the selected triples assigned to the subset (310), and the corresponding triple count variable S, are initialized (312). The features in the inverted tree structure are known or determined, and the features are assigned to the variable F. Total (314). For each feature, from F=1 to F Total Calculation in mention of m i and candidate entity e i The similarity measure between them (also referred to as the feature function in this paper, feature) F(316). Examples of feature measures include, but are not limited to, name, context, type, and entity saliency as described above. As shown, a feature set is computed for each entity mention pair. In an exemplary embodiment, this feature set is a similarity predicate, wherein the feature set utilizes mention m i With candidate entity e i One or more string similarity functions for comparison.

[0067] After feature computation, each entity-mention pair is evaluated against an EL Logical Neural Network (LNN) rule template, where the template has one or more logical connection rules and corresponding connectivity weights, organized in a binary tree (also referred to as a hierarchical structure in this paper). The binary tree is organized such that the root node is operationally coupled to two or more internal nodes, and the internal nodes are operationally coupled to leaf nodes residing in the last level of the binary tree. As shown in this paper, triples are evaluated using learned rules R. Evaluation is performed on triples (triples... S The tree is processed in a bottom-up manner using a tree structure, for example, starting from the leaf nodes representing features. Each node in the tree is referred to as a vertex v in this paper, and each vertex can be a root node, an internal node, or a leaf node. The number of vertices in the tree is assigned to the variable v. Total (318). For each vertex, from v=1 to v Total , determine whether vertex v is a thresholding operation (320). Each feature is represented in a leaf node, and each feature has a corresponding or associated thresholding operation. After a positive response is determined at step (320), the corresponding thresholding operation is calculated as follows:

[0068] f i [1+exp(θ v -f i )] -1

[0069] And the calculation results are sent upstream to the next level in the inverted tree structure (322). In an exemplary embodiment, the evaluation of step (322) involves filtering features based on the corresponding learned threshold θ of the features. As an example, if the feature value f i If it is 0.1, then it depends on [1+exp(θ)]. v -f i )] -1 The value of θ can be obtained as a number between 1 and 0.29. For example, if θ v If the value is 0.9, then the evaluation result of the thresholding operation will be 0.3. Based on this value, when multiplied by f... iThis will reduce the output to a value close to 0, effectively removing the feature from consideration. Therefore, feature filtering at step (322) selectively incorporates the feature into the LNN rule template by effectively removing the feature or assigning a non-zero score to it.

[0070] If the response at step (320) is negative, then determine whether vertex v is a logical AND operation (324). After determining a positive response at step (324), evaluate the logical AND operation as follows:

[0071]

[0072] The calculation result is then sent upstream to the next level in the inverted tree structure (326). The negative response determined at step (324) is an indication that vertex v is a logical OR operation (328). The evaluation of the logical OR operation is performed as follows:

[0073]

[0074] The calculation results are then sent upstream to the next level (330) in the inverted tree structure. After evaluating each of the vertices as shown in steps (322), (326), and (330), the rule prediction and corresponding logical OR operation represented in the root node are assigned to the variable p. i (332). Triplet (Triplet) S ) has entity y i And calculate y i and p i The loss is calculated (334). Details of the loss calculation are shown and described below. As shown in steps (320) through (332), the threshold and weights (collectively referred to herein as connectivity weights) are learned. More specifically, an artificial neural network (ANN) and the corresponding machine learning (ML) algorithm are used to calculate the loss corresponding to the feature prediction.

[0075] Following step (334), the triplet count variable S is incremented (336), and it is determined whether each triplet in the subset has been evaluated (338). A negative response to this determination is followed by returning to step (314) to evaluate the next triplet in the subset, and a positive response ends the initial aspect of the rule evaluation. More specifically, a backpropagation is performed following a positive response determined at step (338), which involves calculating the gradient S based on all losses in the subset. Total (340), and propagation targeting subset S Total The gradient is used to update the following parameters: θ in rule R v β v and (342). Therefore, a suitable threshold is learned for each of the computed features. In an exemplary embodiment, the ANN and the corresponding ML algorithm train the LNN-formulated EL rules on a labeled dataset and in C i Gradient descent is performed on all candidates using residual sorting loss. For mention m... i and candidate set C i loss function L(m) i C i ) is limited to:

[0076]

[0077] Among them, e ip ∈C i It is a positive candidate, C i \{e ip} is the set of negative candidates, and μ is the residual hyperparameter. Positive and negative labels are derived from label L. i Obtain. Then, determine whether another subset of labeled mention-entity pairs exists in the labeled dataset for the learned rule R (344). A negative response is followed by returning to the learned rule R (346), while a positive response is followed by returning to step (308). Thus, the labeled dataset and its corresponding entity-mention pairs are processed through an LNN-formal system to learn the corresponding rule R, including connectivity weights in the links connecting nodes of the tree structure.

[0078] like Figures 3A to 3C As shown, given a set of rule templates, a set of features, and a labeled EL dataset, LNNs are used to learn appropriate weights for logical connectives. (Reference) Figure 4 A flowchart (400) is provided to illustrate the process of using LNNs to learn new rules with appropriate weights for logical connectives. As mentioned above, an exemplary set of non-embedding-based feature functions is provided to measure mention m. i and candidate entity e ij Similarity between them. Exemplary sets include name features, context features, type features, and entity saliency features. The variable F is used in this paper to represent the partition of these features (402). The input is in the form of a labeled dataset L (e.g., entity-mention pairs) and a partition of feature F (404). The number of binary trees that can be constructed using the amount of leaves limited by |F| is evaluated by: C(|F|-1), where C represents the Catalan number, (406). In the steps described below, it is assumed that a node will have an operation, whereby a logical AND or logical OR operator may be assigned to the node. The following pseudocode demonstrates the process of selecting a logical operator and assigning it to an internal node of the binary tree:

[0079]

[0080] The pseudocode demonstrates the process of learning one or more logical connection rules, and more specifically, the aspect of dynamically generating templates. In an exemplary embodiment, the template is a hierarchical structure in the form of a binary tree, and the nodes processed for rule assignment are internal nodes. More specifically, as shown, logical rules R are learned based on the generated template, and the selected rules are evaluated on a validation set, such as a labeled dataset. Based on this evaluation, the selected rules are selectively assigned to corresponding internal nodes in the hierarchical structure. In an exemplary embodiment, the assigned rules are conjunction or disjunction LNN operators. Thus, as illustrated herein, given a feature set and an EL-labeled dataset, new rules with corresponding weights are learned for logical connection words.

[0081] refer to Figure 5 A block diagram (500) is provided to illustrate an example LNN reformulation of the EL algorithm. As shown in this example, the reformulation is an inverted tree structure with features and corresponding thresholds, logical operators, and associated weights. In this example, five features are shown. In exemplary embodiments, there may be different numbers of features in the reformulation, and therefore the number shown and described herein should not be considered limiting. The five features (referred to herein as f0(510), f1(512), f2(514), f3(516), and f4(518)) are represented as individual leaf nodes of the inverted tree structure. Each feature is shown with a corresponding threshold. More specifically, feature f0 (510) is shown as operationally connected to the corresponding threshold operation θ0 (520), f1 (512) is shown as operationally connected to the corresponding threshold operation θ1 (522), feature f2 (514) is shown as operationally connected to the corresponding threshold operation θ2 (524), feature f3 (516) is shown as operationally connected to the corresponding threshold operation θ3 (526), ​​and feature f4 (518) is shown as operationally connected to the corresponding threshold operation θ4 (528). Each threshold operation is learned and directly associated with one or more feature functions.

[0082] As further shown, the first set of internal nodes (shown in this document as the internal nodes of the inverted tree) 0,0 (530) and internal nodes 0,1 (550)) Operationally connected to the selection of features and their corresponding thresholds. Internal nodes 0,0 (530) is operatively connected to features f0(510), f1(512), and f2(514), internal nodes 0,1(550) is operationally connected to features f3 (516) and f4 (518). An edge is shown that operationally connects the leaf node and its corresponding thresholded element to the first set of internal nodes (530) and (550). Specifically, the edge... 0,0 (532) Operationally connect the feature f0 (510) and the corresponding threshold θ0 (520) to the node. 0,0 (530), side 0,1 (534) Operationally connect the feature f1 (512) and the corresponding threshold θ1 (522) to the node. 0,0 (530), and on the side 0,2 (536) Operationally connect the feature f2 (514) and the corresponding threshold θ2 (524) to the node. 0,0 (530). Similarly, the edge 1,0 (552) Connect feature f3 (516) and the corresponding threshold θ4 (526) to the node. 0,1 (550), and the edge 1,1 (554) Connect the feature f5 (518) and the corresponding threshold θ5 (528) to the node. 0,1 (550). Including edges 0,0 (532), edge 0,1 (534), edge 0,2 (536), edge 1,0 (552) and edge 1,1 Each edge in (554) has a separate corresponding weight and, similar to a threshold, is learned. In an exemplary embodiment, these weights are referred to as feature weights fw, where the edges... 0,0 (532) has feature weights fw0, and edges 0,1 (534) has feature weights fw1, and edges 0,2 (536) has feature weights fw2, and edges 1,0 (552) has feature weights fw3, and edges 1,1 (554) has feature weight fw4. The second internal node (node) 1,0 (560) is shown as being operatively coupled to an internal node. 0,0 (530) and internal nodes 0,1 (550). This illustrates operative coupling to a second internal node. 1,0 (560) has two edges, including the edge 2,0 (562) and edge 2,1 (564). Each of these edges (i.e., edge) 2,0 (562) and edge 2,1 (564) Having corresponding weights) is referred to as the rule weight rw in this paper. That is, the edge 2,0 (562) has regular weight rw0, and edge 2,1(564) has regular weights rw1. Similar to feature weights and thresholds, the regular weights are learned.

[0083] In this example, each internal node 0,0 (530) and internal nodes 0,1 (550) represents the LNN logical AND (∧) operation, and in this example, it is also referred to as the second inner node of the root node. 1,0 (560) represents a logical OR (∨). As an example, it relates to the internal node... 0,0 (530) The associated rule R1 is as follows:

[0084] R1: (f0>θ0)∧(f1>θ1)∧(f2>θ2)

[0085] If f0 > θ0 is true, f1 > θ1 is true, and f2 > θ2 is true, then R1 evaluates to true. Similarly, as an example, with internal nodes... 0,1 (550) The associated second rule (rule R2) is as follows:

[0086] R2: (f3>θ3)∧(f4>θ4)

[0087] If f3 > θ3 is true and f4 > θ4 is true, then R² is evaluated as true. Second internal node (node) 1,0 (560) is the root node of the inverted tree structure, and as shown in this paper, it combines internal nodes. 0,0 (530) and internal nodes 0,1 (550) Boolean logic. As an example, the root node (node) 1,0 Rule R3 of (160) is as follows:

[0088] R1∨R2

[0089] If either the first or the second rule R1 or R2 evaluates to true, then R3 evaluates to true.

[0090] The aspects of tools (152), (154), (156), and (158) and their associated functions can be embodied in a single-location computer system / server, or, in embodiments, configured in a cloud-based system with shared computing resources. Reference Figure 6 A block diagram (600) is provided illustrating an example of a computer system / server (602), hereinafter referred to as a host (602) communicating with a cloud-based support system to implement the above reference. Figures 1 to 5The systems and processes described. Examples of well-known computing systems, environments, and / or configurations suitable for use with the host (602) include, but are not limited to, personal computer systems, server computer systems, thin clients, thick clients, handheld or laptop devices, multiprocessor systems, microprocessor-based systems, set-top boxes, programmable consumer electronics, network PCs, minicomputer systems, mainframe computer systems, and file systems (e.g., distributed storage environments and distributed cloud computing environments) that include any of the aforementioned systems, devices, and their equivalents.

[0091] The host (602) can be described in the general context of computer system executable instructions such as program modules that are executed by the computer system. Typically, a program module may include routines, programs, objects, components, logic, data structures, etc., that perform a specific task or implement a specific abstract data type. The host (602) can be implemented in a distributed cloud computing environment (610), where tasks are performed by remote processing devices linked via a communication network. In a distributed cloud computing environment, program modules can reside in local and remote computer system storage media, including memory storage devices.

[0092] like Figure 6 As shown, the host (602) is illustrated as a general-purpose computing device. Components of the host (602) may include, but are not limited to, one or more processors or processing units (604), system memory (606), and buses (608) that couple various system components, including system memory (606), to the processor (604). The bus (608) represents one or more of several types of bus architectures, including memory buses or memory controllers, peripheral buses, accelerated graphics ports, and processor or local buses using any of various bus architectures. By way of example and not limitation, these architectures include Industry Standard Architecture (ISA) buses, Microchannel Architecture (MCA) buses, Enhanced ISA (EISA) buses, Video Electronics Standards Association (VESA) local buses, and Peripheral Component Interconnect (PCI) buses. The host (602) typically includes various computer system readable media. This media can be any available media accessible to the host (602), and it includes volatile and non-volatile media, removable and non-removable media.

[0093] The memory (606) may include computer system-readable media in the form of volatile memory, such as random access memory (RAM) (630) and / or cache memory (632). By way of example only, the storage system (634) may be provided for reading from and writing to a non-removable, non-volatile magnetic medium (not shown, and generally referred to as a "hard disk drive"). Although not shown, a disk drive may be provided for reading from and writing to a removable, non-volatile disk (e.g., a "floppy disk"), and an optical disk drive may be provided for reading from or writing to a removable, non-volatile optical disk such as a CD-ROM, DVD-ROM, or other optical media. In this case, each may be connected to a bus (608) via one or more data media interfaces.

[0094] A program / utility (640) having a set (at least one) of program modules (642) may be stored in memory (606), as an example and not a limitation, along with an operating system, one or more applications, other program modules, and program data. Each of the operating system, one or more applications, other program modules, and program data, or some combination thereof, may include an implementation of a networking environment. The program modules (642) typically perform the functionality and / or methods of embodiments of entity linking in a logical neural network. For example, the set of program modules (642) may include those configured to... Figure 1 The modules of tools (152), (154), (156) and (158) described in the document.

[0095] The host (602) can also communicate with one or more external devices (614), such as a keyboard, pointing device, sensory input device, sensory output device, etc.; a display (624); one or more devices that enable a user to interact with the host (602); and / or any device that enables the host (602) to communicate with one or more other computing devices (e.g., a network card, modem, etc.). Such communication can occur via an input / output (I / O) interface (622). Furthermore, the host (602) can communicate with one or more networks via a network adapter (620), such as a local area network (LAN), a general wide area network (WAN), and / or a public network (e.g., the Internet). As described, the network adapter (620) communicates with other components of the host (602) via a bus (608). In one embodiment, multiple nodes of a distributed file system (not shown) communicate with the host (602) via the I / O interface (622) or via the network adapter (620). It should be understood that, although not shown, other hardware and / or software components can be used in conjunction with the host (602). Examples include, but are not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data archive storage systems.

[0096] In this document, the terms “computer program medium,” “computer-usable medium,” and “computer-readable medium” are used to generally refer to media such as main memory (606), including RAM (630), cache (632), and storage systems (634), such as removable storage drives and hard disks installed in hard disk drives.

[0097] The computer program (also referred to as computer control logic) is stored in memory (606). The computer program can also be received via a communication interface, such as a network adapter (620). When run, such a computer program enables the computer system to perform the features of this embodiment as discussed herein. In particular, when run, the computer program enables the processing unit (604) to perform features of the computer system. Thus, such a computer program represents a controller for the computer system.

[0098] In one embodiment, the host (602) is a node in a cloud computing environment. As is well known in the art, cloud computing is a service delivery model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, network bandwidth, servers, processing, memory, storage, applications, virtual machines, and services) that can be rapidly provisioned and released with minimal management effort or interaction with service providers. This cloud model may include at least five features, at least three service models, and at least four deployment models. Examples of these features are as follows:

[0099] On-demand self-service: Cloud consumers can unilaterally and automatically provide computing power, such as server time and network storage, as needed, without requiring manual interaction with the service provider.

[0100] Wide Area Network (WAN) Access: Capabilities are available on the network and accessed through standard mechanisms that facilitate the use of heterogeneous thin or thick client platforms (e.g., mobile phones, laptops, and PDAs).

[0101] Resource pooling: A provider's computing resources are pooled to serve multiple consumers using a multi-tenant model, where different physical and virtual resources are dynamically assigned and reassigned based on demand. Location independence is significant because consumers typically do not control or know the exact location of the resources provided, but can specify a location at a higher level of abstraction (e.g., country, state, or data center).

[0102] Rapid Flexibility: In some cases, the ability to scale outwards and inwards quickly and flexibly can be provided. For consumers, the available capacity often appears unlimited and can be purchased at any time and in any quantity.

[0103] Metrics services: Cloud systems automatically control and optimize resource usage by leveraging metering capabilities at an abstraction layer appropriate to service types (e.g., storage, processing, bandwidth, and active user accounts). Resource usage can be monitored, controlled, and reported, providing transparency for both service providers and consumers.

[0104] The service model is as follows:

[0105] Software as a Service (SaaS): The capability offered to consumers is the ability to use the provider's applications running on cloud infrastructure. Applications can be accessed from various client devices through thin client interfaces such as web browsers (e.g., web-based email). Consumers do not manage or control the underlying cloud infrastructure, including the network, servers, operating system, storage, or even individual application capabilities, with possible exceptions such as limited user-specific application configuration settings.

[0106] Platform as a Service (PaaS): This provides consumers with the ability to deploy consumer-created or acquired applications onto cloud infrastructure using programming languages ​​and tools supported by the provider. Consumers do not manage or control the underlying cloud infrastructure, including networks, servers, operating systems, or storage, but they have control over the deployed applications and the configuration of any application hosting environments.

[0107] Infrastructure as a Service (IaaS): The capabilities offered to consumers are processing, storage, networking, and other basic computing resources that enable consumers to deploy and run arbitrary software, which may include operating systems and applications. Consumers do not manage or control the underlying cloud infrastructure, but have control over the operating system, storage, deployed applications, and possibly limited control over selected networking components (e.g., host firewalls).

[0108] The deployment model is as follows:

[0109] Private cloud: Cloud infrastructure operated solely by an organization. It can be managed by the organization or a third party and can exist inside or outside a building.

[0110] Community cloud: Cloud infrastructure shared by several organizations and supporting a specific community with shared concerns (e.g., tasks, security requirements, policies, and compliance considerations). It can be managed by an organization or a third party and can exist on-site or off-site.

[0111] Public cloud: Cloud infrastructure available to the general public or large industrial groups and owned by organizations that sell cloud services.

[0112] Hybrid cloud: A cloud infrastructure is a combination of two or more clouds (private, community, or public) that remain a single entity but are bound together by standardized or proprietary technologies that enable data and application portability (e.g., cloud bursts for load balancing between clouds).

[0113] Cloud computing environments are service-oriented, focusing on statelessness, loose coupling, modularity, and semantic interoperability. At the heart of cloud computing is the infrastructure of a network of interconnected nodes.

[0114] Now for reference Figure 7 An illustrative cloud computing network (700) is shown. As illustrated, the cloud computing network (700) includes a cloud computing environment (750) with one or more cloud computing nodes (710), and local computing devices used by cloud consumers can communicate with said cloud computing nodes. Examples of such local computing devices include, but are not limited to, personal digital assistants (PDAs) or cellular phones (754A), desktop computers (754B), laptop computers (754C), and / or automotive computer systems (754N). The individual nodes within the nodes (710) can also communicate with each other. They can be physically or virtually grouped (not shown) in one or more networks, such as private clouds, community clouds, public clouds, or hybrid clouds, or combinations thereof, as described above. This allows the cloud computing environment (700) to provide infrastructure, platform, and / or software as a service, without requiring cloud consumers to maintain resources on their local computing devices. It should be understood that... Figure 7 The types of computing devices (754A-N) shown are intended to be illustrative only, and the cloud computing environment (750) can communicate with any type of computerized device via any type of network and / or network-addressable connection (e.g., using a web browser).

[0115] Now for reference Figure 8 This shows the result of Figure 7 The cloud computing network provides a set of functional abstraction layers (800). It should be understood beforehand that... Figure 8 The components, layers, and functions shown are for illustrative purposes only, and the embodiments are not limited thereto. As depicted, the following layers and corresponding functions are provided: a hardware and software layer (810), a virtualization layer (820), a management layer (830), and a workload layer (840). The hardware and software layer (810) includes hardware and software components. Examples of hardware components include mainframes, and in one example, [example of mainframe is missing]. System; a server based on a RISC (Reduced Instruction Set Computer) architecture, in an example IBM In the system; IBM System; IBM Systems; storage devices; networks and network components. Examples of software components include network application server software, one example being IBM. Application server software; and database software, in one example being IBM D. Database software. (IBM, zSeries, pSeries, xSeries, BladeCerter, WebSphere, and DB2 are trademarks of International Business Machines Corporation, registered in many jurisdictions worldwide.)

[0116] The virtualization layer (820) provides an abstraction layer from which the following examples of virtual entities can be provided: virtual servers; virtual storage; virtual networks, including virtual private networks; virtual applications and operating systems; and virtual clients.

[0117] In one example, the management layer (830) can provide the following functions: resource provisioning, metering and pricing, user portal, service layer management, and SLA planning and enforcement. Resource provisioning provides dynamic procurement of computing resources and other resources used to perform tasks within the cloud computing environment. Metering and pricing provides cost tracking when utilizing resources in the cloud computing environment, as well as invoicing or issuing invoices for consuming these resources. In one example, these resources may include application software licenses. Security provides authentication for cloud consumers and tasks, as well as protection for data and other resources. The user portal provides consumers and system administrators with access to the cloud computing environment. Service layer management provides cloud resource allocation and management to ensure the fulfillment of required service layers. Service layer agreement (SLA) planning and enforcement provides pre-scheduling and procurement of cloud resources, where future demand is anticipated based on the SLA.

[0118] The workload layer (840) provides examples of functions that can be utilized in a cloud computing environment. Examples of workloads and functions that can be provided from this layer include, but are not limited to: map creation and navigation; software development and lifecycle management; virtual classroom education delivery; data analysis and processing; transaction processing; and entity linking in logical neural networks.

[0119] The systems and flowcharts shown herein can also take the form of computer program devices for entity linking in logical neural networks. These devices have program code embodied therein. This program code can be executed by a processing unit to support the described functionality.

[0120] Although specific embodiments have been shown and described, it will be apparent to those skilled in the art that changes and modifications can be made based on the teachings herein without departing from its broader scope. Therefore, the appended claims are intended to cover all such changes and modifications within the scope of the embodiments. Furthermore, it should be understood that the embodiments are defined solely by the appended claims. Those skilled in the art will understand that if a specific number of claim elements is intentional, such intention will be explicitly stated in the claims, and in the absence of such a statement, there is no such limitation. For non-limiting examples, to aid understanding, the appended claims include the use of the introductory phrases “at least one” and “one or more” to introduce claim elements. However, the use of such phrases should not be construed as implying that introducing a claim element by the indefinite article “a” or “an” limits any particular claim containing such an introduced claim element to an embodiment containing only one such element, even when the same claim includes the introductory phrase “one or more” or “at least one” and the indefinite article such as “a” or “an”; the same applies to the use of definite articles in the claims.

[0121] This embodiment may be a system, method, and / or computer program product. Furthermore, selected aspects of this embodiment may take the form of a completely hardware embodiment, a completely software embodiment (including firmware, resident software, microcode, etc.), or an embodiment combining software and / or hardware aspects, all of which may be collectively referred to herein as a "circuit," "module," or "system." Additionally, aspects of this embodiment may take the form of a computer program product implemented in a computer-readable storage medium (or media) having computer-readable program instructions thereon for causing a processor to execute aspects of this embodiment. Therefore, the systems, methods, and / or computer program products disclosed herein are operable to improve the functionality and operation of dynamic orchestration of prerequisite-driven coded infrastructure.

[0122] Computer-readable storage media can be tangible devices capable of retaining and storing instructions for use by an instruction execution device. Computer-readable storage media can be, for example, but not limited to, electronic storage devices, magnetic storage devices, optical storage devices, electromagnetic storage devices, semiconductor storage devices, or any suitable combination of the foregoing. A non-exhaustive list of more specific examples of computer-readable storage media includes the following: portable computer disks, hard disks, dynamic or static random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), magnetic storage devices, portable optical disc read-only memory (CD-ROM), digital multifunction disc (DVD), memory sticks, floppy disks, mechanical encoding devices such as punch cards or raised structures in recesses on which instructions are recorded, and any suitable combination of the foregoing. As used herein, computer-readable storage media should not be construed as transient signals themselves, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., light pulses through fiber optic cables), or electrical signals transmitted through wires.

[0123] The computer-readable program instructions described herein can be downloaded from a computer-readable storage medium to a corresponding computing / processing device, or via a network, such as the Internet, a local area network (LAN), a wide area network (WAN), and / or a wireless network, to an external computer or external storage device. The network may include copper cables, optical fibers, wireless transmission, routers, firewalls, switches, gateway computers, and / or edge servers. A network adapter card or network interface in each computing / processing device receives the computer-readable program instructions from the network and forwards them to a computer-readable storage medium within the corresponding computing / processing device.

[0124] Computer-readable program instructions used to perform the operations of this embodiment may be assembly instructions, instruction set architecture (ISA) instructions, machine-dependent instructions, microcode, firmware instructions, state setting data, or source code or object code written in any combination of one or more programming languages, including object-oriented programming languages ​​(e.g., Java, Smalltalk, C++, etc.) and conventional procedural programming languages ​​(e.g., the "C" programming language or similar programming languages). The computer-readable program instructions may be executed entirely on the user's computer, partially on the user's computer, as a standalone software package, partially on the user's computer and partially on a remote computer, or entirely on a remote computer or server cluster. In the latter case, the remote computer may be connected to the user's computer via any type of network, including a local area network (LAN) or wide area network (WAN), or may be connected to an external computer (e.g., via the Internet using an Internet service provider). In some embodiments, electronic circuitry including, for example, programmable logic circuitry, field-programmable gate arrays (FPGAs), or programmable logic arrays (PLAs) may execute computer-readable program instructions by utilizing state information from the computer-readable program instructions to personalize the electronic circuitry in order to perform one or more aspects of this embodiment.

[0125] Aspects of this embodiment are described herein with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer-readable program instructions.

[0126] These computer-readable program instructions may be provided to a processor of a general-purpose computer, a special-purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions / actions specified in one or more blocks of a flowchart and / or block diagram. These computer-readable program instructions may also be stored in a computer-readable storage medium that can direct a computer, programmable data processing apparatus, and / or other devices to operate in a particular manner, such that the computer-readable storage medium in which the instructions are stored includes an article of writing comprising instructions for implementing aspects of the functions / actions specified in one or more blocks of a flowchart and / or block diagram.

[0127] Computer-readable program instructions may also be loaded onto a computer, other programmable data processing apparatus or other device to cause a series of operational steps to be performed on the computer, other programmable apparatus or other device to produce a computer-implemented process, such that the instructions, which execute on the computer, other programmable apparatus or other device, perform the functions / actions specified in one or more boxes of a flowchart and / or block diagram.

[0128] The flowcharts and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present invention(s). In this regard, each block in a flowchart or block diagram may represent a module, segment, or portion of instructions comprising one or more executable instructions for implementing a specified logical function. In some alternative embodiments, the functions mentioned in the blocks may occur in a non-linear order as shown in the figures. For example, two blocks shown consecutively may actually be executed substantially simultaneously, or these blocks may sometimes be executed in reverse order, depending on the functions involved. It will also be noted that each block in the block diagrams and / or flowcharts, and combinations of blocks in the block diagrams and / or flowcharts, may be implemented by a dedicated hardware-based system that performs the specified function or action or executes a combination of dedicated hardware and computer instructions.

[0129] It should be understood that although specific embodiments have been described herein for illustrative purposes, various modifications can be made without departing from the scope of the embodiments. In particular, annotation of unstructured NL data and extraction of facts into a structured format can be performed by different computing platforms or across multiple devices. Furthermore, libraries can be localized, remote, or distributed across multiple systems. Therefore, the scope of protection of the embodiments is defined only by the appended claims and their equivalents.

Claims

1. A computer system, comprising: The processor is operationally coupled to the memory; An artificial intelligence (AI) platform is operatively coupled to the processor, the AI ​​platform comprising: A feature manager is used to generate a feature set for one or more entity-mention pairs in an annotated dataset, wherein the feature set corresponds to an attribute that measures the degree of similarity between textual mentions and candidate entities; An evaluator is configured to evaluate the feature set of one or more entity-mention pairs generated against an entity-linking EL logical neural network LNN rule template, the template having one or more logical connection rules organized in a hierarchical structure and corresponding connectivity weights. A machine learning (ML) manager is operatively coupled to the evaluator, the ML manager being configured to learn the connectivity weights using an artificial neural network (ANN) and a corresponding ML algorithm; The AI ​​platform is configured as follows: EL rules are used to formulate LNN rule templates; Gradient descent is performed using the ANN and the corresponding ML algorithm to train the LNN-formulated EL rule on the labeled dataset; and A learned model is generated with learned thresholds and learned connectivity weights for the logical connection rules, thereby eliminating ambiguity in the references by linking references in short texts to entities in a logical neural network using interpretable rules, wherein the short texts include single sentences or questions.

2. The system of claim 1, wherein the evaluation further comprises the evaluator to reformulate the entity linking algorithm, consisting of a set of disjunctive rules, into an LNN representation for learning, wherein entity linking is a first-order logic rule in a restricted form, the first-order logic rule comprising a set of Boolean predicates connected by logical operators.

3. The system of claim 2, wherein the entity-mention pair evaluation further comprises the evaluator for computing one or more features for a subset of labeled entity-mention pairs, wherein each of the features has a corresponding similarity predicate.

4. The system of claim 3, further comprising the ML manager to utilize the ANN and the ML algorithm to learn an appropriate threshold for each of the computed features in relation to the corresponding similarity predicate.

5. The system of claim 4, further comprising the evaluator filtering the computed one or more features based on a learned threshold corresponding to the computed one or more features, and in response to the filtering, selectively merging the computed one or more features into the LNN rule template, the selective merging comprising removing features or assigning non-zero scores to the features.

6. The system of claim 2, further comprising a rule manager operatively coupled to the evaluator, the rule manager being configured to: Learn one or more of the logical connection rules; Dynamically generate templates for the layered structure; Logical rules are learned based on the dynamically generated template; Evaluate the selected rules on the labeled dataset; as well as The selected rules are selectively assigned to the corresponding nodes in the hierarchical structure based on the evaluation.

7. The system of claim 6, wherein the template is a binary tree and the corresponding node is an internal node, and the system further includes the rule manager to selectively assign conjunction or disjunction LNN operators to the internal nodes.

8. A computer program product for eliminating ambiguity in references in text, the computer program product comprising program code executable by a processor to: Generate features for one or more entity-mention pairs in an annotated dataset, wherein the features correspond to attributes that measure the degree of similarity between textual mentions and candidate entities; The features of the one or more entity-mention pairs generated by evaluating the entity linking EL logical neural network LNN rule template have one or more logical connection rules organized in a hierarchical structure and corresponding connectivity weights; The connectivity weights are learned using an artificial neural network (ANN) and the corresponding machine learning algorithm. EL rules are used to formulate LNN rule templates; Gradient descent is performed using the ANN and the corresponding ML algorithm to train the LNN-formulated EL rules on the labeled dataset; as well as A learned model is generated with learned thresholds and learned connectivity weights for the logical connection rules, thereby eliminating ambiguity in the references by linking references in short texts to entities in a logical neural network using interpretable rules, wherein the short texts include single sentences or questions.

9. The computer program product of claim 8, wherein the evaluation for each entity-mention pair of the LNN rule template further includes program code configured to re-formulate an entity linking algorithm consisting of a set of disjunctive rules into an LNN representation for learning, wherein entity linking is a first-order logic rule in a restricted form, the first-order logic rule comprising a set of Boolean predicates connected by logical operators.

10. The computer program product of claim 9, wherein the entity-mention pair evaluation further comprises program code configured to compute a set of features for each entity-mention pair, wherein each of the features has a corresponding similarity predicate.

11. The computer program product of claim 10, further comprising program code configured to perform the following operations: The ANN and the ML algorithm are used to learn an appropriate threshold for each of the one or more computed features, associated with the corresponding similarity predicate; The calculated one or more features are filtered based on the corresponding learned thresholds of the calculated features; as well as The calculated features are selectively incorporated into the LNN rule template, wherein selective incorporation includes removing features or assigning non-zero scores to the features.

12. The computer program product of claim 9, further comprising program code configured to perform the following operations: Learn one or more of the logical connection rules; Dynamically generate templates for the layered structure; Logical rules are learned based on the dynamically generated template; Evaluate the selected rules on the labeled dataset; as well as The selected rules are selectively assigned to the corresponding nodes in the hierarchical structure based on the evaluation.

13. The computer program product of claim 12, wherein the template is a binary tree and the corresponding node is an internal node, and the computer program product further includes program code configured to selectively assign conjunction or disjunction LNN operators to the internal nodes.

14. A method for eliminating ambiguity in references in text, comprising: Generate features for one or more entity-mention pairs in an annotated dataset, wherein the features correspond to attributes that measure the degree of similarity between textual mentions and candidate entities; The features generated by one or more entity-mention pairs are evaluated for an entity-linking EL logical neural network (LNN) rule template, the template having one or more logical connection rules and corresponding connectivity weights organized in a hierarchical structure; The connectivity weights are learned using an artificial neural network (ANN) and a corresponding machine learning (ML) algorithm. EL rules are used to formulate LNN rule templates; Gradient descent is performed using the ANN and the corresponding ML algorithm to train the LNN-formulated EL rules on the labeled dataset; as well as A learned model is generated using a learned threshold and learned connectivity weights for the logical connection rules, thereby eliminating ambiguity in the references by linking references in short texts, including single sentences or questions, to entities in a logical neural network using interpretable rules.

15. The method of claim 14, wherein the entity-mention pair evaluation comprises re-formulating an entity linking algorithm consisting of a set of disjunctive rules into an LNN representation for learning, wherein entity linking is a first-order logic rule in a restricted form, the first-order logic rule comprising a set of Boolean predicates connected by logical operators.

16. The method of claim 15, wherein the entity-mention pair evaluation comprises computing a set of features for each entity-mention pair, wherein each feature has a corresponding similarity predicate.

17. The method of claim 16, further comprising using the ANN and the ML algorithm to learn an appropriate threshold for each of the computed features in relation to the corresponding similarity predicate.

18. The method of claim 17, further comprising filtering the computed one or more features based on a learned threshold corresponding to the computed one or more features, and in response to the filtering, selectively incorporating the computed one or more features into the LNN rule template, the selective incorporation comprising removing features or assigning non-zero scores to the features.

19. The method of claim 15, further comprising: Learn one or more logical connection rules from the logical connection rules, including dynamically generating templates for the hierarchical structure; Logical rules are learned based on the dynamically generated template; The selected rules are evaluated on the labeled dataset; as well as Based on the evaluation, the selected rule is selectively assigned to the corresponding node in the hierarchical structure.

20. The method of claim 19, wherein the template is a binary tree and the corresponding node is an internal node, and the method further comprises selectively assigning conjunction or disjunction LNN operators to the internal node.