Multi-modal knowledge graph construction and retrieval system and method

A technology of knowledge graph and retrieval system, applied in the field of multimodal knowledge graph construction and retrieval system, can solve problems such as low efficiency and low real-time performance, achieve high-precision fitting, improve real-time performance and efficiency, and high-efficiency knowledge graph build effect

Pending Publication Date: 2022-07-12
北京德信电通科技有限公司
0 Cites 0 Cited by

AI-Extracted Technical Summary

Problems solved by technology

[0004] The technical problem to be solved by the present invention is the technica...
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Method used

[0066] In this embodiment, when the multimodal data is used as the original data to construct the knowledge graph, for the data, according to the rate of change, the decision and method for processing it can be effectively distinguished to improve real-time performance and efficiency. The present invention calculates the real-time rate of change of various multimodal data, and distinguishes the high-speed update data Pn and the slow update data Pr according to the predefined change rate threshold; for the high-speed update data Pn, calls the fast-change data estimation fusion program Perform data estimation and fusion; for the slow update data Pr, directly call the slow change data processing and fusion program for calculation and fusion, and at the same time directly calculate the change value of the slow update data Pr, realizing efficient knowledge map construction.
[0104] In this embodiment, when the multimodal data is used as the original data to construct the knowledge map, for the data, according to the rate of ch...
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Abstract

The invention relates to a multi-modal knowledge graph construction and retrieval system, which solves the technical problems of low efficiency and low real-time performance, and comprises a knowledge data acquisition and processing unit, a knowledge graph construction management unit and a knowledge graph application service unit which are cascaded, the knowledge data acquisition and processing unit is used for acquiring and transmitting data and comprises a multi-modal data acquisition unit; the knowledge graph construction management unit is used for construction and update management of a knowledge graph; the construction of the knowledge graph comprises the steps of constructing an ontology according to business requirements, completing knowledge fusion according to data contents and an ontology structure, associating labeled data with the ontology, and completing the construction of a knowledge graph model; the knowledge graph application service unit comprises a knowledge retrieval unit, a knowledge association and recommendation unit and a knowledge question and answer unit, the problem is well solved, and the method can be used for multi-modal knowledge graph construction and retrieval.

Application Domain

Technology Topic

Image

  • Multi-modal knowledge graph construction and retrieval system and method
  • Multi-modal knowledge graph construction and retrieval system and method

Examples

  • Experimental program(1)

Example Embodiment

[0057] Example 1
[0058] This embodiment provides a multimodal knowledge graph construction and retrieval system, such as figure 1 , the multimodal knowledge graph construction and retrieval system includes a cascaded knowledge data acquisition and processing unit, a knowledge graph construction management unit and a knowledge graph application service unit;
[0059] The knowledge data collection and processing unit is used to collect and transmit data, including a multimodal data collection unit;
[0060] The knowledge graph construction management unit is used for the construction and update management of the knowledge graph; the construction of the knowledge graph includes constructing the ontology according to the business needs, completing the knowledge fusion according to the data content and ontology structure, associating the labeled data with the ontology, and completing the knowledge graph model. the construction of;
[0061] The knowledge graph application service unit includes a knowledge retrieval unit, a knowledge association and recommendation unit, and a knowledge question and answer unit;
[0062] like figure 2 , the execution of knowledge fusion includes the following steps:
[0063] Step S1, calculate the real-time rate of change of each multi-modal data, and differentiate the high-speed update data P according to a predefined rate of change threshold n and slow update data P r; for high-speed update data P n , call the fast variable data estimation fusion program to estimate and fuse the data; for the slow update data P r , directly call the slow variable data processing fusion program for calculation and fusion, and directly update the slow update data P r The change value of is calculated;
[0064] Step S2, the change value estimated by the rapidly changing data estimation fusion program exceeds a predefined threshold or the data P is updated slowly r If the change value exceeds the predefined threshold, at least two knowledge graph construction models are called to complete the model construction of the knowledge graph;
[0065]In step S3, the knowledge graph model constructed in step S2 is formed into a final knowledge graph model according to a predefined voting strategy.
[0066] In this embodiment, when the multimodal data is the original data to construct the knowledge graph, the decision and method for processing the data can be effectively differentiated according to the change rate, and the real-time performance and efficiency can be improved. The present invention calculates the real-time change rate of each multimodal data, and distinguishes the high-speed update data P according to the predefined change rate threshold n and slow update data P r; for high-speed update data P n , call the fast variable data estimation fusion program to estimate and fuse the data; for the slow update data P r , directly call the slow variable data processing fusion program for calculation and fusion, and directly update the slow update data P r The change value is calculated, and the efficient knowledge graph construction is realized.
[0067] Specifically, the multimodal data collection unit includes a text data collection unit, an image data collection unit, an audio data collection unit, and a video data collection unit.
[0068] Preferably, in order to efficiently fit, estimate, and integrate fast data, in this embodiment, a specially free fast-variable data estimation fusion program is used to perform data estimation and fusion. Of course, existing data fusion methods can also be used, and the method in this embodiment includes:
[0069] Step R1, define Among them, {x 1 ,x 2 ,...x k ,x K } is the observation value of K independent data samples in the historical high-speed update data samples, k=1, 2, 3...K, j and w are predefined parameters, w 1 ,w 2 ,...w k is a set of real numbers;
[0070] Step R2, pass y k =μ+αt k +ε k , μ=log(2γ), the characteristic index ∝ and the dispersion coefficient γ are calculated; among them, ε k are the pre-defined coefficients of the same distribution but independent error terms with mean 0, t k =log|w k |, K is the number of historical samples;
[0071] Step R3, pass z k =δw k +ε k , calculate the position parameter δ, where z k =arctan(Im(w k )/Re(w k ), ε k are the pre-defined mean 0 that belong to the same distribution but are independent error term coefficients;
[0072] Step R4: Bring the characteristic index ∝, dispersion coefficient γ, and position parameter δ obtained in steps R2 and R3 into φ(w)=exp{jδw-γ|w| ∝ }, and do Fourier transform to get the probability density function f(x), complete the high-speed update data P n The fitted estimates are fused.
[0073] Specifically, the rapidly changing data estimation fusion program is invoked to perform data estimation and fusion, and further includes:
[0074] Step R5, determine As rapidly changing data estimates whether the change value estimated by the fusion procedure exceeds a predefined threshold T max index; among them, A is the data value that needs to be estimated and fused in real time, is the parameter estimated by the historical high-speed update data samples, T max is the detection threshold corresponding to the predefined fusion rate.
[0075] Preferably, in order to prevent the degradation of the real-time system and the inefficiency caused by failure, preferably, the knowledge data acquisition and processing units include multiple, the knowledge graph construction management unit includes multiple, and the knowledge graph application service unit includes multiple;
[0076] Step A1, optionally multiple knowledge data acquisition and processing units, multiple knowledge graph construction management units, and multiple knowledge graph application service units to form a real-time system;
[0077] Step A2, optional adjacent front and rear levels, the unit of the former level is defined as the primary unit, and the unit of the latter level is defined as the secondary unit;
[0078] Step A3, define the real-time system performance model as H=H 1 ·H 2 ·H 3 ·H 4 ·H 5 , where H 1 For effectiveness, H 2 For processing efficiency, H 3 is the system load rate, H 4 is the data processing accuracy, H 5 is the system failure rate;
[0079] Step A4, H 4 is predefined, H 5 It is the real-time system failure rate calculated according to the historical situation, and calculated according to the following publicity H 2 =PH 21 +(1-P)H 21 H 22 , H 3 = (NH 31 +NH 32 )/(N+M);
[0080] Among them, W=PW 1 +(1-P)(W 1 +W 2 ), T=PT 1 +(1-P)(T 1 +T 2 ), t is the total time of the data in the primary unit and the secondary unit, P is the probability of the predefined data entering the secondary unit from the primary unit data, the processing efficiency of the primary unit Secondary unit processing efficiency Primary unit load factor Secondary unit load factor N is the number of primary units, M is the number of secondary units, R is an integer, P R Average data volume according to predefined primary units get, Q R Average data volume according to predefined sub-units get, W 1 =L 1 /λ is the average response time of primary unit data, W 2 = L 2 /λH 21 P is the response time of the secondary unit; T 1 =1/μ 1 Average service time for primary unit data, T 2 =1/μ 2 Average service time for secondary unit data; μ 1 and μ 2 is the parameter of exponential distribution, λ is the predefined Poisson parameter;
[0081] In step A5, the overall performance value of the real-time system is calculated, and the size of the overall performance value is judged. If it is greater than the predefined threshold, return to step A1 to reselect to form a new real-time system.
[0082] This embodiment also provides a method for constructing and retrieving a multimodal knowledge graph, the method comprising:
[0083] Step 1, the multimodal data collection unit collects knowledge data, and preprocesses the knowledge data, distinguishes data categories for the data, establishes data identifiers, generates standard data bars, determines whether the knowledge graph database exists, and if so, obtains the identifiers Index, but store if it does not exist;
[0084] Step 2: Build an ontology according to business needs, and build a mapping relationship between standard data bars and ontology to complete the preliminary construction of the knowledge graph model;
[0085] Step 3, complete the knowledge fusion process according to the data content and ontology structure, and update the knowledge map, including:
[0086] Step S1, calculate the real-time rate of change of each multi-modal data, and differentiate the high-speed update data P according to a predefined rate of change threshold n and slow update data P r; for high-speed update data P n , call the fast variable data estimation fusion program to estimate and fuse the data; for the slow update data P r , directly call the slow variable data processing fusion program for calculation and fusion, and directly update the slow update data P r The change value of is calculated;
[0087] Step S2, the change value estimated by the rapidly changing data estimation fusion program exceeds a predefined threshold or the data P is updated slowly r If the change value exceeds the predefined threshold, at least two knowledge graph construction models are called to complete the model construction of the knowledge graph;
[0088] Step S3, the knowledge graph model constructed in step S2 is formed into the final knowledge graph model according to the predefined voting strategy
[0089] Step 4, the knowledge graph application service unit invokes the knowledge graph according to business requirements to participate in completing the business.
[0090] Preferably, on the basis of the conventional data fusion method, the rapidly changing data estimation fusion procedure of the present embodiment includes:
[0091] Step R1, define Among them, {x 1 ,x 2 ,...x k ,x K } is the observation value of K independent data samples in the historical high-speed update data samples, k=1, 2, 3...K, j and w are predefined parameters, w 1 ,w 2 ,...w k is a set of real numbers;
[0092] Step R2, pass y k =μ+αt k +ε k , μ=log(2γ), the characteristic index ∝ and the dispersion coefficient γ are calculated; among them, ε k are the pre-defined coefficients of the same distribution but independent error terms with mean 0, t k =log|w k |, K is the number of historical samples;
[0093] Step R3, pass z k =δw k +ε k , calculate the position parameter δ, where z k = arctan(Im(w k )/Re(w k ), ε k is a predefined mean 0 that belongs to the same distribution but is independent of the error term coefficients;
[0094] Step R4: Bring the characteristic index ∝, dispersion coefficient γ, and position parameter δ obtained in steps R2 and R3 into φ(w)=exp{jδw-γ|w| ∝}, and do Fourier transform to get the probability density function f(x), complete the high-speed update data P n The fitted estimates are fused.
[0095] Preferably, calling a rapidly changing data estimation fusion program to perform data estimation and fusion, further comprising:
[0096] Step R5, determine As rapidly changing data estimates whether the change value estimated by the fusion procedure exceeds a predefined threshold T max index; among them, A is the data value that needs to be estimated and fused in real time, is the parameter estimated by the historical high-speed update data samples, T max is the detection threshold corresponding to the predefined fusion rate.
[0097] Preferably, in order to improve the real-time efficiency of the system and prevent failure or system function degradation, the multimodal knowledge graph construction and retrieval method further includes:
[0098] Step A1, optionally multiple knowledge data acquisition and processing units, multiple knowledge graph construction management units, and multiple knowledge graph application service units to form a real-time system;
[0099] Step A2, optional adjacent front and rear levels, the unit of the former level is defined as the primary unit, and the unit of the latter level is defined as the secondary unit;
[0100] Step A3, define the real-time system performance model as H=H 1 ·H 2 ·H 3 ·H 4 ·H 5 , where H 1 For effectiveness, H 2 For processing efficiency, H 3 is the system load rate, H 4 is the data processing accuracy, H 5 is the system failure rate;
[0101] Step A4, H 4 is predefined, H 5 It is the real-time system failure rate calculated according to the historical situation, and calculated according to the following publicity H 2 =PH 21 +(1-P)H 21 H 22 , H 3 = (NH 31 +NH 32 )/(N+M);
[0102] Among them, W=PW 1 +(1-P)(W 1 +W 2 ), T=PT 1 +(1-P)(T 1 +T 2 ), t is the total time of the data in the primary unit and the secondary unit, P is the probability of the predefined data entering the secondary unit from the primary unit data, the processing efficiency of the primary unit Secondary unit processing efficiency Primary unit load factor Secondary unit load factor N is the number of primary units, M is the number of secondary units, R is an integer, P R Average data volume according to predefined primary units get, Q R Average data volume according to predefined sub-units get, W 1 =L 1 /λ is the average response time of primary unit data, W 2 =L 2 /λH 21 P is the response time of the secondary unit; T 1 =1/μ 1 Average service time for primary unit data, T 2 =1/μ 2 Average service time for secondary unit data; μ 1 and μ 2 is the parameter of exponential distribution, λ is the predefined Poisson parameter;
[0103] In step A5, the overall performance value of the real-time system is calculated, and the size of the overall performance value is judged. If it is greater than the predefined threshold, return to step A1 to reselect to form a new real-time system.
[0104] In this embodiment, when the knowledge graph is constructed for the original data of the multimodal data, the decision and method for processing the data can be effectively differentiated according to the change rate, and the real-time performance and efficiency can be improved. The present invention calculates the real-time change rate of each multimodal data, and distinguishes the high-speed update data P according to the predefined change rate threshold n and slow update data P r; for high-speed update data P n , call the fast variable data estimation fusion program to estimate and fuse the data; for the slow update data P r , directly call the slow variable data processing fusion program for calculation and fusion, and directly update the slow update data P r The change value is calculated, and the efficient knowledge graph construction is realized. For rapidly changing data, high-speed and high-precision fitting is achieved through the unique algorithm of the present invention. And the overall comprehensive efficiency of the system is evaluated in real time, and then the composition of the system is adjusted to make it effective and efficient.
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

no PUM

Description & Claims & Application Information

We can also present the details of the Description, Claims and Application information to help users get a comprehensive understanding of the technical details of the patent, such as background art, summary of invention, brief description of drawings, description of embodiments, and other original content. On the other hand, users can also determine the specific scope of protection of the technology through the list of claims; as well as understand the changes in the life cycle of the technology with the presentation of the patent timeline. Login to view more.
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Similar technology patents

Detection method and device for page availability

InactiveCN104993943AAvoid wastingImprove timeliness and efficiencyWebsite content managementData switching networksHigh availabilityApplication Context
Owner:BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD

Multimedia data display method, device, and system

InactiveCN106412224AImprove timeliness and efficiencySubstation equipmentInformation technologyData acquisition
Owner:LETV HLDG BEIJING CO LTD +1

Classification and recommendation of technical efficacy words

  • Improve timeliness and efficiency

Sale platform based on smart phone platform

InactiveCN103971279AImprove timeliness and efficiencyIncrease breadthBuying/selling/leasing transactionsDatabase serverSmart phone
Owner:CHONGQING AOLU E COMMERCE

Method and system for monitoring surface roughness of magnetic control spattering target

InactiveCN101819030AReduce the chance of defective productsImprove timeliness and efficiencyVacuum evaporation coatingSputtering coatingPhysicsVoltage
Owner:BOE TECH GRP CO LTD +1

Digitized quantitative evaluation method of oilfield waterflooding process

ActiveCN107965301AImprove timeliness and efficiencyStrengthen the supervision of water injection processFluid removalResourcesIndex systemEngineering
Owner:PETROCHINA CO LTD

Multi-fault mode identification method and device of swashplate of helicopter

InactiveCN103674538AImprove timeliness and efficiencyImprove securityMachine bearings testingReal-time computingRadial basis function neural
Owner:BEIJING AEROSPACE MEASUREMENT & CONTROL TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products