Fast in-memory technique for building reverse csr graph index in rdbms

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By constructing a reverse CSR from a forward CSR using parallelization techniques, the problem that existing CSR representations cannot efficiently traverse directed edges in any direction is solved, improving the performance of graph pattern matching queries and graph algorithms, and reducing storage and time costs.

CN115280306BActive Publication Date: 2026-06-19ORACLE INT CORP

View PDF 2 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: ORACLE INT CORP
Filing Date: 2021-03-10
Publication Date: 2026-06-19

Application Information

Patent Timeline

10 Mar 2021

Application

19 Jun 2026

Publication

CN115280306B

IPC: G06F16/901; G06F17/10

AI Tagging

Application Domain

Other databases indexing Complex mathematical operations

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

Existing technologies, especially CSR representations, cannot efficiently traverse directed edges in any direction when building graph indexes in relational database management systems. This limits the performance of graph pattern matching queries and graph algorithms, and the storage and time costs of building reverse CSRs are also high.

Method used

By using parallelization techniques, a reverse CSR is constructed from a pre-existing forward CSR. By utilizing the mapping relationship pattern to the graph data model and employing a fast in-memory algorithm, the storage and time costs of constructing the reverse CSR are reduced, enabling fast traversal of directed edges in any direction.

Benefits of technology

It improves the performance of graph pattern matching queries and graph algorithms, reduces storage requirements and construction time, enhances the efficiency of multi-threaded processing, and reduces synchronization requirements.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN115280306B_ABST

Patent Text Reader

Abstract

In this embodiment, the computer obtains a mapping from a relational schema of a database to a graph data model. The relational schema identifies one or more vertex tables corresponding to one or more vertex types in the graph data model and one or more edge tables corresponding to one or more edge types in the graph data model. Each edge type is associated with a source vertex type and a target vertex type. Based on this mapping, a forward compressed sparse row (CSR) representation is filled for forward traversal of edges of the same edge type. Each edge originates from a source vertex and terminates at a target vertex. Based on the forward CSR representation, a backward CSR representation of the edge type is filled for backward traversal of edges of the edge type. Speedup occurs in two ways. Values computed for the forward CSR are reused in the backward CSR. Elastic and inelastic scaling can occur.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to loading heterogeneous graphs from tables in a relational database into memory. This paper presents parallel techniques for accelerating the construction of pairs of Redundant Compressed Sparse Row (CSR) encoded data for traversing directed edges of a feature graph in either direction. Background Technology

[0002] The demand for graph analysis of data residing in relational database management systems (RDBMS) is growing. Some solutions require constructing graphs outside of the RDBMS to store the data of interest. One solution requires constructing the graph in a dedicated graph analysis engine. Another solution requires migrating the data to a graph database. These solutions are undesirable because they significantly increase the complexity of data management within the enterprise and result in significant loading / data transfer costs for external engines.

[0003] In RDBMS, performing graph analysis directly on relational tables (such as graph pattern matching queries, graph algorithm execution, or a combination of both) is significantly less efficient than the performance offered by a dedicated graph engine, especially for algorithms of interest (such as PageRank). RDBMS can implement graph algorithms as a series of slow, cumbersome, materialized table joins that require short-term intermediate results.

[0004] Some queries (such as pathfinding queries) are better expressed as graph queries than relational queries (such as Structured Query Language (SQL)). For example, topology queries are better expressed as regular expressions or context-free expressions that are not easily expressed as SQL. RDBMSs that only expect SQL and / or tabular queries typically lack data structures specifically designed for graph analysis. In these ways, the state of RDBMS in this domain may be too slow for graph analysis.

[0005] RDBMS handles relational data, that is, data stored as tables linked together by primary key-foreign key relationships. This relational data can be analyzed as a graph. For example, an N:M relationship can be interpreted as edges. In-memory graph representations, such as adjacency lists or adjacency matrices, can then be built on top of the relational data.

[0006] One in-memory graph indexing method is Compact Sparse Row (CSR) representation, which provides a compact representation of an adjacency list using only two arrays (referred to in this paper as the source array and the destination array). A technical problem with directed graphs is that CSR representations only allow edges to be followed in one direction: from the source array to the destination array, which can perform poorly for various graph analyses. Attached Figure Description

[0007] In the attached diagram:

[0008] Figure 1A It is a block diagram depicting the instance mapping between instance relation schemas and instance graph data models;

[0009] Figure 1B It is a block diagram depicting example positive and negative CSRs;

[0010] Figure 2 This is a flowchart depicting an example process of filling a pair of CSRs as edge types of a graph;

[0011] Figure 3 Example activities that can occur when filling a reverse CSR are depicted;

[0012] Figure 4 This is a block diagram depicting an example computer that uses parallelism to accelerate the filling of pairs of CSR-encoded graphs.

[0013] Figure 5 An example activity is depicted that facilitates the parallel population of data and / or metadata in the reverse destination array;

[0014] Figure 6 An example activity is depicted that facilitates the parallel population of data and / or metadata in the reverse source array;

[0015] Figure 7 This is a block diagram illustrating a computer system on which embodiments of the present invention can be implemented;

[0016] Figure 8 It is a block diagram of a basic software system that can be used to control the operation of a computing system. Detailed Implementation

[0017] In the following description, numerous specific details are set forth for purposes of explanation in order to provide a thorough understanding of the invention. However, it will be apparent, however, that the invention may be practiced without these specific details. In other instances, well-known structures and devices are illustrated in block diagram form to avoid unnecessarily obscuring the invention.

[0018] General Overview

[0019] This paper presents a method for loading heterogeneous graphs from tables in a relational database into memory. Parallelization techniques are presented to accelerate the construction of Redundant Compressed Sparse Row (CSR) encoded pairs for traversing directed edges of a feature graph in either direction.

[0020] Forward and reverse CSRs are in-memory relational database management system (RDBMS) graph indexes. They, individually or in pairs, enable fast following of edges from graph data in any direction and accelerate some graph pattern matching queries and graph algorithms (such as PageRank). Like forward CSRs, reverse CSRs can be constructed from relational data using SQL queries, but this is slow. The technique presented in this paper makes it possible to accelerate the creation of reverse CSRs by constructing them from pre-existing forward CSRs using a fast in-memory parallel algorithm.

[0021] In addition to the forward CSR, constructing a second graph index (referred to as the reverse CSR in this paper) can be beneficial, as it again stores all the edges of the graph, except that its direction is reversed. The advantage of constructing a reverse CSR, besides the forward CSR, is that it can follow edges in either direction, which is beneficial for graph pattern matching queries and graph algorithms. In one example, a user wants to match a long string of vertices (a1) → (a2) → ... → (aN), using a highly selective filter on the characteristics of vertex (aN). If the forward CSR is the only available graph index, then many chains to (aN) must be explored before discarding them. If the reverse CSR is available, then exploration can start from (aN), and the chain is followed only in the very few cases where (aN) has not been filtered out, which is beneficial for performance.

[0022] In another example, a user wants to use multiple threads to run the PageRank graph algorithm, explained later in this article. If a reverse CSR is not available, multiple threads will iterate through different portions of the source array of the forward CSR, incrementing the rank of the neighboring vertices found in the destination array each time. Since multiple source vertices processed by different threads can connect to the same destination vertex, threads require synchronization to update the new rank of the destination vertex. However, if a reverse CSR is available, multiple threads can iterate through different destination vertices, finding all corresponding source vertices each time and immediately calculating the new rank without synchronization. Eliminating the need for cross-thread synchronization increases throughput and reduces latency.

[0023] Using both forward and reverse CSRs simultaneously has two main drawbacks: (1) memory usage is twice that of using only the forward CSR; and (2) building two graph indexes is slower than building a single graph index. Our technique mitigates the impact of (2) by proposing a fast, in-memory, and parallel algorithm to build the reverse CSR from a pre-existing forward CSR, rather than building it from scratch using SQL queries on the vertex and edge tables.

[0024] In this embodiment, the computer obtains a mapping from a relational schema of the database to a graph data model. The relational schema identifies one or more vertex tables corresponding to one or more corresponding vertex types in the graph data model and one or more edge tables corresponding to one or more corresponding edge types in the graph data model. Each edge type is directed and therefore associated with a corresponding source vertex type and a corresponding target vertex type. Based on that mapping, a forward compressed sparse row (CSR) representation is used to fill the forward traversal of edges of the same edge type. Each edge of an edge type starts at a source vertex of the source vertex type of the edge type and terminates at a target vertex of the target vertex type of the edge type. Based on the forward CSR representation, a backward CSR representation is used to fill the backward traversal of edges of the edge type. Speedup occurs in two ways. First, values computed for the forward CSR are reused in the backward CSR. Second, elastic and non-elastic scaling may occur accordingly.

[0025] 1.0 Example Computer and Diagram

[0026] Figure 1A-1B This is a block diagram depicting an example computer 100 and example figure 105 in an embodiment. Computer 100 uses parallelization to accelerate the filling of redundant compressed sparse line (CSR) coded pairs in logical figure 105. Computer 100 may be at least one rack server, such as a blade server, personal computer, mainframe, virtual computer, or other computing device. When computer 100 includes multiple computers, the computers are interconnected via a communication network.

[0027] Figure 105 is a directed graph containing vertices AD and directed edges UZ connecting vertices AD, as shown. Figure 105 is an instance of graph data model 130, containing vertex types 141-142 and edge types 151-152, as shown in the element type column of graph data model 130. The display column of graph data model 130 is an illustrative legend of the graph instance, such as 105. For example, and according to the display column, edge Y is shown as a dotted line, indicating that edge Y is an instance of edge type 152.

[0028] Figure 1A An example mapping 120 is depicted between an example relational schema 160 and an example graph data model 130 in the embodiments. The graph data model 130 may define properties (not shown) and the types of vertices and edges. For example, vertex type 141 may have age and color properties, some of which, none of which, or all of which may be properties of vertex type 142.

[0029] According to graph data model 130, each edge type has a corresponding source vertex type and a target vertex type. For other edge types, either, none, or both can be exactly the same. For example, two edge types 151-152 have the same source vertex type 141, but have different corresponding target vertex types 141-142.

[0030] For edge type 151, the source vertex type is also the destination vertex type, which facilitates self-directed edges, such as X, that originate from and terminate at the same vertex. In some embodiments, a first vertex can be redundantly connected to the same second vertex in the same or opposite directions by multiple edges of the same or different edge types. For example, edges U and X redundantly connect vertex A to itself.

[0031] In operation and depending on the embodiment, Figure 105 is loaded into volatile or non-volatile memory for analysis as various column vectors. The contents of the vectors are homogeneous in terms of element data type, but different vectors can have different content data types. For example, a vector can store values of the same properties for vertices or edges of the same type. A vector can store system properties of graph elements, such as identifiers of vertices of vertex type 141. A vector can store application properties of graph elements, such as the transport status of vertices of vertex type 141.

[0032] Although elements of the same vector are stored contiguously in memory, multiple vectors of different corresponding properties of the same graph element type do not need to be adjacent in memory. Multiple vectors of different corresponding properties of the same graph element type should have the same number of elements, and the contents of those vectors should have the exact same order. For example, for vertex type 141, the color or age of vertex A should appear at the same offset in each of the corresponding color and age property vectors.

[0033] That offset can be operated on as a canonical offset to access all properties of the same vertex or edge. The canonical offset may also be referred to in this document as an internal identifier, a volatile identifier, or an in-memory graph topology identifier (IMGTID). As explained later in this document, the canonical offset is one of a variety of dense identifiers.

[0034] As used herein, depending on the context, an in-memory array for a graphics element type can be a single feature vector or a logical aggregation of multiple distinct feature vectors that accept the canonical offset of the graphics element type as offsets. Each graphics element type has its own sequence of canonical offset values that increment from zero. Computer100 takes care not to confuse canonical offset values of different graphics element types, even if such offsets are syntactically interchangeable. Canonical offsets are not semantically interchangeable.

[0035] A canonical offset uniquely identifies a vertex or edge within its vertex or edge type. Canonical offsets are not globally unique. Vertices and / or edges of different types can unintentionally share the same canonical offset. For example, zero can be the same canonical offset for vertices A and D with different vertex types.

[0036] In this embodiment, the uniqueness of the canonical offset is guaranteed only for the same graph instance. If graph data model 130 describes multiple graph instances concurrently residing in memory, then each graph instance has its own set of vertex arrays and edge arrays. For example, regardless of whether two graph instances share the same graph data model 130, if two graph instances share vertex type 141, then for the same vertex type 141, there are two separate vertex arrays with separate characteristic vectors. Therefore, vertices of the same vertex type 141 used for two graph instances should not be confused.

[0037] In embodiments, graph instances may partially overlap to share one or more vertices and / or one or more edges. Even for graph instances that do not share metadata 120, 130, and / or 160, graph instances may share some CSRs, vectors, or arrays when one or more vertex types and / or one or more edge types are shared. For example, such aggregation and / or indexing structures can store the union of two graph instances for one, some, or all graph element types. In embodiments, only metadata 120, 130, and / or 160 may be shared, but the graph instance content may not be shared. CSRs will be explained later in this document.

[0038] Loading any or every graph element type into memory creates at least one feature vector for each graph element type. Therefore, each vertex type and edge type has a non-empty logical set of feature vectors. For vertex types, this logical set of vectors is referred to herein as the vertex array, which is logically tabular. For edge types, that logical set of vectors is referred to herein as the edge array, which is logically tabular.

[0039] Therefore, vertex types 141-142 and edge types 151-152 each have a corresponding vertex array or edge array with characteristic vectors. Throughout this document, all internal identifiers of graph elements in memory are canonical offset values to the vertex array or edge array of the corresponding graph element type. Each graph element type has its own zero-based, dense, ascending, and continuous sequence of non-negative integer values, which is valid when graph 105 is loaded into memory and before it is evicted from and / or reloaded into memory, as explained in the relevant U.S. patent application 16 / 747,827.

[0040] Some feature vectors in an array of one graph element type can store canonical offsets of another graph element type for cross-referencing. For example, an edge array can have feature vectors storing canonical offsets of vertices of the target vertex type of the edge. Thus, various graph element arrays can be associated with each other, which is sufficient to encode the entire topology of Figure 105.

[0041] Figure 105 is loaded from a relational database having a relational schema 160 that defines vertex tables 171-172 and edge tables 181-182. Relational schema 160 defines the persistent format of the data used for Figure 105, and graph data model 130 defines the analysis format suitable for graph analysis in memory. For example, each row of vertex table 171 can be a persistent representation of the corresponding vertex of vertex type 141. For example, vertex A can be stored as a row in vertex table 171, while vertex D can be stored in vertex table 172.

[0042] Mapping 120 is more or less a data binding between graph data model 130 and relation schema 160. Mapping 120 can be bidirectional to facilitate data reformatting during loading or persistence. In embodiments, rows of mapping 120 are stored as rows in a mapping table, such as in relation schema 160 or in different schemas and / or databases. In embodiments, mapping 120 is instead persisted to a separate data file or entered interactively during operation.

[0043] Although not shown, mapping 120 can contain bindings that are finer or coarser than a one-to-one mapping from table to vertex type. For example, mapping 120 can contain query predicates that can selectively bind rows of a vertex table to different corresponding vertex types based on the content of the vertex table rows. Similarly, mapping 120 can contain query unions or query joins that can bind vertex types to multiple vertex tables.

[0044] The semantics of mappings, such as 120, provide flexibility to facilitate various scenarios. For example, multiple database instances can share the same relation schema 160, but each database instance has different content in the relation tables, and the same graph data model 130 and mapping 120 can be used to generate separate graph instances for each database instance. Different mappings, such as 120, can each map the same relation schema 160 to different corresponding graph data models. Different mappings, such as 120, can each map different corresponding relation schemas to the same graph data model.

[0045] Mapping 120 provides flexibility for various structural normalization, renormalization, or denormalization scenarios. For example, each vertex table row can be mapped to a vertex, and each edge table row can be mapped to an edge. The edge table can have foreign keys to the vertex table, and vice versa. These and the following mapping details (such as which table columns are primary or foreign keys, how those keys are used, and how they are associated with graph element types) are specified in Mapping 120.

[0046] In some embodiments, the polarity of table relationships can be changed as follows. For example, an edge table connecting two vertex tables can have a foreign key in one vertex table, while the other vertex table can have a foreign key in the edge table. The edge table can be an associative table, with two foreign keys for each of the two connected vertex tables. The edge table may have no foreign keys, such as when both connected tables have foreign keys in the edge table. The edge type does not need to have any edge tables, such as when one vertex table has a foreign key in the other vertex table.

[0047] There can be some kind of overload on the table rows, such that mapping 120 can map the same row of the same vertex table to multiple vertex types. For example, the same row can have two columns with different corresponding foreign keys, used to map to different corresponding edge types with different corresponding source vertex types and / or different corresponding destination vertex types.

[0048] Various embodiments of mapping 120 may contain various binding tuples, such as any of the following:

[0049] • (Relationship table, graph element type)

[0050] • (Source vertex type, edge type, target vertex type)

[0051] • (Source vertex table, edge table, target vertex table)

[0052] • (Source primary key, source foreign key, target primary key, target foreign key).

[0053] Implementations may combine some or all of those kinds of tuples to alternatively implement other kinds of tuples.

[0054] The numerous methods presented above offer sufficient flexibility to allow mapping 120 to be reused with different database instances, such as a January sales database and a February sales database. Different mappings can: a) adapt different corresponding relation schemas to the same graph data model, and / or b) adapt different corresponding graph data models to the same relation schema. For example, two different mappings can alternately map the same edge table to different corresponding edge types, which differ only in direction in different corresponding graph data models. For instance, two edge types can connect the same two vertex types, such that one edge type uses one vertex type as the source vertex type, while the other edge type uses the same vertex type as the target vertex type. Therefore, foreign key polarity and edge type direction can be related or unrelated.

[0055] This adaptability facilitates integration with legacy databases without disrupting their legacy schemas, thus making legacy schemas and content future-proof. Therefore, it promotes the reuse and / or reuse of mappings, relational schemas, graph data models, and / or database content.

[0056] 1.1 Example of a positive CSR

[0057] Figure 1B Example forward CSR 110 and reverse CSR 115 in the embodiments are depicted. Figure 1B It is for reference. Figure 1A Presented.

[0058] Figure 1A Mapping 120 is metadata and does not require providing the actual content of any specific graph instance, such as 105, for analysis. The analysis of Figure 105 represents topological encoding in memory (such as volatile dynamic random access memory (DRAM)) based on CSR aggregations (one or more), such as 110 and / or 115. As shown, CSRs 110 and 115 encode only edges of edge type 151. Other edge types may each have a separate pair of CSRs.

[0059] Forward CSR 110 contains forward arrays 190 and 195, which, although shown as a table, are integer vectors with corresponding single columns, and their actual stored contents are shown in bold. Columns in forward CSR 110 that are not shown in bold are implicit columns; they may be descriptive and are not actually stored.

[0060] The vertices and edges of Figure 105 are topologically encoded as pairs of CSRs, such as 110 and 115 for edge type 151, as shown below. Each edge type has its own forward CSR, which has its own forward source array, such as 190. Each row of the forward source array 190 represents a different vertex of vertex type 141, which is the source vertex type of edge type 151. Each edge type has its own edge array, such as the forward destination array 195 for edge type 151. Each row of the forward destination array 195 represents a different edge of edge type 151.

[0061] Each edge type has its own CSR pair, such as CSR110 and 115 for edge type 151. Although multiple edge types 151-152 share the same source vertex type 141, the corresponding forward CSRs of edge types 151-152 have their own corresponding forward source arrays.

[0062] The forward source array 190 contains a forward edge position vector, which contains the offsets of rows in the forward destination array 195. The values in the forward edge position vector of the forward source array 190 are monotonically increasing to indicate the starting position of a subsequence of rows in the forward destination array 195 representing edges of edge type 151 that originate from vertices of a given row in the forward source array 190. For example, in the forward source array 190, vertex A originates from edges of edge type 151, which are represented as consecutive corresponding rows starting at row 0 of the forward destination array 195. Each value in the forward edge position vector of the forward source array 190 can be calculated by adding the count of edges of edge type 151 originating from the previous vertex of the previous row in the forward source array 190 to the previous value.

[0063] For example, vertex A triggers four edges UX of edge type 151, represented by rows 0-3 of the forward destination array 195. Therefore, zero + four = four is the value in the forward edge position vector of the forward source array 190 for vertex B. Similarly, vertex B does not trigger edges, so four + zero = four is the value in the forward edge position vector of the forward source array 190 for vertex C. In this embodiment, the last entry in the forward edge position vector of the forward source array 190 contains the count of edges of edge type 151, which is also the count of rows in the forward destination array 195.

[0064] Each edge row of the forward destination array 195 indicates the offset of a row in the vertex array of the target vertex type 141 in the vertex position vector. In this case, the forward destination array 195 may be or include the forward source array 190, as explained below. For example, the vertex position vector of the forward destination array 195 indicates that edge V terminates at a vertex in the first row of the forward source array 190, namely vertex B.

[0065] By using only the forward edge position vectors of the forward source array 190, computer 100 can detect the four edges of edge type 151 that originate from vertex A by subtracting neighboring values. By using the forward destination array 195 after using the forward source array 190, computer 100 can also detect that these four edges terminate at vertex AC. With a separate CSR for each edge type, the entire topology of Figure 105 can be densely encoded and traversed quickly.

[0066] Arrays 190 and 195 are both represented as columns or vectors with vertex positions and forward edge positions. All of those columns / vectors contain canonical offsets for the graph elements, which are specific to a single graph element type for a given column or vector. In forward source array 190, the vertex position column contains canonical offsets for vertex type 141, while the forward edge position vector contains canonical offsets for edge type 151.

[0067] In the forward destination array 195, the forward edge position column contains the canonical offset for edge type 151, and the vertex position vector contains the canonical offset for vertex type 141. The forward edge position vectors and columns of the corresponding arrays 190 and 195 in the same CSR 110 should be used for the same edge type. Depending on whether the source vertex type and the destination vertex type of edge type 151 are the same, the vertex position columns and vectors of the corresponding arrays 190 and 195 in the same CSR 110 may or may not be used for the same vertex type.

[0068] Although arrays 190 and 195 are shown as contents of forward CSR 110, in some embodiments, those arrays may logically also be vertical slices of the graph element array. For example, in an embodiment, forward source array 190 may be a subset of the columns of the vertex array of vertex type 141. In any case, forward source array 190 and the vertex array of vertex type 141 have the same number and order of vertices.

[0069] Even though edge types 151-152 all have the same source vertex type 141, edge types 151-152 also have separate CSRs with individual forward edge position vectors. In an embodiment, those individual forward edge position vectors can also be separate columns in the same vertex array used for vertex type 141.

[0070] In one embodiment, the forward destination array 195 may be a subset of the columns of the edge array for edge type 151, in which case the forward destination array 195 has the same edge ordering as the edge array for edge type 151. In another embodiment, the forward destination array 195 and the edge array for edge type 151 may have different edge orders, as long as there is a mapping between those orders, as explained later herein. In any case, the forward destination array 195 and the edge array for edge type 151 have the same number of edges.

[0071] 1.2 Example of Reverse CSR

[0072] CSRs 110 and 115 are a pair that facilitate bidirectional traversal of edges of unidirectional edge type 151. Edge traversal in the direction of the edge uses forward CSR 110. Edge traversal in the direction of the reverse edge uses reverse CSR 115. Therefore, edge traversal in either direction can occur in more or less similar time and space.

[0073] While CSRs 110 and 115 are used for traversals in opposite directions, this differs from connecting two different edge types of the same vertex type but in opposite directions. For example, for edge type 152, there can be a pair of CSRs originating from vertex type 141 and terminating at vertex type 142. Instead, another edge type originating from vertex type 142 and terminating at vertex type 141 will have a separate pair of CSRs. Even if both edge types have the same source vertex type and destination vertex type, they will still have a separate pair of CSRs.

[0074] CSRs 110 and 115 are a redundant pair because both CSRs encode the same topological portion of graph 105 in an alternative manner, and one or both methods can be used during the same traversal of graph 105. For example, querying a route in a city with one-way streets might require two concurrent searches, one from the origin and one from the destination, and success if both searches reach any of the same intermediate vertices. In this embodiment, both CSRs 110 and 115 are used to treat the directed graph 105 as an undirected graph. An example where only the reverse CSR 115 is needed is the PageRank algorithm, which measures the importance of a webpage by traversing backwards through hyperlinks to referencing webpages to discover the transitive closure surrounding the webpage being measured.

[0075] Reverse CSR 115 contains reverse arrays 117 and 119, which, although presented as a table, are integer vectors of their respective individual columns, and their actual stored contents are shown in bold. Columns in Reverse CSR 115 that are not shown in bold are implicit columns; they may be descriptive but are not actually stored.

[0076] The vertices and edges of Figure 105 are topologically encoded as reverse CSRs, such as 115 for edge type 151, as shown below. Each edge type has its own reverse CSR, which has its own reverse destination array, such as 117. Each row of the reverse destination array 117 represents a different vertex of vertex type 141, which is the target vertex type of edge type 151. Each row of the reverse source array 119 represents a different edge of edge type 151.

[0077] Each edge type has its own CSR pair, such as CSR110 and 115 for edge type 151. Although multiple edge types 151-152 share the same target vertex type 141, the corresponding reverse CSRs of edge types 151-152 have their corresponding reverse source arrays.

[0078] The reverse destination array 117 contains a reverse edge position vector, which contains the offsets of rows in the reverse source array 119. The values in the reverse edge position vector of the reverse destination array 117 monotonically increase to indicate the starting position of a subsequence of rows in the reverse source array 119 representing edges of edge type 151 that terminate at a vertex in a given row of the reverse destination array 117. For example, in the reverse destination array 117, vertex A terminates edges of edge type 151, which are represented as consecutive corresponding rows starting at row 0 of the reverse source array 119. Each value in the reverse edge position vector of the reverse destination array 117 can be calculated by adding the count of edges of edge type 151 terminating at the previous vertex in the previous row of the reverse destination array 117 to the previous value.

[0079] For example, vertex A terminates with two edges U and X of edge type 151, which are represented by rows 0-1 of the reverse source array 119. Therefore, 0 + 2 = 2 is the value in the reverse edge position vector of the reverse destination array 117 for vertex B. Similarly, vertex B terminates with two edges V and Z, so 2 + 2 = 4 is the value in the reverse edge position vector of the reverse destination array 117 for vertex C. In this embodiment, the last entry in the reverse edge position vector of the reverse destination array 117 contains the count of edges of edge type 151, which is also the count of rows in the reverse source array 119.

[0080] Each edge row of the reverse source array 119 indicates the offset of a row in the vertex array of source vertex type 141 in the vertex position vector. In this case, the reverse source array 119 may be or include the reverse destination array 117, as explained below. For example, the vertex position vector of the reverse source array 119 indicates that edge Z originates from a vertex in the second row of the reverse destination array 117, namely vertex C.

[0081] By using only the reverse edge position vectors of the reverse destination array 117, the computer 100 can detect two edges of vertex B terminating edge type 151 by subtracting adjacent values, such as subtracting the value for vertex A from the value for vertex B, resulting in 2 - 0 = two. By using the reverse source array 119 after using the reverse destination array 117, the computer 100 can also detect that these two edges originate from vertices A and C. Using a separate reverse CSR for each edge type allows for dense encoding and fast backward traversal of the entire topology of Figure 105.

[0082] Reverse arrays 117 and 119 are both shown as columns or vectors with vertex positions and reverse edge positions. All of those columns / vectors contain canonical offsets for the graph elements, which are specific to a given graph element type for a given column or vector. In reverse destination array 117, the vertex position column contains canonical offsets for vertex type 141, and the reverse edge position vector contains canonical offsets for edge type 151.

[0083] In the reverse source array 119, the reverse edge position column contains the canonical offset for edge type 151, and the vertex position vector contains the canonical offset for vertex type 141. The reverse edge position vectors and columns of the corresponding reverse arrays 117 and 119 in the same reverse CSR 115 should be used for the same edge type. Depending on whether the source vertex type and the target vertex type of edge type 151 are the same, the vertex position columns and vectors of the corresponding reverse arrays 117 and 119 in the same reverse CSR 115 may or may not be used for the same vertex type.

[0084] Although reverse arrays 117 and 119 are shown as contents of reverse CSR 115, in some embodiments, those reverse arrays may logically also be vertical slices of the graph element array. For example, in an embodiment, reverse destination array 117 may be a subset of the columns of the vertex array of vertex type 141. In any case, reverse destination array 117 and the vertex array for vertex type 141 have the same number and order of vertices.

[0085] The two edge types have separate CSRs, each with a separate reverse destination array and a separate reverse edge position vector, even if both edge types have the same target vertex type. In an embodiment, those separate reverse edge position vectors can also be separate columns in the same vertex array for the target vertex type.

[0086] The forward edge position vector of the reverse source array 119 stores the offset of each edge within the forward destination array 195 and / or the edge array for edge type 151. In an embodiment, the reverse source array 119 and the edge array for edge type 151 can have different orderings of edges, as long as there is a mapping between those orderings. For example, the forward and reverse edge position columns of the reverse source array 119 can be used as a bidirectional lookup table operation to translate edge positions. In any case, the reverse source array 119 and the edge array for edge type 151 have the same number of edges.

[0087] This paper presents techniques for parallel filling of entries of identical reverse CSRs and / or identical CSR pairs to accelerate shared memory, such as through symmetric multiprocessing (SMP), such as with multi-core processors. For example, Figure 105 can be enormous, such as having a diameter with billions of vertices, trillions of edges, and / or tens or hundreds of thousands of vertices. For example, time feasibility can depend on the filling of identical reverse CSRs scaled according to the synchronization and coordination techniques described herein.

[0088] In this embodiment, memory structures such as CSRs and vertex tables are optional. The following data definition language (DDL) statements can specify my_graph 105 as eligible to be loaded into memory, where the owner is a user or a schema.

[0089] ALTER PROPERTY GRAPH[owner.]my_graph INMEMORY

[0090] Similar DDL statements can specify that Figure 105 is no longer suitable for memory loading. In embodiments and as discussed later herein, computer 100 exposes Figure 105 to clients in the same manner, regardless of whether Figure 105 resides in memory. For example, if Figure 105 does not reside in memory, computer 100 can apply Data Manipulation Language (DML) statements, such as Structured Query Language (SQL), to the database and its tables containing relational schema 160 to perform filtering, joins, and projections as needed to retrieve a result set representing all instances of Figure 105 or a specific graph data element or a specific graph data element type.

[0091] Furthermore, as described later herein, loading some or all of Figure 105 into memory can occur asynchronously in a background process, such that: a) client requests are more or less entirely delegated to query processing by a database management system (DBMS) hosted by computer 100 or a different computer; b) however, repetition of the same request is applied only to Figure 105 in memory during the same graph analysis session. Various embodiments may incorporate some or all of the graph processing functionality of computer 100 into the DBMS itself. For example, the DBMS on computer 100 may operate as both a relational database engine and a graph database engine.

[0092] As will be described later herein, Figure 105 can be loaded into and / or unloaded from memory in a segmented manner, synchronously or asynchronously (e.g., in the background) with respect to client requests. For example, CSRs and / or vertex tables are loaded into memory individually, driven by demand, and evicted from memory when memory is scarce. In another example described later herein: a) horizontal and / or vertical slices of vertex and / or edge tables store their data in memory blocks; b) each block can be loaded individually, or, in embodiments, evicted individually; and c) multiple blocks can be loaded in parallel from one or more of the same or different relational tables. Therefore, fulfilling client requests may require mixed access to database tables and memory.

[0093] 2.0CSR for the filling process

[0094] Figure 2 This is a flowchart depicting an example process that computer 100 can execute to fill a pair of CSRs 110 and 115 for edge type 151 in Figure 105. (Reference) Figure 1A-1B discuss Figure 2 Parallel filling of the reverse CSR 115 will be presented later herein. Parallel filling of the forward CSR 110 is described in the relevant U.S. patent application 16 / 747,827.

[0095] As presented earlier in this document, step 202 yields a mapping 120 that binds relation schema 160 to graph data model 130. For example, mapping 120 may include a lookup table whose keys are relation table names and whose values are vertex type names. Mapping 120 may specify source and destination vertex types for each edge type. Mapping 120 may be composed manually or derived automatically, such as by analyzing graph data model 130 and relation schema 160.

[0096] Individual CSR pairs can be generated for each edge type 151-152 of the graph data model 130. Steps 204 and 206 populate CSR pairs for one edge type and can be repeated for additional edge types of the same graph data model 130.

[0097] Based on mapping 120, step 204 fills in the forward CSR 110, as described elsewhere herein and / or in relevant U.S. patent application 16 / 747,827. After step 204, the forward CSR 110 is ready for use.

[0098] Based on the forward CSR representation 110, step 206 fills in the reverse CSR representation 115, as described elsewhere in this document. The parallelization of step 206 will be presented later. After step 206, the reverse CSR 115 is ready for use.

[0099] Consulting a positive CSR 110 accelerates step 206 by reusing work already performed during step 204. Consulting a CSR 110 accelerates step 206 by avoiding redundant work such as calculating, grouping, or sorting vertices and / or edges, and / or input / output (I / O) (such as accessing persistent storage). Other techniques used to populate a CSR do not consult another CSR.

[0100] 3.0 Sample Performance Improvements

[0101] Figure 3 Example activities that might occur to populate reverse CSR 115 are depicted. References Figure 1A-1B Discussion with 2 Figure 3 In one embodiment, in Figure 3 The positive CSR 110 is filled before certain activities occur. In embodiments, the filling of the positive CSR 110 is automated or, to some extent, synchronized with... Figure 3 Some activities occur concurrently.

[0102] The embodiments may perform some or all of activities 301-309 in any order. Activities 301-309 present non-exclusive design choices and may or may not be optional. Most activities 301-309 can be combined.

[0103] Activity 301 efficiently populates the reverse CSR 115. Activities 302-303 disable certain high-latency activities. In other words, activities 302-303 specify operations that should not occur when populating the reverse CSR 115.

[0104] Activity 302 fills the reverse CSR 115 without performing input / output (I / O). In other words, activity 302 has all the necessary data that is already available in random access memory (RAM) such as volatile RAM. For example, the forward CSR 110 may already reside in memory and can be queried by activity 302.

[0105] Activity 303 populates the reverse CSR 115 without accessing the relation tables that provide vertices and edges for edge type 151. For example, activity 303 does not need to access any relation tables or anything persisted in the database unless the edge type is used as a leaf edge, as explained later in this document. Specifically, activity 303 does not access any of the following: the source vertex table 171, the destination vertex table (also 171 in this example), or the edge table 181. For example, activity 303 could instead access the required data in the forward CSR 110 in memory.

[0106] Activity 304 uses parallelization to accelerate the filling of the reverse CSR 115, such as by horizontal scaling with multiple compute threads, CPUs, and / or CPU cores. For example, the filling of some (one or more) rows in the reverse arrays 117 and / or 119 can be assigned to different threads. Work allocation techniques such as data partitioning, thread pools, backlogs, and thread safety will be discussed later in this paper.

[0107] Activity 305 processes at least two edges concurrently. For example, two edges originating from or terminating at the same vertex can be processed by corresponding threads. For example, each row in the reverse source array 119 can be filled by a corresponding thread.

[0108] Activity 306 performs an atomic operation to increment a counter for the target vertex. For example, and as discussed later herein, each target vertex AC of vertex type 141 can have its own corresponding counter, which is accessed by atomic instructions of the instruction set architecture (ISA). For example, fetch-and-add can atomically read and increment the counter. As discussed later herein, contention can cause concurrent atomic instructions to be executed serially.

[0109] Activity 307 counts the edges of edge type 151 that terminate at each vertex of target vertex type 141. For example, the reverse edge position vector in reverse CSR 115 can be filled based on this edge count. Similarly, the rows for edges in the reverse source array 119 can be assigned to the target vertex based on this edge count.

[0110] The filling of CSRs 110 and 115 can overlap in time to some extent, especially when the filling of the forward CSR 110 requires and / or computes the values needed to fill the reverse CSR 115. For example, synchronous logic or asynchronous pipeline parallelization can make the filling of CSRs 110 and 115: a) occur concurrently to some extent, b) copy and / or reuse the computed values, and / or c) consult previously filled data partitions, such as blocks presented later in this article.

[0111] When filling the forward destination array 195 for each edge of edge type 151, activity 308 increments the corresponding counter for the target vertex that terminates the edge. For example, since target vertex B terminates two edges V and Z, the counter for target vertex B should be incremented once or twice. Such edge counters and thread safety will be discussed later.

[0112] The forward destination array 195 can provide any or all edges of edge type 151. Activity 308 can have or facilitate the avoidance of parallelization, such as linear processing (i.e., iteration) of edges within the forward destination array 195. Activity 309 is a single-threaded implementation that counts the edges for each target vertex by linearly iterating through the forward destination array 195 or another edge array for edge type 151. Activities 308-309 can be somewhat mutually exclusive. However, as will be presented later in this document, data chunking can facilitate multithreading that concurrently counts edges in different chunks but processes edges within the same chunk sequentially.

[0113] 4.0 Parallel padding of the reverse destination array

[0114] Figure 4 This is a block diagram depicting an example computer 400 and example figure 410 in an embodiment. Computer 400 uses parallelization to accelerate the padding of redundant compressed sparse line (CSR) encoded pairs in logic figure 410. Computer 400 may be an implementation of computer 100.

[0115] Figure 5 Example activities are depicted to facilitate the parallel filling of data and / or metadata into the reverse destination array 430. Implementations can realize some or all of activities 501-507 that can occur in any order. Some of activities 501-507 can be combined into a single activity. As follows, Figure 4-5 The example configuration and operation of computer 400 are shown.

[0116] Example Figure 410 has a graph data model with only one vertex type and two edge types, shown as solid or dashed arrows respectively. The solid edge type can be encoded into CSR pairs as follows, excluding edges R that have different edge types and belong to different CSR pairs. The forward destination array 420 and the forward source array are encoded as described earlier in this document.

[0117] Single-Program Multiple-Data (SPMD) with shared memory and multiple asynchronous compute threads, CPUs, and / or CPU cores provides horizontal scaling, which accelerates the filling of some or all of the arrays of CSR pairs, as shown below. In an embodiment, any array of CSR pairs can be logically and / or physically divided into multiple blocks, each block having multiple adjacent rows of the array. For example, the forward destination array 420 contains blocks A1-2, each with three edges.

[0118] To speed up the processing of any array, each thread can process the corresponding block of the array simultaneously. If there are more blocks than threads, then an ordered or unordered backlog can contain unprocessed blocks. When a thread finishes processing a block, it can take another block from the backlog and process it until the backlog is empty.

[0119] Two concurrent threads can process blocks A1-A2 separately, which may include filling and / or subsequently reading blocks A1-A2. For convenience, those threads are known in this document according to their respective blocks. Thus, blocks A1-A2 are processed by threads A1-A2 respectively, which concurrently fill the reverse destination array 430 as shown as activity 501.

[0120] Threads A1-A2 can operate at various stages and share data with other threads in other parallel or serial stages of the processing pipeline. The goal is to populate the reverse edge position vector of the reverse destination array 430. To achieve this, the reverse edge position vector of the reverse destination array 430 can temporarily store various intermediate values at different times, such as the old and new values shown.

[0121] In the first phase, threads A1-A2 concurrently compute and adjust the old values stored in the reverse edge position vector of the reverse destination array 430. Although threads A1-A2 are concurrent together, each thread processes each edge of its own block serially. That is, each thread processes one edge at a time, such as in each period of time T1-T4.

[0122] Because the reverse edge position vector of the reverse destination array 430 has elements for each vertex EJ, each element has an old value. Initially, before T1, all old values were zero.

[0123] Time intervals T1-T4 are logical and relative times. Although they monotonically increase, they do not need to be at equal intervals. For example, T1-T2 can be separated by a few milliseconds, while T2-T3 can be separated by a few nanoseconds.

[0124] At time T1, threads A1-A2 process the first edge of their respective blocks by: a) detecting which vertex is the target vertex of the first edge while filling or reading the forward destination array 420, and b) incrementing the old value of that target vertex by one. For example, the target vertex of the first edge of block A2 is vertex H, whose vertex position is three. Therefore, as shown at time T1 for vertex H, 1A2 means that thread A2 stores one, that is, initially zero and incremented by one. Thread A1 concurrently behaves similarly at time T1 for its first edge N, shown as 1A1.

[0125] The old value is the descriptive gender name of the reverse edge position vector of the reverse destination array 430. Therefore, the values shown in the old value for times T1-T4 are actually stored in the reverse edge position vector of the reverse destination array 430 at those times.

[0126] At time T2, threads A1-A2 process their respective second edges O and S, both terminating at the same destination vertex F. Therefore, threads A1-A2 simultaneously attempt to increment the old value of vertex F, which requires a competing operation that could potentially corrupt the old value of vertex F. Some implementations do not support redundant edges, such as O and S with the same source vertex and the same destination vertex.

[0127] In a thread-safe implementation, the old value is protected by atomic instructions of the CPU Instruction Set Architecture (ISA). Atomic instructions such as fetch and add can atomically: read the value of a numeric variable and increment that numeric variable. Threads A1-A2 use atomic instructions to increment the old value. In this implementation, compare and swap is the atomic instruction used instead of fetch and add.

[0128] When multiple threads simultaneously issue atomic instructions for the same variable (such as an array element or a memory address), the execution of the atomic instructions is serialized, such as along time T2-3, as shown in the figure. For example, as shown in the figure, thread A2 increments the old value of vertex F at time T2, and thread A1 increments the old value of vertex F at time T3. This serialization safely resolves conflicts.

[0129] In other embodiments, other software synchronization mechanisms achieve thread safety, such as mutexes, semaphores, locks, or critical sections. In any case, a conflict can only occur on the same element of the array. Simultaneous access to different elements of the same array is inherently thread-safe. For example, each element can have its own independent lock.

[0130] Due to atomic serialization or other factors, thread A2 completes its last edge ahead of time T3, as shown in the figure. Thread A1, however, lags behind and completes at time T4, as shown in the figure. This filling phase continues as long as all participating threads have completed, which can be detected using software synchronization mechanisms such as semaphores or barriers.

[0131] In this embodiment, the reverse destination array 430 contains blocks B1-B3, each containing two target vertices. In the next filling phase, threads B1-B3 can process the corresponding blocks B1-B3. In this embodiment, threads A1-A2 and B1-B3 are overlapping sets of threads. For example, threads A1 and B3 could be the same thread that has been repurposed. For example, a thread could return to a pool of idle threads after completing processing and wait for repurposement.

[0132] Based on the final old value, each of threads B1-B3 iterates over the target vertex of the corresponding block in the thread to calculate the total number of edges run in the block. For each target vertex, it is stored in the corresponding element of the new value. As shown in the figure, the new value of the first target vertex in each block is set to zero. The new value of each subsequent target vertex in the block is the sum of the new value of the previous element in the block and the final old value of the previous element.

[0133] For example, for the second target vertex F of block B1, the new value of the previous target vertex E is zero, and the final old value of the previous target vertex E is one. Therefore, thread B1 calculates the new value for target vertex F as 0 + 1 = one, as shown in the figure, and stores it. The new value is the description gender name of the reverse edge position vector of the reverse destination array 430. Therefore, the value shown in the new value is actually stored in the reverse edge position vector of the reverse destination array 430, thereby overwriting the previously stored old value, as shown in activity 503.

[0134] Each block B1-B3 may contain or otherwise associate with its own metadata fields, such as the last value and block offset shown. Each block may have metadata computed and stored by the corresponding thread. As explained above, the old value is the count of edges for each target vertex, and the new value is the running sum of those counts within the block.

[0135] The new value used for the target vertex does not include the old edge count of the target vertex, but only the old edge count of the previous target vertices in the block. Therefore, the old edge count of the last target vertex in the block is excluded from the new value. Although the new value is excluded, the run total should include the excluded count to ultimately determine the run total, which is stored in the last value metadata field of the block.

[0136] As shown for demonstration purposes, the last value is effectively, though not operationally, the sum of all the final old values for all target vertices in the block. For example, at time T3, the final old values for block B1 are one and three, respectively. Therefore, the last value is actually 1 + 3 = four, as shown in the figure. Thread safety for filling in the new and last values is inherent because each thread B1-B3 only accesses its own block.

[0137] Therefore, the new and last values are computed in a multi-threaded processing phase. The next processing phase is single-threaded, but can be pipelined with the previous phase, as shown below. For example, pipelined parallelism might require concurrency for activities 502-503 in the corresponding pipelined phase.

[0138] A single thread should process blocks B1-B3 in the order they appear in the reverse destination array 430. Therefore, a single thread should process block B1 first and block B3 last. When starting to process a block, a single thread should wait until the thread in the previous stage has completed that block.

[0139] For example, a single thread should not begin processing a block before thread B1 has completed block B1, even if thread B3 has already completed block B3. Pipelining occurs when a single thread processes one block while another block is still being processed by another thread in the previous stage. For example, if the previous stage completed block B1 before block B3, then a single thread can process block B1 while block B3 is still being processed by thread B3 in the previous stage.

[0140] A single thread populates the block offset metadata field for all blocks in ascending order, one block at a time. The arithmetic formula used for the block offset depends on which block it is. As shown in the figure, the block offset of the first block B1 is zero. The block offset of the second block B2 is the last value of the metadata field of the first block B1, as shown in the figure, which is four.

[0141] As calculated by Activity 502, the block offset of each subsequent block is the sum of the last value of the previous block and the block offset of the previous block. For example, for block B3, the last value of the previous block B2 is two, as shown in the figure, and the block offset of the previous block B2 is four, as shown in the figure. Therefore, as shown in the figure, the block offset of block B3 is 2 + 4 = six.

[0142] When a single thread completes, the block metadata is populated, and the final parallel population phase for the reverse destination array 430 occurs as follows. Each block is processed in two parallel phases, one of which has already occurred. In this embodiment, each block is processed by the same thread in both parallel phases.

[0143] For example, threads B1-B3 again process their respective blocks. In an embodiment, the assignment of blocks to threads changes during a second parallel phase, such as when a thread returns to the thread pool between the two parallel phases. In an embodiment, the two parallel phases have different numbers of threads.

[0144] Activity 504 applies the block offset as follows. The second parallel phase finally determines the value in the reverse edge position vector of the reverse destination array 430, as shown below and in the figure. The new value is the descriptive gender name of the reverse edge position vector of the reverse destination array 430 at the beginning of this phase. At the end of this phase, the reverse edge position vector of the reverse destination array 430 has a final value, as shown in the figure. For each block, this value is a new value that is the block offset of the block added during Activity 505.

[0145] For example, block B2 has a block offset of four. Therefore, four is added to each new value of block B2. For example, the new value of the target vertex H is zero. Therefore, in the reverse destination array 430, the reverse edge position for the target vertex H is 0+4=four, as shown in the figure.

[0146] Thread safety is inherent for this arithmetic phase because each thread B1-B3 accesses only its own block. As explained above, SPMD processes multiple blocks in parallel, but for earlier processing phases, processing can be sequential within a block. In this embodiment, this arithmetic phase combines SPMD with Single Instruction Multiple Data (SIMD) for data parallelization to further accelerate the process during activity 507 via non-elastic scaling. In this case, and during activity 506, some or all of the final values in a block can be computed concurrently because the same block offset is added to all new values in the block, which is suitable for vector hardware.

[0147] After all threads have completed this phase, the reverse destination array 430 is filled. The reverse CSR containing the reverse destination array 430 continues until the reverse source array 440 is filled, as shown below.

[0148] 5.0 Parallel Filling of Reverse Source Array

[0149] Figure 6 Example activities are depicted to facilitate the parallel filling of data and / or metadata into the reverse source array 440. Implementations can implement some or all of activities 601-604, which can occur in any order. Some of activities 601-604 can be combined into a single activity. As follows, Figure 4 and 6 The example configuration and operation of computer 400 are shown.

[0150] In this embodiment, the filling of the reverse source array 440 is based on the destination arrays 420 and 430 as follows. As explained above, the old values are used as shared counters for edges terminating at the corresponding target vertices in the reverse destination array 430. The parallel filling of the reverse source array 440 also uses a vector of shared counters as follows.

[0151] As discussed above, blocks A1-A2 of the forward destination array 420 are processed in parallel by threads A1-A2 during the initial parallel phase. In the subsequent final parallel phase, threads A1-A2, or the same or a different number of other concurrent threads, process blocks A1-2 in parallel again, as shown below.

[0152] Each thread processes the edges in its block sequentially in the following ways: a) detects the vertex position of the target vertex of the edge in the forward destination array 420, b) uses that vertex position as the offset of that target vertex in the reverse destination array 430, c) detects the first offset of the subset of edges terminating at that target vertex in the reverse edge position vector of the reverse destination array 430 in the reverse source array 440, and d) performs thread-safe mutation, as shown below.

[0153] Each edge terminating at the target vertex should be filled into a separate adjacent row in the reverse source array 440. Parallelization of threads A1-A2 allows edges terminating at the target vertex to be processed in any order, which is tolerable. Because threads A1-A2 process separate edges during activity 604, threads A1-A2 should not share the same row in the reverse source array 440.

[0154] Similarly, threads A1-A2 should not leave any empty rows in the reverse source array 440. Therefore, some coordination is needed to assign rows in the reverse source array 440 to be filled by the corresponding threads A1-A2.

[0155] In this embodiment, each row of the reverse destination array 430 is associated with a corresponding counter that indicates how many edges terminating at the corresponding target vertex have been filled into adjacent rows of the reverse source array 440. Each counter is initially zero and increments by one as edges terminating at the corresponding target vertex are filled into the reverse source array 440.

[0156] In this embodiment, during activity 605, those counters are thread-safe, and each counter is synchronized using atomic instructions (such as fetch and add) or otherwise, as explained above. For example, when threads A1-A2 process edges O and S in their respective blocks A1-A2, the same target vertex F may be contested for selecting the corresponding row of the reverse source array 440 for threads A1-A2 to fill respectively. This contention is prevented by threads A1-2 using the same atomic counter for target vertex F.

[0157] Based on the reverse edge position column of the reverse destination array 430, threads A1-A2 detect that the edge terminating at the target vertex F starts at reverse edge position 1 in the reverse source array 440. In this example, thread A1 has already filled edge N at that position in the reverse source array 440 and incremented the counter for the target vertex F from zero to one. In this example, thread A1 leads thread A2, reading one as the counter value when processing edge O.

[0158] To calculate the position of edge O within the reverse source array 440, thread A1 adds the value of a counter to the starting offset of the target vertex F in the reverse source array 440, and increments the counter by one during activity 601. Therefore, thread A1 fills edge O at the reverse edge position 1+1=2 within the reverse source array 440, as shown in the figure. Thus, edge O and the final edge S are safely and atomically added to a separate row of the reverse source array 440. Filling edge O in the reverse source array 440 during activities 602 and / or 603 may require copying the forward edge position of edge O from array 420 to array 440, as shown in the figure.

[0159] 6.0 Exemplary Embodiment

[0160] This is an exemplary embodiment based on a modern relational DBMS (RDBMS) such as Oracle. This embodiment improves upon previous examples that could further explain this embodiment. Therefore, the following explanation of this embodiment is omitted to emphasize the improvements. The limitations of this embodiment, which are to be claimed, are not required to be those of the previous examples.

[0161] Vertices and edges can be persisted as table rows in the RDBMS database. Each row can contain natural identifiers, such as primary keys. Each row can contain or be associated with dense identifiers, such as monotonically increasing sequence numbers. In this embodiment, some identifiers are native to the RDBMS. Variations of implementations that may or may not rely on native identifiers will be presented later in this document. Below are two example ways in which the RDBMS can provide native identifiers.

[0162] One approach is used in main-memory columnar databases, such as SAPHannah, ActianVector, and Vertica. These main-memory database systems typically already have first-class identifiers that can be used to access specific values in data columns. These identifiers are sequential and start from zero, thus they are dense identifiers.

[0163] Oracle databases have different ways of using hybrid storage, as shown below. Data is primarily stored as rows on disk (i.e., row-primary), but can optionally be cached in a columnar repository in main memory (i.e., column-primary). In this case, a first-order identifier starting from zero is not permanently associated with each row. Instead, each row is permanently associated with a corresponding disk address, and its value can be non-contiguous, thus being a sparse identifier. For example, Oracle has ROWID.

[0164] A forward CSR structure can be used to provide data for creating a reverse CSR. In this embodiment, a forward CSR consists of two arrays. In this embodiment, the CSR structure requires more data within the context of the RDBMS, as shown below.

[0165] Assuming the source and destination arrays of a forward CSR are segmented, this allows for better control over the RDBMS in terms of memory management. The segmented arrays are broken down into blocks that can be used as units for parallelization.

[0166] When the edge type has the same vertex type for both the source and destination vertex types, the forward source array contains the offset (DSTOFF) to the forward destination array, and the forward destination array contains the offset (SRCOFF) to the forward source array. Since the positions in the forward source array, i.e., SRCOFF, start from 0 and are ordered, they are equal to the corresponding DENSEID in the vertex table. Variant 1 below can be applied to RDBMS where DENSEID is not available.

[0167] The element i in the forward source array points to the index at the beginning of the list of outer neighbors of the source vertex i in the forward destination array, and the element i+1 in the forward source array points to the index in the forward destination array after which the list of outer neighbors of the source vertex i ends.

[0168] In addition to the indices from the forward source array, the forward destination array also stores the DENSEID of the rows in the edge table, called the EDGEID value.

[0169] In this exemplary embodiment, some arrays (such as the destination array in a forward CSR) contain pairs of elements sharing the same index. This can be achieved by using a composite data type.

[0170] In this exemplary embodiment, the reverse CSR data structure is filled as follows. The reverse CSR data structure is more or less identical to the forward CSR data structure, except that the source array and the destination array function in reverse, because reverse CSR makes it possible to quickly find inner neighbors rather than outer neighbors:

[0171] The element i in the reverse destination array points to the index at the beginning of the inner neighbor list of the target vertex i in the reverse source array, and the element i+1 in the reverse destination array points to the index in the reverse source array after which the inner neighbor list for the destination vertex i ends.

[0172] In addition to the index from the reverse destination array, the reverse source array also stores the DENSEID of the rows in the edge table, called the EDGEID value.

[0173] In this exemplary embodiment, if it is already available, the reverse CSR is constructed directly from the forward CSR. The construction of the reverse CSR is completed in steps 1-3.

[0174] Step 1 allocates the full and / or final-size but unfilled reverse destination and reverse source arrays. When the edges have the same vertex type at both ends, the reverse destination array has the same number of elements as the source array from the forward CSR (number of rows in the vertex table). This means the size of the reverse destination array is known and can be allocated directly. Similarly, the reverse source array has the same number of elements as the destination array in the forward CSR (number of rows in the edge table), meaning it can also be allocated immediately. For both reverse arrays, the allocation of a single block can be parallelized if the memory allocator supports it.

[0175] Step 2 populates the reverse CSR by calculating the offsets into the array, as shown below. Step 2 calculates the SRCOFF values stored in the reverse destination block. These values are offsets in the reverse source array at which the inner neighbors of each destination vertex begin. The SRCOFF value can be derived from the number of inner neighbors of each vertex, since the difference between SRCOFF i+1 and SRCOFF i is the number of inner neighbors that vertex i has. The number of inner neighbors can be looked up from the persistent relation table via SQL operations; however, the efficiency of Step 2 is improved because the in-memory forward CSR is available for filling. Afterward, a running sum of the number of inner neighbors needs to be calculated to obtain the SRCOFF value. Where possible, using multithreading to improve performance is desirable.

[0176] Step 2 is accomplished through the following four sub-steps, a and a. Sub-step a is multi-threaded and is used to compute the number of incoming edges for each vertex using the forward CSR. All values in the destination array of the reverse CSR are initialized to 0. Threads are then spawned, each working simultaneously from a single corresponding block in the source array of the forward CSR. Each thread follows each outgoing edge from each source vertex in its block and increments the element in the destination array of the reverse CSR corresponding to the destination of each outgoing edge. Since multiple threads can increment the same value simultaneously, atomic instructions provided by the hardware are used to perform the increments. For example, the first element in destination block 2 has 3 inner neighbors from source vertices that are scattered and may come from different blocks. This means that these three increments can come from multiple threads, and atomic operations are necessary to avoid lost updates.

[0177] Sub-step b is multi-threaded and is used to calculate the run sum for each block based on the number of incoming edges. In this sub-step, each thread works on a single block in the destination array of the reverse CSR one at a time. The local, zero-based run sum for each block is calculated. For example, the first three elements of destination block 1 have 3, 2, and 1 inner neighbors, respectively. After sub-step b, the first four elements of destination block 1 will be 0, 3 (= 0 + 3), 5 (= 3 + 2), and 6 (= 5 + 1). The last calculated value (i.e., the last value presented earlier in this document) 2398 is not part of the block: it is the total number of inner neighbors in the block and needs to be the first value of the next block at the end of step 2 (i.e., the first new value presented earlier in this document). At this point, this value is stored in a field of the block's metadata, referred to below as LASTVAL and earlier in this document as the Last Value.

[0178] Sub-step c is single-threaded and is used to calculate the block offset, as shown below. The CHUNKOFF value is also stored in the block's metadata and represents the offset at the beginning of the block. It can be calculated as a zero-based, block-level sum of the LASTVAL value, with the following conditions: The CHUNKOFF value of destination block 1 is equal to the LASTVAL of destination block 0, and the CHUNKOFF value of destination block i (i>1) is equal to the sum of the CHUNKOFF and LASTVAL of destination block i-1. After calculating the CHUNKOFF value, the LASTVAL value can be discarded. The CHUNKOFF value of destination block 2 is set to the LASTVAL of destination block 1: 2398, and the CHUNKOFF VALUE of destination block 3 is set to the sum of the CHUNKOFF and LASTVAL of destination block 2: 3682(2398+1284). Note that substep c does not need to start after substep b has completely finished: it is possible to start calculating the block-level run sum for destination block i as soon as substep b for the previous destination block (0 to i-1) is completed.

[0179] Sub-step d is multi-threaded and is used to compute the final SRCOFF value. The final SRCOFF value can be computed on a per-block basis using multiple threads: each thread simply adds the CHUNKOFF value to each element in the block. This operation can be hardware-vectorized, such as using SIMD. At the end of this sub-step, the CHUNKOFF value can be discarded. Elements from destination block 1 remain unchanged because the first block has no offset, but 2398 is added to all elements from destination block 2 because it is the CHUNKOFF value used for that block.

[0180] Following the four sub-steps ad in step 2, the reversed destination block contains its final values. Step 3 has sub-steps ab. To ensure good performance, the reversed source block is populated by multiple threads, some of which may add different EDGERID (e.g., edge offsets to the reversed source array) and DSTOFF values for the same source vertex. To handle conflicts, it is necessary to keep track of how many EDGERID / DSTOFF values have been inserted for each element during this step. For this purpose, a new segmented array named cur_pos is allocated and initially filled with zeros during step 3a of constructing the reversed CSR. The value from cur_pos is automatically incremented using atomic instructions from the hardware.

[0181] Step 3b completes the reverse CSR filling, which involves filling the reverse source block with the EDGERID and DSTOFF values of the incoming edges and inner neighbors. This sub-step again makes full use of the forward CSR. Threads are spawned or repurposed, each working on a block from the source array of the forward CSR. Each thread follows each outgoing edge from each source vertex of the block it is currently processing, and each time, the thread adds a new (EDGEID, DSTOFF) pair to the inner neighbor of the destination vertex to which the outgoing edge leads. The position in the source array of the reverse CSR where the (EDGEID, DSTOFF) pair should be inserted is calculated as the sum of the SRCOFF of the destination vertex caused by the outgoing edge traversal and the value of cur_pos, which was previously inserted into the inner neighbor list. The EDGERID value is copied from the destination array of the forward CSR, and the DSTOFF value corresponds to the position of the source vertex in the source array of the forward CSR. It is also incremented when cur_pos is read for the operation. Using atomic increments is necessary to avoid conflicts from other threads that might attempt to write EDGERID / DSTOFF values to the same destination vertex in the reverse CSR. After all EDGERID / DSTOFF values have been filled, the cur_pos array is no longer needed and can therefore be freed.

[0182] After completing the three steps 1-3 above, the reverse CSR is fully populated. Since this is done entirely in memory using multiple threads, the overhead of building a reverse CSR is much lower than that of building a forward CSR via SQL queries, as discussed later in this article.

[0183] The algorithm described above can be used with any number of spawned threads, including a single thread, in which case it is sequential. However, note that since each thread always works on one block at a time, using more threads than there are blocks in the destination array will result in some threads being idle. Therefore, a number of threads between 1 and the number of blocks in the destination array should be used. Choosing the exact number of threads to use is a logical problem that depends on the hardware, machine usage, and decisions regarding various trade-offs.

[0184] The following is a comparison of filling in forward versus reverse CSR data structures. The steps for efficiently constructing a forward CSR from the source and destination relationship table are as follows (I-III), particularly as presented in the relevant U.S. patent application 16 / 747,827.

[0185] Similar to reverse CSR filling, in step 1, source and destination arrays need to be allocated. The amount of memory allocated for these arrays is known in reverse CSR filling that relies on a previously existing forward CSR, but this approach cannot be used to fill forward CSRs. The number of elements in the forward source and destination arrays is equal to the number of rows in the source vertex and edge relation tables, respectively. As long as the RDBMS does not cache the table size and the source and destination vertices have the same vertex type, the following two SQL queries need to be run to retrieve these row numbers:

[0186] select count(*)from vertex; select count(*)from edge;

[0187] Afterwards, memory allocation can be performed. Block allocation can be done in parallel, similar to forward CSR.

[0188] In step II, the DSTOFF value in the source array must be calculated. As a reminder, in step 2 of the reverse CSR fill above, filling the destination array with SRCOFF values involved using the forward CSR to find the number of inner neighbors for each destination vertex. This method cannot be used here to find the number of outer neighbors for each source vertex. Instead, the following query can be used to find the number of outer neighbors for each source vertex:

[0189] select src, count(*)from edge where edge.rowid group by src order bysrc;

[0190] A filter for the range of rows from the edge table can be added to the query to split the work across multiple processes. After this operation, the running sum of degrees can be computed in parallel, similar to the operation done on in-degrees in reverse CSR population.

[0191] In step III, the outer neighbors of each source vertex need to be found to populate the destination array. In step 2 of the reverse CSR filling, the inner neighbors of each destination vertex were found by fully utilizing the forward CSR, but this method cannot be used here either. Instead, a double JOIN query needs to be run to find the outer neighbors:

[0192] select src.rowid, dst.rowid, edge.rowid from vertex src, vertex dst, edgeetab where src.key=etab.src and dst.key=etab.dst where etab.rowid;

[0193] Furthermore, a range filter for rows can be added to the query so that multiple processes can run a portion of it. Concurrency handling when populating the destination array can be similar to what is done for the reverse CSR when populating the source array, i.e., using cur_pos segmented arrays and atomic instructions.

[0194] Steps II-III of filling the positive CSR use ORDERBY and double JOIN, which are expensive operations. Below are important variations 1-2, which are suitable for different RDBMS row identification schemes and / or different constraints regarding the graph topology.

[0195] Variant 1 is used when the RDBMS does not provide a DENSEID identifier. The reverse CSR fill algorithm described above expects the RDBMS to provide a sequential identifier for each table, starting from 0, i.e., DENSEID. Variant 1 handles the case where the database only provides a non-sequential identifier that can start from any value. That identifier is SPARSEID.

[0196] While forward and reverse CSRs can be constructed as described above even if the DENSEID identifier is unavailable, their usefulness will be limited because it's impossible to identify the row in the vertex table corresponding to a position in the source array of the forward CSR or the destination array of the reverse CSR. This means it's impossible to access vertex properties. To address this, in variant 1, the source array of the forward CSR stores SRCIDs, i.e., the SPARSEID for each source vertex, and the destination array of the reverse CSR stores DSTIDs, i.e., the SPARSEID for each destination vertex. The reason DENSEIDs don't do this is that they are equal to their indices in the array and can therefore be inferred. Although EDGEIDs in this variant store SPARSEIDs instead of DENSEIDs, they can be used to access edge properties in the same way as above.

[0197] The reverse CSR padding described above needs adjustment for use with variant 1. In step I above, the destination array of the reverse CSR will be larger because it also needs to accommodate the DSTIDs. In step II, the DSTIDs in the destination array of the reverse CSR can be directly copied from the SRCIDs in the source array of the forward CSR, in the same order, provided the graph is isomorphic, meaning the graph has only one vertex type and one edge type. In other words, in an isomorphic graph, one edge type has the same vertex type for both the source and destination vertices.

[0198] Variant 2 is used when edge types can have different source and destination vertex types. The reverse CSR filling assumption described above assumes the graph is homogeneous, i.e., it contains a single vertex table and a single edge table. RDBMS can also support heterogeneous graphs, i.e., graphs with multiple vertex tables and / or multiple edge tables. In a heterogeneous graph, the source and destination columns in the edge table can point to two different vertex tables. Heterogeneous graphs offer several advantages, particularly in terms of performance. Heterogeneous graphs also support the influence of forward CSR, as shown below.

[0199] CSRs for different edge types can be daisy-chained, as shown below. The SRCOFF operations in the destination array of a forward CSR might not recognize elements in the source array of the forward CSR; instead, they can recognize elements in the source array of another forward CSR. Similarly, the DSTOFF operations in the source array of a reverse CSR can recognize elements from the destination array of the reverse CSR; instead, they can recognize elements from the destination array of another reverse CSR. In this way, CSR pairs can be used as chains of operations for traversing paths of multiple edge types.

[0200] Graph queries can traverse various types of edges and vertices to find a solution. Some types of edges and vertices may be related only to intermediate values, and not to the final destination vertex of the query. The last edge in a graph traversal may have a special CSR encoding, and those edges are referred to as leaves in this paper. Therefore, daisy-chained CSRs can exist for one or more specific queries or one or more patterns. For example, different queries traversing the same edge type may or may not share the same CSR pairs of that edge type. Thus, an edge type can have multiple CSR pairs for different contextual uses.

[0201] Whether a CSR in a pair is a leaf depends on: a) the edge type of the CSR pair is the last edge type in the traversal, and b) the direction of the traversal. Here, the last edge type is based on the direction of the traversal, which may be the same as or different from the direction of the edges of the edge type. For forward traversal, the destination array of the forward CSR of the last edge type encodes the last edge.

[0202] For reverse traversal, the last edge type in the traversal is somewhat counterintuitive; it's the first edge type found in the path. The source array of the reverse CSR for the first edge type found in the path encodes the last traversed edge to identify those paths, since the paths are traversed backwards.

[0203] For forward traversal, the destination array of the forward CSR for a leaf edge has a special encoding. Similarly, for reverse traversal, the source array of the reverse CSR for a leaf edge has the following special encoding.

[0204] If a forward CSR is used for a leaf edge, its SRCOFF will be replaced with the identifier of the row in the destination table (DENSEID for the standard algorithm and SPARSEID for variant 1). Similarly, if a reverse CSR is used for a leaf edge, its DSTOFF will be replaced with the identifier of the row in the source table (DENSEID for the standard algorithm and SPARSEID for variant 1).

[0205] Variant 2 handles heterogeneous cases of reverse CSR padding. The following are modifications to the algorithm described above. Step 1 needs to be modified because the size of the destination array of the reverse CSR may not be equal to the size of the source array of the corresponding forward CSR. Instead, the following modifications may be necessary.

[0206] If the forward CSR is not a leaf, then the destination array of the reverse CSR has the same number of elements as the source array, allowing the SRCOFF in the destination array of the forward CSR to identify the elements. If the reverse CSR is a leaf in variant 1, then the size of the elements can be different because DSTOFF is replaced with SPARSEID, which can be greater than the offset instead of DENSEID.

[0207] If the forward CSR is a leaf, its destination array will not identify elements from the source array of another CSR, but will instead identify rows directly from the destination table: the number of SRCOFFs (and DSTIDs in variant 1) in the destination array of the reverse CSR is equal to the number of rows in that table. If the number of rows in the destination table is not cached, it can be retrieved via an SQL query.

[0208] Regarding step 2, there are two scenarios. In the standard implementation, step 2 requires no modification. Note that the destination offset caused by the forward CSR is within the range of the source array of the reverse CSR created in step 1.

[0209] If variant 2 is combined with variant 1, then in step 2, the DSTID in the destination array of the reverse CSR is a copy of the SRCID of the element identified from the SRCOFF in the destination array of the corresponding forward CSR in the source array.

[0210] Step 3 in Variant 2 remains unchanged. Variant 2 is suitable for Oracle databases. In this embodiment, DENSEID is implemented in the main memory columnar repository (e.g., as a volatile identifier) and does not support the creation of forward and reverse CSRs for tables not loaded into memory.

[0211] 7.0 Database Overview

[0212] Embodiments of the present invention are used in the context of a database management system (DBMS). Therefore, a description of an example DBMS is provided.

[0213] Generally speaking, a server, such as a database server, is a combination of integrated software components and the allocation of computing resources, such as memory, nodes, and processes on those nodes for executing the integrated software components. This combination of software and computing resources is dedicated to providing specific types of functionality on behalf of clients. A database server controls and facilitates access to a specific database, handling client requests to access the database.

[0214] Users interact with the DBMS database server by submitting commands to the database server, instructing it to perform operations on the data stored in the database. A user can be one or more applications running on the client computer interacting with the database server. Multiple users may also be collectively referred to as users in this document.

[0215] A database consists of data and a database dictionary, stored on a persistent storage mechanism such as a set of disks. Each database is defined by its own separate database dictionary. The database dictionary contains metadata that defines the database objects contained within the database. In fact, the database dictionary defines many aspects of the database. Database objects include tables, table columns, and tablespaces. A tablespace is a collection of one or more files used to store data for various types of database objects, such as tables. If data for database objects is stored in tablespaces, then the database dictionary maps the database objects to one or more tablespaces that hold the data for those database objects.

[0216] The DBMS refers to the database dictionary to determine how to execute database commands submitted to the DBMS. Database commands can access database objects defined by the dictionary.

[0217] Database commands can take the form of database statements. For a database server to process database statements, those statements must conform to a database language supported by the database server. A non-limiting example of a database language supported by many database servers is SQL, including proprietary forms of SQL supported by database servers such as Oracle (e.g., Oracle Database 11g). SQL Data Definition Language (“DDL”) instructions are issued to the database server to create or configure database objects, such as tables, views, or complex types. Data Manipulation Language (“DML”) instructions are issued to the DBMS to manage data stored within database structures. For example, SELECT, INSERT, UPDATE, and DELETE are common examples of DML instructions in some SQL implementations. SQL / XML is a common extension of SQL used when manipulating XML data in object-relational databases.

[0218] A multi-node database management system consists of interconnected nodes that share access to the same database. Typically, nodes are interconnected via a network and share access to shared storage devices to varying degrees, such as shared access to a collection of disk drives and the blocks of data stored thereon. Nodes in a multi-node database system can take the form of a group of computers (such as workstations and / or personal computers) interconnected via a network. Alternatively, nodes can be nodes in a grid, which consists of nodes in the form of server blades interconnected with other server blades on a rack.

[0219] In a multi-node database system, each node hosts a database server. A server (such as a database server) is a combination of integrated software components and computing resources (such as memory, nodes, and processes on the nodes for executing the integrated software components on the processor), a combination of software and computing resources dedicated to performing specific functions on behalf of one or more clients.

[0220] Resources from multiple nodes in a multi-node database system can be allocated to run the software of a specific database server. Each combination of software and resource allocation across nodes is referred to herein as a "server instance" or "instance". A database server may include multiple database instances, some or all of which may run on separate computers, including separate server blades.

[0221] 7.1 Query Processing

[0222] A query is an expression, command, or set of commands that, when executed, causes a server to perform one or more operations on a dataset. A query can specify one or more source data objects from which to determine one or more result sets, such as one or more tables, one or more columns, one or more views, or one or more snapshots. For example, one or more source data objects can appear in the FROM clause of a Structured Query Language (“SQL”) query. SQL is a well-known example language for querying database objects. As used herein, the term “query” is used to refer to any form of query, including queries in the form of database statements and any data structure used for inner query representation. The term “table” refers to any source object that is referenced or defined by a query and represents a collection of rows (such as a database table, view, or inline query block, such as an inline view or subquery).

[0223] Queries can perform operations on data from source data objects row by row as the objects(s) are loaded, or on the entire source data objects(s) after the objects(s) have been loaded. Result sets generated by some operations can make other operations(s) available, and in this way, result sets can be filtered or narrowed based on certain criteria, and / or joined or combined with other result sets(s) and / or other source data objects(s).

[0224] A subquery is a part or component of a query that is distinct from the other parts or components of the query and can be evaluated separately from the other parts or components of the query (i.e., as a separate query). The other parts or components of the query can form an outer query, which may or may not include other subqueries. Subqueries nested within an outer query can be evaluated individually once or multiple times, while the results are computed for the outer query.

[0225] Generally, a query parser receives a query statement and generates an internal query representation of the query statement. Typically, the internal query representation is a collection of interconnected data structures that represent the various components and structure of the query statement.

[0226] The internal query representation can take the form of a node graph, where each interconnected data structure corresponds to a node and a component of the query statement it represents. The internal representation is typically generated in memory for evaluation, manipulation, and transformation.

[0227] Hardware Overview

[0228] According to one embodiment, the techniques described herein are implemented by one or more dedicated computing devices. The dedicated computing device may be hardwired to execute these techniques, or may include digital electronic devices such as one or more application-specific integrated circuits (ASICs) or field-programmable gate arrays (FPGAs) persistently programmed to execute these techniques, or may include one or more general-purpose hardware processors programmed to execute these techniques according to program instructions in firmware, memory, other storage devices, or a combination thereof. Such a dedicated computing device may also combine custom hardwired logic, ASICs, or FPGAs with custom programming to implement these techniques. The dedicated computing device may be a desktop computer system, a portable computer system, a handheld device, a networking device, or any other device that combines hardwired and / or program logic to implement these techniques.

[0229] For example, Figure 7 This is a block diagram illustrating a computer system 700 on which embodiments of the present invention can be implemented. The computer system 700 includes a bus 702 or other communication mechanism for transmitting information, and a hardware processor 704 coupled to the bus 702 to process information. The hardware processor 704 may be, for example, a general-purpose microprocessor.

[0230] Computer system 700 also includes main memory 706, such as random access memory (RAM) or other dynamic storage device, coupled to bus 702, for storing information and instructions to be executed by processor 704. Main memory 706 can also be used to store temporary variables or other intermediate information during the execution of instructions by processor 704. When stored in non-transient storage media accessible to processor 704, these instructions make computer system 700 a dedicated machine customized to perform the operations specified in the instructions.

[0231] The computer system 700 also includes a read-only memory (ROM) 708 or other static storage device coupled to the bus 702 for storing static information and instructions for the processor 704. A storage device 710 (such as a disk, optical disk, or solid-state drive) is provided and coupled to the bus 702 for storing information and instructions.

[0232] Computer system 700 can be coupled to display 712 (such as a cathode ray tube (CRT)) via bus 702 for displaying information to the computer user. Input device 714, including alphanumeric keys and other keys, is coupled to bus 702 for transmitting information and command selections to processor 704. Another type of user input device is cursor control 716 (such as a mouse, trackball, or arrow keys) for transmitting directional information and command selections to processor 704 and for controlling cursor movement on display 712. Such input devices typically have two degrees of freedom on two axes, a first axis (e.g., x) and a second axis (e.g., y), which allows the device to specify its position in a plane.

[0233] Computer system 700 may implement the techniques described herein using custom hardwired logic, one or more ASICs or FPGAs, firmware and / or program logic (which, in conjunction with the computer system, enable or program the computer system 700 as a special-purpose machine). According to one embodiment, computer system 700 performs the techniques described herein in response to processor 704 executing one or more sequences of one or more instructions contained in main memory 706. These instructions may be read into main memory 706 from another storage medium, such as storage device 710. Execution of the sequence of instructions contained in main memory 706 causes processor 704 to perform the processing steps described herein. In alternative embodiments, hardwired circuitry may be used instead of or in combination with software instructions.

[0234] As used herein, the term "storage medium" refers to any non-transient medium that stores data and / or instructions that enable a machine to operate in a particular manner. Such storage media can include non-volatile media and / or volatile media. Non-volatile media include, for example, optical discs, magnetic disks, or solid-state drives, such as storage device 710. Volatile media include dynamic memory, such as main memory 706. Common forms of storage media include, for example, floppy disks, flexible disks, hard disks, solid-state drives, magnetic tape or any other magnetic data storage media, CD-ROMs, any other optical data storage media, any physical media with a perforated pattern, RAM, PROMs and EPROMs, FLASH-EPROMs, NVRAMs, any other memory chips, or magnetic tape cassettes.

[0235] Storage media differ from transmission media but can be used in conjunction with them. Transmission media participate in the transfer of information between storage media. For example, transmission media include coaxial cables, copper wires, and optical fibers, including conductors containing bus 702. Transmission media can also take the form of sound waves or light waves, such as those generated during radio wave and infrared data communication.

[0236] Various forms of media can be used to transfer one or more sequences of one or more instructions to processor 704 for execution. For example, instructions may initially be carried on a disk or solid-state drive of a remote computer. The remote computer may load the instructions into its dynamic memory and transmit them over a telephone line using a modem. A modem local to computer system 700 may receive data over the telephone line and convert the data into an infrared signal using an infrared transmitter. An infrared detector may receive the data carried in the infrared signal, and appropriate circuitry may place the data on bus 702. Bus 702 transfers the data to main memory 706, from which processor 704 retrieves and executes the instructions. Instructions received by main memory 706 may optionally be stored on storage device 710 before or after execution by processor 704.

[0237] Computer system 700 also includes a communication interface 718 coupled to bus 702. Communication interface 718 provides bidirectional data communication coupled to network link 720, which is connected to local network 722. For example, communication interface 718 may be an Integrated Services Digital Network (ISDN) card, a cable modem, a satellite modem, or a modem providing data communication connectivity with a corresponding type of telephone line. As another example, communication interface 718 may be a Local Area Network (LAN) card to provide data communication connectivity with a compatible LAN. A wireless link may also be implemented. In any such implementation, communication interface 718 transmits and receives electrical, electromagnetic, or optical signals carrying streams of digital data representing various types of information.

[0238] Network link 720 typically provides data communication to other data devices via one or more networks. For example, network link 720 can provide a connection via local network 722 to host computer 724 or to data devices operated by Internet Service Provider (ISP) 726. ISP 726 then provides data communication services via a global packet data communication network (now commonly referred to as the "Internet" 728). Both local network 722 and Internet 728 use electrical, electromagnetic, or optical signals carrying digital data streams. Signals through various networks, as well as signals on network link 720 and through communication interface 718 (which carries digital data to and from computer system 700), are example forms of transmission media.

[0239] Computer system 700 can send messages and receive data, including program code, through one or more networks, network links 720, and communication interfaces 718. In the Internet example, server 730 can send requested code to the application through the Internet 728, ISP 726, local network 722, and communication interface 718.

[0240] The received code can be executed by processor 704 upon receipt and / or stored in storage device 710 or other non-volatile memory for later execution.

[0241] Software Overview

[0242] Figure 8 This is a block diagram of a basic software system 800 that can be used to control the operation of computing system 700. Software system 800 and its components, including their connections, relationships, and functions, are merely exemplary and are not intended to limit the implementation of one or more example embodiments. Other software systems suitable for implementing one or more example embodiments may have different components, including components with different connections, relationships, and functions.

[0243] A software system 800 is provided to guide the operation of the computing system 700. The software system 800, which may be stored on system memory (RAM) 706 and fixed storage devices (e.g., hard disk or flash memory) 710, includes a kernel or operating system (OS) 810.

[0244] OS 810 manages the low-level aspects of computer operations, including managing process execution, memory allocation, file input and output (I / O), and device I / O. One or more applications, designated 802A, 802B, 802C...802N, can be "loaded" (e.g., transferred from fixed storage device 710 to memory 706) for execution by system 800. Applications or other software intended for use on computer system 700 can also be stored as downloadable computer-executable instruction sets, for example, for downloading and installing from internet locations (e.g., web servers, app stores, or other online services).

[0245] Software system 800 includes a graphical user interface (GUI) 815 for receiving user commands and data graphically (e.g., "click" or "touch gestures"). These inputs can then be manipulated by system 800 according to instructions from operating system 810 and / or (one or more) applications 802. GUI 815 also displays the results of operations from OS 810 and (one or more) applications 802, allowing the user to provide additional input or terminate the session (e.g., log off).

[0246] OS 810 can execute directly on the bare hardware 820 of computer system 700 (e.g., one or more processors 704). Alternatively, a hypervisor or virtual machine monitor (VMM) 830 can be inserted between the bare hardware 820 and OS 810. In this configuration, VMM 830 acts as a software “buffer” or virtualization layer between OS 810 and the bare hardware 820 of computer system 700.

[0247] VMM 830 instantiates and runs one or more virtual machine instances (“guest machines”). Each guest machine includes a “guest” operating system (such as OS 810) and one or more applications (such as Application(s) 802) designed to run on the guest operating system. VMM 830 presents a virtual operating platform to the guest operating system and manages the execution of the guest operating system.

[0248] In some cases, VMM 830 can allow a guest operating system to run as if it were running directly on the bare hardware 820 of the computer system 800. In these instances, the same version of the guest operating system configured to run directly on the bare hardware 820 can also run on VMM 830 without modification or reconfiguration. In other words, VMM 830 can provide full hardware and CPU virtualization to a guest operating system in some situations.

[0249] In other cases, guest operating systems can be specifically designed or configured to run on VMM 830 for improved efficiency. In these instances, the guest operating system is "aware" that it is running on a virtual machine monitor. In other words, VMM 830 can provide paravirtualization to guest operating systems under certain circumstances.

[0250] A computer system process involves the allocation of hardware processor time and the allocation of memory (physical and / or virtual). Memory allocation is used to store instructions executed by the hardware processor, to store data generated by the execution of those instructions, and / or to store hardware processor state (e.g., register contents) between hardware processor time allocations when the computer system process is not running. Computer system processes run under the control of the operating system and can also run under the control of other programs executing on the computer system.

[0251] cloud computing

[0252] This article generally uses the term "cloud computing" to describe a computing model that enables on-demand access to a shared pool of computing resources, such as computer networks, servers, software applications, and services, and allows for the rapid provisioning and release of resources with minimal management effort or service provider interaction.

[0253] Cloud computing environments (sometimes called cloud environments or the cloud itself) can be implemented in various ways to best suit different requirements. For example, in a public cloud environment, the underlying computing infrastructure is owned by an organization that makes its cloud services available to other organizations or the public. In contrast, private cloud environments are generally used only by a single organization or within a single organization. Community clouds are designed to be shared by several organizations within a community; while hybrid clouds include two or more types of clouds (e.g., private, community, or public) bound together by data and application portability.

[0254] Generally, cloud computing models enable some of the responsibilities that might have previously been provided by an organization's own IT department to be delivered as service layers within the cloud environment for consumer use (depending on the public / private nature of the cloud, within or outside the organization). Depending on the specific implementation, the precise definition of the components or features provided by or within each cloud service layer can vary, but common examples include: Software as a Service (SaaS), where consumers use software applications running on cloud infrastructure, while the SaaS provider manages or controls the underlying cloud infrastructure and applications; Platform as a Service (PaaS), where consumers can use software programming languages and development tools supported by the PaaS provider to develop, deploy, and otherwise control their own applications, while the PaaS provider manages or controls other aspects of the cloud environment (i.e., everything in the runtime execution environment); and Infrastructure as a Service (IaaS), where consumers can deploy and run arbitrary software applications and / or provide processes, storage devices, networks, and other basic computing resources, while the IaaS provider manages or controls the underlying physical cloud infrastructure (i.e., everything below the operating system layer). Database as a Service (DBaaS) is a service where consumers use database servers or database management systems running on cloud infrastructure, while the DBaaS provider manages or controls the underlying cloud infrastructure and applications.

[0255] The basic computer hardware and software, as well as the cloud computing environment, presented above are intended to illustrate the basic underlying computer components that can be used to implement one or more example embodiments. However, the one or more example embodiments are not necessarily limited to any particular computing environment or computing device configuration. Instead, according to this disclosure, the one or more example embodiments can be implemented in any type of system architecture or processing environment that will be understood by those skilled in the art to be capable of supporting the features and functionality of the one or more example embodiments presented herein.

[0256] In the foregoing description, embodiments of the invention have been described with reference to numerous specific details, which may vary from implementation to implementation. Therefore, the description and drawings should be considered illustrative rather than restrictive. The unique and exclusive indication of the scope of the invention, and the content that the applicant intends to define as the scope of the invention, is the literal and equivalent scope of the set of claims published from this application in the specific form of such claims, including any subsequent corrections.

Claims

1. A method implemented by a computer, comprising: Obtain the mapping from the database relational schema to the graph data model, where: Relational pattern recognition corresponds to one or more vertex tables of one or more corresponding vertex types in a graph data model, and one or more edge tables of one or more corresponding edge types in a graph data model. Each of the one or more edge types is associated with a corresponding source vertex type and a corresponding target vertex type among the one or more vertex types; Based on the mapping, a forward compressed sparse row representation is filled for forward traversal of edges of one or more edge types, wherein each edge of the edge type is: Originating from the source vertex type of this edge type, and The target vertex terminates at the target vertex type of this edge type; The forward-compressed sparse row representation includes: A forward destination array, indicating which vertices of the target vertex type of the edge type terminate which corresponding edges of that edge type, and The forward source array indicates the corresponding forward range of offsets to the forward destination array for each vertex of the source vertex type of the edge type; Based on the forward compressed sparse line representation, to perform reverse traversal of edges of this edge type, the reverse compressed sparse line representation of the edge type is filled, which includes: A reverse source array, indicating which vertices of the source vertex type of the edge type give rise to which corresponding edges of that edge type, and The reverse destination array indicates the corresponding reverse range of offsets to the reverse source array for each vertex of the target vertex type (edge type). The padded reverse-compressed sparse row representation includes each edge of the edge type: The offset to the reverse source array is calculated by summing the values of the elements of the reverse destination array corresponding to the target vertex of the edge, plus the value of the corresponding counter for the target vertex, where the corresponding counter for the target vertex indicates how many edges of the edge type have been filled into the reverse source array at the target vertex. Based on the offset, store the following elements into the reverse source array: The identifier of the edge, and / or The offset of the source vertex of the edge from the forward source array; Multiple threads process multiple blocks of edges of the same type in parallel; and Each corresponding counter for the target vertex is thread-safe, and the counter allows multiple threads among the plurality of threads to safely perform the storage of individual rows of the reverse source array.

2. The method of claim 1, wherein: The filled positive compression sparse row representation is also based on: The vertex table corresponding to the source vertex type of the edge type. A vertex table corresponding to the target vertex type of the edge type, and Edge table corresponding to edge type; The filling of the reverse compressed sparse line indicates access to the forward compressed sparse line representation, but not access to: The vertex table corresponding to the source vertex type of the edge type. A vertex table corresponding to the target vertex type of the edge type, and An edge table corresponding to the edge type.

3. The method of claim 1, wherein: The filled inverse compressed sparse row representation includes counting the corresponding edges of that edge type for each vertex of the target vertex type terminating at the edge type; The count of edges of that edge type for each vertex of the target vertex type terminating at that edge type includes: When filling the forward destination array for each edge of this edge type, increment the corresponding counter used to terminate the target vertex of that edge.

4. The method of claim 3, wherein: The reverse edge position vector of the reverse destination array is filled based on the edge count indicated in the corresponding counter for each vertex.

5. The method of claim 4, wherein the counting of edges terminating at each vertex includes concurrently processing at least two edges.

6. The method of claim 5, wherein: At least two edges terminate at the same target vertex; The concurrent processing of the at least two edges includes an atomic operation of incrementing the counter for the same target vertex.

7. The method of claim 4, wherein: One or more arrays are processed as multiple blocks; Each of the plurality of blocks contains a plurality of elements; The one or more arrays include: a forward destination array, a forward source array, a reverse destination array, and / or a reverse source array.

8. The method of claim 7, wherein: The method also includes replacing the corresponding old value of each element in each block of the reverse destination array with the corresponding new value; When the element is the first element in the block, the new value is zero; Otherwise, the new value is the sum of the old value of the previous element of the block and the new value of the previous element.

9. The method of claim 8, wherein replacing the old value of each element in each block comprises processing at least two blocks concurrently.

10. The method of claim 8, wherein each block of the reverse destination array is associated with the following: The last value is the sum of the old values of the multiple elements of the block, and Block offset, which is: It is zero when the block is the first block in the reverse destination array. When the block is the second block in the reverse destination array, it is the last value of the first block. Otherwise, it is the sum of the last value of the previous block and the block offset of the previous block.

11. The method of claim 10, further comprising concurrently: Calculate the sum of the last value of the previous block and the block offset of the previous block, and The old value of each element in the block is replaced.

12. The method of claim 10, further comprising incrementing each element of the block by a block offset of the block.

13. The method of claim 12, wherein incrementing each element of the block comprises concurrently incrementing a plurality of elements of the block.

14. The method of claim 13, wherein the concurrent incrementing of the plurality of elements of the block comprises Single Instruction Multiple Data (SIMD).

15. The method of claim 1, wherein the storage includes storing identifiers of edges copied from the forward destination array into elements of the reverse source array.

16. The method of claim 1, wherein: The multiple threads concurrently process at least two edges that terminate at the same vertex, and the processing includes atomic operations that increment the counter for the same vertex.

17. The method of claim 1, further comprising filling a second reverse compressed sparse line representation of a second edge type with a second forward compressed sparse line representation of the second edge type for reverse traversal of the edges of the second edge type based on a second forward compressed sparse line representation of the second edge type in one or more edge types.

18. The method of claim 17, wherein: The forward compressed sparse row representation is the first forward compressed sparse row representation; The reverse compressed sparse row representation is the first reverse compressed sparse row representation; The second forward-compressed sparse row representation includes a forward destination array containing a sparse identifier or a persistent dense identifier for each vertex of a second source vertex type corresponding to a vertex table of a second edge type in the one or more vertex tables, and / or The first inverse compressed sparse row representation includes an inverse source array containing either a sparse identifier or a persistent dense identifier for each vertex of a second destination vertex type for a second edge type.

19. The method of claim 1, wherein: The forward-compressed sparse row representation includes a forward destination array containing a sparse identifier or a persistent dense identifier for each vertex of the destination vertex type for the edge type, and / or The inverse compressed sparse row representation includes an inverse source array containing either a sparse identifier or a persistent dense identifier for each vertex of the source vertex type for the edge type.

20. One or more non-transitory computer-readable media storing one or more sequences of instructions, which, when executed by one or more processors, cause the steps of any one of claims 1-19 to be performed.

21. A computer-implemented device, comprising: One or more processors; as well as A memory coupled to the one or more processors and including instructions stored thereon, which, when executed by the one or more processors, cause to perform the steps as described in any one of claims 1-19.