Expert System And Data Analysis Tool Utilizing Data As A Concept

a data analysis and expert system technology, applied in the field of data, can solve problems such as not always having domain expertise, and achieve the effects of improving data discovery, aggregating and analysing data, and high quality

Inactive Publication Date: 2018-06-14
DATAVORE LABS
View PDF6 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

[0005]Various embodiments of the present invention generally relate to data as a concept and related systems and methods. In particular, systems and methods are disclosed that allow for the aggregation, reconciliation, analysis, and visualization of data in a unitary end-to-end tool. In some embodiments, the tool, which may also be referred to herein as a platform, may be called Datavore. The tool may be used to automatically learn concepts and relationships existent within big data sets. For example, the tool may be used to aggregate data by a process of ingesting and combining raw data from multiple sources with different file types. As another example, the tool may be used to reconcile the data by scrubbing “messy” (e.g., noisy data) data to produce high quality data that permits better aggregation and analysis of the data. As yet another example, the tool may be used to analyze the data by using multiple data manipulation techniques (e.g., “Excel-like” data manipulation techniques) and statistical analysis to allow for better data discovery. As yet another example, the tool may be used to visualize the data by using dynamic graphs and charts to illustrate key relationships and trends within the data. As yet a further example, the tool may be used to export data, meta-data, or visualizations to other tools, programs, and / or modules. In some embodiments, the tool may be an end-to-end Software as a Service (SaaS) solution that allows data analysis experts to easily conduct complex analysis on big data sets. In some embodiments the tool may act as a Master Data Management tool that includes reference data and analytical data to be an authoritative source of master data. In such embodiments, the tool may operate to reconcile data by removing duplicate and / or incorrect data and automatically generating rules to prevent such data from entering the system or any data analysis step.
[0006]Such a tool may be streamlined and may offer several advantages due to its ability to learn concepts and relationships within big data sets. For example, the tool may allow for a user-defined world in which domain expertise is captured to make appropriate “apples-to-apples” comparisons between similar types of data, from the user's perspective. As another example, the tool may allow for superior analysis of data by conducting customized statistical and predictive analysis of financial and market data. As yet another example, the tool may allow for data curation by cleaning and integrating disparate, messy, or syntactically different data sets. As yet another example, the tool may allow for “smart” visualizations of the data by automatically creating graphs and charts to show the most important relationships between similar and / or different data including magnitudes, relations and allowing for trend and outlier detection. As a further example, the tool may have an intuitive interface that is simple and seamless to the user because it does not involve computer programming, creation of macros, or cryptic database queries.
[0007]A particular example of the use of the too disclosed herein includes industry comparables analysis on financial data. The tool may be used to aggregate multiple financial statements that may be siloed in, e.g., Bloomberg, and / or CapIQ, and / or other data sources. The tool may be used to reconcile the data by quickly creating “apples-to-apples” comparisons of related or similar companies' financials, the comparisons may include industry Key Performance Indicators (KPIs) from the relevant industries. The tool may be used to analyze the data by comparing performance of a company with data analysis expert defined specification and metrics. The work data flow (the step-by-step procedure by which the data is manipulated in order to analyze the data) may be used by the data analysis expert to analyze the data. The analysis may include filtering and grouping the data (e.g., in accordance with the industries in which the company operates). The work data flow may be stored by the tool for later use. For example, the stored work data flow may be used for automation of analysis on different data, portability of data analysis techniques, or as one or more building blocks for additional data analysis. This work data flow and other work data flows created by a user of the tool, in conjunction with learned concepts, may be considered the user's lens with respect to viewing / analyzing particular types of data. The tool may be used to visualize the data by simultaneously viewing company financials and KPI's over time and across the industries in which the company operates. During visualization, outliers and trends may be recognized by the tool. For example, for the use case of multi-strategy and long short equities hedge funds, the tool may allow for a holistic industry review, industry comparables analysis, simulated portfolio performance, and macro data correlations. As another example, for the use case of fixed income and real estate hedge funds, the tool may allow for bond data cleaning, capital structure assessments, complex financial instrument analysis, and merger ramifications.
[0008]The Datavore tool, described herein, may be able to learn concepts and relationships associated with different data types to simplify the analysis of the data and to allow data analysis experts to more efficiently work with the data. The tool may be able to combine the learned concepts and relationships with user defined concepts and relationships associated with the different data types.
[0014]In some embodiments a user context mapping module may be used, e.g., by a user of the tool, to specify a particular “context” to which a data concept belongs. A syntax and semantic reconciliation module may automatically attempt to correct spelling errors such as small typos (string distance / n-grams), sounds-like corrections, and may perform language normalization. The syntax and semantic reconciliation module may automatically attempt to map labels in the data (associated with the data concepts in, e.g., a data set) to universal identifiers stored, e.g., in a remote data store(s). The user may be presented the matches that result from the operation of the syntax and semantic reconciliation module. The user can disambiguate items based on, e.g., a confidence score associated with a particular one of the matches. The user's actions may be stored in a context mapping memory module to assist in future mapping of the same data concept.

Problems solved by technology

Data analysis experts that are tasked with analyzing big data sets (e.g. to build complex predictive models) do not always have domain expertise to know the goals of data analysis and what measurements to take relating to the data sets.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Expert System And Data Analysis Tool Utilizing Data As A Concept
  • Expert System And Data Analysis Tool Utilizing Data As A Concept
  • Expert System And Data Analysis Tool Utilizing Data As A Concept

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0056]Various embodiments of the present invention are directed to methods and systems for learning concepts and relationships associated with different data to simplify the analysis of the data, tracking and storing the analysis techniques used to manipulate the data, and visualizing and filtering the analyzed data. In particular, the Datavore tool, described herein, may be able to learn concepts and relationships associated with different data types to simplify the analysis of the data and to allow data analysis experts to more efficiently work with the data. The tool may be able to combine the learned concepts and relationships with user defined concepts and relationships associated with the different data types. The tool may be able to track and the data analysis experts' data analysis techniques (work data flows). The tool may also allow for the visualization and filtering of the analyzed data. Although multiple references are made herein to a user (e.g., data analysis expert) ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

Work data flow, a step-by-step procedure by which data is manipulated in order to analyze the actual data, resulting from user requests is created by interactions with one or more representations of a domain-specific language. The domain-specific language features a full typing system and language compiler. The full typing system and language compiler is a functional expression language and all representations of which are isomorphic. User interactions with the domain-specific language result in a query generator thereby creating an execution plan represented by an abstract syntax tree.

Description

PRIORITY[0001]This application is a continuation-in-part of U.S. patent application Ser. No. 14 / 934,246, filed Nov. 6, 2015, which claims priority to U.S. Provisional 62 / 084,430, now expired, both of which are hereby incorporated by reference as if submitted in their entireties.FIELD OF THE INVENTION[0002]The present invention relates to data as a concept and related systems and methods, and, more particularly, various embodiments of the present application relate to systems and methods allowing for the creation of dynamic data relationships.[0003]Various embodiments of the present application relate to systems and methods allowing for the creation of dynamic data relationships and utilizing data as a concept normalization algorithms, tracking analysis of the data, and visualizing the analyzed data.BACKGROUND[0004]Data analysis experts that are tasked with analyzing big data sets (e.g. to build complex predictive models) do not always have domain expertise to know the goals of data ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(United States)
IPC IPC(8): G06F17/30
CPCG06F17/30554G06F17/30327G06F17/30427G06Q10/0637G06F16/215G06F11/0793G06F11/30G06Q10/10G06F16/2246G06F16/2452G06F16/248
Inventor VENKATESWARULU, SANJAYPERLMAN-GARR, JAKE
Owner DATAVORE LABS
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products