Semantic-based hadoop system

An analysis system and semantic analysis technology, applied in the field of data network, can solve the problems of difficult acquisition of unstructured data, low unit value, and insufficient value development and utilization by the industry, so as to achieve rich semantic information, high accuracy, and market promising effect

Inactive Publication Date: 2015-01-14
ANHUI HUAZHEN INFORMATION SCI & TECH
View PDF4 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

At present, the focus of the systems provided by similar products on the market is to analyze the internal data of the enterprise. For the massive unstructured data such as some texts from the web, due to difficulties such as relatively difficult acquisition and relatively low unit value, its value has not yet been recognized. fully develop and utilize

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Semantic-based hadoop system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0026] Such as figure 1 As shown, the embodiment of the present invention proposes a semantic-based big data analysis system, including: data collection and storage component 10, real-time data stream processing component 20, storage system component 30, underlying support component 40 and business output component 50.

[0027] Data acquisition storage part 10, comprises: distributed crawler module 11, is used for the work of aspects such as data source detection, Internet data collection and HTML (HyperText Mark-up Language, hypertext markup language) preprocessing; Data source adapter 12 , for the access of third-party data resources, such as the data specified by the customer that needs to be analyzed, can be involved in the processing flow of the system through the data source adapter.

[0028] The real-time data flow processing part 20 is used for the real-time processing of the data flow; it includes a temporary storage module 21, which uses the memory of the cluster as ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a semantic-based hadoop system. The system comprises a data acquisition and loading component, a real-time data stream processing component, a storage system component, a bottom layer support component and a business layer component, wherein the data acquisition and loading component is used for data source detection, Internet data acquisition and HTML (hypertext markup language) preprocessing as well as third-party data resource access, the real-time data stream processing component is used for real-time processing of data streams; the storage system component is used for storing Hadoop clusters and mysql clusters; the bottom layer support component is used for extracting semantic information from text and supporting other services in need of semantic extraction and semantic analysis blocks and related to processing and text retrieval, text processing and semantic search and text processing; the business layer component is used for specific business execution, scheduling and presentation and application sets closely related to specific applications. The system realizes web-based hadoop, and is high in accuracy, rich in provided semantic information, highly practical and industrialized.

Description

technical field [0001] The invention relates to the technical field of data network, in particular to a big data analysis system based on semantics. Background technique [0002] In early 2012, the big data market, including software, hardware, and services, was about $5 billion. As time goes by, the energy of big data will gradually attract more attention. Enterprises need relevant analysis capabilities to gain a competitive advantage and improve operational efficiency. Related technologies and services will be deployed one after another, and the scale of the big data market will grow significantly. . At present, the focus of the systems provided by similar products on the market is to analyze the internal data of the enterprise. For the massive unstructured data such as some texts from the web, due to difficulties such as relatively difficult acquisition and relatively low unit value, its value has not yet been recognized. fully developed and utilized. Contents of the ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06F17/27
CPCG06F16/951G06F40/30
Inventor 贾岩
Owner ANHUI HUAZHEN INFORMATION SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products