Storage efficiency is the ability to store and manage data that consumes the least amount of space with little to no impact on performance; resulting in a lower total operational cost. Efficiency addresses the real-world demands of managing costs, reducing complexity and limiting risk. The Storage Networking Industry Association (SNIA) defines storage efficiency in the SNIA Dictionary as follows: storage efficiency=(effective capacity+free capacity)/(raw capacity).

Similar web page duplicate-removing system based on parallel programming mode

The invention provides a similar web page duplicate-removing system based on a parallel programming mode, comprises a web page content pre-processing module, a web page eigenvector extracting module,a web page feature fingerprint calculation module, a web page fingerprint on-line duplicate-removing module, a web page fingerprint distributed batch duplicate-removing module and a computing platformbased on specific distribution. The system can complete links of carrying out unified conversion of text content encoding, standardization of document structure, web page noise content abortion, thematic content analysis and identification of web pages, lexical segmentation of continuous text content, and the like on the web pages obtained by crawling of web crawlers, thereby forming eigenvectorswhich can present the web pages. Relative algorithms can be used to obtain web page fingerprints which present web page characteristics aiming at the vector. The system provided by the invention accurately and fast detects fully complete repetition or approximate repetition of the web page contents caused by site mirroring, web document transshipment, and the like on the condition of massive amount of data of Internet and completes corresponding repetition-removing works, thereby enhancing the storage efficiency of search engines and bringing better use experience for the search engines.

All-service CDN system based on HTTP and working method thereof

InactiveCN104320410AGuaranteed instantGuaranteed proximityTransmissionSpecial data processing applicationsTraffic capacityHigh availability
The invention discloses an all-service CDN system based on the HTTP and a working method of the all-service CDN system based on the HTTP. The all-service CDN system based on the HTTP and the working method of the all-service CDN system based on the HTTP solve the problems that a traditional CDN supports only one service, has only one scheduling policy and can hardly meet the demands for mass digital media services. The system is mainly composed of a stream media processing module, a center node, an edge acceleration caching node and an intelligent DNS module. The stream media processing module can process various media streams and unify the media streams into stream media section files which can be issued on the CDN based on the HTTP. The center node supports active issuing and passive pulling down of the CDN system. The storage efficiency and the hit rate of a server cluster are improved through the unique caching function, based on P2P, of the edge acceleration caching node. The intelligent DNS module carries out global dispatching on the basis of a node load condition and a link state, so that the purposes of shortening delay for a user when the user has access to a CDN node, lowering the data flow of backbone networks, and guaranteeing the high availability and the high stability of the CDN are achieved.
