Method and apparatus for data re-assembly with a high performance network interface

a network interface and high-performance technology, applied in the field of high-performance network interface reassembly and data reassembly, can solve the problems of cpu stalling, high cost, inefficient use of system resources, etc., and achieve the effects of avoiding header processing, data copying and checking, and cost-effectiveness

Inactive Publication Date: 2005-09-15
ALACRITECH
View PDF99 Cites 177 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

[0013] The present invention offloads network processing tasks from a CPU to a cost-effective intelligent network interface card (INIC). An advantage of this approach is that a vast majority of network message data is moved directly from the INIC into its final destination. Another advantage of this approach is that the data may be moved in a single trip across the system memory bus. The offloading allows the CPU to avoid header processing, data copying, and checksumming. Since network message data does not need to be placed in a CPU cache, the CPU cache may be free for storage of other important instructions or data. Interrupts may be reduced to four interrupts per 64 k SMB read and two interrupts per 64 k SMB write. Other advantages include a reduction of CPU reads over the PCI bus and fewer PCI operations per receive or transmit transaction.

Problems solved by technology

Network processing as it exists today is a costly and inefficient use of system resources.
This is particularly expensive because while the CPU is moving this data it can do nothing else.
While moving the data the CPU is typically stalled waiting for the relatively slow memory to satisfy its read and write requests.
Even today's advanced pipelining technology doesn't help in these situations because that relies on the CPU being able to do useful work while it waits for the memory controller to respond.
Moving all this data with the CPU slows the system down even after the data has been moved.
After the data has been moved, the former resident of the cache will likely need to be pulled back in, stalling the CPU even when we are not performing network processing.
The most obvious expense is calculating the checksum for each TCP segment (or UDP datagram).
The TCP connection object must be located when a given TCP segment arrives, IP header checksums must be calculated, there are buffer and memory management issues, and finally there is also the significant expense of interrupt processing, discussed below.
Each of these segments may result in an interrupt to the CPU.
While this is possible, it is not terribly likely.
And delays in interrupt processing may mean that we are able to process more than one incoming network frame per interrupt.
Interrupts tend to be very costly to the system.
While the processor pipeline is an extremely efficient way of improving CPU performance, it can be expensive to get going after it has been flushed.
Finally, each of these interrupts results in expensive register accesses across the peripheral bus (PCI).
We noted earlier that when the CPU has to access system memory, it may be stalled for several hundred nanoseconds.
When it has to read from PCI, it may be stalled for many microseconds.
The most troubling thing about this is that since interrupt lines are shared on PC-based systems, we may have to perform this expensive PCI read even when the interrupt is not meant for us.
Other peripheral bus inefficiencies also exist.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and apparatus for data re-assembly with a high performance network interface
  • Method and apparatus for data re-assembly with a high performance network interface
  • Method and apparatus for data re-assembly with a high performance network interface

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0073] In order to keep the system CPU from having to process the packet headers or checksum the packet, this task is performed on the INIC, which presents a challenge. There are more than 20,000 lines of C code that make up the FreeBSD TCP / IP protocol stack, for example. This is more code than could be efficiently handled by a competitively priced network card. Further, as noted above, the TCP / IP protocol stack is complicated enough to consume a 200 MHz Pentium-Pro. In order to perform this function on an inexpensive card, special network processing hardware has been developed instead of simply using a general purpose CPU.

[0074] In order to operate this specialized network processing hardware in conjunction with the CPU, we create and maintain what is termed a context. The context keeps track of information that spans many, possibly discontiguous, pieces of information. When processing TCP / IP data, there are actually two contexts that must be maintained. The first context is requi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

An intelligent network interface card (INIC) or communication processing device (CPD) works with a host computer for data communication. The device provides a fast-path that avoids protocol processing for most messages, greatly accelerating data transfer and offloading time-intensive processing tasks from the host CPU. The host retains a fallback processing capability for messages that do not fit fast-path criteria, with the device providing assistance such as validation even for slow-path messages, and messages being selected for either fast-path or slow-path processing. A context for a connection is defined that allows the device to move data, free of headers, directly to or from a destination or source in the host. The context can be passed back to the host for message processing by the host. The device contains specialized hardware circuits that are much faster at their specific tasks than a general purpose CPU. A preferred embodiment includes a trio of pipelined processors devoted to transmit, receive and utility processing, providing full duplex communication for four Fast Ethernet nodes.

Description

CROSS REFERENCE TO RELATED APPLICATIONS [0001] This application claims the benefit under 35 U.S.C. §120 of (is a continuation of) U.S. patent application Ser. No. 10 / 005,536, filed Nov. 7, 2001, which in turn claims the benefit under 35 U.S.C. §120 of (is a continuation of) U.S. patent application Ser. No. 09 / 384,792, filed Aug. 27, 1999, now U.S. Pat. No. 6,434,620, which in turn: 1) claims the benefit under 35 U.S.C. §119 of Provisional Patent Application Ser. No. 60 / 098,296, filed Aug. 27, 1998, 2) claims the benefit under 35 U.S.C. §120 of (is a continuation-in-part of) U.S. patent application Ser. No. 09 / 067,544, filed Apr. 27, 1998, now U.S. Pat. No. 6,226,680, and 3) claims the benefit under 35 U.S.C. §120 of (is a continuation-in-part of) U.S. patent application Ser. No. 09 / 141,713, filed Aug. 28, 1998, now U.S. Pat. No. 6,389,479. [0002] U.S. Pat. No. 6,226,680 and U.S. Pat. No. 6,389,479 both claim the benefit under 35 U.S.C. §119 of Provisional Patent Application Ser. No....

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(United States)
IPC IPC(8): H04L12/56H04L45/74H04L47/36H04L49/901H04L49/9015
CPCH04L12/5693H04L29/0653H04L29/12009H04L29/12018H04L47/10H04L47/193H04L47/36H04L47/6225H04L49/90H04L49/901H04L49/9042H04L49/9052H04L49/9063H04L49/9073H04L49/9094H04L61/10H04L67/34H04L69/16H04L69/166H04L67/325H04L67/10H04L67/327H04L69/22H04L69/08H04L69/161H04L69/163H04L69/12H04L69/32H04L29/06H04L47/50G06F13/28G06F13/4022G06F13/4221H04L61/00H04L67/62H04L67/63H04L9/40H04L49/9021H04L49/9068H04L45/66H04L45/745H04L69/162
Inventor PHILBRICK, CLIVE M.CRAFT, PETER K.HIGGEN, DAVID A.STARR, DARYL D.BLIGHTMAN, STEPHEN E. J.BOUCHER, LAURENCE B.
Owner ALACRITECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products