segmentation-offloads.txt 5.9 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130
  1. Segmentation Offloads in the Linux Networking Stack
  2. Introduction
  3. ============
  4. This document describes a set of techniques in the Linux networking stack
  5. to take advantage of segmentation offload capabilities of various NICs.
  6. The following technologies are described:
  7. * TCP Segmentation Offload - TSO
  8. * UDP Fragmentation Offload - UFO
  9. * IPIP, SIT, GRE, and UDP Tunnel Offloads
  10. * Generic Segmentation Offload - GSO
  11. * Generic Receive Offload - GRO
  12. * Partial Generic Segmentation Offload - GSO_PARTIAL
  13. TCP Segmentation Offload
  14. ========================
  15. TCP segmentation allows a device to segment a single frame into multiple
  16. frames with a data payload size specified in skb_shinfo()->gso_size.
  17. When TCP segmentation requested the bit for either SKB_GSO_TCP or
  18. SKB_GSO_TCP6 should be set in skb_shinfo()->gso_type and
  19. skb_shinfo()->gso_size should be set to a non-zero value.
  20. TCP segmentation is dependent on support for the use of partial checksum
  21. offload. For this reason TSO is normally disabled if the Tx checksum
  22. offload for a given device is disabled.
  23. In order to support TCP segmentation offload it is necessary to populate
  24. the network and transport header offsets of the skbuff so that the device
  25. drivers will be able determine the offsets of the IP or IPv6 header and the
  26. TCP header. In addition as CHECKSUM_PARTIAL is required csum_start should
  27. also point to the TCP header of the packet.
  28. For IPv4 segmentation we support one of two types in terms of the IP ID.
  29. The default behavior is to increment the IP ID with every segment. If the
  30. GSO type SKB_GSO_TCP_FIXEDID is specified then we will not increment the IP
  31. ID and all segments will use the same IP ID. If a device has
  32. NETIF_F_TSO_MANGLEID set then the IP ID can be ignored when performing TSO
  33. and we will either increment the IP ID for all frames, or leave it at a
  34. static value based on driver preference.
  35. UDP Fragmentation Offload
  36. =========================
  37. UDP fragmentation offload allows a device to fragment an oversized UDP
  38. datagram into multiple IPv4 fragments. Many of the requirements for UDP
  39. fragmentation offload are the same as TSO. However the IPv4 ID for
  40. fragments should not increment as a single IPv4 datagram is fragmented.
  41. IPIP, SIT, GRE, UDP Tunnel, and Remote Checksum Offloads
  42. ========================================================
  43. In addition to the offloads described above it is possible for a frame to
  44. contain additional headers such as an outer tunnel. In order to account
  45. for such instances an additional set of segmentation offload types were
  46. introduced including SKB_GSO_IPIP, SKB_GSO_SIT, SKB_GSO_GRE, and
  47. SKB_GSO_UDP_TUNNEL. These extra segmentation types are used to identify
  48. cases where there are more than just 1 set of headers. For example in the
  49. case of IPIP and SIT we should have the network and transport headers moved
  50. from the standard list of headers to "inner" header offsets.
  51. Currently only two levels of headers are supported. The convention is to
  52. refer to the tunnel headers as the outer headers, while the encapsulated
  53. data is normally referred to as the inner headers. Below is the list of
  54. calls to access the given headers:
  55. IPIP/SIT Tunnel:
  56. Outer Inner
  57. MAC skb_mac_header
  58. Network skb_network_header skb_inner_network_header
  59. Transport skb_transport_header
  60. UDP/GRE Tunnel:
  61. Outer Inner
  62. MAC skb_mac_header skb_inner_mac_header
  63. Network skb_network_header skb_inner_network_header
  64. Transport skb_transport_header skb_inner_transport_header
  65. In addition to the above tunnel types there are also SKB_GSO_GRE_CSUM and
  66. SKB_GSO_UDP_TUNNEL_CSUM. These two additional tunnel types reflect the
  67. fact that the outer header also requests to have a non-zero checksum
  68. included in the outer header.
  69. Finally there is SKB_GSO_REMCSUM which indicates that a given tunnel header
  70. has requested a remote checksum offload. In this case the inner headers
  71. will be left with a partial checksum and only the outer header checksum
  72. will be computed.
  73. Generic Segmentation Offload
  74. ============================
  75. Generic segmentation offload is a pure software offload that is meant to
  76. deal with cases where device drivers cannot perform the offloads described
  77. above. What occurs in GSO is that a given skbuff will have its data broken
  78. out over multiple skbuffs that have been resized to match the MSS provided
  79. via skb_shinfo()->gso_size.
  80. Before enabling any hardware segmentation offload a corresponding software
  81. offload is required in GSO. Otherwise it becomes possible for a frame to
  82. be re-routed between devices and end up being unable to be transmitted.
  83. Generic Receive Offload
  84. =======================
  85. Generic receive offload is the complement to GSO. Ideally any frame
  86. assembled by GRO should be segmented to create an identical sequence of
  87. frames using GSO, and any sequence of frames segmented by GSO should be
  88. able to be reassembled back to the original by GRO. The only exception to
  89. this is IPv4 ID in the case that the DF bit is set for a given IP header.
  90. If the value of the IPv4 ID is not sequentially incrementing it will be
  91. altered so that it is when a frame assembled via GRO is segmented via GSO.
  92. Partial Generic Segmentation Offload
  93. ====================================
  94. Partial generic segmentation offload is a hybrid between TSO and GSO. What
  95. it effectively does is take advantage of certain traits of TCP and tunnels
  96. so that instead of having to rewrite the packet headers for each segment
  97. only the inner-most transport header and possibly the outer-most network
  98. header need to be updated. This allows devices that do not support tunnel
  99. offloads or tunnel offloads with checksum to still make use of segmentation.
  100. With the partial offload what occurs is that all headers excluding the
  101. inner transport header are updated such that they will contain the correct
  102. values for if the header was simply duplicated. The one exception to this
  103. is the outer IPv4 ID field. It is up to the device drivers to guarantee
  104. that the IPv4 ID field is incremented in the case that a given header does
  105. not have the DF bit set.