1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495969798 |
- The Symmetric Communication Interface (SCIF (pronounced as skiff)) is a low
- level communications API across PCIe currently implemented for MIC. Currently
- SCIF provides inter-node communication within a single host platform, where a
- node is a MIC Coprocessor or Xeon based host. SCIF abstracts the details of
- communicating over the PCIe bus while providing an API that is symmetric
- across all the nodes in the PCIe network. An important design objective for SCIF
- is to deliver the maximum possible performance given the communication
- abilities of the hardware. SCIF has been used to implement an offload compiler
- runtime and OFED support for MPI implementations for MIC coprocessors.
- ==== SCIF API Components ====
- The SCIF API has the following parts:
- 1. Connection establishment using a client server model
- 2. Byte stream messaging intended for short messages
- 3. Node enumeration to determine online nodes
- 4. Poll semantics for detection of incoming connections and messages
- 5. Memory registration to pin down pages
- 6. Remote memory mapping for low latency CPU accesses via mmap
- 7. Remote DMA (RDMA) for high bandwidth DMA transfers
- 8. Fence APIs for RDMA synchronization
- SCIF exposes the notion of a connection which can be used by peer processes on
- nodes in a SCIF PCIe "network" to share memory "windows" and to communicate. A
- process in a SCIF node initiates a SCIF connection to a peer process on a
- different node via a SCIF "endpoint". SCIF endpoints support messaging APIs
- which are similar to connection oriented socket APIs. Connected SCIF endpoints
- can also register local memory which is followed by data transfer using either
- DMA, CPU copies or remote memory mapping via mmap. SCIF supports both user and
- kernel mode clients which are functionally equivalent.
- ==== SCIF Performance for MIC ====
- DMA bandwidth comparison between the TCP (over ethernet over PCIe) stack versus
- SCIF shows the performance advantages of SCIF for HPC applications and runtimes.
- Comparison of TCP and SCIF based BW
- Throughput (GB/sec)
- 8 + PCIe Bandwidth ******
- + TCP ######
- 7 + ************************************** SCIF %%%%%%
- | %%%%%%%%%%%%%%%%%%%
- 6 + %%%%
- | %%
- | %%%
- 5 + %%
- | %%
- 4 + %%
- | %%
- 3 + %%
- | %
- 2 + %%
- | %%
- | %
- 1 +
- + ######################################
- 0 +++---+++--+--+-+--+--+-++-+--+-++-+--+-++-+-
- 1 10 100 1000 10000 100000
- Transfer Size (KBytes)
- SCIF allows memory sharing via mmap(..) between processes on different PCIe
- nodes and thus provides bare-metal PCIe latency. The round trip SCIF mmap
- latency from the host to an x100 MIC for an 8 byte message is 0.44 usecs.
- SCIF has a user space library which is a thin IOCTL wrapper providing a user
- space API similar to the kernel API in scif.h. The SCIF user space library
- is distributed @ https://software.intel.com/en-us/mic-developer
- Here is some pseudo code for an example of how two applications on two PCIe
- nodes would typically use the SCIF API:
- Process A (on node A) Process B (on node B)
- /* get online node information */
- scif_get_node_ids(..) scif_get_node_ids(..)
- scif_open(..) scif_open(..)
- scif_bind(..) scif_bind(..)
- scif_listen(..)
- scif_accept(..) scif_connect(..)
- /* SCIF connection established */
- /* Send and receive short messages */
- scif_send(..)/scif_recv(..) scif_send(..)/scif_recv(..)
- /* Register memory */
- scif_register(..) scif_register(..)
- /* RDMA */
- scif_readfrom(..)/scif_writeto(..) scif_readfrom(..)/scif_writeto(..)
- /* Fence DMAs */
- scif_fence_signal(..) scif_fence_signal(..)
- mmap(..) mmap(..)
- /* Access remote registered memory */
- /* Close the endpoints */
- scif_close(..) scif_close(..)
|