Author: eilemann@gmail.com
State: Implemented in 0.3
Overview
InfiniBand (IB) is supported using the SDP layer. This layer is transparent to socket-based applications, such as the Equalizer network layer. This document describes the setup to use IB hardware with Equalizer applications.
Windows XP
Overview
The usage of InfiniBand requires a Mellanox card, since Mellanox is the only vendor providing a SDP implementation. Often cards from other vendors are Mellanox cards with a different firmware and can be 'converted' by flashing a Mellanox firmware. The Mellanox Windows SDP implementation is described in great detail in the Architecture and Implementation of Sockets Direct Protocol in Windows paper.
By default, the Mellanox Win32 implementation does not allow using SDP and TCP/IP sockets concurrently. That means that all machines have to be connected using InfiniBand, and only SDP IP addresses are to be used for all connections. See the next section for SDP/TCP Coexistence.
Installation and Configuration
The following steps are necessary to use SDP on Windows:
- Download and install the WinIB Software from Mellanox (www.mellanox.com, go to Product: Software: WinIB). The version tested was 1.3.0.
- At the end of the installer, select 'Enable Socket Direct Protocol'
- Setup the network ports with private IP addressing, e.g., address 10.1.1.x, netmask 255.255.255.0, no default gateway or DNS.
- If you are running a switchless configuration, start one subnet manager per IB port (%ProgramFiles%/Mellanox/WinIB/bin/OpenSM.exe).
- Start the SDP service:
net start sdp
. - Set all Equalizer applications to be 'SDP Applications'. This is done
using the environment
variable
SdpApplications
. ForeqPly
it has to be set toeqServer.exe;eqPly.exe
. - Optional: Verify network performance using
netPerf
. Set SdpApplications to netPerf.exe, run one server (netPerf.exe -s IB_ADDR
) on one machine a client (netPerf.exe -c SERVER_IB_ADDR
) on another machine. On SDR IB hardware, the performance is around 700-800 MB/s using SDP, as compared to 100-200 MB/s using IPoIB. - Use the IP addresses, as configured above, as the hostname for the node connection settings in the configuration files.
- Start the server using the SDP server address, for
example:
eqServer.exe config.eqc --eq-listen 10.1.1.x
. - Start the application so that it connects to the SDP server address and
uses the local SDP as listening address (instead of localhost), for
example:
eqPly.exe -- --eq-server 10.1.1.x --eq-listen 10.1.1.x
. - Depending on the CPU speed, disabling image compression might improve
performance. Compression is enabled in
definesWin32.h
byEQ_USE_COMPRESSION
.
SDP and TCP Coexistence
Starting with WinIB 1.4, it is possible to use both SDP and TCP sockets in one application process. The applications have to explicitly specify the socket type, which allows mixing of TCP and SDP sockets. Please refer to the Mellanox documentation on how to activate the 'mixed SDP applications' mode. Revisions 1001 to 1010 implement support for multiple node connections, which means that if a node (or the server) has multiple connections specified in the configuration file, they now listen to all of them, allowing to use mixed connection types. When a node connects to another node, the connections are tried in the order they are specified. Therefore connections should be configured in the fastest-to-slowest order.
Transparent SDP support, as described above, still works the same way. Explicit SDP support is used by not setting the variable SdpApplications, and explicitly configure SDP socket connections:
- Unset SdpApplications.
- Set the global connection type to SDP (EQ_CONNECTION_IATTR_TYPE SDP in config file globals).
- Start the server as usual, i.e., without the
--eq-listen
parameter. The server will create one default connection (SDP), unless connections are specified explicetely in the configuration file. - Use SDP IP addresses for the nodes in the config file.
- Start the client with
eqPly.exe -- --eq-server 10.1.1.x:SDP --eq-listen 10.1.1.x:SDP
. Note the :SDP qualifier, which is needed for the client to use SDP (it is unaware of the configuration file during startup!). - Optionally configure (secondary) TCP connections. Note that when a node has no SDP capabilities, the server and all nodes he connects to (potentially all in the configuration) need to have a secondary TCP connection configured.
Implementation
The current (1.3.0) Mellanox Windows SDP implementation only supports a subset of the Windows socket interface. Most notably, the eq::net socket connection was rewritten to use overlapped (asynchronous) IO with svn revision 951. TCP/IP connections also exhibit a performance increase due to the use of overlapped IO.
The --eq-listen
parsing was extended to support a simplified
connection description in the form hostname(:port)(:type)
. Port
is an unsigned short value, type is either SDP
or TCPIP
.
The interpretation of the node configuration has changed. When multiple connections are described, they are all set up as a listening connection during Node::initLocal(). Therefore, nodes can now listen on multiple sockets for incoming connections.
The connection setup code between two nodes has been improved.
The config file now accepts connection descriptions for the server section.
Related SVN changes: 951 952 953 1001 1002 1006 1007 1009 1010
Known Bugs
When using SDP, the client process hangs upon exit. We are investigating this issue. Defining EQ_WIN32_SDP_JOIN_WAR in definesWin32.h enables a workaround.
Linux
The Linux usage of SDP is very similar to the Windows usage described above. For the implicit, transparent mode just LD_PRELOAD the SDP library (typically libsdp.so) instead of setting SdpApplications. For explicit support, follow the same configuration as for Windows.