Following up on a prior blog article : “Making a case for a disaggregated storage architecture”.
NVM Express over Fabrics (NVMe-oF) is an emerging standard designed to access storage over a network. Since it offers high performance and low latency, it has captured the imagination of the industry. Implementations are available for popular operating systems, such as Linux and Windows. They allow remote access to storage (especially NVMe Flash storage) with latency and performance very similar to locally-attached storage.
However, before you start NVMe-oF testing and NVMe-oF deployment, there are some specificities of the technology that you should be aware of:
1. Know the Language
Specifications and tutorials talk about targets, initiators and discovery services. What are they?
- The target is the machine hosting the storage. It includes the NVMe-oF target and the media. It allows configuration of what storage is available, and how.
- The initiator is the system accessing the remote storage. Note that today’s standard only defines a point-to-point connection between a target and an initiator. It includes no synchronization between two initiators that might access the same media. No issue if they access different storage exposed by the same target.
- Discovery is a service run by the target that allows the initiator to find out what storage is available. Currently you need to know the address of the discovery service. There are multiple initiatives to improve and build upon the basic discovery of the NVMe-oF; and also to allow new functionalities, like notifications and network management. A notable example is the SNIA Swordfish protocol, RESTful protocol that allows NVMe-oF Network configuration among other features.
2. Plan the Network
NVMe-oF allows transfers using different media, including Ethernet, Infiniband or FibreChannel. In addition to that, it can use different lower-level network protocols. For example, the Ethernet transfer can be done currently using RDMA protocols: RoCE (v1 or v2) or iWARP. However, a TCP version is being defined and is expected for release later this year.
When planning your network, take into account the existence of network adapters that support one protocol only. RDMA protocols are incompatible with each other. For example, a RoCE-based system will not be able to communicate with an iWARP one. It may be worth considering using programmable elements such as the Kalray Target Controller (KTC).
Also be aware of the requirements of your network protocols. For example, RoCE requires lossless Ethernet, PFC deployment using PFC-enabled switches is highly recommended.
3. Prepare for Change
NVMe-oF is emerging and altering technology. The protocol implementation receives frequent updates, so it is important to track changes in your Linux distribution. CentoOS and SuSE provide backports of the NVMe-oF functionality to older kernels, but we recommend at least the 4.9 series with all patches applied. The associated tools are in development also and you may need to update them regularly. New functions and services are in development, especially for management.
The way the system is configured right now will probably change and become simpler. For example, the next NVMe-oF standard version, expected at the beginning of 2019, should include TCP transport, enhanced discovery and authentication.
Apart from the protocol itself, elements of the NVMe-oF networks may include more functions. For instance, the NVMe-oF targets could provide services beyond mere access to the drives. In other words, when preparing your system, take into account the fact that it should be easy to update as new services appear.