Configuring Priority Flow Control for Cluster HeartBeat

Hello all

A While back Microsoft changed there recommendation for Priority Flow Control for Clusters running RDMA with SMB3 in any way. If it’s an old fashion Storage Spaces with Scale Out Fileservers or Storage Spaces Direct. With RoCE RDMA the recommendation has always been to use Datacenter Bridging(DCB) and Priority Flow Control(PFC) for SMB3 traffic.

When RDMA started becoming available to the public because of prices on 10Gbit was coming down, and Storage Spaces and Storage Spaces Direct became more popular. An issue presented it self with Cluster Heartbeats dropping/missing, due to high SMB traffic on the 10Gbit interfaces. Which lead to nodes being isolated, which lead to clusters going offline.

Due to this Microsoft changed the recommendation to add a PFC queue for Cluster Heartbeats with a minimum bandwidth of 1% on Priority 7. This will ensure that Cluster Heartbeats are not missed due to high traffic on the network cards.

I have updated my main post about DCB/RDMA/PFC with the new recommendations, and updated the Powershell script and PFC settings for Dell Switches and Lenovo. You can find the post here https://jtpedersen.com/2017/06/rocerdmadcb-what-is-it-and-how-to-configure-it/

PFC is a must for anyone running RoCE, RoCEv2 is coming with a new lossless protocol on the future that will eliminate PFC. But for now you will need it.

But for any high IO RDMA network i will always recommend DCB and PFC to be enabled, even if it’s RoCE/RoCEv2 or iWARP. iWARP does not need it for low IO as it’s TCP compared to RoCE which is UDP.

Leave a Reply

Your email address will not be published. Required fields are marked *

* Checkbox GDPR is required

*

I agree