Omni-Path Upgrade to InfiniBand
We are currently working on upgrading our Omni-Path infrastructure to InfiniBand. This will improve both the throughput and reliability of the HPC.
UPDATE: March 20, 2020
This project is almost complete, and it has yieleded significant performance improvements on the nodes that were updated. We have approximately 20 nodes left to upgrade, and we have coordinated with the customers affected.
We are going to run benchmarks and update the documentation on our website after the upgrade is complete.
We are upgrading our storage network fabric from Omni-Path to InfiniBand. This is due to many reasons, not least of which is the fact that our primary vendor is no longer supporting Omni-Path networking.
In the process, we are going to install IB networking cards across our entire cluster, even in nodes that didn't have Omni-Path installed to begin with. What this means for you is that you may see your job performance increase by up to 20 times, especially if the job is storage-I/O bound.
Unfortunately, this inovlves taking nodes offline for up to several hours at a time in order to complete the upgrade. Our goal is to disrupt research as little as possible, so we will be taking a staggered approach to the upgrade. We will also reach out to customers on an indvidual basis to schedule downtime for purchased resources.
The upgrade will begin immediately and last until the end of the Spring 2020 semester. If you have any questions or concerns, please let us know: