Rogue Scholar Beiträge

language
Veröffentlicht in JSC Accelerating Devices Lab

Environment Setup Enabling UCC in OpenMPI Enabling NCCL in UCC (Team Layer Selection) All The Variables Results 1. Plain OpenMPI 2. OpenMPI with UCC 3. OpenMPI with UCC+NCCL Scaling Plots Average Latency Bus Bandwidth Comparing MPI, UCC, UCC+NCCL Comparing UCC+NCCL, NCCL Summary Technical Details This post showcases how to use the highly optimized, GPU-based collectives of Nvidia’s (NCCL library) by using it as a UCC backend through MPI.