Microsoft’s updated DeepSpeed can train trillion-parameter AI models with fewer GPUs

Microsoft today released an updated version of its DeepSpeed library that introduces a new approach to training AI models containing trillions of parameters, the variables internal to the model that inform its predictions. The company claims the technique, dubbed 3D parallelism, adapts to the varying needs of workload requirements to power extremely large models while balancing scaling efficiency.

Single massive AI models with billions of parameters have achieved great strides in a range of…

Read Full Article


Please enter your comment!
Please enter your name here