AWS and Nvidia Announce Major Generative AI Partnership

At AWS’s annual Re:Invent conference this week, CEO Adam Selipsky and Nvidia founder Jensen Huang shared the stage to provide an in-depth look at AWS’s expanding partnership with Nvidia and overall strategy for generative AI.

Selipsky explained AWS’s “three macro layers” approach – investing in infrastructure, access tools, and applications across the generative AI stack. “We think about generative AI as having actually Three macro layers, if you will, of a stack, and they’re all equally important. And we are investing in all 3 of them,” he said.

On the infrastructure front, AWS announced it will be the first cloud provider to offer Nvidia’s new GH100 “Grace Hopper” GPUs, providing up to 4x faster large language model inference. “The h two hundred, this is really amazing thing. The combination between the brand new TensorRT, LLM optimizing compilers for Generative AI and h two hundred improves the throughput of inference, large language model inference, by a factor of 4, reducing the cost In just 1 year by a factor of 4,” said Huang.

AWS also unveiled Graviton 4, its latest generation ARM-based server processor, claiming 50% faster performance over Graviton 3. And it announced Trainium 2, its second-generation AI training chip promising up to 4x faster training for large language models.

On the access and services layer, AWS is bringing Nvidia’s DGX Cloud to its platform. “We realized almost 10 years ago that if we wanted to continue to push the envelope on price performance for all of your workloads, We had to reinvent general purpose computing for the cloud era all the way down to the silicon,” noted Selipsky.

And at the application level, AWS launched Amazon Q, a new generative AI assistant service that connects to data sources and business systems to answer natural language questions. “With the foundation models, there are 2 main types of workloads, training and inference. Training is to create and prove FMs by learning patterns from large amounts of training data,” Selipsky explained regarding Amazon Q’s knowledge.

The announcements highlight AWS’s aggressive push into generative AI across infrastructure, services and applications – aiming to provide customers a complete platform for leveraging large language models and other emerging techniques. It also continues AWS’s long-running collaboration with Nvidia, spanning over a decade of GPU innovation in the cloud.

“It’s so early in the game, and we’re incredibly excited about what we’re gonna be able to do together,” concluded Selipsky. But with towering investments in specialized hardware and models for training and inference, AWS is positioning itself to be a dominant player in next-generation AI.

You May Also Like

The Dawn of AI-Generated Cinema: Introducing Sora by OpenAI

In an extraordinary leap reminiscent of the transformation from black and white…

Nicolas Cage Expresses Deep Concerns Over AI’s Impact on Hollywood

Nicolas Cage has voiced significant concerns about the future implications of artificial…

AI Safety Measures Crumble Under Clever Questioning

A significant vulnerability in AI safety measures has been exposed, raising concerns…

Universal Music Group Takes Legal Action Against AI Firm Anthropic for Distributing Song Lyrics

In a significant legal development, Universal Music Group, along with other music…