Amazon and the AI startup Hugging Face have partnered to utilize Amazon's chips.
- RISHI KORDE
- May 23, 2024
- 1 min read

Matt Wood, the head of artificial intelligence products at AWS, highlighted the significant efficiency of the Inferentia2 chip, stating that while models are typically trained once a month, they are used for inference tens of thousands of times per hour. This frequency of use is where Inferentia2 excels.
On Wednesday, Amazon's cloud division announced a partnership with AI startup Hugging Face to facilitate the running of thousands of AI models on Amazon’s specialized computing chips. Hugging Face, valued at $4.5 billion and supported by major companies including Amazon, Google, and Nvidia, is a key platform for AI researchers and developers to share and develop AI software, such as Meta Platforms’ Llama 3.
After developers adjust an open-source AI model, they usually need it to power a software application. Amazon and Hugging Face have now teamed up to enable this process on AWS's custom Inferentia2 chip. Jeff Boudier, head of product and growth at Hugging Face, emphasized their focus on efficiency and cost-effectiveness, aiming to make it accessible for as many people as possible to run AI models affordably.
AWS aims to attract more AI developers to its cloud services for AI deployment. Although Nvidia leads in model training, AWS asserts that its chips can perform model inference more cost-effectively over time. Wood reiterated that while model training is infrequent, the high rate of inference is where Inferentia2 demonstrates its advantage.
Comments