AMD Introduces Instinct MI350P PCIe GPUs to Bring Enterprise AI to Existing Data Centers

AMD has expanded its enterprise AI portfolio with the introduction of the AMD Instinct MI350P PCIe GPUs, a new class of accelerators designed to run advanced AI workloads on existing data center infrastructure. The company positioned the launch as a response to organizations struggling to balance AI adoption with the cost, complexity and unpredictability of cloud‑based or fully redesigned on‑premises systems. The MI350P PCIe cards offer a third path: high‑performance AI acceleration that fits directly into standard air‑cooled server racks without requiring major power or cooling upgrades.

The MI350P PCIe cards are engineered for the emerging era of agentic AI, enabling enterprises to deploy inference workloads on premises using familiar infrastructure. As dual‑slot, drop‑in cards, they support up to eight accelerators in air‑cooled systems and are optimized for small, medium and large AI models, including retrieval‑augmented generation (RAG) pipelines. AMD emphasized that the PCIe form factor provides a cost‑effective option for organizations that need more AI compute than CPUs can deliver but are not yet ready to invest in full GPU‑accelerator platforms.

The company highlighted several performance advantages, including native support for lower‑precision MXFP6 and MXFP4 formats, sparsity acceleration across mainstream 8‑ and 16‑bit precisions, and an estimated 2,299 TFLOPS—with up to 4,600 peak TFLOPS at MXFP4—representing the highest performance currently available in an enterprise PCIe card. The GPUs also feature 144GB of HBM3E memory delivering up to 4TB/s bandwidth.

AMD framed the MI350P launch as part of its commitment to an open AI ecosystem. The cards integrate with the AMD enterprise AI software stack, which includes Kubernetes GPU Operator support, cloud‑native inference microservices and compatibility with frameworks such as PyTorch. The open‑source reference stack is provided at no licensing cost, enabling enterprises to migrate inference workloads with minimal code changes and without per‑token fees.

OEM and software partners—including Dell, HPE, Cisco, Lenovo, Supermicro, Red Hat, Akamai, Broadcom and Nutanix—praised the MI350P’s ability to deliver scalable, energy‑efficient AI performance within standard data center environments. AMD stated that the new GPUs allow enterprises to run more models, serve more users and operationalize AI faster without rebuilding infrastructure from the ground up.