AWS released Neuron 2.23, introducing NxD Inference GA, new training and inference capabilities, and upgraded developer tools. NxD Inference offers high-performance, low-latency support for machine learning inference on AWS Inferentia hardware.
This update enhances model performance across LLMs and generative AI applications. With tighter integration, developers now benefit from improved compilation, profiling tools, and framework support including PyTorch and TensorFlow.
These improvements streamline AI/ML workloads on AWS, reinforcing AWS's commitment to optimizing GenAI infrastructure and performance at scale.