October 24, 2024

ExecuTorch Beta: On-Device AI and LLMs, Stability, and Acceleration with Partners

ExecuTorch has achieved Beta status with the release of v0.4, providing stable APIs and runtime, as well as extensive kernel coverage. ExecuTorch is the recommended on-device inference engine for Llama 3.2 1B/3B models, offering enhanced performance and memory efficiency for both original and quantized models. There has been a significant increase in adoption and ecosystem growth for ExecuTorch, and the focus is now on improving reliability, performance, and coverage for non-CPU backen...

Read More

October 23, 2024

TorchRec and FBGEMM 1.0 Stable Release

We are happy to announce the stable release, 1.0, for TorchRec and FBGEMM. TorchRec is the PyTorch native recommendation systems library, powered by FBGEMM’s (Facebook GEneral Matrix Multiplication) efficient, low-level kernels.

Read More

October 17, 2024

PyTorch 2.5 Release Blog

We are excited to announce the release of PyTorch® 2.5 (release note)! This release features a new cuDNN backend for SDPA, enabling speedups by default for users of SDPA on H100s or newer GPUs. As well, regional compilation of torch.compile offers a way to reduce the cold start up time for torch.compile by allowing users to compile a repeated nn.Module (e.g. a transformer layer in LLM) without recompilations. Finally, TorchInductor CPP backend offers solid performance speedup with numerous en...

Read More

October 15, 2024

The Path to Achieve PyTorch Performance Boost on Windows CPU

The challenge of PyTorch’s lower CPU performance on Windows compared to Linux has been a significant issue. There are multiple factors leading to this performance disparity. Through our investigation, we’ve identified several reasons for poor CPU performance on Windows, two primary issues have been pinpointed: the inefficiency of the Windows default malloc memory allocator and the absence of SIMD for vectorization optimizations on the Windows platform. In this article, we show how PyTorch CPU...

Read More

October 08, 2024

PyTorch Foundation Technical Advisory Council Elects New Leadership

We are pleased to announce the first-ever Chair and Vice Chair of the PyTorch Foundation’s Technical Advisory Council (TAC): Luca Antiga as the Chair and Jiong Gong as Vice Chair. Both leaders bring extensive experience and deep commitment to the PyTorch community, and they are set to guide the TAC in its mission to foster an open, diverse, and innovative PyTorch technical community. Meet the New Leadership Luca Antiga is the CTO at Lightning AI since 2022. He is an early contributor to P...

Read More

October 02, 2024

PyTorch Conference 2024 Recap: On Fire 🔥

The 2024 PyTorch Conference in San Francisco gathered nearly 1,500 AI researchers, developers, and enthusiasts. Over two days, the event featured engaging discussions, insightful keynotes, and hands-on sessions focused on artificial intelligence (AI) and advancements in PyTorch, the leading open-source machine learning framework. Attendees delved into the future of generative AI, Large Language Models (LLMs), and the crucial role open-source technology plays in driving AI innovation. Here’s...

Read More

September 26, 2024

PyTorch Native Architecture Optimization: torchao

We’re happy to officially launch torchao, a PyTorch native library that makes models faster and smaller by leveraging low bit dtypes, quantization and sparsity. torchao is an accessible toolkit of techniques written (mostly) in easy to read PyTorch code spanning both inference and training. This blog will help you pick which techniques matter for your workloads. We benchmarked our techniques on popular GenAI models like LLama 3 and Diffusion models and saw minimal drops in accuracy. Unless o...

Read More