Articles from Source: Nvidia-Developer-Blog

Building AI Agents for AR Glasses and XR Devices with NVIDIA XR AI

2026-06-16 22:30
Developers of AR glasses and XR devices are facing challenges in creating AI experiences due to an infrastructure gap. NVIDIA XR AI aims to bridge this gap by providing a foundation for connecting XR devices to AI services. Now in beta, developers can access an open-source library to build intelligent agents that enhance user interactions. These agents can assist in various fields, from healthcare to manufacturing, by providing contextual information and guiding users through tasks. NVIDIA XR...
Source: Nvidia Developer Blog
Greg Barbone

Build Your Own Transaction Foundation Model for Financial Intelligence

2026-06-16 20:30
Unlock the potential of transaction data! 💳 The article discusses how financial networks capture human behavior through transactions. Traditional methods rely on fixed features, but foundation models offer a new approach by using pre-trained data for better insights. Leading firms like Stripe and Visa are seeing significant results with these transformer-based models, enhancing tasks like fraud detection and credit scoring. NVIDIA provides a developer example to create your own transaction...
Source: Nvidia Developer Blog
Benjamin Wu

Build On-Device AI Companions with the NVIDIA ACE Game Agent SDK and Unreal Engine 5 Plugins

2026-06-16 17:00
🚀 Exciting advancements in AI for gaming! NVIDIA has announced the ACE Game Agent SDK, which simplifies creating on-device AI companions in Unreal Engine 5. This includes new plugins for ASR, LLM, and TTS, enhancing developer capabilities. Join the live webinar on June 30 for insights on integrating DLSS 4.5. Explore how AI NPCs like Ally in PUBG deliver dynamic gameplay experiences! 🎮🤖 #NVIDIA #UnrealEngine5 #GamingInnovation #AICompanions #GameDevelopment
Source: Nvidia Developer Blog
Phillip Singh

How to Optimize Transformer-Based Models for Low-Precision Training

2026-06-16 16:00
Unlocking the potential of transformer-based models is key for AI advancements. This article discusses optimizing low-precision training using NVIDIA Hopper and Blackwell GPUs, which support FP8 and NVFP4 operations. These innovations can accelerate training time and reduce costs. It emphasizes understanding GEMM workloads to find the best precision settings for your model, enhancing efficiency in training. For example, CodonFM, a biology-focused language model, illustrates these concepts in...
Source: Nvidia Developer Blog
Jonathan Mitchell

NVIDIA Blackwell Tops MLPerf Training 6.0 with Industry-Leading Scale and Performance

2026-06-16 15:11
🚀 NVIDIA has achieved a significant milestone in MLPerf Training v6.0, dominating all benchmarks submitted by the MLCommons consortium. The NVIDIA platform excelled with the fastest training times and highest performance across various tests, including new pretraining benchmarks like DeepSeek-V3 and GPT-OSS-20B. With up to 8,192 Blackwell GPUs working together, NVIDIA showcased the power of its GB300 NVL72 system, setting new records for time-to-train in complex workloads. #NVIDIA #MLPerf #AI...
Source: Nvidia Developer Blog
Farshad Ghodsian

Fine-Tuning Biological Foundation Models with LoRA Using NVIDIA BioNeMo Recipes

2026-06-15 18:07
Foundation models are transforming computational biology, leveraging vast datasets of protein and genomic sequences. Models like ESM2 and Evo2 are effective for various tasks, but adapting them can be challenging due to their size. Low-Rank Adaptation (LoRA) offers a solution by training only a small number of parameters while keeping the main model frozen, significantly reducing resource demands. NVIDIA BioNeMo Recipes simplify this process, providing accessible training workflows on popular...
Source: Nvidia Developer Blog
Bruno Alvisio

Boosting MoE Training Throughput with Advanced Fusion Kernels

2026-06-15 16:45
🚀 Mixture-of-experts (MoE) models are key in modern AI, enhancing model capacity efficiently. NVIDIA introduces advanced fused MLP kernels that optimize training throughput by addressing memory and synchronization issues. These kernels achieve significant speedups, improving end-to-end performance by up to 93%. Explore how custom kernels can help minimize training bottlenecks in MoE blocks. #AI #MachineLearning #NVIDIA #MoE #DeepLearning
Source: Nvidia Developer Blog
Rachit Garg

Pretrained to Imagine, Fine-Tuned to Act: The Rise of World-Action Models

2026-06-15 12:00
🌟 Exploring the rise of Vision-Language-Action (VLA) and World-Action Models (WAM) in robotics! VLA models adapt pretrained vision-language models to generate actions from visual and language inputs. WAM focuses on predicting scene changes and corresponding actions using a pretrained world model. Key terms include grounding, inverse dynamics, and action chunk, which are essential for connecting language instructions to physical actions. #Robotics #AI #MachineLearning #VLA #WAM
Source: Nvidia Developer Blog
Moritz Reuss

NVIDIA Achieves Leading Agentic Coding Performance on First Agentic AI Benchmark

2026-06-12 21:12
NVIDIA has made significant strides in agentic coding performance with the introduction of AA-AgentPerf, the first benchmark for AI agents. 🤖 This new standard measures how well inference systems handle real-world AI tasks, focusing on concurrent AI agents and their performance. AA-AgentPerf normalizes results for better hardware comparison. NVIDIA's innovative co-design approach has achieved up to 20x improved performance over previous generations. 🚀 #NVIDIA #AI #Benchmarking #Technology...
Source: Nvidia Developer Blog
Eduardo Alvarez

Deploy Long-Context Reasoning and Agentic Workflows with MiniMax M3 on NVIDIA Accelerated Infrastructure

2026-06-12 14:43
🚀 As AI adoption grows, developers face challenges with fragmented pipelines for text, vision, and code. MiniMax M3, available on NVIDIA's infrastructure, addresses this by offering a unified multimodal system for long-context reasoning and agentic workflows. With 428B parameters, it supports applications like long video understanding and extended coding sessions. Notably, its innovative MiniMax Sparse Attention enhances performance, making it significantly faster than previous models....
Source: Nvidia Developer Blog
Anu Srivastava

One-Click Multi-Tenant Security with  NVIDIA Quantum InfiniBand

2026-06-11 19:52
NVIDIA Quantum InfiniBand has introduced intent-based security profiles in Unified Fabric Manager (UFM) for streamlined multi-tenant security. With three profiles—General, Bare Metal Cloud, and Secured Bare Metal Cloud—network admins can now auto-configure essential security features in minutes. This reduces deployment time significantly, enhancing efficiency in cloud environments. ⏱️🔒 The architecture emphasizes security across all layers, addressing vulnerabilities often found in...
Source: Nvidia Developer Blog
David Slama

Run DiffusionGemma on NVIDIA for Developer-Ready, High-Throughput Text Generation

2026-06-10 16:16
🚀 Developers focused on real-time AI can benefit from DiffusionGemma, a new text generation model from Google DeepMind. This model generates tokens in parallel, increasing speed and efficiency. It can produce up to 1,000 tokens/sec on NVIDIA H100 GPUs, which enhances user experience while lowering costs. DiffusionGemma supports text and image modalities, with a total of 25.2B parameters. It's optimized for various NVIDIA platforms, making it versatile for different AI applications. #AI...
Source: Nvidia Developer Blog
Anu Srivastava

Designing Production-Ready Battery Energy Storage Systems for AI Factories

2026-06-10 15:00
AI factories are reshaping data-center infrastructure. Unlike traditional setups, they focus on manufacturing intelligence at scale. ⚡️ Battery energy storage systems (BESS) are now key components of this new architecture, enhancing reliability and performance. They help manage power demands efficiently, reducing stress on grids and onsite generation. Learn more about the importance of BESS in AI factories and the considerations for their design. 🔋💡 #AIFactories #EnergyStorage #DataCenters...
Source: Nvidia Developer Blog
Sean James

Delivering Lifecycle Control for AI Infrastructure at Scale with NVIDIA DGX Spark Enterprise Manageability

2026-06-09 19:00
As AI infrastructure grows, enterprise needs for operational maturity are rising. Organizations require systems to be provisionable, observable, secure, and manageable. NVIDIA DGX Spark introduces Enterprise Manageability, providing a complete operational framework from provisioning to retirement, supporting air-gapped deployments. It seamlessly integrates with existing IT workflows, using agentless SSH execution and standardized JSON output for smooth operations. #AI #NVIDIA...
Source: Nvidia Developer Blog
Maitri Taneja

Model Quantization: Turn FP8 Checkpoints into High-Performance Inference Engines with NVIDIA TensorRT

2026-06-09 18:27
Unlock faster inference with NVIDIA TensorRT! 🚀 This article discusses converting FP8-quantized checkpoints into efficient TensorRT engines. This process enhances production deployment, leading to improved throughput and GPU utilization. It details exporting checkpoints to ONNX and compiling them for real-world application, comparing FP8 performance against FP16. Learn more about the quantization workflow and its benefits! #NVIDIA #TensorRT #MachineLearning #AI #Quantization
Source: Nvidia Developer Blog
Ruixiang Wang

Accelerating Federated Learning Research with AI Agents and NVIDIA FLARE Auto-FL

2026-06-09 16:35
🚀 Federated Learning (FL) research often starts with exploring new strategies. The article discusses NVIDIA FLARE Auto-FL, a tool that enhances this process. It automates the testing of FL methods through well-defined benchmarks and structured workflows. This allows researchers to evaluate ideas efficiently while maintaining consistency in results. 📊 Auto-FL helps researchers navigate their experiments, keeping track of outcomes for reproducibility. Learn more about how AI agents can...
Source: Nvidia Developer Blog
Holger Roth

Evaluate Clinical ASR Models Faster with Agent Skills and NVIDIA Nemotron Speech

2026-06-09 15:00
Training speech AI to understand clinical terminology is challenging. Common drug names and medical terms often aren't included in standard speech models. 🏥 Synthetic data generation can address this gap, but accuracy in pronunciation is crucial. Incorrect pronunciations can lead to more problems rather than solutions. NVIDIA's tools support this process, allowing quick creation of clinical benchmarks without the hurdles of real audio collection. 🎤 Clinical ASR is essential for various...
Source: Nvidia Developer Blog
John Jahanipour

Train Models Faster with JAX and MaxText Using NVFP4 on NVIDIA Blackwell

2026-06-08 18:18
🚀 Training large language models (LLMs) efficiently relies on throughput. Every percentage of step time can significantly impact training duration and costs. The NVFP4 recipe in TransformerEngine utilizes subbyte precision for JAX pretraining, achieving high-throughput 4-bit mixed-precision training on NVIDIA Blackwell without accuracy loss compared to FP8. This post outlines the NVFP4 format's efficiency and introduces a pretraining recipe that enhances performance using innovative...
Source: Nvidia Developer Blog
Max Xu

NVIDIA Nemotron 3 Ultra Powers Faster, More Efficient Reasoning for Long-Running Agents

2026-06-04 13:02
🚀 NVIDIA introduces the Nemotron 3 Ultra, designed to enhance long-running agents in complex workflows. This model supports reasoning, context maintenance, and tool usage across multiple turns, addressing challenges like increasing token counts and communication costs. With 550 billion parameters, it excels in orchestration and critical reasoning tasks, achieving 5x higher throughput than similar models. #NVIDIA #AI #MachineLearning #LongRunningAgents #Innovation
Source: Nvidia Developer Blog
Chris Alexiuk

Build Personal AI Agents on Windows PCs with New Tools from Microsoft and NVIDIA

2026-06-02 19:00
🌟 Exciting advancements in AI agents are here! Microsoft and NVIDIA have introduced new tools for developers to create personal AI agents on Windows PCs. These agents assist with tasks like coding and content management, enhancing user experience. Key highlights include turnkey agent sandboxing, faster inference, and improved multi-GPU support. Security is also prioritized with Microsoft eXecution Containers, ensuring safe interaction with personal files. NVIDIA’s new RTX Spark family offers...
Source: Nvidia Developer Blog
Annamalai Chockalingam

Deploy Self-Evolving Agents for Faster, More Secure Research with a Hermes Agent and NVIDIA NemoClaw

2026-06-02 16:00
Unlock the potential of AI research with the Hermes Agent and NVIDIA NemoClaw! 🚀 This open-source tool combines internal and public data sources while ensuring security. It learns user preferences over time, improving efficiency in tasks like sales research and customer support. Key features include: - Easy installation of the NemoClaw stack - Integration with Slack, Outlook, and GitHub - Ability to create recurring reports from chat without coding For setup, a Docker host and API keys are...
Source: Nvidia Developer Blog
Sam Pastoriza

Deploy Agentic-Ready AI at the Edge with Memory Efficiency in NVIDIA JetPack 7.2

2026-06-02 02:00
🚀 Exciting updates in AI deployment! NVIDIA JetPack 7.2 enhances the use of NVIDIA Jetson for real-world applications, focusing on memory efficiency and performance. Key features include: - One-command deployment with NVIDIA NemoClaw for added privacy and security. - Multi-Instance GPU support for efficient multiworkload execution. - Yocto Project support for custom Linux distributions. These advancements aim to optimize existing Jetson hardware and accelerate development. #NVIDIA #JetPack...
Source: Nvidia Developer Blog
Peilun Tsai

Run Local AI Agents with Faster Models and Multi-Node Clustering on NVIDIA DGX Spark

2026-06-01 22:00
🚀 The emergence of autonomous AI agents is reshaping compute demands, focusing on local execution due to privacy and security concerns. NVIDIA is enhancing this with DGX Spark, allowing developers to run agents on owned hardware using NVIDIA NemoClaw. The streamlined setup process enables quick deployment of these agents, making it easier to manage sensitive data. The latest updates also improve model performance and facilitate multi-node clustering for scaling needs. 🌐🔒 #AI #NVIDIA #DGXSpark...
Source: Nvidia Developer Blog
Maitri Taneja

How to Post-Train Autonomous Vehicle Models in Closed-Loop with NVIDIA Alpamayo

2026-06-01 04:49
🚗 Developing effective autonomous vehicle policies requires a shift from open-loop to closed-loop training. NVIDIA Alpamayo offers tools like AlpaSim and the upcoming AlpaGym to help bridge this gap. These resources enable the training of Vision-Language-Action models that adapt to real-world driving scenarios. Key steps include configuring AlpaGym, defining rewards, and launching closed-loop training to enhance AV performance. #AutonomousVehicles #NVIDIA #AI #MachineLearning #TechInnovation
Source: Nvidia Developer Blog
Boris Ivanovic

Develop Physical AI Reasoning, World, and Action Models with NVIDIA Cosmos 3

2026-06-01 04:43
Unlock the potential of Physical AI with NVIDIA Cosmos 3! 🤖🌍 This new foundation model enables robots and autonomous systems to understand the real world, predict future events, and take action effectively. NVIDIA is open-sourcing Cosmos 3’s models, training scripts, and tools for wider access and reproducibility. Dive into the blog for insights on technical workflows and applications in robotics and smart environments! #PhysicalAI #NVIDIACosmos3 #Robotics #OpenSource #Innovation
Source: Nvidia Developer Blog
Asawaree Bhide

Advancing AI Infrastructure for Agentic AI with NVIDIA DOCA In-Silicon Security

2026-06-01 04:21
The rise of AI is leading to the development of AI factories, which transform data into intelligence for autonomous agents. These infrastructures enhance speed and efficiency in AI training and deployment. 🚀 However, the adoption of agentic AI introduces new security challenges. Traditional security systems are not equipped to handle the complexity and scale of AI factories, making them vulnerable. 🔒 NVIDIA's BlueField DPUs offer a solution with in-silicon security, enhancing protection...
Source: Nvidia Developer Blog
Ofir Arkin

NVIDIA Vera CPU Sets a New Standard for Agentic Workloads in AI Factories

2026-06-01 03:59
NVIDIA's Vera CPU is setting a new standard for AI factories, especially in agentic workloads. The article discusses how AI has evolved through different scaling laws, focusing on pretraining, post-training, and test-time scaling. Vera CPUs enhance agentic AI and reinforcement learning by reducing CPU execution time and increasing task throughput, leading to smarter, more efficient AI systems. #NVIDIA #AIFactories #AgenticAI #ReinforcementLearning #TechInnovation 🤖💻📈
Source: Nvidia Developer Blog
Praveen Menon

NVIDIA DSX OS Delivers Open, Modular Software for Operating AI Factories at Scale

2026-06-01 03:36
NVIDIA introduces the DSX OS, a modular software designed for scaling AI factories efficiently. As AI becomes crucial infrastructure, the DSX platform offers a comprehensive approach to design, simulate, and operate these factories across multiple layers. With open-source components, DSX OS enhances deployment speed and operational reliability, ultimately reducing costs. #NVIDIA #AI #TechInnovation #Software #OpenSource
Source: Nvidia Developer Blog
Warren Barkley

DynoSim: Simulating the Pareto Frontier

2026-05-29 22:31
🚀 Modern LLM serving involves many complex choices, making tuning challenging. Each deployment's factors, like model backend and worker counts, interact in ways that can shift performance bottlenecks. 🔍 DynoSim addresses this issue by providing a workload-driven simulation of NVIDIA's Dynamo stack. It combines various components to accurately simulate the serving process. ⚡ It's designed for speed, achieving simulations significantly faster than real-time, showcasing its efficiency on devices...
Source: Nvidia Developer Blog
Yongming Ding

How to Automate AI Model Documentation with the NVIDIA MCG Toolkit

2026-05-29 16:00
🚀 As AI models become more complex, the need for thorough documentation increases. The NVIDIA MCG Toolkit addresses this challenge by automating the creation of model cards, which detail model functionality, training data, and limitations. This ensures transparency for users such as developers, policymakers, and risk assessors. The toolkit streamlines documentation through a structured pipeline that quickly generates compliant model cards from source data. Learn more about how it works! #AI...
Source: Nvidia Developer Blog
Pratyusha Maiti

Run Step 3.7 Flash on NVIDIA GPUs with Enterprise-Ready Multimodal AI

2026-05-29 00:07
🚀 AI applications are advancing to multimodal systems that integrate text, images, video, and documents. Step 3.7 Flash by StepFun enhances production capabilities on NVIDIA infrastructure. This model features 198B parameters, optimized for high-level reasoning and context management. Developers can access the NVFP4-quantized checkpoint on Hugging Face for improved efficiency. #AIMultimodal #NVIDIA #StepFun #AIInnovation #TechUpdates
Source: Nvidia Developer Blog
Anu Srivastava

NVIDIA Dynamo Snapshot: Fast Startup for Inference Workloads on Kubernetes

2026-05-27 23:09
NVIDIA's latest article highlights the challenges of the cold-start problem in inference deployments on Kubernetes. 🚀 As demand fluctuates, scaling inference replicas is crucial, but cold starts can take several minutes, leaving GPUs idle and increasing the risk of SLA violations during peak traffic. 📈 Understanding these delays is key to improving system responsiveness. #NVIDIA #Kubernetes #CloudComputing #MachineLearning #InferenceWorkloads
Source: Nvidia Developer Blog
Schwinn Saereesitthipitak

NVIDIA Blackwell Sets STAC-AI Record for LLM Inference in Finance

2026-05-27 20:00
NVIDIA's Blackwell has achieved a record in LLM inference for finance, showcasing the power of large language models in analyzing unstructured data for trading insights. 📈💼 The STAC-AI benchmark was developed to assess LLM performance, focusing on the LANG6 tests with models like Llama 3.1 across various datasets. These tests analyze financial documents for investment strategies. Results include batch and interactive mode scenarios, measuring throughput and response times. ⚙️📊 #NVIDIA...
Source: Nvidia Developer Blog
Dan Blanaru

What’s New for Game Developers in NVIDIA RTX: DLSS 4.5 for UE5 and Multilingual AI Characters

2026-05-27 16:59
NVIDIA RTX brings exciting updates for game developers! 🎮 The new NVIDIA ACE enhances AI character capabilities, enabling multilingual and dynamic NPC interactions. This improves immersion in gaming experiences. Additionally, DLSS 4.5 is now available as a UE plugin, featuring Dynamic Multi Frame Generation and a new 6x mode. The NVIDIA RTX Branch of Unreal Engine also received a stability update for better compatibility. #GameDevelopment #NVIDIA #AICharacters #UnrealEngine #DLSS
Source: Nvidia Developer Blog
Phillip Singh

Extract More Kernel Performance with NVIDIA CompileIQ Auto-Tuning

2026-05-26 22:08
NVIDIA CompileIQ addresses a key challenge in performance engineering: optimizing compiler options for specific workloads. With the release of CUDA 13.3, CompileIQ introduces an AI-driven auto-tuning framework. It utilizes evolutionary algorithms to tailor compiler configurations for individual tasks, enhancing GPU performance. This innovation is crucial for maximizing throughput, especially in AI infrastructure, where small performance gains can significantly impact overall application...
Source: Nvidia Developer Blog
Aditya Srikanth

Develop High-Performance GPU Kernels in C++ with NVIDIA CUDA Tile

2026-05-26 21:40
Unlock the power of GPU programming with NVIDIA's CUDA Tile! 🚀 The recently launched CUDA 13.3 allows developers to create optimized GPU kernels in C++ using tile-based abstractions. This feature simplifies GPU programming by automating parallelism and memory management. CUDA Tile C++ supports multi-dimensional arrays and enhances performance across different NVIDIA GPU architectures. It's a game-changer for maximizing hardware capabilities without extensive code rewrites. 💻✨ #NVIDIA #CUDA...
Source: Nvidia Developer Blog
Jonathan Bentz

NVIDIA CUDA 13.3 Enhances GPU Development with Tile Programming in C++, Compiler Autotuning, and Python Updates

2026-05-26 21:39
🚀 NVIDIA has launched CUDA 13.3, introducing Tile programming in C++ for enhanced GPU development. This feature simplifies kernel creation while improving performance and portability across various GPU architectures. 🖥️ CUDA Python 1.0 is also here, ensuring stability with features like green contexts and process checkpointing. ⚡ For performance improvements, the new CompileIQ framework offers up to a 15% speedup on critical kernels. Key updates include official C++23 support and enhanced...
Source: Nvidia Developer Blog
Jonathan Bentz

Run Key Genomics and Protein Folding Workloads Faster with NVIDIA RTX PRO 4500 Blackwell

2026-05-26 16:00
Unlock faster genomic analysis and protein folding with NVIDIA's new RTX PRO 4500 Blackwell! 🧬💻 The advancements in precision medicine now allow for quicker genome sequencing and AI-driven protein structure characterization, significantly enhancing treatment development. NVIDIA Parabricks accelerates data analysis from hours to minutes, helping clinicians make timely decisions in critical settings. 💡 Explore how these innovations are transforming healthcare outcomes! #PrecisionMedicine...
Source: Nvidia Developer Blog
Alejandro Chacon

Synthesize Realistic 3D Medical Images at Scale to Ship Pre‑Trained Models

2026-05-22 16:00
NVIDIA has launched Medical AI for Synthetic Imaging (MAISI) to tackle challenges in 3D medical imaging data. 🏥 This generative model creates high-resolution CT volumes with detailed anatomical segmentation, addressing data scarcity and privacy concerns. The NV-Generate-CTMR framework allows researchers to produce realistic 3D volumes at scale, enhancing medical AI development. Learn more about the new NV-Generate-MR-Brain model for synthetic brain anatomy generation! 🧠✨ #MedicalAI #3DImaging...
Source: Nvidia Developer Blog
Can Zhao

Automating and Optimizing Financial Signal Discovery with Multi-Agent Systems

2026-05-21 18:31
🚀 In quantitative finance, researchers are optimizing the discovery of financial signals using AI. Traditionally, this process involved manual tasks that slowed down research. Now, with the NVIDIA NeMo Agent Toolkit, a new method automates signal discovery. This system uses three specialized agents: 1. **Signal Agent**: Finds potential signals. 2. **Code Agent**: Converts signals into Python code. 3. **Evaluation Agent**: Tests and refines signals. This multi-agent approach enhances...
Source: Nvidia Developer Blog
Peihan Huo