Nvidia-Developer-Blog | Daily Tech Articles Feed

AI Factories, Physical AI, and Advances in Models, Agents, and Infrastructure That Shaped 2025

2025-12-16 17:30

In 2025, significant advancements were made in NVIDIA technologies, impacting AI development. Key areas of progress included data center power, compute design, and AI infrastructure. Innovations in model optimization and open models contributed to the evolution of AI agents and physical AI, enhancing the deployment of intelligent systems. These developments are reshaping how AI is integrated into everyday life. 🤖💻 #AIAdvancements #NVIDIA #TechInnovation #MachineLearning #FutureTech

Source: Nvidia Developer Blog

Michelle Horton

Industry Analysis

Boost GPU Memory Performance with No Code Changes Using NVIDIA CUDA MPS

2025-12-16 17:00

NVIDIA CUDA developers can enhance GPU memory performance without code changes using Multi-Process Service (MPS). This tool allows better GPU resource sharing across processes, improving utilization seamlessly. 🖥️ The new Memory Locality Optimized Partition (MLOPart) feature offers optimized devices that cater to latency-sensitive applications, enabling developers to test performance easily. 🔍 MLOPart devices appear as distinct CUDA devices, allowing efficient resource management. They...

Source: Nvidia Developer Blog

Sherwin Nassernia

Technical Deep Dives

Delivering Flexible Performance for Future-Ready Data Centers with NVIDIA MGX

2025-12-15 18:25

🚀 The AI boom is set to accelerate by 2026, pushing enterprise data centers to adapt beyond traditional architectures. NVIDIA MGX's modular reference architecture offers a 6U chassis designed for next-gen compute platforms, including a new liquid-cooled RTX PRO Server. This flexible design supports multiple CPU architectures and enhances serviceability, making maintenance easier. 🌱 #DataCenter #NVIDIA #AI #TechInnovation #Sustainability

Source: Nvidia Developer Blog

Anthony Larijani

Product Announcements

Reducing CUDA Binary Size to Distribute cuML on PyPI

2025-12-15 17:30

🚀 Exciting news for developers! With the 25.10 release, cuML wheels are now available directly from PyPI, simplifying installation. No more complex steps or Conda management—just use pip! 📦 The NVIDIA team has reduced the CUDA library binary size by ~30%, enhancing accessibility and performance. This means faster downloads and lower storage needs. For detailed installation commands and optimization techniques, check out the full article! #cuML #PyPI #CUDA #NVIDIA #PythonDevelopment

Source: Nvidia Developer Blog

Divye Gala

Educational

NVIDIA CUDA-X Powers the New Sirius GPU Engine for DuckDB, Setting ClickBench Records

2025-12-15 17:18

🚀 NVIDIA and the University of Wisconsin-Madison are collaborating to enhance DuckDB with GPU-accelerated analytics through the Sirius engine. DuckDB is gaining traction among major organizations like Microsoft and Databricks due to its efficiency and flexibility. The new Sirius engine leverages NVIDIA CUDA-X libraries for improved performance and query execution. The blog highlights Sirius's architecture and its record-breaking results on the ClickBench analytics benchmark. #NVIDIA #DuckDB...

Source: Nvidia Developer Blog

Xiangyao Yu

Technical Deep Dives

How to Train Scientific Agents with Reinforcement Learning

2025-12-15 14:00

Unlocking the potential of scientific research with AI! 🤖 The article discusses how scientific AI agents can assist researchers by managing literature, planning experiments, and analyzing results, allowing more time for creative discovery. However, building these agents poses challenges, especially in maintaining context and coherence over long tasks. NVIDIA's NeMo framework offers tools like NeMo Gym and NeMo RL to create effective training environments for these AI systems. Notably, Edison...

Source: Nvidia Developer Blog

Christian Munley

Educational

Inside NVIDIA Nemotron 3: Techniques, Tools, and Data That Make It Efficient and Accurate

2025-12-15 14:00

🚀 Exciting developments in AI! NVIDIA's Nemotron 3 family enhances agentic AI systems through a suite of cooperating agents: retrievers, planners, and more. This model focuses on efficiency, accuracy, and customization. Key features include a hybrid Mamba-Transformer architecture, multi-environment reinforcement learning, and a 1M-token context window for extensive reasoning. Nemotron 3 Nano is now available, with Super and Ultra versions coming soon. #NVIDIA #AI #MachineLearning #Nemotron3...

Source: Nvidia Developer Blog

Chris Alexiuk

Product Announcements

Automate Kubernetes AI Cluster Health with NVSentinel

2025-12-08 18:00

🚀 Kubernetes is essential for AI workloads, but managing GPU nodes can be complex. NVSentinel addresses these challenges by continuously monitoring GPU health and automatically fixing issues to minimize disruptions. This open-source tool enhances GPU uptime and reliability, reducing downtime from hours to minutes. With NVSentinel, organizations can ensure smoother operations and better productivity in their AI and high-performance computing environments. #Kubernetes #AI #GPU #NVSentinel...

Source: Nvidia Developer Blog

Lalit Adithya

Technical Deep Dives

Optimizing Inference for Long Context and Large Batch Sizes with NVFP4 KV Cache

2025-12-08 17:00

Unlocking the potential of large-scale inference, the article discusses NVFP4 KV cache quantization. By reducing the precision of weights and activations, this method can cut memory costs by up to 50%. This leads to improved throughput, latency, and the ability to handle larger context lengths and batch sizes. The article also explains the importance of KV cache in optimizing language model performance. #AI #Inference #NVIDIA #Quantization #MachineLearning 🤖💡📈

Source: Nvidia Developer Blog

Eduardo Alvarez

Technical Deep Dives

NVIDIA Kaggle Grandmasters Win Artificial General Intelligence Competition

2025-12-05 18:00

🏆 NVIDIA researchers have secured a significant victory in the Kaggle ARC Prize 2025 competition, a key indicator of advancements in artificial general intelligence (AGI). Ivan Sorokin and Jean-Francois Puget, part of the Kaggle Grandmasters of NVIDIA, topped the leaderboard with a score of 27.64%. Their team, NVARC, fine-tuned a 4B model variant that excelled against larger models at a low cost. The ARC-AGI benchmark tests AI's ability to perform abstract reasoning with minimal examples,...

Source: Nvidia Developer Blog

Moon Chung

Event

NVIDIA Grace CPU Delivers High Bandwidth and Efficiency for Modern Data Centers

2025-12-05 17:00

🚀 The NVIDIA Grace CPU is transforming data centers since its 2023 launch, achieving impressive performance efficiency. Grace combines Arm Neoverse cores with advanced technologies for high bandwidth and energy efficiency. Its single NUMA design simplifies software development by allowing equal memory access across all cores. This architecture benefits cloud environments, enhancing virtual machine performance without the drawbacks of traditional chiplet designs. #NVIDIA #GraceCPU #DataCenters...

Source: Nvidia Developer Blog

Praveen Menon

Technical Deep Dives

NVIDIA CUDA 13.1 Powers Next-Gen GPU Programming with NVIDIA CUDA Tile and Performance Gains

2025-12-04 22:20

🚀 NVIDIA has released CUDA 13.1, marking a significant update to its GPU programming platform. This version introduces CUDA Tile, allowing developers to write kernels using data chunks called tiles, simplifying hardware complexities. Key features include: - Runtime API exposure of green contexts for better resource management. - New tools for double and single precision emulation in NVIDIA cuBLAS. Exciting advancements for both novice and experienced programmers! #NVIDIA #CUDA #GPUs...

Source: Nvidia Developer Blog

Jonathan Bentz

Product Announcements

Simplify GPU Programming with NVIDIA CUDA Tile in Python

2025-12-04 22:20

🚀 NVIDIA has launched CUDA 13.1, introducing tile-based programming for GPUs. This update simplifies GPU programming by allowing developers to write tile kernels in Python, enhancing algorithm efficiency. cuTile Python abstracts hardware specifics, enabling focus on algorithms while the compiler manages thread partitioning. This model is designed for data-parallel GPU kernel authoring, particularly beneficial in AI/ML applications. #NVIDIA #CUDA #GPUProgramming #MachineLearning #Python

Source: Nvidia Developer Blog

Jonathan Bentz

Product Announcements

Focus on Your Algorithm—NVIDIA CUDA Tile Handles the Hardware

2025-12-04 22:20

🚀 NVIDIA has launched CUDA 13.1, featuring NVIDIA CUDA Tile, marking a significant advancement since 2006. This innovation allows developers to write algorithms at a higher level, simplifying code for specialized hardware like tensor cores. Tile programming focuses on processing data in chunks, enhancing compatibility with future architectures. Discover how this evolution supports AI and computational workloads! #NVIDIA #CUDA #AI #Programming #TechInnovation

Source: Nvidia Developer Blog

Jonathan Bentz

Product Announcements

Optimize Data Center Efficiency for AI and HPC Workloads with Power Profiles

2025-12-04 17:00

📈 The demand for computational power is rising, leading to increased energy consumption in data centers. To address this, NVIDIA has introduced energy-optimized power profiles with the Blackwell B200. 🔋 This new software helps users maximize performance while managing power constraints. It offers coarse-grain control for HPC and AI workloads, achieving up to 15% energy savings and a 13% increase in throughput. 🤖 The one-click tuning simplifies the complex process of optimizing GPU settings,...

Source: Nvidia Developer Blog

Pratikkumar Patel

Product Announcements

How to Enhance 3D Gaussian Reconstruction Quality for Simulation

2025-12-03 17:30

Building photorealistic 3D environments for simulation presents challenges, even with advanced methods like 3D Gaussian Splatting. Artifacts such as blurriness and holes can affect visual quality. NVIDIA's Omniverse NuRec addresses this with a generative model called Fixer, which effectively removes these artifacts. The post details how to use Fixer to enhance noisy 3D scenes, specifically for autonomous vehicle simulation. It includes a guide on downloading a sample scene from the...

Source: Nvidia Developer Blog

Wonsik Han

Educational

Accelerating Real-Time Financial Decisions with Quantitative Portfolio Optimization

2025-12-02 18:51

Unlocking the future of financial portfolio optimization! 🚀 This article discusses the challenges of balancing speed and complexity in financial decision-making. The new Quantitative Portfolio Optimization framework aims to overcome these hurdles by transforming slow processes into fast, iterative workflows. With high-performance hardware and NVIDIA cuOpt solvers, it achieves significant speedups, enhancing strategy backtesting and analysis. The CUDA ecosystem further accelerates data...

Source: Nvidia Developer Blog

Peihan Huo

Technical Deep Dives

NVIDIA-Accelerated Mistral 3 Open Models Deliver Efficiency, Accuracy at Any Scale

2025-12-02 18:10

🚀 Exciting news in AI! The Mistral 3 open model family has been launched, offering enhanced accuracy and efficiency for developers and enterprises. This suite features a large multimodal and multilingual model with 675B parameters, alongside smaller high-performance models (3B, 8B, 14B) for diverse applications. All models are trained on NVIDIA Hopper GPUs and available via Mistral AI on Hugging Face, providing versatile deployment options. #AI #MachineLearning #NVIDIA #Mistral3 #TechInnovation

Source: Nvidia Developer Blog

Anu Srivastava

Product Announcements

AWS Integrates AI Infrastructure with NVIDIA NVLink Fusion for Trainium4 Deployment

2025-12-02 16:00

🚀 AWS has announced a collaboration with NVIDIA to enhance AI infrastructure using NVLink Fusion. This integration aims to support the deployment of Trainium4 AI chips and other technologies. 🔗 NVLink Fusion will provide a high-performance, rack-scale platform, addressing the growing complexities of AI workloads while reducing deployment risks. 📈 The partnership focuses on improving networking capabilities and streamlining the development process for custom AI solutions. #AWS #NVIDIA...

Source: Nvidia Developer Blog

Jesse Clayton

Product Announcements

Accelerating Real-Time Financial Decisions with Quantitative Portfolio Optimization

2025-12-01 23:44

Unlocking the future of financial portfolio optimization! 🚀 This article discusses the challenges of balancing speed and complexity in financial decision-making. The new Quantitative Portfolio Optimization framework aims to overcome these hurdles by transforming slow processes into fast, iterative workflows. With high-performance hardware and NVIDIA cuOpt solvers, it achieves significant speedups, enhancing strategy backtesting and analysis. The CUDA ecosystem further accelerates data...

Source: Nvidia Developer Blog

Peihan Huo

Technical Deep Dives

Train Small Orchestration Agents to Solve Big Problems

2025-12-01 23:25

At NVIDIA Research, we're advancing agent design by addressing the challenge of selecting the right tools for tasks. Our new approach involves an "orchestrator" model that supervises other models, considering user preferences like speed, cost, and accuracy. Interestingly, small models can effectively manage this process when properly tuned. Introducing ToolOrchestra, our method for data preparation and reinforcement-learning training to enhance orchestration. #NVIDIA #AI #MachineLearning...

Source: Nvidia Developer Blog

Shizhe Diao

Technical Deep Dives

Build Efficient Financial Data Workflows with AI Model Distillation

2025-12-01 22:00

Unlock the potential of AI in finance with Model Distillation! 💡 Large language models are transforming quantitative finance for tasks like alpha generation and risk prediction. However, challenges in cost and integration persist. NVIDIA technology offers solutions for continuous model fine-tuning, creating smaller, efficient models that maintain high accuracy while reducing costs. This enables seamless integration into financial workflows. Discover how developers can utilize tested...

Source: Nvidia Developer Blog

Dhruv Desai

Educational

How to Scale Data Generation for Physical AI with the NVIDIA Cosmos Cookbook

2025-12-01 17:00

Unlock the potential of physical AI with the NVIDIA Cosmos Cookbook! 📚 This guide offers step-by-step recipes for generating scalable, high-fidelity synthetic data, addressing the challenges of collecting diverse real-world datasets. 🌍 Key features include video data augmentation techniques using various control modalities like depth, edge, and segmentation, enabling developers to create realistic variations while maintaining consistency. 🤖 Perfect for robotics developers aiming to enhance...

Source: Nvidia Developer Blog

Prachi Mishra

Educational

Making GPU Clusters More Efficient with NVIDIA Data Center Monitoring

2025-11-25 21:00

High-performance computing is rapidly growing, driven by advancements in AI and large language models. As GPU demand increases, optimizing GPU efficiency is crucial. In a recent article, strategies were discussed to minimize idle GPU waste in large clusters. Key issues include hardware failures, misconfigured jobs, and idle sessions. Each waste type requires specific solutions to enhance productivity and reduce operational costs. Effective monitoring and targeted programs can lead to...

Source: Nvidia Developer Blog

Sachin Lakharia

Technical Deep Dives

Making Robot Perception More Efficient on NVIDIA Jetson Thor

2025-11-25 18:00

Building autonomous robots requires efficient visual perception for tasks like obstacle recognition and navigation. 🚀 NVIDIA Jetson platforms, such as Jetson AGX Orin and Thor, combine powerful GPUs with dedicated hardware accelerators to enhance performance while managing power consumption. 🔋 The NVIDIA Vision Programming Interface (VPI) helps developers unlock the full potential of these accelerators, enabling low-latency applications. A development example includes creating a multi-stream...

Source: Nvidia Developer Blog

Chintan Intwala

Educational

Build and Run Secure, Data-Driven AI Agents

2025-11-24 19:49

🚀 As generative AI evolves, organizations require accurate and reliable AI agents tailored to their data. NVIDIA introduces the AI-Q Research Assistant and Enterprise RAG Blueprints, leveraging retrieval-augmented generation (RAG) for enhanced document comprehension and reporting. Deployment involves secure, scalable infrastructure on AWS, utilizing Amazon EKS, OpenSearch, and S3 for optimal performance. Explore how NVIDIA's blueprints harness advanced models for efficient data processing and...

Source: Nvidia Developer Blog

Abdullahi Olaoye

Technical Deep Dives

Model Quantization: Concepts, Methods, and Why It Matters

2025-11-24 19:23

Understanding Model Quantization is essential for optimizing AI performance. This technique allows complex models to run efficiently on limited hardware by reducing the precision of model parameters. Tools like NVIDIA TensorRT and Model Optimizer help simplify this process while preserving accuracy. Explore how quantization can enhance memory usage, inference speed, and energy consumption in AI applications. #ModelQuantization #AI #NVIDIA #DeepLearning #TechTalks

Source: Nvidia Developer Blog

Ruixiang Wang

Educational

Breaking Through Reinforcement Learning Training Limits with Scaling Rollouts in BroRL

2025-11-19 21:51

🚀 Exciting developments in reinforcement learning! NVIDIA Research introduces Broadened Reinforcement Learning (BroRL), a new approach that enhances large language model training. Unlike traditional methods that focus on increasing training steps, BroRL emphasizes increasing exploratory rollouts, allowing for hundreds of rollouts per prompt. This innovative method breaks through performance plateaus seen in previous models, enabling continuous learning while being more data- and compute-...

Source: Nvidia Developer Blog

Jian Hu

Product Announcements

Building Better Qubits with GPU-Accelerated Computing

2025-11-19 17:00

Quantum computing is set to transform various fields, but developing effective qubits remains a challenge due to sensitivity to noise. NVIDIA and Berkeley Lab are advancing this area with GPU-accelerated EDA tools, enhancing the design of quantum chips. Their open-source simulation package, ARTEMIS, has achieved significant milestones in simulating full quantum chips. These innovations help researchers address complex interactions and improve accuracy in chip design, crucial for the future of...

Source: Nvidia Developer Blog

Zhi (Jackie) Yao

Technical Deep Dives

Building Scalable AI on Enterprise Data with NVIDIA Nemotron RAG and Microsoft SQL Server 2025

2025-11-18 20:00

🚀 Exciting developments at Microsoft Ignite 2025! Microsoft announced SQL Server 2025, which integrates with NVIDIA's Nemotron RAG for enhanced AI capabilities. This collaboration allows developers to build secure, high-performance AI applications using data from cloud or on-premises. Key benefits include improved retrieval-augmented generation (RAG) performance and flexibility, addressing common enterprise challenges. The new architecture leverages NVIDIA GPUs for efficient embedding...

Source: Nvidia Developer Blog

Uttara Kumar

Product Announcements

Faster Chemistry and Materials Discovery with AI-Powered Simulations Using NVIDIA ALCHEMI

2025-11-18 17:00

Unlocking faster discoveries in chemistry and materials science is now possible with NVIDIA ALCHEMI. 🧪 Traditional methods are slow and costly, but ALCHEMI introduces two new atomistic simulation services: Batched Conformer Search (BCS) and Batched Molecular Dynamics (BMD). These tools leverage advanced AI to efficiently identify low-energy conformers, enhancing property predictions. The BCS NIM uses machine learning to speed up energy optimization, making it easier for researchers to explore...

Source: Nvidia Developer Blog

Wen Jie Ong

Product Announcements

NVIDIA NVQLink Architecture Integrates Accelerated Computing with Quantum Processors

2025-11-17 22:31

🚀 Quantum computing is evolving through the integration of accelerated computing and quantum processors. NVIDIA's NVQLink architecture supports this by connecting GPU superchips with quantum system controllers, enhancing real-time calibration and error correction. This open platform enables efficient workloads and fosters innovation across various quantum technologies. Discover how NVQLink could transform quantum computing! #QuantumComputing #NVIDIA #AcceleratedComputing #Innovation #TechNews

Source: Nvidia Developer Blog

Shane Caldwell

Technical Deep Dives

Pioneering AI Co-Scientists for Fusion Research and Cancer Treatment

2025-11-17 22:30

AI is transforming scientific research by enabling the generation and analysis of data in new ways. Collaborative AI co-scientists assist researchers in developing hypotheses and experimental plans. They leverage advanced reasoning and interdisciplinary knowledge to accelerate discoveries. NVIDIA and Los Alamos National Laboratories are developing co-scientists for two critical areas: inertial confinement fusion and cancer treatment. These AI models aim to tackle complex scientific challenges...

Source: Nvidia Developer Blog

Geetika Gupta

Industry Analysis

Achieve CUTLASS C++ Performance with Python APIs Using CuTe DSL

2025-11-13 20:30

🚀 CuTe, a key part of CUTLASS 3.x, simplifies data layouts and thread mappings for GPU programming. The new CuTe DSL in CUTLASS 4 allows Python developers to create efficient GPU kernels without the complexities of C++ templates. It ensures consistent performance across NVIDIA GPUs while improving compilation speed and error handling. Explore examples on GitHub to see its capabilities! 💻✨ #CUTLASS #CuTe #Python #GPUProgramming #NVIDIA

Source: Nvidia Developer Blog

Brandon Sun

Technical Deep Dives

How to Get Started with Neural Shading for Your Game or Application

2025-11-13 19:55

Unlock the future of rendering with neural shading! 🎮✨ For 25 years, real-time rendering has evolved alongside hardware advancements. As traditional methods hit their limits, neural shading offers a new path by integrating AI models into the graphics pipeline. This technique enhances performance and visual fidelity using dedicated AI hardware like NVIDIA’s Tensor Cores. It enables efficient real-time execution of small neural networks in shaders, simplifying complex visual challenges....

Source: Nvidia Developer Blog

Shannon Woods

Educational

NVIDIA Blackwell Architecture Sweeps MLPerf Training v5.1 Benchmarks

2025-11-13 00:08

🚀 The NVIDIA Blackwell architecture has set a new standard by achieving the fastest training times across all MLPerf Training v5.1 benchmarks. This architecture demonstrates significant performance improvements, essential as AI models grow larger and more complex. 📊 Key highlights include: - Fastest times for various models, including Llama 3.1 and DLRM-DCN. - Exclusive submissions for all benchmarks. NVIDIA's innovation in low-precision data formats, particularly the NVFP4, is a critical...

Source: Nvidia Developer Blog

Ashraf Eassa

Industry Analysis

Just Released: Warp 1.10 Expands JAX Interoperability and Performance

2025-11-13 00:07

🚀 Just released: Warp 1.10 enhances JAX interoperability and performance! Key updates include improvements for high-performance GPU simulations, expanded support for Tile programming, and better compatibility with Arm architecture. Explore the latest features to optimize your workflows! #NVIDIA #Warp #JAX #GPU #TechUpdates

Source: Nvidia Developer Blog

Mohammad Mohajerani

Product Announcements

NVIDIA Blackwell Architecture Sweeps MLPerf Training v5.1 Benchmarks

2025-11-12 16:00

🚀 The NVIDIA Blackwell architecture has set a new standard by achieving the fastest training times across all MLPerf Training v5.1 benchmarks. This architecture demonstrates significant performance improvements, essential as AI models grow larger and more complex. 📊 Key highlights include: - Fastest times for various models, including Llama 3.1 and DLRM-DCN. - Exclusive submissions for all benchmarks. NVIDIA's innovation in low-precision data formats, particularly the NVFP4, is a critical...

Source: Nvidia Developer Blog

Ashraf Eassa

Industry Analysis

Fusing Communication and Compute with New Device API and Copy Engine Collectives in NVIDIA NCCL 2.28

2025-11-11 00:06

🚀 Exciting news in the tech world! NVIDIA has released NCCL 2.28, enhancing communication and computation efficiency. This update introduces GPU-initiated networking and device APIs, allowing developers to create custom kernels that integrate networking directly into compute tasks. Key features include Copy Engine-based collectives and the NCCL Inspector for better monitoring and profiling. These advancements aim to improve throughput, reduce latency, and maximize GPU utilization in multi-GPU...

Source: Nvidia Developer Blog

Sylvain Jeaugey

Product Announcements

Upcoming Livestream: Build Visual AI Agents with NVIDIA Cosmos Reason and Metropolis

2025-11-10 22:22

Join the upcoming livestream on November 18, where you'll learn to fine-tune the NVIDIA Cosmos Reason VLM to create visual AI agents. 🕕 The session runs from 18:00 to 19:00 (CET). This is a great opportunity for anyone interested in AI and data-driven solutions. Don't miss out! #NVIDIA #AI #Livestream #VisualAI #TechEvent

Source: Nvidia Developer Blog

Tanya Lenz

Event

Articles from Source: Nvidia-Developer-Blog