Articles by Category: Technical_deep_dives

Improving Quality of Recommended Content through Pinner Surveys

2025-12-05 20:02
Pinterest is enhancing content quality by utilizing user feedback through surveys. 📊 In partnership with the Inspired Internet Pledge, the platform collects ratings on visual appeal to understand what users value. This data is then used to train machine learning models to improve recommendations across Homefeed, Related Pins, and Search. The initiative aims to reduce low-quality content and elevate user experience. 📈✨ Learn more about how Pinterest is putting Pinners first! #Pinterest...
Pinterest Engineering

NVIDIA Grace CPU Delivers High Bandwidth and Efficiency for Modern Data Centers

2025-12-05 17:00
🚀 The NVIDIA Grace CPU is transforming data centers since its 2023 launch, achieving impressive performance efficiency. Grace combines Arm Neoverse cores with advanced technologies for high bandwidth and energy efficiency. Its single NUMA design simplifies software development by allowing equal memory access across all cores. This architecture benefits cloud environments, enhancing virtual machine performance without the drawbacks of traditional chiplet designs. #NVIDIA #GraceCPU #DataCenters...
Praveen Menon

Treating your agents like microservices

2025-12-05 08:40
Exploring the future of multi-agent architectures, Ryan talks with Guillaume De Saint Marc from Outshift by Cisco. They discuss how treating agents as microservices can enhance scalability and decentralization. Key points include the challenges of current infrastructure and the need for effective communication protocols. Discover more about these emerging technologies and connect with the AGNTCY community. 🚀🔗 #MultiAgentArchitecture #Microservices #Cisco #EmergingTech #AI
Phoebe Sajor

MCP-Powered Financial AI Workflows on Databricks

2025-12-04 23:00
Discover how Model Context Protocol (MCP) and Databricks Agent Bricks are transforming financial data management. 💼 These technologies enable secure, real-time data processing and streamline workflow automation in finance. Stay ahead in the industry with these innovative solutions! 🚀 #FinanceTech #DataAutomation #MCP #Databricks #Innovation

Titans + MIRAS: Helping AI have long-term memory

2025-12-04 19:26
🚀 Exciting advancements in AI! The Titans architecture and MIRAS framework are introduced to enhance AI's long-term memory while it operates. These innovations allow models to process extensive contexts efficiently, updating their memory in real-time. Unlike traditional methods, Titans combines the speed of RNNs with the accuracy of transformers, enabling AI to adapt dynamically to new information without requiring offline retraining. #AI #MachineLearning #TechInnovation #LongTermMemory...

KubeVirt’s Architecture: CRDs, Controllers and Daemons

2025-12-04 16:00
🚀 KubeVirt is reshaping how organizations migrate to Kubernetes by bridging legacy virtual machines and modern containers. This architecture extends Kubernetes' capabilities, allowing VMs to coexist with containers on a single platform. KubeVirt introduces Custom Resource Definitions (CRDs) to manage VMs as native resources, simplifying orchestration. The recent eBook, “Running Virtual Machines on Kubernetes,” offers a comprehensive guide on this migration process. 📖 #KubeVirt #Kubernetes...
Janakiram MSV

Architecting efficient context-aware multi-agent framework for production

2025-12-04 00:00
🔍 The AI landscape is evolving, with organizations deploying multi-agent systems for complex tasks. However, managing context has become a significant challenge. 🔧 ADK introduces **Context Engineering**, a new approach that treats context as a separate system. This architecture focuses on efficient processing, compaction, and caching to enhance AI performance. 💡 By enabling scoped context handoffs, this method aims to improve reliability and cost-effectiveness in production environments. #AI...

We Got Claude to Fine-Tune an Open Source LLM

2025-12-04 00:00
🚀 Exciting news for AI developers! Claude now has the ability to fine-tune open-source language models with the new Hugging Face Skills tool. This feature allows users to write training scripts, submit jobs to cloud GPUs, and monitor progress seamlessly. The hf-llm-trainer skill equips Claude with knowledge on model training, GPU selection, and configuration. Learn how to leverage this tool to enhance your AI projects! #ArtificialIntelligence #MachineLearning #OpenSource #HuggingFace #ClaudeAI

How We Debug 1000s of Databases with AI at Databricks

2025-12-03 22:00
At Databricks, AI is transforming database management. The company has implemented an AI-assisted debugging platform that replaces manual operations, significantly cutting down the time spent on database issues. This innovation allows for more efficient management of thousands of databases. Explore the future of database debugging! 💻🤖 #AI #DatabaseManagement #Databricks #Innovation #TechTrends

Combining Rust and Python for High-Performance AI Systems

2025-12-03 21:00
🚀 Python dominates AI and ML with its robust ecosystem, but it faces speed limitations due to its interpreted nature and global interpreter lock. 🌟 Rust offers high performance and memory safety, making it ideal for critical components in AI systems. By combining both, developers can prototype in Python while leveraging Rust for performance-intensive tasks. Learn how this hybrid approach is reshaping AI development and enhancing efficiency! #AI #MachineLearning #Rust #Python #HighPerformance
Zziwa Raymond Ian

Autonomous Observability at Pinterest (Part 1 of 2)

2025-12-03 17:02
At Pinterest, we are enhancing our observability tools to create a unified experience. Traditionally, our systems operated in silos, making it hard to connect logs, metrics, and traces. To address this, we are implementing the Model Context Protocol (MCP) to streamline our observability data. This will enable faster root-cause analysis and empower our teams to build context-aware tools. We are excited to embrace AI in this journey, aiming for a more intelligent observability future! 🚀🔍...
Pinterest Engineering

How Agentforce Achieved 3–5x Faster Response Times While Solving Enterprise-Scale Architectural Complexity

2025-12-03 04:15
🚀 Exciting progress in AI! Krista Hardebeck and her team at Salesforce have successfully implemented the Agentforce service for a major retailer, achieving response times 3–5x faster. They optimized complex architectures while integrating brand-specific conversational AI. Key improvements included reducing reasoning latency by 20 seconds and ensuring reliability in high-volume order interactions. The project emphasized a strong technical foundation, enabling expansion across multiple brands...
Scott Nyberg

Build a SendGrid MCP Server for AI Email Workflows

2025-12-03 00:00
🚀 Learn how to build a SendGrid MCP Server for AI email workflows! This guide offers insights on setting up a scalable email platform, utilizing Twilio's APIs for efficient communication. Discover best practices and tips for enhancing your email capabilities in the AI landscape. #SendGrid #AIWorkflows #Twilio #EmailMarketing #TechTips
Denis Kuria

Accelerating Real-Time Financial Decisions with Quantitative Portfolio Optimization

2025-12-02 18:51
Unlocking the future of financial portfolio optimization! 🚀 This article discusses the challenges of balancing speed and complexity in financial decision-making. The new Quantitative Portfolio Optimization framework aims to overcome these hurdles by transforming slow processes into fast, iterative workflows. With high-performance hardware and NVIDIA cuOpt solvers, it achieves significant speedups, enhancing strategy backtesting and analysis. The CUDA ecosystem further accelerates data...
Peihan Huo

All About Cedar, an Open Source Solution for Fine-Tuning Kubernetes Authorization

2025-12-02 18:00
Cedar is an open-source authorization engine developed by AWS, designed to enhance Kubernetes’ role-based access control (RBAC). While RBAC has been effective since 2017, it has limitations, such as not allowing conditional access or attribute-based controls. Micah Hausler, a principal engineer at AWS, highlighted how Cedar was initially created to solve authorization challenges. It has proven to be a clear and concise language that helps both technical and non-technical users understand...
Heather Joslyn

Run cost-effective AI workloads on OpenShift with AWS Neuron Operator

2025-12-02 16:30
🚀 Enhance your AI workloads with the AWS Neuron Operator on Red Hat OpenShift! This collaboration allows enterprises to run LLM inference and training with AWS Inferentia and Trainium chips, offering up to 70% lower costs per inference. The AWS Neuron Operator simplifies deployment and management of AI devices, optimizing performance and cost efficiency. Key features include automated scheduling, device management, and telemetry collection. Gain flexibility and significant savings while...
Erwan Gallen, Yevgeny Shnaidman, Nenad Peric

Improving MySQL® Cluster Uptime: Designing Advanced Detection, Mitigation, and Consensus with Group Replication

2025-12-02 14:00
At Uber, maintaining high availability is crucial. The article discusses our implementation of MySQL® Group Replication in single-primary mode. This approach has successfully reduced failover time to under 10 seconds, enhancing both reliability and write availability during failures. Learn more about our strategies for improving MySQL® cluster uptime. 🚀🔧 #MySQL #GroupReplication #HighAvailability #TechInnovation #UberEngineering

Automate unique compliance checks with OpenShift and CustomRule

2025-12-02 08:00
Unlock enhanced compliance for Red Hat OpenShift with the new CustomRule feature! 🚀 This article reveals how security teams can automate unique compliance checks, transforming specific security rules into code. It offers practical examples on writing and integrating CustomRules into existing workflows, streamlining the auditing process. Currently in Tech Preview, this feature is designed to help organizations maintain compliance efficiently. Explore more about CustomRules and their potential!...
Vincent Shen, Lance Bragstad

Accelerating Real-Time Financial Decisions with Quantitative Portfolio Optimization

2025-12-01 23:44
Unlocking the future of financial portfolio optimization! 🚀 This article discusses the challenges of balancing speed and complexity in financial decision-making. The new Quantitative Portfolio Optimization framework aims to overcome these hurdles by transforming slow processes into fast, iterative workflows. With high-performance hardware and NVIDIA cuOpt solvers, it achieves significant speedups, enhancing strategy backtesting and analysis. The CUDA ecosystem further accelerates data...
Peihan Huo

Train Small Orchestration Agents to Solve Big Problems

2025-12-01 23:25
At NVIDIA Research, we're advancing agent design by addressing the challenge of selecting the right tools for tasks. Our new approach involves an "orchestrator" model that supervises other models, considering user preferences like speed, cost, and accuracy. Interestingly, small models can effectively manage this process when properly tuned. Introducing ToolOrchestra, our method for data preparation and reinforcement-learning training to enhance orchestration. #NVIDIA #AI #MachineLearning...
Shizhe Diao

How Okta Scaled From 12 to 1,000 Kubernetes Clusters With Argo CD

2025-12-01 22:00
Okta faced challenges with its Auth0 platform for private cloud customers, prompting a shift to open source GitOps using Argo CD. At KubeCon + CloudNativeCon, engineers Jérémy Albuixech and Kahou Lei shared insights on scaling from 12 to over 1,000 Kubernetes clusters in five years. Argo CD has proven to be an effective solution, supported by a growing community. #Kubernetes #GitOps #Okta #ArgoCD #CloudNative
B. Cameron Gain

Under the Hood of Confluence Race Mode

2025-12-01 17:50
🚗 Exciting news from Atlassian! We've launched Confluence Race Mode, a mini-game that lets teams race cars on whiteboards. This initiative, inspired by Williams Racing, started as a project during our ShipIt hackathon. Research shows that play can boost team productivity by up to 20%. Dive into our blog for insights on the architecture behind Race Mode, including the use of Entity Component Systems and state machines for optimal performance. Try it for yourself: wb.new/racing! 🎮 #Confluence...
Jovana Dunisijevic

Optimizing blood and VFX systems in UE5 for Let Them Come: Onslaught

2025-12-01 17:32
Tuatara Games has successfully optimized thousands of blood decals and FX events in Unreal Engine 5 for their game, Let Them Come: Onslaught. This enhancement aims to ensure smooth and chaotic gameplay, creating an immersive experience for players. Learn more about their techniques and the impact on gameplay! 🎮💥🩸 #GameDevelopment #UnrealEngine5 #LetThemComeOnslaught #TuataraGames #GamingNews

Streamlining Security Investigations with Agents

2025-12-01 16:00
🔒 Slack's Security Engineering team is enhancing its security processes by integrating AI agents. Their security event ingestion pipeline manages billions of daily events, focusing on alert reviews during on-call shifts. The team is refining a prototype for AI-driven investigations to improve efficiency and decision-making. This post is the first in a series detailing their design choices and learnings. Stay tuned for more insights! 🚀 #CyberSecurity #AI #TechInnovation #Slack #SecurityAgents
Dominic Marks

How AI-Driven Refactoring Cut a 2-Year Legacy Code Migration to 4 Months

2025-12-01 15:12
🚀 Exciting advancements in code migration! In a recent article, Lilach Nachmias, Senior Manager of Software Engineering at Salesforce, shared insights on migrating the Own Archive package into Salesforce’s Core infrastructure. This process transformed a two-year effort into just four months through AI-driven refactoring. The migration tackled challenges like undocumented legacy Apex patterns and deep dependency chains, ensuring a fully native product that enhances security and reliability....
Scott Nyberg

Kamera Uses Simulation To Verify Kubernetes Controller Logic

2025-11-28 13:00
🌐 Tim Goodwin, a graduate student from UC Santa Cruz, is exploring the future of Kubernetes as a universal control plane. He developed Kamera, a simulation software that helps manage and debug Kubernetes controllers without needing a cluster. 📊 Kamera captures the behavior of controllers, enhancing their functionality and making it easier for developers. Goodwin presented this innovation at KubeCon + CloudNativeCon NA 2025. Discover how Kubernetes can manage a variety of systems beyond...
Joab Jackson

Creating a domain-specific NL-to-SQL MCP server

2025-11-28 06:00
🚀 Enterprise data analysis often struggles with turning business questions into SQL queries due to a lack of data analysts. Generative AI, like large language models (LLMs), offers a potential solution. 💡 However, initial experiments revealed that generic LLMs lack the specific knowledge of business contexts and data semantics, leading to incorrect SQL generation and security issues. 🔍 To address these challenges, a domain-specific NL-to-SQL MCP server was proposed. This system aims to bridge...

How does cgroups v2 impact Java and Node.js in OpenShift 4?

2025-11-27 07:01
Understanding the impact of cgroups v2 on Java and Node.js in OpenShift 4 is essential for developers. This article outlines compatibility concerns and solutions associated with cgroups v2, a Linux kernel feature for resource management in containers. It details how different OpenShift versions handle cgroups, emphasizing the importance of using the latest images for Node.js and OpenJDK. For those using Java, cgroups v2 compatibility was introduced in OpenJDK 8u372 and later versions. Node.js...
Francisco De Melo Junior

Turning Handoffs into Handshakes: Integrating Design Systems for AI Prototyping at Scale

2025-11-26 19:17
At Atlassian, we are integrating AI into our design and engineering processes to enhance collaboration and efficiency. 🤝 AI now assists in ideation and prototyping, allowing teams to focus on decision-making rather than repetitive tasks. This shift transforms traditional handoffs into collaborative interactions. ⚙️ By leveraging our design system, AI achieves about 70% accuracy in prototyping from a single screenshot, streamlining workflows and maintaining brand consistency. 📐 We share our...
Laura Huerta

Building AI Agents in Kotlin – Part 2: A Deeper Dive Into Tools

2025-11-26 08:23
Unlock the potential of AI agents with Kotlin! 🚀 In the latest article, we explore extending the capabilities of a basic coding agent using the Koog framework. This includes building an ExecuteShellCommandTool, allowing the agent to run code and learn from real-time feedback. We also discuss the anatomy of a Koog tool and essential safety considerations when enabling command-line execution. Understanding these components is crucial for creating effective tools. #Kotlin #AI #KoogFramework...
Bruno Lannoo

Trusted execution clusters operator: Design and flow overview

2025-11-26 08:01
🔒 Confidential computing enhances cloud-native security by protecting data in use, which is traditionally vulnerable. The trusted execution cluster operator, a Kubernetes-native tool, manages clusters with hardware-based security features like secure enclaves and memory encryption. This ensures that sensitive workloads are accessed only by verified software. Key components include the Trustee for attestation and key management, and the operator automates the configuration of security...
Alice Frosi, Jakob Naucke

Autoscaling vLLM with OpenShift AI model serving: Performance validation

2025-11-26 07:01
🚀 Exciting insights on autoscaling with vLLM in OpenShift AI! This article compares KServe's KEDA-based autoscaling to Knative's concurrency-based approach. Key findings show KEDA's ability to scale effectively under both homogeneous and heterogeneous workloads, maintaining service-level objectives (SLO) better than Knative. 🔍 KEDA adapts to real-time metrics, ensuring efficient resource use and improved request success rates. Dive deeper into the performance results and implications for AI...
Alberto Perdomo

Real-time data quality monitoring: Kafka stream contracts with syntactic and semantic test

2025-11-26 00:00
Monitoring data quality is crucial in today's data-driven world. The article discusses the challenges in ensuring reliable data within Kafka streams, which are essential for real-time processing. The Coban platform introduces a solution that allows users to define data contracts, enabling automated quality checks. This helps identify issues in real-time and notifies stakeholders promptly. Key components include data contract definitions, automated test execution, and result observability. 📊📈...
Source: Grab Tech

Making GPU Clusters More Efficient with NVIDIA Data Center Monitoring

2025-11-25 21:00
High-performance computing is rapidly growing, driven by advancements in AI and large language models. As GPU demand increases, optimizing GPU efficiency is crucial. In a recent article, strategies were discussed to minimize idle GPU waste in large clusters. Key issues include hardware failures, misconfigured jobs, and idle sessions. Each waste type requires specific solutions to enhance productivity and reduce operational costs. Effective monitoring and targeted programs can lead to...
Sachin Lakharia

Reducing experiment duration with predicted control variates

2025-11-25 16:28
Etsy has significantly enhanced its experimentation process with the evolution of CUPED, reducing average experiment duration by 3 days! 🕒 This variance reduction technique uses pre-experiment data to improve accuracy and speed in measuring outcomes. The introduction of CUPAC further cuts variance, leading to quicker decision-making. These advancements allow teams to conduct more experiments, ultimately enhancing the buyer and seller experience on the platform. #Etsy #CUPED #Experimentation...
Kelly McManus

Reducing experiment duration with predicted control variates

2025-11-25 16:28
Etsy has made significant strides in reducing experiment durations with the evolution of its CUPED technique. 🛍️ CUPED, or Controlled-Experiment Using Pre-Experiment Data, has decreased average experiment time by 3 days by leveraging historical data to enhance accuracy and speed. 📊 The introduction of CUPAC further refines this process, yielding an additional 10% variance reduction, allowing more experiments to be conducted yearly. 🧪 Etsy continues to explore new techniques to enhance its...
Kelly McManus

Docker Sandboxes: A New Approach for Coding Agent Safety

2025-11-25 15:00
🚀 Exciting developments in coding agent technology! As coding agents like Claude Code and Gemini CLI become more autonomous, developers face challenges in balancing access and safety. Current options like YOLO Mode and DIY VMs can compromise security or productivity. To address this, a new solution using Docker Sandboxes is being explored. This provides isolation for coding agents while allowing them to operate effectively. The experimental preview includes container-based isolation and broad...
Source: Docker Blog
Srini Sekaran

Building domain-specific LLMs with synthetic data and SDG Hub

2025-11-25 07:00
🌐 Synthetic data generation is transforming how we build large language models (LLMs). By using one model to create training examples for another, teams can fill domain-specific gaps without relying on scarce human data. 🔧 Enter SDG Hub, an open-source toolkit that simplifies synthetic data workflows. It allows users to mix LLM components with traditional data tools, enhancing efficiency and scalability. 📈 The process includes generating synthetic data, fine-tuning models, and deploying...
Shivchander Sudalairaj, Hao Wang, Addie Stevens