JetBrains Releases Mellum2: A 12B MoE Model for Fast, Specialized Tasks in Multi-Model AI Pipelines
JetBrains released Mellum2, op…
How to Speed Up Transformer Training Using NVIDIA Apex (FusedAdam, FusedLayerNorm) and Native torch.amp
In this tutorial, we work thro…
Meet Memory OS: A 6-Layer Open-Source Memory Stack Built on Top of Hermes Agent
Hermes Agent already remembers…
Parallax: A Parameterized Local Linear Attention That Keeps Softmax and Adds a Learned Covariance Correction Branch
The Transformer’s attention me…
Build Skill-Augmented AI Agents with SkillNet for Search, Evaluation, Graph Analysis, and Task Planning
In this tutorial, we implement…
Best Text-to-Speech TTS Models in 2026: A Benchmark-Based Comparison
Text-to-speech TTS moved fast …
Hermes Agent Ships Tool Search for MCP: Anthropic Evals Show 49% to 74% Accuracy Gain on Opus 4
Nous Research’s open-source He…
How to Use AgentTrove: Streaming 1.7M Agentic Traces and Building a Clean ShareGPT SFT Dataset in Python
In this tutorial, we explore A…