Links

A Dream of Spring for Open-Weight LLMs: 10 Architectures from Jan-Feb 2026 ↗
February 26, 2026 | magazine.sebastianraschka.com | Local | Protected
Note: local archival copy
A Round Up And Comparison of 10 Open-Weight LLM Releases in Spring 2026
The State Of LLMs 2025: Progress, Progress, and Predictions ↗
December 30, 2025 | magazine.sebastianraschka.com | Local | Protected
Note: Local archival copy
A 2025 review of large language models, from DeepSeek R1 and RLVR to inference-time scaling, benchmarks, architectures, and predictions for 2026.
The Big LLM Architecture Comparison ↗
December 20, 2025 | magazine.sebastianraschka.com | Local | Protected
Note: local archival copy. updated 2025-12-20
From DeepSeek-V3 to Kimi K2: A Look At Modern LLM Architecture Design
Spinlocks vs. Mutexes: When to Spin and When to Sleep ↗
December 8, 2025 | howtech.substack.com | Local | Protected
Note: Local archival copy
You're staring at perf top showing 60% CPU time in pthread_mutex_lock. Your latency is in the toilet. Someone suggests "just use a spinlock" and suddenly your 16-core server is pegged at 100% doing nothing useful. This is the synchronization primitive trap, and most engineers step right into it because nobody explains when each primitive actually makes sense.
The State of Reinforcement Learning for LLM Reasoning ↗
December 4, 2025 | magazine.sebastianraschka.com | Local | Protected
Note: beyond standard llms
Understanding GRPO and New Insights from Reasoning Model Papers
Beyond Standard LLMs ↗
December 4, 2025 | magazine.sebastianraschka.com | Local | Protected
Note: beyond standard llms
Linear Attention Hybrids, Text Diffusion, Code World Models, and Small Recursive Transformers
From GPT-2 to gpt-oss: Analyzing the Architectural Advances ↗
December 4, 2025 | magazine.sebastianraschka.com | Local | Protected
Note: beyond standard llms
And How They Stack Up Against Qwen3
A Technical Tour of the DeepSeek Models from V3 to V3.2 ↗
December 4, 2025 | magazine.sebastianraschka.com | Local | Protected
Note: thorough article on deepseek arch
Understanding How DeepSeek's Flagship Open-Weight Models Evolved