Length-Constrained Summarization with GRPO: Reward Signal Ablations on Reddit TL;DR
An ablation study of GRPO reward signals for 64-token Reddit TL;DR summarization across Qwen2.5-0.5B and LFM-2.5-350M on Apple Silicon. Read more
An ablation study of GRPO reward signals for 64-token Reddit TL;DR summarization across Qwen2.5-0.5B and LFM-2.5-350M on Apple Silicon. Read more
A 942 MB checkpoint. Four Raspberry Pis. ~1.5 min gather. No single point of failure. A deep dive into smoltorrent - a distributed checkpoint sharding system built over raw TCP with replication, SHA-256 integrity verification, mDNS discovery, and Prometheus monitoring. Read more
Build a 4-node Raspberry Pi 4B cluster with UCTRONICS enclosure, PoE+ hats, and TP-Link LS110P PoE switch. Real numbers: 94.4 Mbps per link (100 Mbps switch ceiling), 62.3°C under full 16-core load, zero throttling at 1800 MHz throughout. Read more
Wire Mac minis into a high-bandwidth local Thunderbolt cluster for distributed training and inference with zero cloud egress cost, low latency, and direct control over cluster networking. Read more
Build a 4-node Raspberry Pi 4B cluster with UCTRONICS enclosure, PoE+ hats, and TP-Link LS110P PoE switch. Real numbers: 94.4 Mbps per link (100 Mbps switch ceiling), 62.3°C under full 16-core load, zero throttling at 1800 MHz throughout. Read more
Wire Mac minis into a high-bandwidth local Thunderbolt cluster for distributed training and inference with zero cloud egress cost, low latency, and direct control over cluster networking. Read more
A 942 MB checkpoint. Four Raspberry Pis. ~1.5 min gather. No single point of failure. A deep dive into smoltorrent - a distributed checkpoint sharding system built over raw TCP with replication, SHA-256 integrity verification, mDNS discovery, and Prometheus monitoring. Read more
Build a 4-node Raspberry Pi 4B cluster with UCTRONICS enclosure, PoE+ hats, and TP-Link LS110P PoE switch. Real numbers: 94.4 Mbps per link (100 Mbps switch ceiling), 62.3°C under full 16-core load, zero throttling at 1800 MHz throughout. Read more
An ablation study of GRPO reward signals for 64-token Reddit TL;DR summarization across Qwen2.5-0.5B and LFM-2.5-350M on Apple Silicon. Read more
Build a 4-node Raspberry Pi 4B cluster with UCTRONICS enclosure, PoE+ hats, and TP-Link LS110P PoE switch. Real numbers: 94.4 Mbps per link (100 Mbps switch ceiling), 62.3°C under full 16-core load, zero throttling at 1800 MHz throughout. Read more
Wire Mac minis into a high-bandwidth local Thunderbolt cluster for distributed training and inference with zero cloud egress cost, low latency, and direct control over cluster networking. Read more
An ablation study of GRPO reward signals for 64-token Reddit TL;DR summarization across Qwen2.5-0.5B and LFM-2.5-350M on Apple Silicon. Read more
A 942 MB checkpoint. Four Raspberry Pis. ~1.5 min gather. No single point of failure. A deep dive into smoltorrent - a distributed checkpoint sharding system built over raw TCP with replication, SHA-256 integrity verification, mDNS discovery, and Prometheus monitoring. Read more
Build a 4-node Raspberry Pi 4B cluster with UCTRONICS enclosure, PoE+ hats, and TP-Link LS110P PoE switch. Real numbers: 94.4 Mbps per link (100 Mbps switch ceiling), 62.3°C under full 16-core load, zero throttling at 1800 MHz throughout. Read more
A 942 MB checkpoint. Four Raspberry Pis. ~1.5 min gather. No single point of failure. A deep dive into smoltorrent - a distributed checkpoint sharding system built over raw TCP with replication, SHA-256 integrity verification, mDNS discovery, and Prometheus monitoring. Read more
A 942 MB checkpoint. Four Raspberry Pis. ~1.5 min gather. No single point of failure. A deep dive into smoltorrent - a distributed checkpoint sharding system built over raw TCP with replication, SHA-256 integrity verification, mDNS discovery, and Prometheus monitoring. Read more
Build a 4-node Raspberry Pi 4B cluster with UCTRONICS enclosure, PoE+ hats, and TP-Link LS110P PoE switch. Real numbers: 94.4 Mbps per link (100 Mbps switch ceiling), 62.3°C under full 16-core load, zero throttling at 1800 MHz throughout. Read more
An ablation study of GRPO reward signals for 64-token Reddit TL;DR summarization across Qwen2.5-0.5B and LFM-2.5-350M on Apple Silicon. Read more
An ablation study of GRPO reward signals for 64-token Reddit TL;DR summarization across Qwen2.5-0.5B and LFM-2.5-350M on Apple Silicon. Read more
An ablation study of GRPO reward signals for 64-token Reddit TL;DR summarization across Qwen2.5-0.5B and LFM-2.5-350M on Apple Silicon. Read more
Wire Mac minis into a high-bandwidth local Thunderbolt cluster for distributed training and inference with zero cloud egress cost, low latency, and direct control over cluster networking. Read more