The "X technique reduces tokens by Y%" fad is so old
Can't believe people get taken in by this
Charly Wargnier (@DataChaz)
UP TO 95% TOKEN REDUCTION WITH ZERO CODE CHANGES
A Netflix engineer just open-sourced Headroom, and it’s one of the smartest ways I’ve seen to cut LLM costs.
It wraps Cursor or Claude in a local proxy to compress your payload before it hits the LLM:
→ Intelligently shrinks logs, JSON, and code
→ Perfectly preserves logic accuracy
→ Keeps 100% of your data local
→ Stops Opus-tier models from wasting tokens on boilerplate
It already crossed 35K stars, which says a lot.
100% free and open-source.
repo in 🧵↓
— https://nitter.net/DataChaz/status/2067996575817945197#m