Claude bills by tokens, not message count—token growth per conversation follows S…

title: @0x_kaize: Most people blame Claude for strict limits. I blamed Claude too.

Recently I rea...

author: 0x_kaize

content_type: twitter_article

published: 2026-03-29T16:03:52+00:00

source_url: https://x.com/0x_kaize/status/2038286026284667239

word_count: 1096

Most people blame Claude for strict limits. I blamed Claude too.

Recently I realized that Claude do

Recently I realized that Claude doesn't count the number of messages. it counts tokens. All you need to do is use tokens wisely, but not everyone knows how to do that and ends up losing a lot of tokens and money as a result.

I got really into this and put together a list of the best habits that will save you a ton of tokens.

1. Edit your prompt. Don't send a follow-up

When Claude doesn't get your thoughts right, you might feel tempted to send:

1/ “No, I meant [your message]”

2/ “Ugh, that's not what I wanted [your message]”

and so on

Don't do that!

Every subsequent message is added to the conversation history. Claude re-reads ALL of it every turn - burning tokens on context that didn't even help.

Token cost per message = all previous messages + your new one.

> Total = S × N(N+1) / 2

> (S = avg tokens per exchange, N = message count)

At ~500 tokens per exchange:

5 messages: 7.5K tokens

10 messages: 27.5K tokens

20 messages: 105K tokens

30 messages: 232K tokens

Message 30 costs 31x more than message 1

Instead: click Edit on your original message → fix it → regenerate. The old exchange gets replaced, not stacked.

Fix the prompt, don't feed the history.

2. Start a fresh chat every 15–20 messages

In the previous section, I showed how token costs grow with every message.

Ideally, you should start a new chat every 15–20 messages.

Now imagine a chat with 100+ messages. At ~500 tokens per exchange, that's over 2.5 million tokens burned - most of it just re-reading old history.

One developer tracked his usage and found that 98.5% of tokens were spent on re-reading the history. Only 1.5% went toward actually outputting the result.

Aniket Parihar's post on LinkedIn.

When a chat gets long → ask Claude to summarize everything → copy it → new chat → paste as first message.

3. Batch your questions into one message

Many people believe that splitting questions into separate messages leads to better results. Almost always, the opposite is true.

Three separate prompts = three context loads.

One prompt with three tasks = one context load.

You save tokens twice: fewer context reloads, and you stay further from hitting your limit.

Instead of: “Summarize this article”

“Now list the main points”

“Now suggest a headline”

Write: “Summarize this article, list the main points, and suggest a headline.”

Bonus: the answers often turn out better because Claude immediately sees the full picture.

Three questions. One Prompt. Always!

4. Upload recurring files to Projects

If you upload the same PDF to multiple chats, Claude re-tokenizes that document every single time.

Use the Projects feature instead.

Upload your file once → it gets cached. Every new conversation inside that project references it without burning tokens again.

Cached project content doesn't eat into your usage when you access it repeatedly.

If you work with contracts, briefs, style guides, or any long docs - this alone could cut your token spend dramatically.

5. Set up Memory & User Preferences

Every new chat without saved context wastes 3–5 messages on setup: “I’m a marketer, I write in a casual style, I prefer short paragraphs…”

You've probably seen people start every prompt with "Act as a..." - that's tokens burned on repeat. Claude can remember this permanently.

Go to “Settings” → “Memory and User Settings.” Save your role, communication style, and settings once. Claude will automatically apply them to every new chat.

6. Turn off features you're not actively using

Web search, connectors, and “Explore” mode - all of these add tokens to every response, even if you don’t need them.

Writing your own content?

Turn off the “Search and Tools” feature.

The “Advanced Thinking” feature also consumes tokens. Keep it turned off by default. Only turn it on if your first attempt was unsatisfactory.

Rule: if you didn’t turn this feature on intentionally, turn it off.

7. Use Haiku for simple tasks

Grammar checking, brainstorming, formatting, quick translations, short answers - Haiku handles all of this at a much lower cost than Sonnet or Opus.

Choosing the right model is the most important decision you make every day.

Haiku for drafts and simple tasks → frees up 50–70% of your budget for tasks that truly require powerful models.

Mental model: Haiku → quick tasks, low cost

Sonnet → real work, medium cost

Opus → deep thinking, high cost

You don’t need powerful models for simple tasks!

8. Spread your work across the day

The Claude system uses a rolling 5-hour window. It does not reset at midnight - your limit gradually decreases. Messages sent at 9 a.m. will no longer count by 2 p.m.

If you use up your entire limit during a single morning session, most of your daily limit will remain unused.

Divide your day into 2–3 sessions: morning, afternoon, and evening. By the time you return, your previous usage is no longer counted, and you have a

new limit.

9. Work during off-peak hours

Starting March 26, 2026: Anthropic will now use up your 5-hour session limit more quickly during peak hours:

> 5:00 AM to 11:00 AM Pacific Time / 8:00 AM to 2:00 PM Eastern Time on weekdays.

Same query, same chat - but during peak hours, it impacts your limit more.

Claude bills by tokens, not message count—token growth per conversation follows S…

Brief

Why it matters

Key details