Discussion
Search code, repositories, users, issues, pull requests...
Isolated_Routes: I like this a lot An interesting next iteration would be to add a functionality that evaluates a user's work for inefficiencies and suggests where they can improve cut cost. Might be outside the scope of your project, but it could be interesting.
Normal_gaussian: This is cool.I do find the activities a little suspect - it has 1 turn of planning for me in the last 30 days. I have claude write plans first before every coding session, often using one agent session to plan and then output a plan file, and then others to execute on it. I also have several repos dedicated to 'planning' in the sense of what should I do next based on what emails/tickets/bugs etc. I have. In other words - I do all kinds of planning!
jimbokun: So how much did the tokens to build this cost?
ieie3366: "Built this after realizing I was spending ~$1400/week on Claude Code with almost no visibility into what was actually consuming tokens."holy slop. the $200/month plan has NEVER hit rate limits for me and I often run 5+ tabs of concurrent agents in a large 300k LoC codebase
ethan_smith: The $200/month plan throttles you when you hit limits - you just wait in a queue. API usage at $1400/week means unthrottled, parallel execution with no waiting. These are very different use cases, and for teams or heavy automation workflows the API cost can make sense if the time savings justify it.
weird-eye-issue: That's not how it works
cududa: You’ve never used the API version versus the $200 plan and set the two at the exact same task, have you?
weird-eye-issue: I used the API version for quite a while before using a subscription, which I now have used extensively for many months.So, is your claim that they just slow down and queue the subscription version, or are you accusing them of using nerfed models, or is it something else? The only time I ever get some slowness has to do with the models being overloaded and has nothing to do with limits. Those are two separate concepts you seem to be confusing. And luckily, this is pretty rare for me since I don't work during US time zones.
yashjadhav2102: That’s really helpful – token usage is definitely something everybody knows about yet can’t quite put their finger on. This split between non-tool conversations and coding looks rather shocking, like there’s a ton of inefficiency in our interaction with such services. I like the fact that it doesn’t require any LLM to run – nice and simple concept.
R00mi: The 56% conversation vs 21% coding split is a really interesting finding — it lines up with trajectory studies on SWE-bench where ~38% of an agent's actions are pure exploration (grep, find, file reads). The remaining "no-tool" turns are likely the agent digesting what it read and planning its next move. These two costs are linked: the less efficiently the agent localizes, the more thinking turns it needs to piece things together. PatchPilot (ICML 2025) quantified this — localization capability accounts for ~47% of an agent's total improvement. One thing that would be really interesting in your tool: separating exploration turns (grep/find/read) from pure thinking turns, and seeing how the ratio scales with project size. On large monorepos, exploration should blow up non-linearly.