I am not sure this is going to be as direct of a comparison as the title implies. I am not a scientist. I don’t plan to concoct an experiment to test both tools and models against the same task. There are already benchmarks out there, and I don’t think they matter all that much in real life. What do I want to know? Which tools and models FEEL better to use.
I got curious about this almost immediately after Devstral 2 was released. As I am writing this, Mistral is offering free tokens for what seems like nearly unlimited use of their new Vibe-CLI tool. You can also pay for Devstral 2 tokens on OpenRouter, and they are quite inexpensive. Inexpensive enough that I might have paid less by the token for Devstral 2 had I used it instead of my $3 Z.ai Coding Plan. Maybe.

Devstral 2 is a newer model than GLM-4.6, so that gives Mistral a potential edge over Z.ai. Devstral 2 is only a 123B model, while GLM-4.6 is a 355B model. Being three times as big is a huge advantage!
Either model comes in way behind Claude Opus in this race, but both models are much cheaper and at least somewhat faster than a Claude Code subscription.
NOTE: When I tried Devstral 2, GLM-4.6 was Z.ai’s latest model. They released GLM-4.7 while I was putting the finishing touches on this post.
- Devstral 2 Mistral Vibe CLI Announcement
- Contemplating Local LLMs vs. OpenRouter and Trying Out Z.ai With GLM-4.6 and OpenCode
- Is The $6 Z.ai Coding Plan a No-Brainer?
Who is this blog post for?
It is for people like me. I don’t write code 40 hours per week. That isn’t my job. I have been firing up OpenCode to help me bang out a small coding task roughly once every two or three days. I might be firing it up more often than I need to because my Z.ai subscription is new, shiny, and fun.
I don’t write code often enough to justify paying $20 per month for a Claude Pro subscription, and I certainly don’t code enough to justify $100 per month for Claude Max!
Maybe you write code as occasionally as I do. Maybe you use an LLM to help you configure things like Proxmox, Jellyfin, and nginx in your homelab. Maybe you have a $100 Claude Max subscription at work, but you need something to fill in the gap for your occasional coding needs at home.
I definitely believe that a $6 Z.ai subscription was a no-brainer when I wrote that blog post two weeks ago. Maybe paying by the token for Devstral will wind up being nearly as good, a little faster, and manage to cost even less.
Go try this vibe coding stuff while it is free!
It looks like Devstral 2 is going to be free to use for the entire month of December. Google’s Gemini-CLI allows 60 requests per hour against their API for free. Qwen-Code can be used with Qwen’s API for free. There are a lot of free ways to use agentic coding interfaces, and they’re not expiring at the end of the year.
I am sure there are other ways of testing out or even regularly using these sorts of coding tools completely for free. Don’t forget that free things are almost always free for a reason! Mistral is currently free to get you hooked. Other API’s are free when you agree to let them use your data for training.
You might also want to consider where your data is going. I previously talked about how my Z.ai subscription is served from China, and your ethics might not line up with that. This is also true of Alibaba’s Qwen service.
Maybe you would feel better paying a little more for Devstral 2 knowing that they are a French company and their servers are in Europe. Maybe you’d prefer to pay a massive company like Google that is based in the United States.
Thoughts on Vibe and Devstral 2 from a shadetree programmer
I have a lovely JC Pro Macro Pad on my desk. Nearly everything that I use this macro pad for needed to be redone when I migrated from Ubuntu to Bazzite on my workstation a few weeks ago. This is my Mission Control macro pad, and the keys usually light up in a way that indicates the state of the action. The headphone toggle turns red when the speakers are active, and my espresso machine button turns blue when Home Assistant thinks the espresso machine has cooled down.
I needed a new way to control the state of the LEDs. The Arduino gets grumpy if too many processes try to write to the serial port at the same time, so I had OpenCode with Z.ai write me a pair of scripts. One creates and watches a fifo for new commands that the Arduino already understands, and it ships those over the serial port as they come on.
The other script is called macroled, and it has the simple job of converting English color names to RGB values then writing the appropriate commands to the fifo.
1 2 3 4 5 | |
I installed Vibe-CLI today, and I asked it to create another script. This one watches my Radeon 9070 XT GPU’s wattage. If the wattage cap is set to 250 or below, it sets the macro pad key to blue. If the cap is over 250, the key turns red. Blue for cool and quiet, red when full power is available. It adds more green to the mix as actual power consumption rises.
I’m not entirely happy with how these colors wind up mixing, but this gives me a visual indicator of both my maximum available GPU performance and how hard I am hitting the GPU.
Devstral 2 did a fantastic job here. We had to go back and forth several times. I decided that I didn’t like the color getting diluted when the GPU was only using 20 or 30 watts, so I asked Vibe to only mix in the green when the GPU goes above 50 watts. I also went back and forth a couple of times swapping colors around and changing maximum brightness.
This was a small task, but small tasks are what I usually need to work on. Vibe and Devstral 2 did as good of a job here as I would expect to see from OpenCode and GLM-4.6.
Is vibe coding OK? How do you define it?
The early uses of vibe coding seemed to be used in a derogatory manner towards non-programmers producing LLM-generated code that they didn’t understand. It doesn’t feel derogatory any longer, and it seems to encompass a wider variety of processes.

I have seen quite a few attempts at definitions, but nobody seems to agree on the boundaries. I personally decided that I feel that I am vibe coding when I don’t touch the code in a text editor. I take a peek at most of the shell scripts that OpenCode spits using cat or less, but I almost never open them in Emacs. I am being a little safer by making sure there are no sneaky rm -rf commands in there, but I’m not changing anything.
I think that counts as vibe coding.
OpenCode and Vibe write better shell scripts than I do!
Listen. I can write a good shell script. The fact is, though, that I usually don’t. I hack something together that works. I might sneak in some error checking around the areas that were causing me problems while attempting to get things to work correctly, but most of the short scripts that I write have nearly zero good error checking.
The vibe coded scripts are more likely to break things down into functions. They’re more likely to check for error codes. They’re more likely to stop and let you know why something didn’t work right when you run them. The vibe-coding machine does a MUCH better job of making sure the scripts output extra text to make sure you can see what is going on as they run.
Is my script going into production on a server? I will put in the effort to do all these things and more. Is the script just setting the color of an LED on my macro pad? I will leave all of this out. The robots will beat me here every time.
Is the Z.ai Coding Lite Plan still a no-brainer?!
I almost had to guess at this, but Z.ai just added a usage view to their subscription dashboard. I have used 26.7 million tokens in the last 30 days.
Devstral 2 is currently free, and has done a good job for me here, but what about in January when it isn’t free? I see in my OpenRouter account that paid Devstral 2 would cost $0.05 per million input tokens and $0.22 per million output tokens. Assuming Devstral 2 would have matched GLM-4.6 on token count, which is a MASSIVE assumption, I would have paid $1.33 for my input tokens at OpenRouter. I think it is safe to round up to a million output tokens and say I would have paid $0.60 in that direction.
That adds up to a bit less than the $3 that I paid, because of the half price deal, for my first month on my Z.ai Coding Lite plan.
That isn’t QUITE a no-brainer anymore, right?! I’m happy with what I’ve paid for. GLM-4.6 is a more powerful model. I suspect there will be jobs that GLM-4.6 can easily handle where Devstral 2 might fail, and $3 per month isn’t a lot of money. Not only that, but so far my usage is trending upwards.
A part-timer could probably use free API services and tools for the foreseeable future
Everyone wants your money. They all want you to subscribe. They all want to get you hooked on their tool and model.
I suspect that one company or another will have a free coding plan for the next year or two. Alibaba wants you to use Qwen. Google wants you to use Gemini. Mistral wants you to use their new Vibe tool with Devstral 2. If you have a good experience while it is free, then you might become a paying customer when it isn’t. You might even use their models in the programs you’re writing.
I completely understand this. Even with my light use, I have gotten used to OpenCode and GLM-4.6. Devstral was easy to work with using Vibe, but everything felt a little weird. I don’t mind paying $36 or $72 for the year knowing that I will be able to use my preferred tools for the next 12 months.
That is currently MY preferred tool. You might like Vibe, Qwen-Code, or Claude Code more than OpenCode. You can probably slot Z.ai’s subscription into Vibe or Qwen-Code like you can with Claude Code or OpenCode, but maybe it isn’t just the tool you like. Maybe you feel more comfortable working with the Devstral or Qwen Coder models. That’s fine. You should try everything!
Maybe you shouldn’t limit yourself to just one model!
If you are an occasional user like myself, I think it is just fine to lock yourself in to a single model for 3, 6, or even 12 months. Especially if the price is right.
What if you actually are an extremely heavy user? Should you just spend $200 every month on the biggest Claude Max subscription? Maybe not!
I just learned about the oh-my-opencode plugin for OpenCode. It is extremely opinionated and absolutely bananas! It is preconfigured to call the best-suited model for each task. It uses GPT-5.2 for design and debugging, Gemini 3 Pro for frontend development, Claude Sonnet 4.5 for documentation and codebase exploration, and Grok Code for fast codebase explortation. That is at least three different APIs or subscriptions.
You might still be better off with separate lesser subscriptions even if you aren’t using oh-my-opencode. I keep reading that Claude is better at implementing straightforward solutions, while OpenAI’s Codex is better at debugging complicated problems. People also seem to feel that Z.ai’s GLM-4.6 is good enough for handling most of the grunt work.
Maybe upgrading from the $20 to the $100 Claude subscription isn’t the best move when you start reacing the 5-hour limit. It might be better to spend $20 on Claude and add a $20 Codex plan to the mix to attack those problems where Claude falls short. You can probably get more than double the work done with Codex when you run out of Claude tokens.
When that still isn’t enough, you can add a Z.ai subscription to the mix. A tool like OpenCode can connect to all three subscriptions, and switching between them is just a few keystrokes away.
If you are already subscribed to the $200 tiers of both Claude and Codex, and you are maxing them out, then none of this applies directly to you. You’re way beyond the audience of this blog post!
The important thing to remember is that you’re not locked in. If you pay for a plan that is undersized, you can always upgrade, and you are free to mix and match. I am excited that I landed on OpenCode, because I can plug it into all sorts of different backends, and I can configure different agents to use whatever API might be appropriate.
Conclusion
The landscape of AI coding tools is changing faster than ever. What was a clear “no-brainer” subscription a month ago now has serious competition from free tiers and pay-per-use models. Devstral 2 with Vibe-CLI has proven to be a capable setup, while OpenCode with GLM-4.7 remains my go-to tool. The key takeaway is that there’s no one-size-fits-all solution. What matters most is finding the combination that fits your workflow, budget, and privacy comfort level.
I’d love to hear about your experiences with these tools. What’s your current AI coding setup, and are you happy with it? Have you tried Vibe-CLI, OpenCode, or similar tools? Are you more concerned with cost, performance, privacy, or ease of use? Come join our Discord community where we discuss AI coding tools, homelab setups, and all things tech. It’s a great place to share your experiments and learn from others navigating the same decisions.
- Contemplating Local LLMs vs. OpenRouter and Trying Out Z.ai With GLM-4.6 and OpenCode
- Is The $6 Z.ai Coding Plan a No-Brainer?
- How Is Pat Using Machine Learning At The End Of 2025?
- Should You Run A Large Language Model On An Intel N100 Mini PC?
- Harnessing The Potential of Machine Learning for Writing Words for My Blog