Deepseek nerds, have you tried any CLI/Coding agent integrations with it?

aanes_appreciator [he/him, comrade/them]@hexbear.net · 8 days ago

Deepseek nerds, have you tried any CLI/Coding agent integrations with it?

chgxvjh [he/him, comrade/them]@hexbear.net · 8 days ago

Tabitha ☢️[she/her]@hexbear.net · 8 days ago

All new science and philosophy is done in China since 2025, sorry buddy, if you wanted your model to think in English, you gotta build crypto scams or day trading algos or something that’s obsolete before you even started.

chgxvjh [he/him, comrade/them]@hexbear.net · 8 days ago

The future can’t even be conceived of in English.

gay_king_prince_charles [she/her, he/him]@hexbear.net · 8 days ago

That’s odd. Deepseek put in a lot of effort to prevent language-switching because it significantly degraded results.

piccolo [any]@hexbear.net · 8 days ago

I’ve been using and enjoying Zed. It’s like Cursor but open source and written from scratch in Rust instead of a VS Code fork. Highly recommend checking it out for seeing what coding agents are capable of. Zed recommends using Claude as the model, and in my experience Claude does work the best, but other models work quite well as well.

Zed, in my opinion, is the right way to do coding agents. It’s part of your chat interface, and it can whip out an entire project from the ground up, but it really feels like you’re in the driver’s seat still, and it will show you diffs of what it changed and you can approve or reject them easily. It feels natural to use and also the best way to actually be a software engineer using a tool.

I don’t really like DeepSeek as a coding model (though I do like it in a no-tools chat context), but GLM 4.6 (another open source Chinese model, made by a company called z.ai) has been very good imo and it works great in Zed. You can use it through OpenRouter or the official z.ai API. If you just buy tokens directly, it costs pennies and you only pay for what you use. Also, z.ai has a very cheap monthly plan for like $3 USD with a lot of usage.

Though, imo OpenRouter makes the most sense for trying out LLMs because you can just put a few dollars into their service and use it for GLM 4.6 or DeepSeek-V3.2 or Qwen-Coder or Claude or any other model and see which ones you like the most. Also, new models come out of China very regularly and are often quite good (Minimax-M2 just came out recently and seems super promising), and if your money is in OpenRouter you can just try it out easily. Plus, if you ever, at any point, put $10 into an OpenRouter account, you get 1000 free messages/day across their free models forever, which is the most generous “free” tier I’ve found. And then you can use those credits to play around with LLMs as coding agents. I put $10 in my account a few months ago and still haven’t run out with reasonable usage as a coding agent.

One caveat about OpenRouter is that you should set it up to prioritize the first party APIs under the hood (e.g. if you want to use GLM 4.6, you can configure OpenRouter to use the official z.ai servers). This is because many providers quantize the models, which makes them less good, and also then it’s a 3rd party getting your data rather than the (possibly CPC-affililated) company that actually makes the model.

IMO, local models don’t make sense for the average person. A computer capable of running DeepSeek-V3.2 at full precision would cost well over $50k (iirc). Of course, you can’t be sure your data isn’t being mined without running it locally, but I’m just writing open source software for fun, so I don’t really care that much.

Please feel free to ask any questions if you want more info! I have strong opinions about this stuff and I’m happy to share.

sudoer777@lemmy.ml · edit-2 6 days ago

A computer capable of running DeepSeek-V3.2 at full precision would cost well over $50k (iirc).

I saw a Hacker News article of someone running Deepseek R1 for $6k, although still too expensive IMO

GLM 4.6

I need to try this.

Minimax-M2

Kimi K2 Thinking also just came out

piccolo [any]@hexbear.net · 5 days ago

Honestly I have not been super impressed with Kimi K2. Maybe the thinking model is better, but in my experience GLM has been much better. I’ll still give it a shot though.

I saw a Hacker News article of someone running Deepseek R1 for $6k, although still too expensive IMO

Do you remember what their setup was? My guess would be CPU inference with a metric fuckton of RAM if they were running it at the full quantization, which could work but would be pretty slow. But for $6k it’d be impossible to buy enough VRAM to run it at full quant on GPUs.

Inui [comrade/them]@hexbear.net · edit-2 8 days ago

I also use Zed and I hook it up to small Qwen models like the new 4B 2507 Thinking model through LM Studio. I just have a 3070 with 8GB of VRAM, and 32GB of regular ram to help offload.

Small models leapfrog each other every 6 months or so kind of like computer hardware and phones. I don’t think you really need to be able to use full 30B or higher models to get use out of them. They’re of course smarter, but if you’re mainly using them as tools for syntax correction, error finding, and small problems like that vs. asking it to spit an entire program, the small ones are pretty good.

piccolo [any]@hexbear.net · 7 days ago

Fair enough, I must say I haven’t tried local models (tfw no GPU ;_;). I guess my take is that if it costs a tenth of a cent on OpenRouter to use a SOTA open source model, I might as well do that, but I can see the appeal of local models for easier queries.

Moidialectica [he/him, comrade/them]@hexbear.net · 7 days ago

oh wait you’re meant to use zed with AI? what have I been doing then

piccolo [any]@hexbear.net · 7 days ago

I mean, it’s a good editor without those features too, but imo they have a really good implementation of the LLM stuff

Inui [comrade/them]@hexbear.net · 7 days ago

There’s a single setting to turn off all the AI integrations if you don’t want them. I like Zed even without them because its very fast and lightweight, but still tend to prefer Kate for the same reason.

Its been a big focus of Zed’s development though. Hooking things up to VSCode and other IDEs can be a pain in the ass with tons of extensions, but Zed has built-in functionality for ollama, LM Studio, etc for local models. You can also connect them to APIs for ChatGPT, Claude, etc if you pay for pro accounts of those.

hello_hello [comrade/them]@hexbear.net · 8 days ago

LLMs are terrible for coding. Keep a personal journal instead. Note down every problem and obstacle you’ve overcome and index them.

gay_king_prince_charles [she/her, he/him]@hexbear.net · 8 days ago

For a terminal-native “agentic” solution, I use opencode^[1] and love it. This can do all sorts of fancy stuff with tooling and commands. For example, if you tell it to do something, it usually builds the code, checks for errors, and then resolves them. For nvim, I use Avante^[2], which is neither great nor terrible. It’s a bit buggy, but it’ll handle boilerplate well and can help with debugging.

BountifulEggnog [she/her]@hexbear.net · 8 days ago

Thank you for the suggestions!

One formatting thing is the ^ needs to be closed, at least on lemmy.

sudoer777@lemmy.ml · 8 days ago

Deepseek on aichat CLI program works well. I tried Deepseek and Kimi K2 with OpenCode and Codex and couldn’t get either of them to work.

ZWQbpkzl [none/use name]@hexbear.net · 8 days ago

I’ve only ever really used LLMs through ollama and chatgpt when google seo crap was too much. I have experimented with different neovim AI plugins with different results.

Most AI text editor plugins will basically just put the chat window in the editor as a sidebar and facilitate you copying code between the buffers.

encounter a strange error that you would have search on SO? The AI probably knows the answer
not sure what method you’re looking for in some unfamiliar library? The AI probably ly knows (if its a popular enough library) Basically it cuts down on the time you would spend googling it.

Then there’s Cursor and editors that support “Tooling” meaning limited access to execute commands on your computer. That gets fucking nuts because it will crap out something functional. Like it will get all the boilerplate for starting an express server with a database connection and slap a react UI on it.

Running the simpler editor plugins against ollama is perfectly functional. You don’t need super high parameters. A 16b model will work fine. The problem will be the start up time because ollama unloads the model after 5 min of inactivity. I have not gotten ollama to work with any cursor like editors yet.

Also, when it comes to running ollama, RAM is the real bottleneck. I’ve found my laptop runs larger ollama models better than my desktop because its using system RAM with integrated graphics. Its not as fast as a dedicated card but a 64g macbook will run deepseek 70b.

Tabitha ☢️[she/her]@hexbear.net · 8 days ago

Most of these things have pretty good free tiers (abuse it before the AI bubble pops lol). And if you actually exhaust them all they often have BYOK so you could end up paying for actual API usage (which could be anywhere from $0.50/month to $500/month lol) instead of flat monthly fees that are just awkward inscrutable credit piles.

Tabitha ☢️[she/her]@hexbear.net · 8 days ago

I’ve never heard anything specific about DeepSeek tho, I assume it’s strong on value and a great way to save on cost until you hit a problem that requires bigger thinkier models. Can it use MCP? I know SWE is very spotty on using MCP, even when you explicitly beg it to.

gay_king_prince_charles [she/her, he/him]@hexbear.net · 8 days ago

Deepseek can use MCP over the API if you have an MCP client. Although the use case for MCP isn’t as big as advertised. Deepseek-reasoner does a great job with those “thinkier” problems, especially if you chain the outputs and provide the right context. I don’t think it’s as good as Claude Sonnet-4.5, but it’s around 30 times cheaper.