deepseek with Crush (agentic code) made me a complete python pipeline for less than 1$

CriticalResist8@lemmygrad.ml · 10 days ago

deepseek with Crush (agentic code) made me a complete python pipeline for less than 1$

tamagotchicowboy@lemmygrad.ml · 6 days ago

For another Chinese model I really like z.ai’s stuff, don’t even need to log in to just use it as a chatbot which is awesome. I’m not sure on their api but coding wise I can just head over, not even bother logging in, and it’ll get me a functional powershell script powered by ollama to organize some files I have lying around.

I think their flash model’s api is free but you still have to provide a credit card that’s standard with AI.

CriticalResist8@lemmygrad.ml · 6 days ago

That’s interesting yeah 4.5 flash is completely free

Might have to try them out too

tamagotchicowboy@lemmygrad.ml · 6 days ago

Yep and it actually doesn’t ask for cc to generate an api key, I’m planning to use it for some serious file organization.

CriticalResist8@lemmygrad.ml · 6 days ago

Definitely need to make my key. Mistral doesn’t ask for CC either (just phone #) and their api is completely free with absolutely wild rate limits - 500k tokens per minute and 2 billion token output per month. All their models are available incl. mistral-large, but you sometimes get hit with rate limits on the model. If that happens just provision in your script that it auto falls back to another model if it gets error 429.

Only thing I regret is they don’t allow multithreading on the free plan lol.

tamagotchicowboy@lemmygrad.ml · 4 days ago

Qwen also has free api I think but I know no details

Maeve@lemmygrad.ml · 10 days ago

Thank you and thank China for making inexpensive, intelligent technology for anyone else to use.

Philo_and_sophy@lemmygrad.ml · 9 days ago

Great work comrade. I’ve been wanting to do some data work on prolewiki, but wanted to avoid scraping the site

Did/can you put up the text dump unto hugging face or GitHub?

Similarly, curious if you’ve looked into running it through a vector database for RAG calls?

CriticalResist8@lemmygrad.ml · 9 days ago

We should be handing these files to a dev who has scraped and standardized the MIA already, and will do the same for PW so that you get a whole bunch of marxist theory to do stuff with (for example fine-tuning an LLM)

Philo_and_sophy@lemmygrad.ml · 9 days ago

Well you’re talking to a dev, so is there any db or dump that you have available?

Assian_Candor [comrade/them]@hexbear.net · 9 days ago

Where did you go to get started on this stuff

CriticalResist8@lemmygrad.ml · 9 days ago

Honestly you kinda just discover this by yourself over time. It’s very daunting at first but you start one step at a time and it starts to come naturally. Though crush was explained to me by @[email protected] which I thank for his time, and the mistral model we use for the translation work was suggested by a fellow PW editor.

But at the very beginning I asked deepseek to write me a guide about installing local models, and it did. Ask it to:

(Important: send it your hardware and OS specs in the first prompt.)
check for dependencies you might already have installed
do as much as possible through the command line interface (cmd on windows) (this is a huge time saver and you just copy paste)
order steps by least effort to most effort (so it starts by checking dependencies)

Then whenever you run into a bug or a command output just send it “I’m on step 2, here’s the log:” and paste the log.

Especially as you can’t have a too recent CUDA and python versions depending on what you want to install, it takes care of making sure to use the right ones. I previously tried to install some interfaces by myself and gave up at the time lol.

It has some outdated info bc of knowledge cutoff so it might suggest older models and the most popular interfaces, but it’s fine to start with. You can always uninstall and switch them out later. For example it suggested I use OpenWebUI which is good, but you also have to install Docker on Windows which is a whole thing, then run Docker, then run the UI from Docker, then run ollama… I switched to koboldcpp as sort of a “middle-ground” to more advanced interfaces/engines and it just works out of the box (double click the exe and load a gguf model).

And if you’re not sure what model to get or how to configure the settings for it, again, ask deepseek. Provide your hardware and ask it “what should I set in kobold to optimize this model”. the info may be wrong but tbh it’s more of an art than a science anyway, but this gets you playing around with the settings and trying different things out, and start understanding what they do. Though the rule of thumb is you want a model whose disk size is ~2 gigs lower than your Vram so that you can fit it entirely on Vram with some to spare.

After that the ‘pro’ engine is something like vLLM but I’m not there yet. With crush you just run one command to install on your computer, launch from cmd, select model provider, enter API key, and it works. I wouldn’t use it with a local LLM because of context size limitation, you really need as much context window as you can get for agentic work.

Same thing with image gen interfaces, I asked deepseek to make the same kind of guide. It even created the python virtual environment for me so that’s pretty cool. Then go on civitai look at top rated pictures find some you like and download the models and LORAs, and the rest you just kinda figure out as you try to push the limits of what you can do with it. Like the reason I went looking for more realistic models on civitai is because deepseek got me started with SDXL2.5, it even provided a direct download link lol, but that model is obsolete by now, so once I’d played around with it enough I went looking for other models to solve that.

Protip: put your engine and models on an SSD and get more than 2TB space lol, you’ll want it (you’re gonna start downloading models left and right and never delete). Loading from SSD takes literally 15 seconds versus >2 minutes from HDD.

Assian_Candor [comrade/them]@hexbear.net · 9 days ago

Awesome ty. It’s good to be early adopter on this. Everything is going to move in this direction so getting smart now is time well invested.

CriticalResist8@lemmygrad.ml · 9 days ago

Lol I only got into it recently 3 years after it started and I already feel like I missed out on a lot of stuff. It just moves so fast, but it’s never too late to get into it. The open-source scene especially has a lot of things going on that you just won’t get from the proprietary names, which is a shame because they do have the talent… I tried gpt-oss which is openAI’s response to deepseek to please the open source community and I have to admit it runs incredibly well. 50 tokens per second and thinking capabilities (like deepseek) instead of mistral’s 15/s. I just think, what if we could have that for basically every model instead of having them lock everything behind closed source.

小莱卡@lemmygrad.ml · 9 days ago

Is crush foss? First time hearing about it

CriticalResist8@lemmygrad.ml · 9 days ago

FSL-1.1-MIT https://github.com/charmbracelet/crush?tab=License-1-ov-file

Moidialectica [he/him, comrade/them]@hexbear.net · 9 days ago

I could probably finish projects I couldnt bother doing with crush then. Neat, maybe, I guess?

CriticalResist8@lemmygrad.ml · edit-2 9 days ago

I just found out Mistral has a free API, you could give it a try with crush. Make sure to make a copy of your project or git it though because crush works on the source files directly (and it’s good practice regardless)

Moidialectica [he/him, comrade/them]@hexbear.net · 9 days ago

wtf

I do have credits on openrouter, but wtf, mistral has free api???

CriticalResist8@lemmygrad.ml · 9 days ago

lol yes https://lemmygrad.ml/post/9725638

Moidialectica [he/him, comrade/them]@hexbear.net · 9 days ago

What’s the context limit?

CriticalResist8@lemmygrad.ml · 9 days ago

no idea, probably 128k. for translation work we restart fresh every time and send individual chunks

Moidialectica [he/him, comrade/them]@hexbear.net · 9 days ago

WTF 128k content is huge