Just to keep my fellow lemmygradians updated on what AI tools are capable of, and also because I’m pretty stoked for this project.

I put 5$ in the deepseek api (sidenote: I like that you have to top up a credit balance and they don’t auto bill) then downloaded crush. Crush is an agentic coding tool, meaning basically it instructs the LLM to do stuff automatically.

It made me a complete python script to first download all of the ProleWiki content pages into txts (which also means we can do backups, even if it’s a little hacked together).

Then with a second script we are running these txts through a (local) LLM to translate them for our French instance. The problem is there are 5000 pages on the EN instance and a grand total of 3 in French, so nobody is interested in joining and writing pages from scratch when you could “just” find them on the EN instance.

For these two scripts (which are running right now) I’ve paid a whopping 67 cents on API. It amounts to a few hours of prompting and then of course waiting for the agent to work.

Cache hit on deepseek is a godsend for agentic work as it’s basically free (less than 2 cents per 1M token), and with a codebase you constantly feed it the same code over and over. This is why my cache hit is so high.

Compare to GPT-5 which costs 12 cents per 1m cache hit.

What’s pretty amazing (and scary, it’s very scary using crush) is that you can just go do something else while it works and puts everything together. Go have dinner while the agent is on the task, or watch a youtube video.

The third and final script will be used to upload the translated files to the wiki. I still need to think about what exactly I want it to do (write API access is not a problem, the problem is just the logic of it all).

As for running the translation job if you’re curious, it saves its progress so I can stop and resume any time I want and I estimate around 6-8 days of continuous running to go through everything (there’s a lot of material). Yes we could use an API or even rent a GPU and multithread but eh, I figured I only have to do this once. And there’s a LOT of tokens to translate, you won’t escape that. Even using a cloud API it would probably take a few days of continuous querying.

But compare to doing it by hand which, well, we haven’t even started despite the instance existing for 4 years. So it’s basically 4+ years vs 8 days of work.

Later I can adapt this specific script to work on books to bring more exclusive theory to English like we did for the CIA’s Shining Path which was done with what is now an almost obsolete model lol (and I definitely improved the prompting since that one). I might actually redo CIA’s Shining Path with mistral just to see how it differs.

The problem if anything is this is making me learn stuff like git to make it FOSS and downloadable and make it more robust to handle more usecases lol skeptical

About crush:

Before I started using crush I didn’t really get what an agent actually did or helped. So this isn’t just putting a prompt into the web interface and asking it to generate python code. The agent makes sure to take care of everything, including writing functions tests and fixing bugs. That’s right, this thing fixes its code automatically.

It calls tools and terminal commands by itself, and can edit files. When it does you get a git-like preview of the lines edited.

to use crush you just prompt the LLM. “Okay now I want to do this, now I want to do that, there’s a bug here’s the log” and it will work through the problem by itself. It’s scary how fast it does it.

You can extend its capabilities with LSPs and MSPs but I haven’t looked into that yet. Which it was more user-friendly to set up, but I got there in the end.

Caveats:

deepseek boasts a pretty comfy 128k tokens context window, but you run through it quickly because it has to read and understand the entire project. Crush handles this (it makes the LLM write to a crush.md file and then restarts the last command sent when context resets), but you’re still limited. However with tools like deepseek-ocr, if they ever start integrating it, you have potentially infinite context. Clearly they’re going to come up with something, they’re already working on it. But you won’t be recreating twitter with an LLM yet.

You don’t want a specifically coding fine-tune for this as it needs to understand the file structure and the readmes. However I have run into situations where the LLM did stuff it shouldn’t have done, for example deleting the database that keeps track of which files we’ve already worked through because it doesn’t know this is the ‘live’ prod.

Mind you I’m pretty much cobbling this together so I don’t git it or anything, it’s just a one-time script for our specific needs and I shouldn’t put the content files in the same folder as the script, it’s just good practice. I def recommend keeping two copies of your project if you’re not going to git. Crush works on one copy and then you can copy the files over to the other folder.

Oh also no chance of crush deleting system32 as it opens in a specific folder and can’t leave it. Before running a script it also lets you review the code and asks for permission to run.

This is not replacing devs. It’s a great addition to non-devs and devs alike. For non-devs it lets us write our scripts and solve our problems. For devs you spend more time thinking through and planning your app and then send the writing of it to the LLM. As a designer this speaks to me because we plan things a lot lol. And if you know your stuff, you can avoid some of the pitfalls the LLM might go into if you don’t specifically prompt it for it.

If you also don’t know some libraries or APIs very well it can handle them for you. You can totally give crush working code you wrote yourself, it’s just that it might not be the most efficient way to use it since it could also write that code for you.

Your workflow is basically 3-10x more efficient with this and that’s valuable - take a coffee break while it works, you deserve it. You become more of an engineer than a coder and imo this is where dev work is heading.

Translation work:

As for the translation, which is handled by mistral-3.2-instruct (a 24B model that fits on my 16GB and generates at 15 tokens per second, honestly good job France I gotta hand it to you), it’s pretty good but you have to prompt it first. The prompt for this task is ~600 tokens, which is a lot but also not a lot considering I can easily have a 16k context window with this tool.

imo a lot of the “we spend more time fixing the translation than we would have spent doing it ourselves” comes from clients incorrectly prompting stuff (but what else is new lol), translators not necessarily using tools to automate bulk edits, and older models not doing as good a job - deepseek is actually pretty solid at translating because of the thinking, though we didn’t use a thinking model for this task.

Translating the filenames is messier and more prone to hallucinating random characters. I think it’s because it just doesn’t have a lot to work on, you’re asking it to translate 5 seemingly random words. Translating the page content is much better, some pages that I checked are pretty amazing.

Not all languages work equally. I used Mistral specifically bc it’s french so we assume it understands french better. Some languages don’t have ‘enough’ presence to be trained on effectively, and others are just not a priority for devs. Chinese LLMs are seemingly better at Persian for example but still not ‘great’.

Another thing is it sometimes translates jargon two different ways. It would need a dictionary or something like that that says “this word is always translated as X”. I’m sure this will come, and in fact a simple dictionary is probably an old-school method for an LLM already. But you would also need to build that dictionary and when you have 5000 pages of content I just don’t know where you would even begin.

Even with those caveats it gets us 80-90% of the way there and the remaining work will be to fix stuff manually as we come across it. Or with mass regex edits. If we can get interest to the FR instance with this as one of our editors has alluded to, then we can also count on crowdsourcing the rest of it over time.

Conclusion:

We’re doing pretty exciting things for 67 cents.

  • tamagotchicowboy@lemmygrad.ml
    link
    fedilink
    arrow-up
    2
    ·
    6 days ago

    For another Chinese model I really like z.ai’s stuff, don’t even need to log in to just use it as a chatbot which is awesome. I’m not sure on their api but coding wise I can just head over, not even bother logging in, and it’ll get me a functional powershell script powered by ollama to organize some files I have lying around.

    I think their flash model’s api is free but you still have to provide a credit card that’s standard with AI.

        • CriticalResist8@lemmygrad.mlOP
          link
          fedilink
          arrow-up
          2
          ·
          6 days ago

          Definitely need to make my key. Mistral doesn’t ask for CC either (just phone #) and their api is completely free with absolutely wild rate limits - 500k tokens per minute and 2 billion token output per month. All their models are available incl. mistral-large, but you sometimes get hit with rate limits on the model. If that happens just provision in your script that it auto falls back to another model if it gets error 429.

          Only thing I regret is they don’t allow multithreading on the free plan lol.

  • Maeve@lemmygrad.ml
    link
    fedilink
    arrow-up
    12
    ·
    10 days ago

    Thank you and thank China for making inexpensive, intelligent technology for anyone else to use.

  • Philo_and_sophy@lemmygrad.ml
    link
    fedilink
    English
    arrow-up
    8
    ·
    9 days ago

    Great work comrade. I’ve been wanting to do some data work on prolewiki, but wanted to avoid scraping the site

    Did/can you put up the text dump unto hugging face or GitHub?

    Similarly, curious if you’ve looked into running it through a vector database for RAG calls?

    • CriticalResist8@lemmygrad.mlOP
      link
      fedilink
      arrow-up
      6
      ·
      9 days ago

      We should be handing these files to a dev who has scraped and standardized the MIA already, and will do the same for PW so that you get a whole bunch of marxist theory to do stuff with (for example fine-tuning an LLM)

    • CriticalResist8@lemmygrad.mlOP
      link
      fedilink
      arrow-up
      9
      ·
      9 days ago

      Honestly you kinda just discover this by yourself over time. It’s very daunting at first but you start one step at a time and it starts to come naturally. Though crush was explained to me by @[email protected] which I thank for his time, and the mistral model we use for the translation work was suggested by a fellow PW editor.

      But at the very beginning I asked deepseek to write me a guide about installing local models, and it did. Ask it to:

      • (Important: send it your hardware and OS specs in the first prompt.)
      • check for dependencies you might already have installed
      • do as much as possible through the command line interface (cmd on windows) (this is a huge time saver and you just copy paste)
      • order steps by least effort to most effort (so it starts by checking dependencies)

      Then whenever you run into a bug or a command output just send it “I’m on step 2, here’s the log:” and paste the log.

      Especially as you can’t have a too recent CUDA and python versions depending on what you want to install, it takes care of making sure to use the right ones. I previously tried to install some interfaces by myself and gave up at the time lol.

      It has some outdated info bc of knowledge cutoff so it might suggest older models and the most popular interfaces, but it’s fine to start with. You can always uninstall and switch them out later. For example it suggested I use OpenWebUI which is good, but you also have to install Docker on Windows which is a whole thing, then run Docker, then run the UI from Docker, then run ollama… I switched to koboldcpp as sort of a “middle-ground” to more advanced interfaces/engines and it just works out of the box (double click the exe and load a gguf model).

      And if you’re not sure what model to get or how to configure the settings for it, again, ask deepseek. Provide your hardware and ask it “what should I set in kobold to optimize this model”. the info may be wrong but tbh it’s more of an art than a science anyway, but this gets you playing around with the settings and trying different things out, and start understanding what they do. Though the rule of thumb is you want a model whose disk size is ~2 gigs lower than your Vram so that you can fit it entirely on Vram with some to spare.

      After that the ‘pro’ engine is something like vLLM but I’m not there yet. With crush you just run one command to install on your computer, launch from cmd, select model provider, enter API key, and it works. I wouldn’t use it with a local LLM because of context size limitation, you really need as much context window as you can get for agentic work.

      Same thing with image gen interfaces, I asked deepseek to make the same kind of guide. It even created the python virtual environment for me so that’s pretty cool. Then go on civitai look at top rated pictures find some you like and download the models and LORAs, and the rest you just kinda figure out as you try to push the limits of what you can do with it. Like the reason I went looking for more realistic models on civitai is because deepseek got me started with SDXL2.5, it even provided a direct download link lol, but that model is obsolete by now, so once I’d played around with it enough I went looking for other models to solve that.

      Protip: put your engine and models on an SSD and get more than 2TB space lol, you’ll want it (you’re gonna start downloading models left and right and never delete). Loading from SSD takes literally 15 seconds versus >2 minutes from HDD.

        • CriticalResist8@lemmygrad.mlOP
          link
          fedilink
          arrow-up
          6
          ·
          9 days ago

          Lol I only got into it recently 3 years after it started and I already feel like I missed out on a lot of stuff. It just moves so fast, but it’s never too late to get into it. The open-source scene especially has a lot of things going on that you just won’t get from the proprietary names, which is a shame because they do have the talent… I tried gpt-oss which is openAI’s response to deepseek to please the open source community and I have to admit it runs incredibly well. 50 tokens per second and thinking capabilities (like deepseek) instead of mistral’s 15/s. I just think, what if we could have that for basically every model instead of having them lock everything behind closed source.