• 1 Post
  • 187 Comments
Joined 3 years ago
cake
Cake day: July 3rd, 2023

help-circle








  • qqq@lemmy.worldtoLemmy Shitpost@lemmy.worldholy moley
    link
    fedilink
    arrow-up
    1
    ·
    edit-2
    9 days ago

    I wager that, for example, most people didn’t vote in california not because they see their candidate as a lost cause, but because they know “their” candidate has carried the state for sure.

    That’s a natural interpretation as well. I wonder if it’d be possible to at least guess at whether it was that or “my person won’t win so what’s the point”. There are probably so many other factors. For example the “did not vote map” looks surprisingly similar to the SOVI map: https://www.atsdr.cdc.gov/place-health/php/svi/svi-interactive-map.html. I’m not entirely sure what to make of that, my knee jerk thought is that you could see more “what’s the point they’re both the same” or “neither side actually cares about my needs” among disenfranchised people in general combined with maybe more voter suppression efforts in disenfranchised areas? Would voting being a federal holiday or easier to vote by mail make those areas specifically better?


  • qqq@lemmy.worldtoLemmy Shitpost@lemmy.worldholy moley
    link
    fedilink
    arrow-up
    2
    ·
    edit-2
    9 days ago

    I’d be interested in an interactive version of this where you could assign a percentage of those votes to the person who lost the state as a naive proxy for “what would have happened if the people who thought their vote didn’t matter because [D|R] would win anyway”. I know it wouldn’t be an actual measure but it’d be fun to mess with anyway.

    In particular I find it kinda interesting that CA and TX are both didn’t vote and both historically considered “easy wins”.

    This image is just generally interesting because it also turns the idea of swing states around a bit. If neither candidate motivated enough people in all of those states could we consider them swing states?








  • You’re suggesting that we replace THEM with an agent.

    I am not suggesting we replace anyone, least of all the open source community, so let’s not put words in my mouth

    I think the current code I see being generated is generally “good enough”. I’m not comparing it to perfect: I’m comparing it to people.

    If this were true, then open source projects would have much less of an issue with pull requests from sloperators.

    This doesn’t follow to me. A good tool in the hand of a crappy user doesn’t suddenly make good output. I specifically said that LLMs write good code in a specific setting. Clearly random person generating thousands of lines at a time for a project they don’t understand isn’t that setting.

    You seem to be very focused on crappy code generated by people that don’t know what they’re doing, the technology isn’t good enough for that, so yes, it won’t work in that setting, I agree.


  • I’d push back on your point here with a few things:

    The primary one being: the code doesn’t need to be perfect or even above average – average is perfectly fine. The idea here is comparing the AI to a human, not to perfection. I see this constantly with AI and I find it a bit disingenuous.

    I do truly believe what I said above will be possible within my career (I’m in my mid 30s), but it’s not really what I’m worried about right now. I think the current code I see being generated is generally “good enough”. I’m not comparing it to perfect: I’m comparing it to people.

    I read a comment once that still rings true - “Hallucinations” are a misnomer. Everything an LLM puts out is a hallucination; it’s just that a lot of the time, it happens to be accurate. Eliminating that last percentage of inaccurate hallucinations is going to be nearly impossible.

    I don’t see any reason you have to remove all hallucinations to get a good tool for autonomous development: humans aren’t perfect either. We compensate for that with processes and checking each others work, but plenty still falls through the cracks.

    LLMs also have no understanding of context outside the immediate. Satire is completely opaque to them. Sarcasm is lost on them, by and large. And they have no way to differentiate between good and bad output. Or good and bad input, for that matter. Joke pseudocode is just as valid in their training corpus as dire warnings about insecure code.

    Have you seen output in which satirical code is actually included? I’m well aware of things like https://www.anthropic.com/research/small-samples-poison and the potential here. And do you not believe that either (a) these types of trivial issues would be caught by a person whose job was just to audit output or even (b) this type of issue could be caught by specially trained domain limited AIs designed to check output?


  • To your point then: what are your thoughts on this project? https://github.com/anthropics/claudes-c-compiler I’m not particularly interested in this use case right now but it seems more in line with what you’re interested in.

    I think it shows a lot of limitations but also a lot of potential. I don’t personally think the AI needs to get the code perfect on the first go – it has to be compared to humans and we definitely don’t do that.

    I really really dislike the way it’s being sold as a solution for things it’s in no way a solution for.

    Yes, of course. I think it’s important to look passed the blowhards and think about what it’s actually doing: that is the perspective I’m trying to talk about this from.


  • I didn’t say “trust me bro” and showing Claude submissions is sufficient for analyzing code in the context I believe it is good: one file at a time and one task at a time. This is also the same realm that a human is good. You are welcome to look at the project as a whole to determine the “project quality” as well: it’s open source. But I’m not here to argue: I believe this tech that is barely in its infancy is already quite good and going to get better, and I’m already considering what it will do to my life. If you don’t, that’s fine.


    I’ll add here that I find it very frustrating to talk about these “AI agents” and their code output, because it’s something we’re all close to and spent a lot of time learning. The concept of “a machine” getting “better than us” so quickly, with the background context of an industry that is chomping at the bit to replace humans makes these discussions inherently difficult and really emotional. I feel genuine sadness when I think about it. If the world were different we’d probably all be stoked. I don’t want the AI to be better than me, and I currently don’t believe it is, but I think:

    1. My belief doesn’t stop the market. People do believe that it is better than me or at least good enough. This has a real effect on my life and the lives of people I know.
    2. I don’t see any fundamental reason it won’t get better at development. Part of the reason it struggles with large projects is context: that doesn’t sound like a fundamental engineering constraint to me, it sounds like a memory constraint. Specialization will also make it better and better I assume.
    3. Even if it is never better than me, it will certainly be more efficient and eventually the market will consider my time better spent correcting its output or guiding it, removing the fun part of the work in my mind.

    I don’t think my job is currently on the chopping block today: I don’t do development I do security work. But I do think it will either be on the chopping block or fundamentally change sooner than I’m comfortable with.