Open-source Deepseek R1 dethrones commercial AI, now allegedly being hit by cyberattack

JOMusic@lemmy.ml · 1 month ago

Open-source Deepseek R1 dethrones commercial AI, now allegedly being hit by cyberattack

blakenong@lemmings.world · 1 month ago

It still can’t count the Rs in strawberry, I’m not worried.

rdri@lemmy.world · 1 month ago

blakenong@lemmings.world · 1 month ago

Clearly not the first try ;)

Kuvwert@lemm.ee · 1 month ago

Yes it can

blakenong@lemmings.world · 1 month ago

Screenshots please

https://lemmy.world/comment/14735060

Kuvwert@lemm.ee · edit-2 1 month ago

https://ibb.co/wVNsn5H

https://ibb.co/HpK5G5Pp

https://ibb.co/sp1wGMFb

https://ibb.co/4wyKhkRH

https://ibb.co/WpBTZPRm

https://ibb.co/0yP73j6G

Note that my tests were via groq and the r1 70B distilled llama variant (the 2nd smartest version afaik)

Edit 1:

Incidentally… I propositioned a coworker to answer the same question. This is the summarized conversation I had:

Me: “Hey Billy, can you answer a question? in under 3 seconds answer my following question”

Billy: “sure”

Me: “How many As are in abracadabra 3.2.1”

Billy: “4” (answered in less than 3 seconds)

Me: “nope”

I’m gonna poll the office and see how many people get it right with the same opportunity the ai had.

Edit 2: The second coworker said “6” in about 5 seconds

Edit 3: Third coworker said 4, in 3 seconds

Edit 4: I asked two more people and one of them got it right… But I’m 60% sure she heard me asking the previous employee, but if she didnt we’re at 1/5

In probably done with this game for the day.

I’m pretty flabbergasted with the results of my very unscientific experiment, but now I can say (with a mountain of anecdotal juice) that with letter counting, R1 70b is wildly faster and more accurate than humans .

rdri@lemmy.world · 1 month ago

https://lemmy.world/comment/14754440

blakenong@lemmings.world · 1 month ago

“Again” so it failed the first time. Got it.

rdri@lemmy.world · edit-2 1 month ago

It didn’t, I just wanted a short reply. Though it failed when I asked again in the same chat. But when asked to split the word to 2 parts it became sure that the correct answer is 3.

Here is the fun part

Certainly! Let’s split “strawberry” into two equal parts and count the 'r’s in each:

First half: “strawb”
Letters: S, T, R, A, W, B
'r’s: 1 (at position 3)
Second half: “erry”
Letters: E, R, R, Y
'r’s: 2 (at positions 2 and 3)

Now, summing them: 1 (first half) + 2 (second half) = 3 'r’s.

Wait, this contradicts my earlier correction. It seems I made another mistake. Let’s carefully recount:

First half: “strawb”
'r’s: 1 (at position 3)
Second half: “erry”
'r’s: 1 (only one ‘r’ at position 2; the third letter is ‘r’ but already counted in the first half).

Wait, no—this is getting confusing. Let’s re-examine the entire word:

Full word: S, T, R, A, W, B, E, R, R, Y
'r’s: Positions 3, 8, and 9 → 3 'r’s.

I apologize for the back-and-forth. The correct total is 3 'r’s in “strawberry.” Thank you for your patience as I worked through this! 😊

blakenong@lemmings.world · 1 month ago

That’s a lot of processing just to count letters. Hopefully it can add numbers without splitting the number ;)

Pieisawesome@lemmy.world · 1 month ago

It’s because LLMs don’t work with letters. They work with tokens that are converted to vectors.

They literally don’t see the word “strawberry” in order to count the letters.

Splitting the letter probably separates them into individual tokens

ikt@aussie.zone · 1 month ago

That’s a lot of processing just to count letters

feel free to ask Google/Bing/Your favourite search engine to do the same :P

blakenong@lemmings.world · 1 month ago

Search engines are not designed to answer questions. Apples and oranges.

Corkyskog@sh.itjust.works · 1 month ago

Is this some meme?

blakenong@lemmings.world · edit-2 1 month ago

No. It literally cannot count the number of R letters in strawberry. It says 2, there are 3. ChatGPT had this problem, but it seems it is fixed. However if you say “are you sure?” It says 2 again.

Ask ChatGPT to make an image of a cat without a tail. Impossible. Odd, I know, but one of those weird AI issues

SoftestSapphic@lemmy.world · 1 month ago

Because there aren’t enough pictures of tail-less cats out there to train on.

It’s literally impossible for it to give you a cat with no tail because it can’t find enough to copy and ends up regurgitating cats with tails.

Same for a glass of water spilling over, it can’t show you an overfilled glass of water because there aren’t enough pictures available for it to copy.

This is why telling a chatbot to generate a picture for you will never be a real replacement for an artist who can draw what you ask them to.

blakenong@lemmings.world · 1 month ago

Oh, that’s another good test. It definitely failed.

There are lots of Manx photos though.

Manx images: https://duckduckgo.com/?q=manx&iax=images&ia=images

vrighter@discuss.tchncs.de · 1 month ago

so… with all the supposed reasoning stuff they can do, and supposed “extrapolation of knowledge” they cannot figure out that a tail is part of a cat, and which part it is.

Kuvwert@lemm.ee · 1 month ago

The “reasoning” models and the image generation models are not the same technology and shouldn’t be compared against the same baseline.

SoftestSapphic@lemmy.world · 1 month ago

The “reasoning” you are seeing is it finding human conversations online, and summerizing them

vrighter@discuss.tchncs.de · 1 month ago

I’m not seeing any reasoning, that was the point of my comment. That’s why I said “supposed”

Corkyskog@sh.itjust.works · 1 month ago

I mean I tested it out, even tbough I am sure your trolling me and DeepSeek correctly counts the R’s

blakenong@lemmings.world · 1 month ago

Not trolling you at all:

https://lemmy.world/comment/14735060

Kuvwert@lemm.ee · 1 month ago

Non thinking prediction models can’t count the r’s in strawberry due to the nature of tokenization.

However openai o1 and deep seek r1 can both reliably do it correctly

Open-source Deepseek R1 dethrones commercial AI, now allegedly being hit by cyberattack

Open-source Deepseek R1 dethrones commercial AI, now allegedly being hit by cyberattack

DeepSeek hit with large-scale cyberattack, says it's limiting registrations