• jacksilver@lemmy.world
    link
    fedilink
    English
    arrow-up
    3
    ·
    14 hours ago

    I’ve done some work on natural language to SQL, both with older (like Bert) and current LLMs. It can do alright if there is a good schema and reasonable column names, but otherwise it can break down pretty quickly.

    Thats before you get into the fact that SQL dialects are a really big issue for LLMs to begin with. They all looks so similar I’ve found it common for them to switch between them without warning.

    • morbidcactus@lemmy.ca
      link
      fedilink
      English
      arrow-up
      1
      ·
      4 hours ago

      Yeah I can totally understand that, Genie is databricks’ one and apparently it’s surprisingly decent at that, but it has access to a governance platform that traces column lineage on top of whatever descriptions and other metadata you give it, was pretty surprised with the accuracy in some of its auto generated descriptions though.

      • jacksilver@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        3 hours ago

        Yeah, the more data you have around the database the better, but that’s always been the issue with data governance - you need to stay on top of that or things start to degrade quickly.

        When the governance is good, the LLM may be able to keep up, but will you know when things start to slip?