• Sibbo@sopuli.xyz
    link
    fedilink
    arrow-up
    10
    ·
    6 days ago

    So did anybody try this and wake to share their experiences?

    • Does it use lots of CPU, RAM or disk?
    • Are the search results actually good?
    • Does it use a browser extension or so to get new visited sites, or do I have to import my history every day?
    • Does it also have a crawler?

    Also, why would I use this over e.g. YaCy?

  • activistPnk@slrpnk.net
    link
    fedilink
    arrow-up
    4
    ·
    6 days ago

    I currently use the find-grep function in emacs, which is basically: find . -type f -exec grep 'my.*search.*pattern' {} +

    To do PDFs, I use something like find . -type f -iname \*pdf -exec pdfgrep 'my.*search.*pattern' {} +

    My problem is generally when TOKEN1<space>TOKEN2 has a line break between tokens. It’s fucking annoying that grep is line-by-line. I wonder if Hister solves that problem. But from the website, I see no advanced syntax. I would love to search a pattern like word1 w/s word2, which would find cases where word1 and word2 appear in the same sentence. And word1 w/p word2 to match cases where two words are in the same paragraph.

    • cravl@slrpnk.net
      link
      fedilink
      arrow-up
      2
      ·
      5 days ago

      Replacing line breaks with nulls first is an option. That’s a lot of extra processing for very large blocks of text though.

      Using regular grep is possible with the right flags, or you could also use pcre2grep with the -M flag, which should be available on every distro nowadays. See this Stack Overflow article for details.

      • activistPnk@slrpnk.net
        link
        fedilink
        arrow-up
        1
        ·
        4 days ago

        pcregrep is not automatically installed with Debian but it’s in the official repos. It seems common to get:

        pcregrep: Too many errors - abandoned.
        pcregrep: Error -8, -21 or -27 means that a resource limit was exceeded.
        pcregrep: Check your regex for nested unlimited loops.
        

        But it will help in many cases. I can see that it works on sufficiently small files. I noticed the built-in grep function for emacs can be modified to use pcregrep w/-M added instead of grep, which I find quite important because emacs makes it very easy to jump around to visit different results. In the end it’s still a hack.