• senkora@lemmy.zip
    link
    fedilink
    arrow-up
    7
    ·
    edit-2
    4 days ago

    You can store the Merkle trees inside of a SQLite database as extra columns attached to the data.

    That way you get the benefits of a high-level query language and a robust storage layer as well as the cryptographic verification.

    In fact, there is a version control system called Fossil which does exactly that:

    https://fossil-scm.org/home/doc/trunk/www/fossil-v-git.wiki

    The baseline data structures for Fossil and Git are the same, modulo formatting details. Both systems manage adirected acyclic graph (DAG) of Merkle tree structured check-in objects. Check-ins are identified by a cryptographic hash of the check-in contents, and each check-in refers to its parent via the parent’s hash.

    The difference is that Git stores its objects as individual files in the .git folder or compressed into bespoke key/value pack-files, whereas Fossil stores its objects in a SQLite database file which provides ACID transactions and a high-level query language. This difference is more than an implementation detail. It has important practical consequences.

    […]

    The SQL query capabilities of Fossil make it easier to track the changes for one particular file within a project. For example, you can easily find the complete edit history of this one document, or even the same history color-coded by committer, Both questions are simple SQL query in Fossil, with procedural code only being used to format the result for display. The same result could be obtained from Git, but because the data is in a key/value store, much more procedural code has to be written to walk the data and compute the result. And since that is a lot more work, the question is seldom asked.