Benutzer-Werkzeuge

Webseiten-Werkzeuge


lavenzug:hammer-sql-meets-shell.264

HAMMER: Where SQL meets Shell

Transactional DDL

If you have ever worked with a https://postgresql.org/decent database system, you will have encountered (and learned to love) transactional DDL. If you don’t know what I’m talking about, then you’re missing out big time and you definitely should try it out. In short, transactional DDL allows you to put practically anything between BEGIN TRANSACTION; …; COMMIT;: not just DML statements (INSERT, UPDATE and DELETE), but things like ALTER, CREATE and DROP statements. In Postgres, there are only very few things that you cannot wrap inside a transaction, most of which follow from common sense (transactions are only valid inside a single database, hence you cannot roll back things like DROP DATABASE, for instance). This allows you to create schema migration scripts that are safe: Either they succeed completely, or they don’t—there are no in-betweens.

It is very common to see haphazard schema migration scripts, that somehow try to manually undo the damage they inflicted onto the database because of a bug, a subtle incompatibility, schemas that have diverged because someone tampered with them, and so on. This approach ultimately doesn’t work: If your script runs into a situation which it didn’t anticipate, then by definition it doesn’t know where it is. If it doesn’t know where it started, it cannot restore the database to where it was.

But with transactional DDL, I just wrap my entire upgrade script inside BEGIN and COMMIT, and all those problems just… vanish.

So what’s that have to do with the HAMMER file system?

In some ways, HAMMER works similar to a database system: All changes to the file system exist in the context of a transaction, which either succeeds or doesn’t. This is how HAMMER can cope with crashes without having to fsck, because—just like a well-designed DBMS—it just looks at its log, rolls back any half-completed transaction and is up and running again. You never risk inconsistencies (like disconnected inodes which suddenly become half-hybrid directories and who knows what else); at most, your changes from the, say, last minute will be gone, as if you had never done them.

What’s interesting is that HAMMER allows unprivileged users to muck around with the transactions. It keeps a fine-grained history of all changes to your files and directories and allows you to undo your changes with the aptly-named undo command. Furthermore, hammer history allows to see how far back in history you can actually go. With everything at default settings, you can at least go back until the most recent snapshot.1) If you know what you are doing, you can also re-visit your file system as from, say, five minutes ago: A healthy amount of black magic inside the VFS code allows you to enter transactions as if they were directories.

How does this look like?

Let’s say I have a directory called ~/Important Stuff/. If I sync it to disk, that means that there will have been a HAMMER transaction which contains whatever changes had been pending for that directory, and this transaction has been committed, i.e. if power would drop afterwards, ~/Important Stuff/ would at least everything until the sync call.

Let’s assume that my ~/Important Stuff/ looks like this:

% lt
.
|-- Birthdays.db
|-- DO NOT FORGET!.text
`-- other stuff
    `-- secret.text

At any given time, I can ask HAMMER about the most recently committed file system transaction, and it will give me a transaction id.

  % hammer synctid .
  0x0000000201e862e0

hammer synctid will sync the file system, then print out the transaction that it just committed.

Transaction ids can be explored as if they were directories—if you know the trick:

  % cd @@0x0000000201e862e0
  % pwd
  /home/streichholz/Important Stuff/@@0x0000000201e9e690
  % ls -F1
  Birthdays.db
  DO NOT FORGET!.text
  other stuff/

Now let’s assume that I accidentally overwrite DO NOT FORGET!.text and fill it with garbage. If I happen to know a valid transaction id just before the change occurred, I can restore the file by just cding into the transaction and copying the old file over the new one, just using ordinary tools available from the shell.

BEGIN and ROLLBACK with HAMMER

This is a very powerful concept and allows to perform some very black magic. It sometimes happens that a database is not as self-contained as it should be. For instance, a database describing a photo collection or scanned images might choose to keep the actual files outside the database someplace in the file system, for performance reasons. In this case, a naïve schema migration that just changes the database might not be enough.

Suppose it were decided that instead of saving your photos with a file name supplied by the user, they are stored with their SHA512 hash. Perhaps we want to avoid that a malicious user crafts file names in such a way as to cause us grief.

In addition to migrating the database schema to account for the new file names, we would have to rename the files as well. In the database, we can wrap the whole thing in a transaction, but with the file system, we usually cannot. But we can make use of hammer transactions to replicate exactly that—and it’s astonishingly easy. We just ask for the transaction id of the file system which contains the directory where we will be working in—before we perform any changes. This will be our BEGIN TRANSACTION.

Then, we do whatever we have to do: rename files, create new files, split directories and so on. If everything goes well, we’re done, there is nothing further to do. We might want do sync, to make sure that the changes end up on disk. This will be our COMMIT.

If something goes awry, all we have to do is to re-visit that old transaction and copy everything that’s in there over the ‘real’ directory. Dragonfly provides the cpdup utility for precisely this purpose. Assuming that some big rename went badly, we can just:

  cd @@0x0000000201e862e0
  cpdup -i0 -v . ..

This will delete all files that appeared after the transaction, restore all files whose content changes after the transaction and in general restore the whole directory tree to where it was before we did our changes. This will be our ROLLBACK—and this is where the magic is.2)

Scripting transactional file system updates

Now we have everything we need to amend our file system update script, it’s just a bunch of very short functions (in zsh):

transactional-fs.zsh
function begin {
    # Grab the current transaction and store it for later for eventual ROLLBACK.
    typeset -g tid=$( hammer synctid . )
}
 
function commit {
    # All’s well.  We could also decide to just do nothing in this case.
    sync
    exit 0
}
 
function rollback { 
    # Ow!  Everything’s on fire.  Quickly restore everything before master comes back.
    cd @@${tid}
    cpdup -i0 -v . ..
    exit 1
}

With these three functions, we can write upgrade scripts in a way that they either succeed or return to a clean state, without having to manually revert half-done upgrades if something went wrong:

perform-update.zsh
source transactional-fs.zsh
 
begin
 
for each in **/*; do
    hash=$(sha512 ${each})
 
    first=${hash[1]}
    second=${hash[2]}
    rest=${hash[3,-1]}
    target=${first}/${second}
 
    if [[ ! -d ${target} ]] mkdir -p ${target} || rollback
    mv ${each} ${target}/${rest} || rollback
done
 
commit

This is just a quick proof-of-concept to illustrate my point. I will try to hash out that script a little bit more and to actually test it, but the general idea works—and has already saved me countless headaches.

That was too easy.

FIXME Actually, hammer synctid has some drawbacks and won’t work in this constellation. The transaction id that it returns is guaranteed to be valid—but there are no guarantees as to how long. If you happen to do some elaborate changes that take more than an instant, it might as well be that your transaction has already been pruned by the time you want to rollback to it. If this happens, you’re screwed—but you can at least go back to the latest snapshot. It’s Dragonfly, after all. As long as you kept the defaults, there will be a snapshot that is at most a day old (assuming the computer was running for the nightly cleanup cronjob).

This is because of the way the fine-grained history works, which I have not yet fully researched. As far as I know, it is kept for a minute or so, after which only ‘coarse’ history is kept, but still available. After even the coarse history is pruned, only snapshots remain. Otherwise, your HAMMER file systems would grow without bounds.

Most of the logic remains valid, only the begin procedure needs to find a better transaction id, perhaps by using undo to get ‘safe’ transactions.

% undo -i .
.: ITERATE ENTIRE HISTORY
        0x0000000201e96640 16-Nov-2016 16:50:40
        0x0000000201e9e690 16-Nov-2016 16:51:07
new-begin.zsh
function begin {
    # Ensure that the directory has been saved to disk.
    sync .
 
    # Grab the TID from the undo history.
    typeset -g tid=$(undo . | tail -1 | cut -d ' ' -f 1)
 
    # At this point, the tid will still contain a leading tab:
    # ‘        0x0000000201e9e690’.
    tid=${tid#\t}
}

This transaction id should be valid for a bit longer than just a few decaseconds. ◀

Diskussion

1) This might not be 100% correct; I still have to research this a little bit more thoroughly.
2) Even though it might seem counterintuitive, cpdup will not be confused that is is instructed to mirror a subdirectory onto its parent. If you did something like this with a real directory, it most likely will end up in chaos. But @@0x0000000201e862e0 doesn’t really exist and is just something conjured up on the fly by the VFS. In other words, the directory inode of the parent won’t contain it, so if cpdup, rsync or whatever traverse it, they won’t see it—and thus won’t remove it.
lavenzug/hammer-sql-meets-shell.264.txt · Zuletzt geändert: 2016-11-16 17:38 (vor 3 Jahren) von Stefan Unterweger