Monday, December 11, 2006

Transactional File System in Windows Vista (Part 2)

In my last post I tried to describe what I think is the single most important issue with the implementation of TxF (and TxR).

Now I want to explain how I think the pre-existing file name-based APIs could have been changed (or could still be changed in future versions) to allow for opting-in Transactions for virtualy any code that works with files, including things like System.Data.DataSet.WriteXml(String).

I take this just as an exercise for my mind, as I am almost sure that someone thought about this solution, but then discarded it for a reason I cannot discern.

First of all, let’s say that all code running inside or on top of Windows, when it needs to access the file system, ends up invoking on of a relatively small set of Win32 APIs. Of those, some are file-name based (like CreateFile, DeleteFile, SetFileAttributes, etc.) and some are file-handle based (like GetFileSize, ReadFile, WriteFile, etc.).

Second, you always have to call first a name-based API to get a handle you can use with a handle-based function.

Third, no new Transacted versions of handle-based APIs were created. Instead, handle-based APIs get transactional behavior only if the handle passed is already associated with a transaction.

(Still, it bugs me why some name-based APIs were not replicated but instead behave like in the beta 2 model, becoming transactional depending on the ambient transaction).

So, you only signal that want to participate in an existing transaction at the time you call name-based APIs. In the current model, the way to do it is calling the transactional version of the function (like CreateFileTransacted, DeleteFileTransacted, SetFileAttributesTransacted, etc).

But what all those functions have in common in the first place, is that they receive a file name as a parameter!

(Edit: What follows is my proposition, not how TxF/TxR works in Vista. Somehow, after some editing I got the text wrong. )

What if we change rules a little: If the file name is prefixed with a moniker like, for instance “txf:” or “txfile:”, then it becomes transactional. You can understand it as designating a new namespace for TxF, one that points to the same file system store, but behaves differently.

There are already many rules about file names. However, for me it makes sense to add just one more in this case. Of course the exact form of the prefix is not important (NTFS seems to prefer other kind of prefixes).

By this plan, calling CreateFile(“txf:foo.txt”…) would be equivalent to calling CreateFileTransacted(“foo.txt”…). Something similar could be done with TxR.

I think this change would integrate perfectly with the model that shipped in Vista, meaning that you could mix and match calls to the Transacted functions with calls to the “normal” functions with the moniker prepended to the file name.

Also, existing client code and code “hidden” across the programming stack would not need to opt-out. It would be automatically be not transactional because its hard-coded file names (or registry key names) would not contain the transactional moniker.

But the real benefit would be that this could enable us to do things like myDataSet.WriteXml(“txf:foo.xml”) or [insert any other function that takes a file name as a parameter](“txf:bar.ext”), without waiting for a revision of them. Virtually all the programming stack could enjoy the benefits of TxF, without requiring modifications.

I can only think of a few functions that try to parse the file name and that could fail on the presence of the prefix, but I think those would be rare exceptions.

What do you think? If you like it, you can vote for this as I entered a closely related suggestion to the .NET Framework team in Microsoft Connect.

No comments:

Moving to MSDN

I haven't decided yet, but it is very likely that I will stop blogging here for some time. For some background, I have moved to the sate...