Aquarium Blog

Thursday, December 14, 2006

Broken Windows (Part 1)

Today at work, someone I respect much told me that I write very well (in Spanish) and that I am both inspiring and motivating to others. She also said that those were very rare features in a developer.

Seldom I have been aware of this. Actually I believe that the real underlying fact is that I cannot go to work every morning if I am bored of it and I don't feel at ease. I think I could not keep a boring job for a week, just for the money. So, when I am becoming bored, I have to do whatever it takes to make it fun again. People that see me, understand very quickly that I work for the fun, and that I love what I do. I don't know for sure, but when I see someone that seems to love what she or he does, it helps me keep my own fire alive.

During Tuesday afternoon, I talked on the phone with a very special group of people. Now I wish i could have a second chance to tell them:

See guys? Someone I work with thinks I am inspiring and motivating! John? Are you there? At least you have my blog address... :)

Well, the writing my office mate was referring to is an email I wrote to many developers in our organization about a rather basic topic that we needed to reinforce. I have been involved a lot in code quality initiatives lately. I need to translate the text in order to publish it, so please wait until my next post.

Monday, December 11, 2006

Jon Udell gets assimilated! ;)

I have been a fan of Jon for 18 years. This is great news. I think he will have a very important role in the ongoing Microsoft change.

With Rory and Jon on board, Channel 9/10 looks like the dream team.

Now, let's see how I do it this week...

Transactional File System in Windows Vista (Part 2)

In my last post I tried to describe what I think is the single most important issue with the implementation of TxF (and TxR).

Now I want to explain how I think the pre-existing file name-based APIs could have been changed (or could still be changed in future versions) to allow for opting-in Transactions for virtualy any code that works with files, including things like System.Data.DataSet.WriteXml(String).

I take this just as an exercise for my mind, as I am almost sure that someone thought about this solution, but then discarded it for a reason I cannot discern.

First of all, let’s say that all code running inside or on top of Windows, when it needs to access the file system, ends up invoking on of a relatively small set of Win32 APIs. Of those, some are file-name based (like CreateFile, DeleteFile, SetFileAttributes, etc.) and some are file-handle based (like GetFileSize, ReadFile, WriteFile, etc.).

Second, you always have to call first a name-based API to get a handle you can use with a handle-based function.

Third, no new Transacted versions of handle-based APIs were created. Instead, handle-based APIs get transactional behavior only if the handle passed is already associated with a transaction.

(Still, it bugs me why some name-based APIs were not replicated but instead behave like in the beta 2 model, becoming transactional depending on the ambient transaction).

So, you only signal that want to participate in an existing transaction at the time you call name-based APIs. In the current model, the way to do it is calling the transactional version of the function (like CreateFileTransacted, DeleteFileTransacted, SetFileAttributesTransacted, etc).

But what all those functions have in common in the first place, is that they receive a file name as a parameter!

(Edit: What follows is my proposition, not how TxF/TxR works in Vista. Somehow, after some editing I got the text wrong. )

What if we change rules a little: If the file name is prefixed with a moniker like, for instance “txf:” or “txfile:”, then it becomes transactional. You can understand it as designating a new namespace for TxF, one that points to the same file system store, but behaves differently.

There are already many rules about file names. However, for me it makes sense to add just one more in this case. Of course the exact form of the prefix is not important (NTFS seems to prefer other kind of prefixes).

By this plan, calling CreateFile(“txf:foo.txt”…) would be equivalent to calling CreateFileTransacted(“foo.txt”…). Something similar could be done with TxR.

I think this change would integrate perfectly with the model that shipped in Vista, meaning that you could mix and match calls to the Transacted functions with calls to the “normal” functions with the moniker prepended to the file name.

Also, existing client code and code “hidden” across the programming stack would not need to opt-out. It would be automatically be not transactional because its hard-coded file names (or registry key names) would not contain the transactional moniker.

But the real benefit would be that this could enable us to do things like myDataSet.WriteXml(“txf:foo.xml”) or [insert any other function that takes a file name as a parameter](“txf:bar.ext”), without waiting for a revision of them. Virtually all the programming stack could enjoy the benefits of TxF, without requiring modifications.

I can only think of a few functions that try to parse the file name and that could fail on the presence of the prefix, but I think those would be rare exceptions.

What do you think? If you like it, you can vote for this as I entered a closely related suggestion to the .NET Framework team in Microsoft Connect.

Wednesday, December 06, 2006

Transactional File System in Windows Vista

In November, 1998, Microsoft Transaction Server was about a year old and SQL Server 7 was just arriving. I had at hand the task of coding a small CRM-like application in Visual Basic 5. Among other features, it had to upload unstructured documents and keep them linked to rows in a database.

I had one important decision to make: Should those documents be stored in the database itself or in the server file system?

SQL Server 6.5 had a lot of limitations with its lack of row locking and some performance issues with BLOB columns.

On the other side, the file system lacked transactional capabilities, and I lacked the ability to create a Compensating Resource Manager.

A transactional file system would have been super useful.

In November 2006, eight years later, Windows Vista is available. Transactions were introduced as a new feature of NTFS, named TxF. The Windows Registry is also getting support for transactions in Vista, under the name of TxR.

Before TxF, for instance, if you wanted to get ACID-like behavior from multiple file system operations, you could, but you had to fiddle a lot with temporary files, renaming, etc. With TxF you just issue something like a "begin transaction", then do your stuff in NTFS, and last, you commit or roll back the whole thing.

This way, TxF pushes best practices under the rug, and pushes the developer one level of abstraction up regarding files.

Surendra Verma, Developer Manager in the CFS group, explained how TxF/TxR works in Channel 9 some months ago. But it is interesting to note that after the video was recorded, there were major design changes to TxF/TxR.

As Jim Johnson explains in this first, second and third posts, from Beta 2 to RC1, TxF API changed from "implicit transaction enlistment" model that worked with the existing Win32 file APIs to a more explicit model for which new "Transacted" versions of some APIs were added.

In the first version, you just did something like:

EnterTransactionScope();
// do whatever file work with your favorite file APIs
ExitTransactionScope();

Everything you did in the middle got automatically enlisted in a thread specific ambient transaction.

In the new model, you have to do something like this (some function names were invented):

hTransaction = GetTransactionHandle();
hFile = CreateFileTransacted(... hTransaction ...);
// do whatever, but now using new *Transacted APIs
CloseHandle(hTransaction );

The complete listing of APIs that were affected by TxF is here.:

If you take a look at it, all the APIs for which a new Transacted version were created are file name-based. Besides, some existing APIs were updated and are now transaction aware, meaning that they acquire transactional behavior in the presence of a file handle that is associated to a transaction (for file handle-based APIs) or in the presence of a thread level ambient transaction (for yet another group of file name-based APIs).

The reason Microsoft change models, as explained by Surendra in the discussion of the video in Channel 9, is that the more simple original version, had a major drawback:

Between any pair of EnterTransactionScope()/ExitTransactionScope(), every single file or registry operation made by any code, even code lost in the middle of the programming stack was automatically and forcefully enlisted in the ambient transaction, acquiring a behavior that was often not intended at the time such code was created.

You could not opt-out.

So, if implicit transactions means that current code will break or misbehave, it is good that they abandoned this path.

On the other side, the main tradeoff of the new version, in my opinion, is that it is "too explicit":

Only those new APIs and those that have been changed will get transactional behavior. So, the hundreds, if not thousands, of higher level APIs that somehow affect the file system, won't get the possibility of having transactional behavior until the whole stack gets updated.

Now, you cannot opt-in.

For a .NET developer like me, this means that I have to use a lot of interop, or wait until new versions of System.IO.FileStream, methods like System.File.Delete, and even that the brand new APIs in System.IO.Packaging get revised to include the option of using transactions.

I have been thinking of a deceptively simple change they could do to the existing file name-based APIs, that could solve this issue. I must be missing something, or they would have implemented it on Vista.

I tried to discuss my idea con Surendra, but he is probably having vacation after shipping Vista :)

I will try to explain the idea in my next post...