Friday, July 06, 2007

Maximum size of the file under source control (continued)

In my previous post I wrote about the deltas mechanim used to store files revisions in TFS database and (somewhat) lamented the lack of configuration and documentation of that.

Well, it seems that at least configuration is taken care of. As Richard Berg helpfully pointed out in the comments to the post, it is possible to specify "deltaMaxFileSize" parameter in web.config file of Version Control web service (for default installation located in C:\Program Files\Microsoft Visual Studio 2005 Team Foundation Server\Web Services\VersionControl).
The value of this key is maximum file size (in bytes) to perform delta algorithm on. For example, the following setting will be equivalent to default (16 Mb size):

    <add key="deltaMaxFileSize" value="16777216" />

I did play around with that setting a bit; I created a ZIP file full of small 1 Kb sized files and put it through check-out/check-in cycle removing single file from archive for each new revision. The database size was gauged using SQL Server Management studio to view properties of TfsVersionControl database. The setting did work as expected, and reverse delta algorithm indeed works amazingly well - with delta enabled the database size remained effectively constant when I added a new revisions of the file.

So as it turned out, you can optionally configure the algorithm; I'd say it is something to do if you contemplate storing revisions of large binary files. A word of caution though - as the setting is yet to find the way into official documentation on TFS, it is not supported at the moment.

Update: Important remark (courtesy of Buck Hodges) on side effects of changing the default value: "You'll also want to think really hard about setting the value any larger. The library that does this doesn't consume memory linearly (it's CPU intensive as well). It's not hard to run your server out of memory when you least expect it."

1 comment:

Anonymous said...

You'll also want to think really hard about setting the value any larger. The library that does this doesn't consume memory linearly (it's CPU intensive as well). It's not hard to run your server out of memory when you least expect it.