2011-04-14

Reusability and discoverability of LINQ Queries

I agree with Ayende point that Repositories and DAL’s are pretty much a thing of the past.

Usually my LINQ queries contains 3 distinct parts:

  • Some filtering
  • Some transformation
  • Some selection

If I search for LINQ reusability I get some hits where all 3 parts are reused together. But in my experience it is extremely rare I need the same 3 parts in more than 1 place.

In my experience it is often the filtering part that is used many places. I did not have a good solution for this, until I read ayendes post.

After some though I came up with the following.

linq1

 

 

 

 

 

 

 

 

 

 

linq2

 

 

 

 

 

 

 

 

 

linq3

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Notice how the filtering is reusable, and chainable in LINQ . Also the code reads very well and the where clauses are easily discoverable with intellisence.

This approach is very light weight and flexible, just like code should be.

Windows Home Server 2011–How I use it

I recently added a Western Digital Hard disk with a new sector? size to my Windows Home Server 2003. Within a week it behaved very strangely. Afterwards I found out I had to set a jumper on the disk before using it in a pre-windows 7 machine.

So I installed a fresh Windows Home Server 2011, on my excellent Tranquil PC BBS. Installation went smooth and and I was up and running. I added the wifes laptop, my laptop, childrens laptop and the MCE to automatic backup.

I really don’t want to open any ports in my router, I dint think its worth the security risk. I think MS should have shipped at free remote service with WHS, that did the relaying. I have used Hamachi before but it is too slow and is unstable, unless some ports are open in the router. Now I use Remobo and works great and is very fast, its free unless you need to run it as a windows service. Installation was easy and it just works.

Currently the server sports a 2TB drive and a 1,5TB drive. In the old WHS Drive Extender made all the data available as a single 3,5TB unit, but the new WHS does not support this. Many have declare WHS dead because this feature was removed, but I don’t get it? I actually like the simple transparent way much better. I only have 2 media types that require TB’s of space and that is DVD rips, and Recorded TV. When I have too grow I just add a “Recorded TV 2” share and have my MCE point to that too, no one will notice.

I am a firm believer that most work should be done on the local machine and not over network. So the wifes documents and all the family pictures are stores on the wifes laptop. She sometimes delete files or removes text from documents. With WHS I only have a nightly backup, so I need something something to continuously backup files to the home server share. I ended up using AutoVer, its free automagically creates a copy of the file with a date extension. This cost some extra space, but it is worth it. Now I have all changes made to her files not just images every night.

For off-shore backup (you do keep you backup at another location right?) I use 2 e-sata enclosings with 1,5GB HDD’s in them. I keep one at home attached to the server and one at work locked up. Every monday pickup the one at home and swap it with the one at work. This kind of backup works great in case of fire, virus or hacker attacks since the disk is offline and at another location. With the old WHS I could mount the backup HDD in any other computer and see the files, but the new one hides the data in a unknown format. I like the old transparent way, now I have to use WHS to restore a file.

I have some critical data, the family pictures/videos about 60GB of it. So I asked on superuser how I should make a backup of those files to the cloud. The answer was a great product called CloudBerry Backup. Its optimized for S3 usage, so it does not use too much bandwidth or space. All files can be downloaded from Amazon S3 afterwards from any tool. Not like Mozy or the like where you are dependent their tools for restore. Cloudberry Backup offered a simple and transparent way of doing cloud backup. Installation and setup was silly easy, I set mine to run every night at 00 to 06 hours (It takes a few night to upload 60GB of data). I get a email report every day specifying total data, and how much was transferred. CloudBerry Backup just pickes up where it left the day before, even large files are uploaded in chunks so no bandwidth is wasted when it resumes the next night.

So there you have it, using a few extra tools I just made my WHS even more valuable too me.

2010-03-17

Embedding a TFS Build label in all assemblies

The default TFS build label does not have a xx.xx.xx.xx pattern, but looks like “MyBuild_20070418.1”. So we cant assign the value to the AssemblyVersion or AssemblyFileVersion attribute, however the AssemblyInformationalVersion attribute can be used as it is a string.

   1: <Target Name="AfterGet">
   2:   <!-- Insert build label into assemblies -->
   3:   <Message Text="Modifying AssemblyInfo files under &quot;$(SolutionRoot)&quot;." />
   4:  
   5:   <Message Text="Undo checkout assemblyinfo files, make sure no files are checked out." />
   6:   <Exec Command="&quot;C:\Program Files\Microsoft Visual Studio 9.0\Common7\IDE\tf.exe&quot; undo *.* /recursive /noprompt"
   7:         WorkingDirectory="$(MSBuildProjectDirectory)\..\sources" ContinueOnError="true"/>
   8:  
   9:   <Message Text="Checkout assemblyinfo files" />
  10:   <Exec Command="&quot;C:\Program Files\Microsoft Visual Studio 9.0\Common7\IDE\tf.exe&quot; checkout &quot;AssemblyInfo.*&quot; -r"
  11:         WorkingDirectory="$(MSBuildProjectDirectory)\..\sources" ContinueOnError="true"/>
  12:  
  13:   <ItemGroup>
  14:     <AssemblyInfoFilesCs Include="$(SolutionRoot)\**\assemblyinfo.cs" />
  15:   </ItemGroup>
  16:  
  17:   <Message Text="Adding $(LabelName) to %(AssemblyInfoFilesCs.fullpath)" />
  18:   
  19:   <WriteLinesToFile
  20:           File="%(AssemblyInfoFilesCs.fullpath)"
  21:           Lines="[assembly: AssemblyInformationalVersion(&quot;$(LabelName)&quot;)]"
  22:           Overwrite="false"/>
  23: </Target>
  24:  
  25:  
  26:  
  27: <Target Name="AfterCompileSolution">
  28:   <Message Text="Undo checkout assemblyinfo files" />
  29:  
  30:   <Exec Command="&quot;C:\Program Files\Microsoft Visual Studio 9.0\Common7\IDE\tf.exe&quot; undo *.* /recursive /noprompt"
  31:         WorkingDirectory="$(MSBuildProjectDirectory)\..\sources" ContinueOnError="false"/>
  32: </Target>

With this in place all assemblies (dll, exe), will have a AssemblyInformationalVersion attribute with the label of the TFS Build. So it should be a no-brainer to trace a dll to a point in time in TFS Source Control. Just open the dll in reflector and disassemble the dll.

reflector

An alternative to AssemblyInformationalVersion  would be AssemblyConfiguration, but then we have to use a task that can do a text replace, as it is usually already defined in AssemblyInfo.cs files.

2010-03-02

SQL Server Restore database with exclusive access

   1: ALTER DATABASE MyDatabase SET SINGLE_USER WITH ROLLBACK IMMEDIATE  
   2:  
   3: RESTORE DATABASE MyDatabase FROM DISK = 'X:\Path\mybackup.bak'

In effect this rolls back any pending transactions, and gives you exclusive access to the database.

2010-02-04

Effective Logging

Runtime diagnostics

There are 3 major ways to diagnose code at runtime.

runtime

I think most .NET developers use the debugger everyday in Visual Studio, and Automated tests (actually mostly unit tests) get a lot of blog space. But interestingly logging i not mentioned often? In some scenarios on way is more useful than the 2 other, and some times it does not matter which one is used. But all 3 ways have distinct advantages that can be used, to ease our busy developer lives.

Advantages

If your application has crashed, It does not matter how many unit tests or how good you are with the debugger. A good log will help you identify the problem.

In a live application the performance degradation(amount of log messages) can be controlled, by setting a log level. The application does not even have to be restarted. Unit tests does not do much with live applications and usually it is not wise to attach a debugger to a live application.

Statistics can be gathered from logs, like key performance indicators, number of exceptions etc. WMI can also be used, but that is actually just an other kind of logging (internal vs external).

Logs (often log files) act as an alternative UI, administrators/supporters etc. can read logs and determine out of process usage of the application.

If there is a good log there is no reason to call the user and start a “shared screen” so you can see what he does.

Messages

messages

Don't just output values, describe what the values mean. When reading the log, you don't have the source code to help you - only the previous and next message(s).

Logging as a global error handler

As far as I can tell this is where most applications have some form of logging. The implementation looks differently depending on the type application.

In a WinForms app it looks something like this.

globalerrorhandlingwinforms

And in an ASP.NET app it could looks like this.

globalerrorhandlingweb

However this at the application boundary, it should be the only place in your code where the type Exception is caught. All uncaught exceptions should be allowed to bubble up to the global error handler and properly logged.

Logging is not error handling for the user

errorhandling

Don't log an exception just to re-throw the exception. But if you actually catch an specific exception type, always log the exception. The fact that the exception was caught could come in handy when diagnosing a problem with the application.

Decision Points

Are where the code branch, like if, switch and the like.

baddcisionpoints

Don’t log inside a decision point, and don’t log where there is no decision point. Also don't log values that are not used for the decision.

 gooddcisionpoints

Do log before a decision point and do log the value(s) that are used to determine what branch to use. Do log all the values in a single formatted message. There are more decision points than “if”’s, logical operators also act as a decision point.

Things to log even though it is not a decision point

When the application uses out of process resources (File system, starts a process etc.), Do log the resource used.

outofprocess

This could help an administrator understand how the application works.

 

Another thing to always log is the use of DateTime.Now

 datetimenow

DateTime.Now problems are often elusive and hard to replicate in a debugging environment, having the exact value help to understand under what conditions a problem can occur.

 

Point of indirection (like events) where there is no clear path to the calling code. It can be helpful to see what events get called and in what order.

indirection

Assertion

Don’t use logging for assertion, a manual test with the debugger or a repeatable automated test are better suited for this.

Message Context

Every log message should have context values that determines this usage from others. Often things like:

  • Timestamp
  • Level (debug, trace, info, warning, error, fatal)
  • Thread
  • Principal (username)
  • Url (for web apps)
  • Request hashcode (for web apps)
  • Session hashcode (for web apps)

Using these message contexts, it should be easy to grab interrelated messages for analysis.

Conclusion

Don't neglect logging just because your more familiar with the debugger or you and your agile team has the code base covered with automated tests. In some scenarios logging is a vital view into your running application.

Recommended Reading

http://nirajrules.wordpress.com/2008/06/14/blogging-best-practicies/

http://www.kdgregory.com/index.php?page=java.logging

http://esmithy.net/2004/11/09/effective-logging/

http://www.ibm.com/developerworks/java/library/j-logging/

2010-01-27

How to restore to another database with full text index

I have been using this little trick for some time now.

The other day I had to restore a backup to a database with full text indexes.

First we need to know the files included in the backup.

   1: RESTORE FILELISTONLY FROM DISK='C:\backup.bak'

Next we need to move the additional files to another location.

   1: RESTORE DATABASE myDatabase FROM DISK = 'C:\mybackup.bak' WITH REPLACE,
   2: MOVE 'ftname_ft' TO 'c:\newpath\ftname.ft' 

2010-01-26

Optimizing build times

Background

I am currently working on a large winforms code base.

It has 60+ Visual Studio projects in a single Solution.

The project dependencies are “V” shaped. Many projects are dependent on a few projects, that are dependent on even fewer projects. So the many projects should be parallelized, after the few projects have been build in sequence.

Getting started

First of all I needed to get a base measurement of how long the build takes.
I don't trust ReBuild to not cheat, so I deleted all source files on disk and got them again from source control.
Next Visual Studio does not show build times, so I used MSBuild from the concole. It is very easy, because a sln file is a MSBuild file. ex: “msbuild my-solution.sln” and the total build time is showed when it is finished.

TimeElapsed 

Analysis

So my build was taking approximately 4 minutes I needed to find the low hanging fruits, that would make a huge impact with minimum of work and change.
Adding a parameter to msbuild helped me do that. ex: “msbuild my-solution.sln /v:diag”, now the build took longer but I could see what it was doing. One thing immediately caught my attention, the creation of a COM wrapper for MSHTML.dll. It took almost a minute, that's 25% of the total time. So I changed the reference from the COM to the PIA (Primary Interop Assembly).

During my analysis I repeatedly had to delete the local files and get them from source control, and I noticed that my source code files and 3rd party dll’s was approximately 200MB. After the build was done the folder was almost 3GB!
The sinner was a little property called “copy local” automatically set to true in every project in Visual Studio (except for the system references).
So I set all projects references to copy local false, and the disk writes went down – way down.

BuildSize

But I had created a new problem, I could not start the program any longer because the main exe file did not have all the dll references copied (the windows services and websites did not start either). So In order to enable “F5” again I set all the projects output path to the same bin folder.

Hardware

My main laptop is shipped with a Samsung SSD 64GB Hard drive, so I needed an additional hard drive for my WMs and growing code base. I ordered an Icy Box with USB power and esata, and a WD Velocy Raptor 10000RPM Hard drive (the fastest non SSD out there?). But the greedy Velocy Raptor was too much for the USB power, so I ended up ordering a WD Black 7200RPM HDD.

So basically I had a SSD, a fast 7200RPM and a 10000RPM drive, so why not let them duke it out :-)
Actually I added another option, a RAM Drive like this one http://stevesmithblog.com/blog/using-ramdisk-to-speed-build-times/

duke-it-out

Build times in seconds, less is better.

The first think to notice is that the SSD is the slowest of all the disks! Now I am no expert, but the SSD runs the OS and all the applications so it has other duties too. It is also kind of an early SSD, I think the new Intel X25 has better performance for builds (lots of small reads and writes). It is a really good disk for my OS and applications, every thing runs really smooth.

Next thing to notice is the RAM drive flattens as there are less disk activity, it looks like it becomes a bottleneck(maybe the compiler is greedy on the RAM bus?).

Last thing to notice is the 7200RPM and 10000 RPM drives are almost equally fast, when building to the same bin folder with copy local = false. And the drive is a lot cheaper.

Conclusion

Remember this is NOT universally applicable. But looking for things that take a long time to compile, like the COM reference or reducing disk reads/writes is. Going from 4 minutes to under 1 minute is not too bad.