Category Archives: SQL

Intellisense (AutoComplete) Broken in Sql Server 2008 R2

If your Intellisense in SQL Server Management Studio just stopped working, chances are you have applied Visual Studio 2010 SP1. This is because there is a compatibility issue between Microsoft SQL Server 2008 R2 Management Studio (SSMS 2008 R2) and the Transact-SQL Language Service that is included in Visual Studio 2010 SP1.

The good news is there is a fix included in the latest Cumulative Update package for SQL Server 2008 R2 – http://support.microsoft.com/kb/2507770

All you need to do is download the pack, install it, and go back to saving your keystrokes.

Advertisements

Microsoft Parallel Extensions – Hard”core” code made easy!

Ok, so I must admit that I was a little disappointed last night. I was really looking forward to implementing a threaded Consumer / Producer to try and take advantage of our 16 core datawarehouse server. Instead, I ended up changing one line of code – welcome to the world of Microsoft Parallel Extensions.

The Background (feel free to skip this)

At my work, it’s really important for us to be able to detect if people are sharing their accounts. We do this via storing a permanent cookie on the users machine with a “Machine ID” (a guid we generate). When they log in from that machine, we log the id so we know the user was on a particular machine. The more machines, the more likely they are sharing their account.

This worked well, apart from the fact that people tend to clear their cookies, leading to an artificial rise of “Machine IDs” over time. Three years ago, I came up with an algorithm to find out how to work out what machine IDs were due to cookie clearing, and what ones weren’t. So I don’t bore you to death with the details, I will leave that for another post.

The Details

At the core of my problem was the following loop:

foreach (string user in userlist)
{
  // Do some stuff
  // Commit it to the database
}

This was going through around 60,000 users and 4mil+ records and running a pretty complex algorithm on them. It took around 10 minutes to run, but when I looked at the performance monitor, I noticed that the total CPU usage rarely got over 4% and that only one CPU process spiked – hmm, a common sign that threads would help 🙂 One of my pet hates is having to write really complicated and unintelligble code just to “increase” performance, and threads, not matter how hard you try, always end up looking ugly!

The Solution

Some neuron at the back of my brain started firing, and I got this feeling that I knew a better way. F# – nope, that’s not it – would be helpful, but it would take me a long time to convert everything into Functional Programming. A quick google search bore fruit: this article I read Microsoft Parallel Extensions library back in 2007! Microsoft are now planning on releasing this as part of .Net 4.0; but a CTP for 3.5 still exists.

After installing the MSI and then adding the reference to System.Threading, all I needed to do was to make this simple change:

Parallel.ForEach(userlist, (username) =>
{
 // Same code as before
});

It was that simple! Voila, multi-threading code. If you want more control over the threads created, you can use the TaskManagerPolicy object – a simple way to do this is outlined here.

Caveat: You’re code inside the loop has to be thread safe – since none of my code was modifying shared objects, this wasn’t a concern for me.

The Result

TaskMonitory

Beautiful! The task now takes under a minute instead of 10.

Update … From

In the past I have quite often wanted to update values in a table based on values in another table (esp for caching data in DataWarehouses / migrating data between databases). Rather than writing a stored procedure to iterate through both tables, a much more efficient way is to use the From clause in an Update statement (NB: this is vendor specific, in this case, SQL Server). I was walking one of the junior developers through this today – basically what you need to do is start with a select statement creating the relationship between the two (or more) tables you want to combine

select 
      count(distinct r.requestID)
from
      customerRequests r,
      summaryTable st
where
      r.customerID = st.customerID

Then, you just need to change it so that it is an update statement

update
      summaryTable
set
      customerRequests = count(i.requests) — Put all the sets here…
from
      customerRequests i, 
      summaryTable st — Defines relationship
where
      r.customerID = st.customerID

Of course, you can make this more complex (in our case we had three tables updating)