Category Archives: SQL Server

Hiring Software Engineers in Cambridge

Thanks for taking a look. As you’ll have gathered from the headline I’m hiring software engineers to join myself and David Priddle in our new Cambridge office. I’ve put a few salient details below and, if you’d like to find out more, I’d love to hear from you. Please drop me an email at [email protected].

The company is MIG Global, a small-ish London based market research company, operating since the mid-noughties under the name Morar Consulting. The team we’re building in Cambridge is concerned with two of their products:

  • The first is an long-standing survey platform that is being rebuilt and enhanced,
  • The second, and the one I’m hiring for here, is a brand new data visualisation and analysis product.

We’re very much aiming to build a best in class offering so, to that end, I’m looking for two mid- to senior-level software engineers. Ideally you’ll have some full stack experience, and you’ll work with myself and a UX engineer to deliver our new product.

Technology-wise, there are still some decisions to be made, but the overall platform will be C# – and very likely F# – with ASP.NET Core, TypeScript, D3, HTML5, SQL Server, RabbitMQ, and possibly a NoSQL store for caching. Wonderfully, we are not encumbered by the need to support older versions of Internet Explorer.

The office itself will be a small, friendly affair on the Science Park. We’re hoping to be in by the end of this week, at which point you’re welcome to stop by for a tour.

I mentioned that MIG is headquartered in London so naturally you’re probably wondering if you’ll need to spend any time in London. The answer is yes, but it’s likely to average out at once every week or two.

Before I finish, I should also mention package. Salary is obviously going to be dependent on your experience, but I can tell you with certainty that in terms of Cambridge we are extremely competitive. We also offer a good benefits and flexible working.

I’m very happy to provide more detail on any of the above so if you are interested drop me a line. Once again, my email address is [email protected].

If you have a website or a LinkedIn profile, please feel free to include a link, but don’t worry about a CV unless you already have one to hand. Mostly the email is just so I can arrange a time to speak with you, so short and sweet is fine.

Thanks for taking the time to read and consider!

Follow-up: a happy ending to the Visual Studio story – Microsoft team steps in to help

A couple of days ago I published a long post documenting the challenging experience I’d had trying to buy a new Visual Studio cloud subscription:

My sorry tale of trying and failing to buy a Visual Studio cloud subscription

Well, I’m happy to report that, after the efforts of a number of awesome people at Microsoft, I’ve managed to successfully activate my Visual Studio subscription and I’m now up and running again with both Windows 10, Visual Studio and (shortly) SQL Server installed and functioning correctly.

So, this time around, let me tell a happier tale…

It starts when, having seen my plight, John Montgomery got in touch via twitter, looping in Buck Hodges:

John Montgomery and Buck Hodges of Microsoft see my plight on twitter and kindly reach out.

These two are both heavy hitters in Visual Studio and .NET in Redmond. John is Partner Director of Program Management, and Buck Hodges is a Partner Director Software Engineer. Having these guys on the case is already reassuring.

Buck’s initial suggestion didn’t quite work out but after getting back in touch with them he asked me to drop him an email so he could expedite the process. Things then started happening quite quickly.

Buck immediately looped in Andrew Brenner, Mike Tayebi, and Marc Paine to help. Marc is a Principal Software Engineer Manager, and Andrew is a Senior Program Manager.

Marc and Andrew got to work on finding a fix and, later in the evening, Marc emailed me instructions with a workaround they’d come up with. Due to timezone differences, and meetings the following morning, I couldn’t immediately try it out. As soon as I could I gave it a try and was overjoyed to be find that I was now able to assign the subscription to myself via https://manage.visualstudio.com/ in a private browsing session:

I can now see and assign my Visual Studio Professional subscription to myself

I’m not quite home and dry yet but this, in itself, is serious progress. A few minutes later I received the following welcome email to activate my subscription:

Yes! Welcome email from Microsoft at last.

I click the Activate my subscription button (actually I copy the link into another private browsing session) and I’m able to successfully activate my subscription.

Now, when I log in to https://my.visualstudio.com/ I’m can access all my benefits:

Visual Studio subscription downloads.

Visual Studio subscription product keys.

(I’m loving the fact there’s an entry in there for Office 95 Professional, btw.)

I’m able to download and install both Windows 10 and Visual Studio 2015:

Success! BOOM!

Success! BOOM! That’s both of the above installed and running in a Parallels VM. I’m extremely happy. I’m also extremely impressed with the speed of the Windows 10 install – I didn’t time it, but it really was only a few minutes. Very cool.

Marc also tells me that they’ve figured out why I couldn’t see or manage any subscriptions and are discussing a solution so that in future the workaround won’t be necessary, as well as investigating some other failure points I identified. Andrew also spent time going through my previous post creating a list of issues that various teams need to address to avoid other people having a similar experience.

Honestly, I’m so impressed with the way these guys stepped up and helped out. I’d particularly like to thank John, Buck, Marc, and Andrew for all their work and time in getting me unblocked, and for taking ownership over the process.

This is absolutely consistent with my previous experience dealing with people who work for Microsoft. Once you find an in to the right person or group of people, past the seemingly impenetrable corporate exterior, what you find are smart people who really care about what they do and about delivering a great experience to customers, and who will go above and beyond to do that. I know they’re going to find and implement solutions for all the problems I had.

I’d also like to thank the UK licensing support team who, whilst they weren’t equipped to handle these kinds of problems, did try and help out as much as they could, as well as Jeff Lambert (Escalation Engineer), and Trevor Hancock (Senior Escalation Engineer), who got in touch to try and help, and followed up to see how I was getting on.

Lastly, I’d like to thank my friend Elisabeth Blees, who is a Program Manager in the Visual Studio team, and who checked in to see how I was getting on, followed up with Buck and his team, and updated me on what they’d been doing.

So I’m pleased to say I’m up and running, and help from Buck and his team really couldn’t have come at a better time: I’m giving a talk on performance tuning .NET and SQL Server web apps at tomorrow’s DDD event at Microsoft’s UK headquarters in Reading, and now I have everything I need to do that.

Thanks again to all!

Current talk list 2016: web and database performance

It’s that time of the year where, for me, talk proposals are submitted. I also tend to take it as an opportunity to refresh and rework talks.

This year I’ve submitted talks for DDD, DDD North, and NDC London (this one’s a bit of a long shot), and am keeping my eye out for other opportunities. I’ll also be giving talks at the Derbyshire .NET User Group, and DDD Nights in Cambridge in the autumn.

Voting for both DDD and DDD North is now open so, if you’re interested in any of the talks I’ve listed below, please do vote for them at the following links:

Here are my talks. If you’d like me to give any of them at a user group, meetup, or conference you run, please do get in touch.

Talk Title: How to speed up .NET and SQL Server web apps

Performance is a critical aspect of modern web applications. Recent developments in hardware, software, infrastructure, bandwidth, and connectivity have raised expectations about how the web should perform.

Increasingly this attitude is applied to internal line of business apps, and niche sites, as much as to large public-facing sites. Google even bases your search ranking in part on how well your site performs. Being slow is no longer an option.

Unfortunately, problems can occur at all layers and in all components of an application: database, back-end code, systems integrations, local and third party services, infrastructure, and even – increasingly – the client.

Complex apps often have problems in multiple areas. How do you go about tracking them down and fixing them? Where do you begin?

The answer is you deploy the right tools and techniques. The good news is that generally you can do this without changing your development process. Using a number of case studies I’m going to show you how to track down and fix performance issues. We’ll talk about the tools I used to find them, and the fixes that resulted.

That being said, prevention is better than cure, so I’ll also talk about how you can go about catching problems before they make it to production, and monitor to get earlier notification of trouble brewing.

By the end you should have a plethora of tools and techniques at your disposal that you can use in any performance analysis situation that might confront you.

Talk Title: Premature promotion produces poor performance: memory management in the CLR and JavaScript runtimes

The CLR, JVM, and well-known JavaScript runtimes provide automatic memory management with garbage collection. Developers are encouraged to write their code and forget about memory management entirely. But whilst ignorance is bliss, it can also lead to a host of problems further down the line.

With web applications becoming ever more interactive, and the meteoric rise in popularity of mobile browsers, the kind of performance and resource usage issues that once only concerned back-end developers have now become common currency on the client as well.

In this session we’ll look at how these runtimes manage memory and how you can get the best out of them. We’ll discuss the “classic” blunders that can trip you up, and how you can avoid them. We’ll also look at the tools that can help you if and when you do run into trouble, both on the client and the server.

You should come away from this session with a good understanding of managed memory, particularly as it relates to the CLR and JavaScript, and how you can write code that works with the runtimes rather than against them.

Talk Title: Optimizing client-side performance in interactive web applications

Web applications are becoming increasingly interactive. As a result, more code is shifting to the client, and JavaScript performance has become a key factor for many web applications, both on desktop and mobile. Just look at this still ongoing discussion kicked off by Jeff Atwood’s “The State of JavaScript on Android in 2015 is… poor” post: https://meta.discourse.org/t/the-state-of-javascript-on-android-in-2015-is-poor/33889/240.

Devices nowadays offer a wide variety of form factors and capabilities. On top of this, connectivity – whilst widely available across many markets – varies considerably in quality and speed. This presents a huge challenge to anyone who wants to offer a great user experience across the board, along with a need to carefully consider what actually constitutes “the board”.

In this session I’m going to show you how to optimize the client experience. We’ll take an in depth look at Chrome Dev Tools, and how the suite of debugging, data collection and diagnostic tools it provides can help you diagnose and fix performance issues on the desktop and Android mobile devices. We’ll also take a look at using Safari to analyse and debug web applications running on iOS.

Throughout I’ll use examples from https://arcade.ly to illustrate. Arcade.ly is an HTML5, JavaScript, and CSS games site. Currently it hosts a version of Star Castle, called Star Citadel, but I’m also working on versions of Asteroids (Space Rawks!), and Space Invaders (yet to find an even close to decent name). It supports both desktop and mobile play. Whilst this site hosts games the topics I cover will be relevant for any web app featuring a high level of interactivity on the client.

Talk Title: Complex objects and microORMs: an introduction to the Dapper.SimpleLoad and Dapper.SimpleSave extensions for StackExchange’s Dapper microORM

Dapper (https://github.com/StackExchange/dapper-dot-net) is a popular microORM for the .NET framework that provides simple way to map database rows to objects. It’s a great alternative when speed is of the essence, and when you just don’t need the functionality offered by EF.

But what happens when you want to do something a bit more complicated? What happens if you want to join across multiple tables into a hierarchy composed of different types of object? Well,then you can use Dapper’s multi-mapping functionality… but that can quickly turn into an awful lot of code to maintain, especially if you make heavy use of Dapper.

Step in Dapper.SimpleLoad (https://github.com/Paymentsense/Dapper.SimpleLoad), which handles the multi-mapping code for you and, if you want it to, the SQL generation as well.

So far so good, but what happens when you want to save your objects back to the database?

With Dapper it’s pretty easy to write an INSERT, UPDATE, or DELETE statement and pass in your object as the parameter source. But if you’ve got a complex object this, again, can quickly turn into a lot of code.

Step in Dapper.SimpleSave (https://github.com/Paymentsense/Dapper.SimpleSave), which you can use to save changes to complex objects without the need to worry about saving each object individually. And, again, you don’t need to write any SQL.

I’ll give you a good overview of both Dapper.SimpleLoad and Dapper.SimpleSave, with a liberal serving of examples. I’ll also explain their benefits and drawbacks, and when you might consider using them in preference to more heavyweight options, such as EF.

How to quickly convert all NTEXT columns to NVARCHAR(MAX) in a SQL Server database

I was at a client’s earlier today and the question came up of how to convert all NTEXT columns to NVARCHAR(MAX) in their SQL Server databases, and it turns out they have rather a lot of them.

There are a couple of obvious advantages to this conversion:

  1. Online index rebuilds with SQL Server Enterprise Edition become a possibility,
  2. Values are stored in row by default, potentially yielding performance gains.

My response to this was, “Yeah, sure: I can write a script to do that.” Two seconds after I said this I thought, “Hmm, I bet 30 seconds of Googling will provide a script because this must have come up a zillion times before.”

Sure enough, there are some pretty reasonable hits. For example, http://stackoverflow.com/questions/18789810/how-can-i-easily-convert-all-ntext-fields-to-nvarcharmax-in-sql-query.

Buuuuuuuuuuuut as ever, you’d be naive indeed to think that you can just copy and paste code from StackOverflow and have it work first time. Moreover, even with modification, you need to go over it with a fine-toothed comb to make sure you’ve squashed every last bug.

For example, this boned me earlier because I wasn’t paying proper attention:

CASE WHEN is_nullable = 1 THEN 'NOT' ELSE '' END

You can see the logic is reversed from what it should be.

So, anyway, I ended up concocting my own, of which you can find the latest version at https://github.com/bartread/sqlscripts/blob/master/scripts/AlterAllNtextColumnsInDbToNvarcharMax.sql.

There are essentially three phases:

  1. Change the datatype of the NTEXT columns to NVARCHAR(MAX) using ALTER TABLE statements
  2. Pull any values small enough to fit in row out of LOB storage and back into rows using UPDATE statements. Thanks to John Conwell for pointing out the necessity of doing that to realise any performance increase with existing data.
  3. Refresh the metadata for any views using sp_refreshview – this makes sure that, for example, columns listed for them in sys.columns have the correct data type.

Phases 1 and 2 are actually done together in a loop for each NTEXT column in turn, whilst phase 3 is done in a separate loop at the end. I just refresh the metadata for all views because, although I could figure out only the views that depend on the tables, it’s simpler to just do them all and doesn’t take that long. Of course, if you have thousands of views and a relatively small number of NTEXT columns you might want to rethink this. My situation is numbers of tables, views, and NTEXT columns are all of the same order of magnitude so a simple script is fine.

For those of you who don’t have git installed, or aren’t comfortable with DVCS, here’s the full script:

USE _YOUR_DATABASE_NAME_
GO

SET NOCOUNT ON;

-- Set this to 0 to actually run commands, 1 to only print them.
DECLARE @printCommandsOnly BIT = 1;

-- Migrate columns NTEXT -> NVARCHAR(MAX)

DECLARE @object_id INT,
      
@columnName SYSNAME,
      
@isNullable BIT;

DECLARE @command NVARCHAR(MAX);

DECLARE @ntextColumnInfo TABLE (
  
object_id INT,
  
ColumnName SYSNAME,
  
IsNullable BIT
);

INSERT INTO @ntextColumnInfo ( object_id, ColumnName, IsNullable )
  
SELECT  c.object_id, c.name, c.is_nullable
  
FROM    sys.columns AS c
  
INNER JOIN sys.objects AS o
  
ON c.object_id = o.object_id
  
WHERE   o.type = 'U' AND c.system_type_id = 99;

DECLARE col_cursor CURSOR FAST_FORWARD FOR
   SELECT
object_id, ColumnName, IsNullable FROM @ntextColumnInfo;

OPEN col_cursor;
FETCH NEXT FROM col_cursor INTO @object_id, @columnName, @isNullable;

WHILE @@FETCH_STATUS = 0
BEGIN
   SELECT
@command =
      
'ALTER TABLE '
      
+ QUOTENAME(OBJECT_SCHEMA_NAME(@object_id))
           +
'.' + QUOTENAME(OBJECT_NAME(@object_id))
       +
' ALTER COLUMN '
      
+ QUOTENAME(@columnName)
       +
' NVARCHAR(MAX) '
      
+ CASE
          
WHEN @isNullable = 1 THEN ''
          
ELSE 'NOT'
        
END
      
+ ' NULL;';
      
  
PRINT @command;
  
IF @printCommandsOnly = 0
  
BEGIN
       EXECUTE
sp_executesql @command;
  
END

   SELECT @command =
      
'UPDATE '
      
+ QUOTENAME(OBJECT_SCHEMA_NAME(@object_id))
           +
'.' + QUOTENAME(OBJECT_NAME(@object_id))
       +
' SET '
      
+ QUOTENAME(@columnName)
       +
' = '
      
+ QUOTENAME(@columnName)
       +
';'

   PRINT @command;
  
IF @printCommandsOnly = 0
  
BEGIN
       EXECUTE
sp_executesql @command;
  
END

   FETCH NEXT FROM col_cursor INTO @object_id, @columnName, @isNullable;
END

CLOSE col_cursor;
DEALLOCATE col_cursor;

-- Now refresh the view metadata for all the views in the database
-- (We may not need to do them all but it won't hurt.)

DECLARE @viewObjectIds TABLE (
  
object_id INT
);

INSERT INTO @viewObjectIds
  
SELECT o.object_id
  
FROM sys.objects AS o
  
WHERE o.type = 'V';

DECLARE view_cursor CURSOR FAST_FORWARD FOR
   SELECT
object_id FROM @viewObjectIds;

OPEN view_cursor;
FETCH NEXT FROM view_cursor INTO @object_id;

WHILE @@FETCH_STATUS = 0
BEGIN
   SELECT
@command =
      
'EXECUTE sp_refreshview '''
      
+ QUOTENAME(OBJECT_SCHEMA_NAME(@object_id)) + '.' + QUOTENAME(OBJECT_NAME(@object_id))
       +
''';';
      
  
PRINT @command;

   IF @printCommandsOnly = 0
  
BEGIN
       EXECUTE
sp_executesql @command;
  
END

   FETCH NEXT FROM view_cursor INTO @object_id;
END

CLOSE view_cursor;
DEALLOCATE view_cursor;
GO

NOTE: this won’t work where views are created WITH SCHEMABINDING. It will fail at ALTER TABLE for any table upon which schemabound views depend. Instead, to make it work, you have to DROP the views, then do the ALTERs and UPDATEs, then re-CREATE the views. Bit of a PITA but there’s no way around it unfortunately. I didn’t need to worry about this because my client doesn’t use schemabound views.

Of course it goes without saying that you should back up your database before you run any script like this!

To use it you just need to substitute the name of your database where it says _YOUR_DATABASE_NAME_ at the top of the script.

Also, As with automating many tasks in SQL Server, dynamic SQL is a necessity. It’s a bit of a pain in the backside so a @printCommandsOnly mode is advised for debugging purposes, and I’ve switched this on by default. You can copy and paste the commands into a query window, parse them, or even execute them to ensure they work as expected.

Once you’re happy this script does what you want set the value of @printCommandsOnly to 0 and rerun it to actually execute the commands it generates.

You might wonder why I’ve written this imperatively rather than in set-based fashion. Well, it’s not just because I’m a programmer rather than a DBA. In fact the original version, which you can still see if you look at the file’s history, was set-based. It looked pretty much like this:

USE _YOUR_DATABASE_NAME_
GO

-- Migrate columns NTEXT -> NVARCHAR(MAX)

DECLARE @alterColumns NVARCHAR(MAX) = '';
SELECT  @alterColumns = @alterColumns
  
+'ALTER TABLE '
  
+ QUOTENAME(OBJECT_SCHEMA_NAME(c.object_id)) + '.' + QUOTENAME(OBJECT_NAME(c.object_id))
   +
' ALTER COLUMN '
  
+ QUOTENAME(c.Name)
   +
' NVARCHAR(MAX) '
  
+ CASE WHEN c.is_nullable = 1 THEN '' ELSE 'NOT' END + ' NULL;'
  
+ CHAR(13)
   +
'UPDATE '
  
+ QUOTENAME(OBJECT_SCHEMA_NAME(c.object_id)) + '.' + QUOTENAME(OBJECT_NAME(c.object_id))
   +
' SET '
  
+ QUOTENAME(c.Name)
   +
' = '
  
+ QUOTENAME(c.Name)
   +
';' + CHAR(13) + CHAR(13)
FROM    sys.columns AS c
INNER JOIN sys.objects AS o
ON c.object_id = o.object_id
WHERE   o.type = 'U' AND c.system_type_id = 99; --NTEXT

PRINT @alterColumns;

EXECUTE sp_executesql @alterColumns;
GO

-- Update VIEW metadata

DECLARE @updateViews NVARCHAR(MAX) = '';
SELECT @updateViews = @updateViews
  
+ 'EXECUTE sp_refreshview '''
  
+ QUOTENAME(OBJECT_SCHEMA_NAME(o.object_id)) + '.' + QUOTENAME(OBJECT_NAME(o.object_id))
   +
''';' + CHAR(13)
FROM sys.objects AS o
WHERE o.type = 'V'

PRINT @updateViews;

EXECUTE sp_executesql @updateViews;
GO

It’s certainly a lot less code, which is nice. And it doesn’t use CURSORs, which is also nice.

However, it does have problems:

  • The PRINT statement in T-SQL truncates output if it goes beyond a certain length. I don’t know exactly what this length is off the top of my head, but my generated scripts were more than long enough to reach it.
  • The result of this is you can’t copy and paste the complete generated script into another query window, so it might make debugging a bit trickier in some instances.
  • The really problematic thing is that, when something goes wrong, you can’t necessarily relate it back to the exact command that failed, whereas the imperative version makes this easy since each generated command is executed individually.

So I threw away the, on the face of it, "cleverer" and more "elegant" set-based version in favour of the longer, clunkier (but easier to debug) imperative version.

I hope you find it useful and please feel free to submit patches, pull requests, bug reports, feature requests via the main project GitHub page at https://github.com/bartread/sqlscripts. (Turns out I’m building up a small library of handy scripts so I’ll be pushing a few more items into this repo in due course.)

Timing is everything in the performance tuning game: learn to choose the right metrics to hunt down bottlenecks

So much of life is about timing. Just ask David Davis. He was arrested after getting into a scuffle whilst having his hair cut:

David Davis with half a haircut in his police mugshot.

Bad timing, right?

But that’s not really the kind of timing I’m talking about. When you’re performance tuning an application an understanding of timing is crucial to success – it can reveal truth that would otherwise remain masked. In this post I want to cover three topics:

  • The different types of timing data you can collect, and the best way to use them,
  • Absolute versus relative timing measures, and
  • The effect of profiling method (instrumentation versus sampling) on the timing data you collect.

Let’s start off with the first…

Regardless of your processor architecture, operating system, or technology platform most (good) performance profiling software will use the most accurate timer supported by your hardware and OS. On x86 and x64 processors this is the Time Stamp Counter, but most other architectures have an equivalent.

From this timer it’s possible to derive a couple of important metrics of your app’s performance:

  • CPU time – that is, the amount of time the processor(s) spend executing code in threads that are part of your process ONLY – i.e., exclusive of I/O, network activity (e.g., web service or web API calls), database calls, child process execution, etc.
  • Wall clock time – the actual amount of time elapsed executing a particular piece of code, such as a method, including I/O, network activity, etc.

Different products might use slightly different terminology, or offer subtly differing flavours of these two metrics, but the underlying principles are the same. For this post I’ll show the examples using ANTS Performance Profiler but you’ll find that everything I say is also applicable to other performance tools, such as DotTrace, the Visual Studio Profiling Tools, and JProfiler, so hopefully you’ll find it useful.

The really simple sequence diagram below illustrates the differences between CPU time and wall clock time for executing a method called SomeMethod(), which we’ll assume is in a .NET app, that queries a SQL Server database.

Sequence diagram illustrating the difference between wall clock and CPU time.

The time spent actually executing code in SomeMethod() is represented by regions A and C. This is the CPU time for the method. The time spent executing code in SomeMethod() plus retrieving data from SQL Server is represented by regions A, B, and C. This represents the wall clock time – the total time elapsed whilst executing SomeMethod(). Note that, for simplicity’s sake:

  • I’ve excluded any calls SomeMethod() might make to other methods in your code, into the .NET framework class libraries, or any other .NET libraries. Were they included these would form part of the CPU time measurement because this is all code executing on the same thread within your process.
  • I’ve excluded network latency from the diagram, which would form part of the wall clock time measurement.

Most good performance profilers will allow you to switch between CPU and wall clock time. All the profilers I mentioned above support this. Here’s what the options look like in ANTS Performance Profiler; other products are similar:

Timing options in Red Gate's ANTS Performance Profiler

There’s also the issue of time in method vs. time with children. Again the terminology varies a little by product but the basics are:

  • Time in method represents the time spent executing only code within the method being profiled. It does not include callees (or child methods), or any time spent sleeping, suspended, or out of process (network, database, etc.). It follows from this that the absolute value of time in method will be the same regardless of whether you’re looking at CPU time, or wall clock time.
  • Time with children includes time spent executing all callees (or child methods). When viewing wall clock time it also includes time spent sleeping, suspended, and out of process (network, database, etc.).

OK, let’s take a look at an example. Here’s a method call with CPU time selected:

CPU times for method

And here’s the same method call with wall clock time selected:

Wall clock times for method

Note how in both cases Time (ms), which represents time in method, is the same at 0.497ms, but that with wall clock time selected the time with children is over 40 seconds as opposed to less than half a second. We’ll take a look at why that is in a minute. For now all you need to understand is that this is time spent out of process, and it’s the kind of problem that can easily be masked if you look at only CPU time.

All right, so how do you know whether to look at CPU time or wall clock time? And are there situations where you might need to use both?

Many tools will give you some form of real-time performance data as you use them to profile your apps. ANTS Performance Profiler has the timeline; other tools have a “telemetry” view, which shows performance metrics. The key is to use this, along with what you know about the app to gain clues as to where to look for trouble.

The two screengrabs above are from a real example on the ASP.NET MVC back-end support systems for a large B2B ecommerce site. They relate to the user clicking on an invoice link from the customer order page. As you’d expect this takes the user to a page containing the invoice information, but the page load was around 45 seconds, which is obviously far too long.

Here’s what the timeline for that looked like in ANTS Performance Profiler:

ANTS Performance Profiler timeline for navigating from order to invoice page on internal support site.

(Note that I’ve bookmarked such a long time period not because the profiler adds that much overhead, but because somebody asked me a question whilst I was collecting the data so there was a delay before a clicked Stop Live Bookmark!)

As you can see, there’s very little CPU activity associated with the worker process running the site; just one small spike over to the left.

This tells you straight away that the time isn’t being spent on lots of CPU intensive activity in the website code. Look at this:

Call tree viewing CPU time - doesn't look like there's much amiss.

We’re viewing CPU time and there’s nothing particularly horrendous in the call tree. Sure, there’s probably some room for optimisation, but absolutely nothing that would account for the observed 45 second page load.

Switch to wall clock time and the picture changes:

Call graph looking at wall clock time - now we're getting somewhere!

Hmm, looks like the problem might be those two SQL queries, particularly the top one! Maybe we should optimise those*.

Do you see how looking at the “wrong” timing metric masked the problem? In reality you’ll want to use both metrics to see what each can reveal and you’ll quickly get to know which works best in different scenarios as you do more performance tuning.

By the way: for those of you working with Java, JProfiler has absolutely great database support with multiple providers for different RDBMSs. I would highly recommend you check it out.

You may have noticed that throughout the above examples I’ve been looking at absolute measurements of time, in this case milliseconds. Ticks and seconds are often also available, but many tools often offer relative measurements – generally percentages – in some cases as the default unit.

I find relative values often work well when looking at CPU time but that, generally, absolute values are a better bet for wall clock time. The reason for this is pretty simple: wall clock time includes sleeping, waiting, suspension, etc., and so often your biggest “bottleneck” can appear to be a single thread that mostly sleeps, or waits for a lock (e.g., the Waiting for synchronization item in the above screenshots). This will often be something like the GC thread and the problem is, without looking at absolute values, you’ve no real idea how significant the amounts of time spent in other call stacks really are. Switching to milliseconds or (for really gross problems – the above would qualify) seconds can really help.

Let’s talk about instrumentation versus sampling profiling and the effect this has on timings.

Instrumentation is the more traditional of the two methods. It actually modifies the running code to insert extra instructions that collect timing values throughout the code. For example, instructions will be inserted at the start and end of methods and, depending upon the level of detail selected, at branch points in the code, or at points which mark the boundaries between lines in the original source. Smarter profilers need only instrument branch points to accurately calculate line level timings and will therefore impose less overhead in use.

Back in the day this modification would be carried out on the source code, and this method may still be used with C++ applications. The code is modified as part of the preprocessor step. Alternatively it can be modified after compilation but before linking.

Nowadays, with VM languages, such as those that run in the JVM or the .NET CLR, the instrumentation is generally done at runtime just before the code is JITed. This has a big advantage: you don’t need a special build of your app in order to diagnose performance problems, which can be a major headache with older systems such as Purify.

Sampling is available in more modern tools and is a much lower overhead, albeit less detailed, method of collecting performance data. The way it works is that the profiler periodically takes a snapshot of the stack trace of every thread running in the application. It’ll generally do this many times a second – often up to 1,000 times per second. It can then combine the results from the different samples to work out where most time is spent in the application.

Obviously this is only good for method level timings. Moreover methods that execute very quickly often won’t appear in the results at all, or will have somewhat skewed timings (generally on the high side) if they do. Timings for all methods are necessarily relative and any absolute timings are estimates based on the number of samples containing each stack trace relative to the overall length of the selected time period.

Furthermore most tools cannot integrate ancillary data with sampling. For example, ANTS Performance Profiler will not give information about database calls, or HTTP requests, in sampling mode since this data is collected using instrumentation, which is how it is able to tell you – for example – exactly where queries were executed.

Despite these disadvantages, because of its low overhead, and because it doesn’t require modification of app code, sampling can often be used on a live process without the need for a restart before and after profiling, so can often be a good option for apps in production.

The effect of all of this on timing measurements if you’ve opted for sampling rather than instrumentation profiling is that the choice of wall clock time or CPU time becomes irrelevant. This is because whilst your profiler knows the call stack for each thread in every sample, it probably won’t know whether or not the thread was running (i.e., it could have been sleeping, suspended, etc.) – figuring this out could introduce unacceptable overhead whilst collecting data. As a result you’ll always be looking at wall clock time with sampling, rather than have the choice as you do with instrumentation.

Hopefully you’re now equipped to better understand and use the different kinds of timing data your performance profiler will show you. Please do feel free to chime in with questions or comments below – feedback is always much appreciated and if you need help I’d love to hear from you.

*Optimising SQL is beyond the scope of this post but I will cover it, using a similar example, in the future. For now I want to focus on the different timing metrics and what they mean to help you understand how to get the best out of your performance profiler. That being said, your tool might give you a handy hint so it’s not even as if you need to do that much thinking for yourself (but you’ll still look whip sharp in front of your colleagues)…

ANTS Performance Profiler hinting that the problem may be SQL-related.

Just don’t let them get a good look at your screen!

Live Bookmarking in ANTS Performance Profiler: a killer feature to help you zero in on performance problems fast

Last week I was sat with Simon, one of my client’s managers, as he showed me around their new customer support centre web app highlighting slow-loading pages. Simon, along with a couple of others, has been playing guinea pig using the new support centre in his day to day work.

The main rollout is in a few weeks but the performance problems have to be fixed first so support team members don’t spend a lot more time on calls, forcing customers to wait longer on hold before speaking to someone. Potentially bad for costs, customer satisfaction, and team morale!

Simon gave me a list of about a dozen trouble spots and I remoted into their production box to profile them all. I had to collect the results and get off as quickly as possible to avoid too much disruption; I could analyse them later on my own laptop. This gave me plenty of time to hunt down problems and suggest fixes.

I used Red Gate’s ANTS Performance Profiler throughout. One of the many helpful features it includes is bookmarking. You can mark any arbitrary time period in your performance session, give it a meaningful name (absolutely invaluable!), and use that as a shortcut to come back to it later.

For example, here I’ve selected the “Smart search” bookmark I created whilst profiling the support centre:

Timeline with bookmarked region selected.

The call tree shows me the call stacks that executed during the bookmarked time period. Towards the bottom you can see that SQL queries are using the vast majority of time in this particular stack trace:

Call tree showing call stacks within bookmarked region on timeline.

(Identifying SQL as a problem I took these queries and analysed them in more detail using both their execution plans, and SQL Server’s own SQL Profiler. I then suggested more efficient queries that could be used by NHibernate via repository methods.)

Also note we’re looking at Wall-clock time as opposed to CPU time. I won’t talk about the differences in detail here. What you need to understand is that Wall-clock time represents actual elapsed time. This matters because the queries execute in SQL Server, outside the IIS worker process running the site. Under CPU time measurements, which only include time spent in-process, they therefore wouldn’t appear as significant contributors to overall execution time.

Back on point: bookmarking is great as far as it goes, but you have to click and drag on the timeline after the fact to create them yourself. In the midst of an involved profiling session this is a hassle and can be error prone: what if by mistake you don’t drag out the full region you need to analyse? All too easily done, and as a result you can miss something important in your subsequent analysis.

Step in Live Bookmarks.

Basically, whilst profiling, you hit a button to start the bookmark, do whatever you need to do in your app, then hit a button to save the bookmark. Then you repeat this process as many times as you need. No worries about missing anything.

Here’s how it goes in detail:

  1. Start up a profiling session in ANTS Performance Profiler.
  1. Whilst profiling, click Start Bookmark. (To the right of the timeline.)

Start a live bookmark.

  1. Perform some action in your app – in my case I was clicking links to navigate problem pages, populate data, etc.
  1. Click Stop Bookmark.

Stop (and save) a live bookmark.

  1. The bookmark is added to the list on the right hand side of the timeline. It’s generally a good idea to give your bookmark a meaningful name. To do this just click the pencil icon next on the bookmark and type in the new name.

Rename bookmark.

  1. Rinse and repeat as many times as you need.
  1. When you’ve finished, click Stop in the top-left corner of the timeline to stop profiling.

Stop profiling.

It’s a good idea to save your results for later using File > Save Profiler Results, just in case the worst happens, and of course you can analyse them offline whenever you have time.

And that’s it: nice and easy, and very helpful when it comes to in depth performance analysis across a broad range of functionality within your application.

Any questions/comments, please do post below, or feel free to get in touch with me directly.

Video blog: Getting started with relational databases using IntelliJ IDEA 13

Recently I’ve been reacquainting myself with technologies I haven’t had to use in anger for a while. One of these is Java. This has been great because I’ve been able to get back into using my favourite Java IDE: IntelliJ IDEA from JetBrains.

I’ve also started playing around with video blogging, which I’ve been meaning to do for a while. Nowadays tools like the excellent (and free!) Expression Encoder from Microsoft (sadly no longer under active development) make very easy.

Here are the firstfruits of my labours – a quick video on getting started with relational databases using the Database Tools in IntelliJ IDEA 13 Ultimate Edition.

In this video I show you how to use IDEA to work with SQL Server as a data source. On the face of it this might seem like an odd thing to do. That is unless you’re used to working in environments with fairly diverse infrastructure, in which case it’ll seem like an everyday and fairly pedestrian occurrence.

Topics covered:

  • Creating and configuring your SQL Server data source,
  • Installing the appropriate JDBC driver – a process which IDEA automates to make it as easy as possible,
  • The database pane, which is similar to SQL Server Management Studio’s (SSMS) Object Explorer View,
  • The query editor, which provides intellisense/auto-complete, and knows the different SQL dialects of all the RDBMS’s for which IDEA provides explicit support,
  • The results pane; again very similar to SSMS’s.

What you end up with is an experience that offers much of the functionality you’d most commonly use within SSMS, but without having to step outside of the IDE.

If you develop on Windows and you’re used to SSMS you may not find yourself using this particularly often. If, however, you develop on another OS, you don’t have the SQL Server client tools installed, or you don’t want the hassle and overhead of stepping out of your IDE (SSMS is based on Visual Studio so can take a while to start), you should find the functionality in IDEA provides a convenient shortcut for working with SQL Server databases.

Enjoy!