Tag Archives: nvarchar(max)

How to quickly convert all NTEXT columns to NVARCHAR(MAX) in a SQL Server database

I was at a client’s earlier today and the question came up of how to convert all NTEXT columns to NVARCHAR(MAX) in their SQL Server databases, and it turns out they have rather a lot of them.

There are a couple of obvious advantages to this conversion:

  1. Online index rebuilds with SQL Server Enterprise Edition become a possibility,
  2. Values are stored in row by default, potentially yielding performance gains.

My response to this was, “Yeah, sure: I can write a script to do that.” Two seconds after I said this I thought, “Hmm, I bet 30 seconds of Googling will provide a script because this must have come up a zillion times before.”

Sure enough, there are some pretty reasonable hits. For example, http://stackoverflow.com/questions/18789810/how-can-i-easily-convert-all-ntext-fields-to-nvarcharmax-in-sql-query.

Buuuuuuuuuuuut as ever, you’d be naive indeed to think that you can just copy and paste code from StackOverflow and have it work first time. Moreover, even with modification, you need to go over it with a fine-toothed comb to make sure you’ve squashed every last bug.

For example, this boned me earlier because I wasn’t paying proper attention:

CASE WHEN is_nullable = 1 THEN 'NOT' ELSE '' END

You can see the logic is reversed from what it should be.

So, anyway, I ended up concocting my own, of which you can find the latest version at https://github.com/bartread/sqlscripts/blob/master/scripts/AlterAllNtextColumnsInDbToNvarcharMax.sql.

There are essentially three phases:

  1. Change the datatype of the NTEXT columns to NVARCHAR(MAX) using ALTER TABLE statements
  2. Pull any values small enough to fit in row out of LOB storage and back into rows using UPDATE statements. Thanks to John Conwell for pointing out the necessity of doing that to realise any performance increase with existing data.
  3. Refresh the metadata for any views using sp_refreshview – this makes sure that, for example, columns listed for them in sys.columns have the correct data type.

Phases 1 and 2 are actually done together in a loop for each NTEXT column in turn, whilst phase 3 is done in a separate loop at the end. I just refresh the metadata for all views because, although I could figure out only the views that depend on the tables, it’s simpler to just do them all and doesn’t take that long. Of course, if you have thousands of views and a relatively small number of NTEXT columns you might want to rethink this. My situation is numbers of tables, views, and NTEXT columns are all of the same order of magnitude so a simple script is fine.

For those of you who don’t have git installed, or aren’t comfortable with DVCS, here’s the full script:

USE _YOUR_DATABASE_NAME_
GO

SET NOCOUNT ON;

-- Set this to 0 to actually run commands, 1 to only print them.
DECLARE @printCommandsOnly BIT = 1;

-- Migrate columns NTEXT -> NVARCHAR(MAX)

DECLARE @object_id INT,
      
@columnName SYSNAME,
      
@isNullable BIT;

DECLARE @command NVARCHAR(MAX);

DECLARE @ntextColumnInfo TABLE (
  
object_id INT,
  
ColumnName SYSNAME,
  
IsNullable BIT
);

INSERT INTO @ntextColumnInfo ( object_id, ColumnName, IsNullable )
  
SELECT  c.object_id, c.name, c.is_nullable
  
FROM    sys.columns AS c
  
INNER JOIN sys.objects AS o
  
ON c.object_id = o.object_id
  
WHERE   o.type = 'U' AND c.system_type_id = 99;

DECLARE col_cursor CURSOR FAST_FORWARD FOR
   SELECT
object_id, ColumnName, IsNullable FROM @ntextColumnInfo;

OPEN col_cursor;
FETCH NEXT FROM col_cursor INTO @object_id, @columnName, @isNullable;

WHILE @@FETCH_STATUS = 0
BEGIN
   SELECT
@command =
      
'ALTER TABLE '
      
+ QUOTENAME(OBJECT_SCHEMA_NAME(@object_id))
           +
'.' + QUOTENAME(OBJECT_NAME(@object_id))
       +
' ALTER COLUMN '
      
+ QUOTENAME(@columnName)
       +
' NVARCHAR(MAX) '
      
+ CASE
          
WHEN @isNullable = 1 THEN ''
          
ELSE 'NOT'
        
END
      
+ ' NULL;';
      
  
PRINT @command;
  
IF @printCommandsOnly = 0
  
BEGIN
       EXECUTE
sp_executesql @command;
  
END

   SELECT @command =
      
'UPDATE '
      
+ QUOTENAME(OBJECT_SCHEMA_NAME(@object_id))
           +
'.' + QUOTENAME(OBJECT_NAME(@object_id))
       +
' SET '
      
+ QUOTENAME(@columnName)
       +
' = '
      
+ QUOTENAME(@columnName)
       +
';'

   PRINT @command;
  
IF @printCommandsOnly = 0
  
BEGIN
       EXECUTE
sp_executesql @command;
  
END

   FETCH NEXT FROM col_cursor INTO @object_id, @columnName, @isNullable;
END

CLOSE col_cursor;
DEALLOCATE col_cursor;

-- Now refresh the view metadata for all the views in the database
-- (We may not need to do them all but it won't hurt.)

DECLARE @viewObjectIds TABLE (
  
object_id INT
);

INSERT INTO @viewObjectIds
  
SELECT o.object_id
  
FROM sys.objects AS o
  
WHERE o.type = 'V';

DECLARE view_cursor CURSOR FAST_FORWARD FOR
   SELECT
object_id FROM @viewObjectIds;

OPEN view_cursor;
FETCH NEXT FROM view_cursor INTO @object_id;

WHILE @@FETCH_STATUS = 0
BEGIN
   SELECT
@command =
      
'EXECUTE sp_refreshview '''
      
+ QUOTENAME(OBJECT_SCHEMA_NAME(@object_id)) + '.' + QUOTENAME(OBJECT_NAME(@object_id))
       +
''';';
      
  
PRINT @command;

   IF @printCommandsOnly = 0
  
BEGIN
       EXECUTE
sp_executesql @command;
  
END

   FETCH NEXT FROM view_cursor INTO @object_id;
END

CLOSE view_cursor;
DEALLOCATE view_cursor;
GO

NOTE: this won’t work where views are created WITH SCHEMABINDING. It will fail at ALTER TABLE for any table upon which schemabound views depend. Instead, to make it work, you have to DROP the views, then do the ALTERs and UPDATEs, then re-CREATE the views. Bit of a PITA but there’s no way around it unfortunately. I didn’t need to worry about this because my client doesn’t use schemabound views.

Of course it goes without saying that you should back up your database before you run any script like this!

To use it you just need to substitute the name of your database where it says _YOUR_DATABASE_NAME_ at the top of the script.

Also, As with automating many tasks in SQL Server, dynamic SQL is a necessity. It’s a bit of a pain in the backside so a @printCommandsOnly mode is advised for debugging purposes, and I’ve switched this on by default. You can copy and paste the commands into a query window, parse them, or even execute them to ensure they work as expected.

Once you’re happy this script does what you want set the value of @printCommandsOnly to 0 and rerun it to actually execute the commands it generates.

You might wonder why I’ve written this imperatively rather than in set-based fashion. Well, it’s not just because I’m a programmer rather than a DBA. In fact the original version, which you can still see if you look at the file’s history, was set-based. It looked pretty much like this:

USE _YOUR_DATABASE_NAME_
GO

-- Migrate columns NTEXT -> NVARCHAR(MAX)

DECLARE @alterColumns NVARCHAR(MAX) = '';
SELECT  @alterColumns = @alterColumns
  
+'ALTER TABLE '
  
+ QUOTENAME(OBJECT_SCHEMA_NAME(c.object_id)) + '.' + QUOTENAME(OBJECT_NAME(c.object_id))
   +
' ALTER COLUMN '
  
+ QUOTENAME(c.Name)
   +
' NVARCHAR(MAX) '
  
+ CASE WHEN c.is_nullable = 1 THEN '' ELSE 'NOT' END + ' NULL;'
  
+ CHAR(13)
   +
'UPDATE '
  
+ QUOTENAME(OBJECT_SCHEMA_NAME(c.object_id)) + '.' + QUOTENAME(OBJECT_NAME(c.object_id))
   +
' SET '
  
+ QUOTENAME(c.Name)
   +
' = '
  
+ QUOTENAME(c.Name)
   +
';' + CHAR(13) + CHAR(13)
FROM    sys.columns AS c
INNER JOIN sys.objects AS o
ON c.object_id = o.object_id
WHERE   o.type = 'U' AND c.system_type_id = 99; --NTEXT

PRINT @alterColumns;

EXECUTE sp_executesql @alterColumns;
GO

-- Update VIEW metadata

DECLARE @updateViews NVARCHAR(MAX) = '';
SELECT @updateViews = @updateViews
  
+ 'EXECUTE sp_refreshview '''
  
+ QUOTENAME(OBJECT_SCHEMA_NAME(o.object_id)) + '.' + QUOTENAME(OBJECT_NAME(o.object_id))
   +
''';' + CHAR(13)
FROM sys.objects AS o
WHERE o.type = 'V'

PRINT @updateViews;

EXECUTE sp_executesql @updateViews;
GO

It’s certainly a lot less code, which is nice. And it doesn’t use CURSORs, which is also nice.

However, it does have problems:

  • The PRINT statement in T-SQL truncates output if it goes beyond a certain length. I don’t know exactly what this length is off the top of my head, but my generated scripts were more than long enough to reach it.
  • The result of this is you can’t copy and paste the complete generated script into another query window, so it might make debugging a bit trickier in some instances.
  • The really problematic thing is that, when something goes wrong, you can’t necessarily relate it back to the exact command that failed, whereas the imperative version makes this easy since each generated command is executed individually.

So I threw away the, on the face of it, "cleverer" and more "elegant" set-based version in favour of the longer, clunkier (but easier to debug) imperative version.

I hope you find it useful and please feel free to submit patches, pull requests, bug reports, feature requests via the main project GitHub page at https://github.com/bartread/sqlscripts. (Turns out I’m building up a small library of handy scripts so I’ll be pushing a few more items into this repo in due course.)