This blog post is not planned ;-). I am going through some routine work related at this late hour and I need to insert large amount of data into a test table to test some behaviors of indexes. Well I got started off with SSMS and sending multiple batches. This technique of batches is good for small data, but for large data of 100,000’s rows they are going to take lot of time, especially when you use simple INSERT Statements in a batch and execute it multiple times with GO. So I came up with one more solution using Recursive CTE which is effectively fast than the previous one.
Table under investigation -
CREATE TABLE [dbo].[TestTable]( [Id] [int] IDENTITY(1,1) NOT NULL, [Data] [nvarchar](32) NOT NULL ) GO
with Indexes on -
CREATE NONCLUSTERED INDEX IX_Individual_Data ON [dbo].[TestTable] (Data); CREATE CLUSTERED INDEX PK_Individual_Id ON [dbo].[TestTable] (Id);
Now our first approach is with batches – This took me almost 2 mins 54 secs on my Intel Xeon, 2.00Gz and 3GB Windows Server 2012 64 bit machine.
INSERT INTO [dbo].[TestTable] (Data) SELECT CAST('This is Test' + CAST(ABS(CHECKSUM(NEWID())) % 50 AS NVARCHAR) AS NVARCHAR(32)) GO 300000
Second approach is to use a recursive CTE with MAXRECURSION set to 30000. Be careful while using MAXRECURSION, its maximum value is 32,767. Even in this approach we use multiple batches (GO 10), but this approach makes it quick and effective as you have more control on logic you want to have. Below query took me 15 secs.
WITH TestData AS ( -- Anchor Query SELECT 1 As SampleId, CAST('This is Test' AS NVARCHAR(32)) AS SampleData -- Union All UNION ALL -- Recursive Query SELECT td.SampleId + 1, CAST('This is Test' + CAST(ABS(CHECKSUM(NEWID())) % 50 AS NVARCHAR) AS NVARCHAR(32)) FROM TestData td WHERE td.SampleId < 30000 ) INSERT INTO [dbo].[TestTable] (Data) SELECT SampleData FROM TestData OPTION (MAXRECURSION 30000); GO 10
So based on my personal experiences, I would prefer to use CTE rather than Single insert batch approach. Please do let me know if there are any other quick approaches to insert random data into a table for testing purpose. I know other approaches using XML Parsing and BCP Utility, but on a quick note I do not want to spend lot of time in pushing test data,. so I preferred this way. Do please let me know if I missed any other fast methodologies.