Tip to INSERT Large Test Data effectively in TSQL

This blog post is not planned ;-). I am going through some routine work related at this late hour and I need to insert large amount of data into a test table to test some behaviors of indexes. Well I got started off with SSMS and sending multiple batches. This technique of batches is good for small data, but for large data of 100,000’s rows they are going to take lot of time, especially when you use simple INSERT Statements in a batch and execute it multiple times with GO. So I came up with one more solution using Recursive CTE which is effectively fast than the previous one.

Table under investigation -

CREATE TABLE [dbo].[TestTable](
    [Id] [int] IDENTITY(1,1) NOT NULL,
    [Data] [nvarchar](32) NOT NULL
)
GO

with Indexes on -

CREATE NONCLUSTERED INDEX IX_Individual_Data
    ON [dbo].[TestTable] (Data);

CREATE CLUSTERED INDEX PK_Individual_Id
    ON [dbo].[TestTable] (Id);

 

Now our first approach is with batches – This took me almost 2 mins 54 secs on my Intel Xeon, 2.00Gz and 3GB Windows Server 2012 64 bit machine.

 INSERT
 INTO
       [dbo].[TestTable] (Data)
 SELECT
       CAST('This is Test' +  CAST(ABS(CHECKSUM(NEWID())) % 50 AS NVARCHAR) AS NVARCHAR(32))

GO 300000

 

Second approach is to use a recursive CTE with MAXRECURSION set to 30000. Be careful while using MAXRECURSION, its maximum value is 32,767. Even in this approach we use multiple batches (GO 10), but this approach makes it quick and effective as you have more control on logic you want to have. Below query took me 15 secs.

     WITH TestData AS
 (
    -- Anchor Query
    SELECT
           1 As SampleId,
           CAST('This is Test' AS NVARCHAR(32)) AS SampleData

    -- Union All
    UNION ALL

    -- Recursive Query
    SELECT
           td.SampleId + 1,
           CAST('This is Test' +  CAST(ABS(CHECKSUM(NEWID())) % 50 AS NVARCHAR) AS NVARCHAR(32))
    FROM
           TestData td
    WHERE
           td.SampleId < 30000

 )
 INSERT
 INTO
       [dbo].[TestTable] (Data)
 SELECT
       SampleData
 FROM
       TestData OPTION (MAXRECURSION 30000);

GO 10

 

So based on my personal experiences, I would prefer to use CTE rather than Single insert batch approach. Please do let me know if there are any other quick approaches to insert random data into a table for testing purpose. I know other approaches using XML Parsing and BCP Utility, but on a quick note I do not want to spend lot of time in pushing test data,. so I preferred this way. Do please let me know if I missed any other fast methodologies.

You may also like...