What's the fastest way to bulk insert a lot of data in SQL Server (C# client)

Posted by Andrew on Stack Overflow See other posts from Stack Overflow or by Andrew
Published on 2008-08-23T12:53:09Z Indexed on 2010/05/24 19:01 UTC
Read the original article Hit count: 487

Filed under:

sql-server-2005

I am hitting some performance bottlenecks with my C# client inserting bulk data into a SQL Server 2005 database and I'm looking for ways in which to speed up the process.

I am already using the SqlClient.SqlBulkCopy (which is based on TDS) to speed up the data transfer across the wire which helped a lot, but I'm still looking for more.

I have a simple table that looks like this:

 CREATE TABLE [BulkData](
 [ContainerId] [int] NOT NULL,
 [BinId] [smallint] NOT NULL,
 [Sequence] [smallint] NOT NULL,
 [ItemId] [int] NOT NULL,
 [Left] [smallint] NOT NULL,
 [Top] [smallint] NOT NULL,
 [Right] [smallint] NOT NULL,
 [Bottom] [smallint] NOT NULL,
 CONSTRAINT [PKBulkData] PRIMARY KEY CLUSTERED 
 (
  [ContainerIdId] ASC,
  [BinId] ASC,
  [Sequence] ASC
))

I'm inserting data in chunks that average about 300 rows where ContainerId and BinId are constant in each chunk and the Sequence value is 0-n and the values are pre-sorted based on the primary key.

The %Disk time performance counter spends a lot of time at 100% so it is clear that disk IO is the main issue but the speeds I'm getting are several orders of magnitude below a raw file copy.

Does it help any if I:

Drop the Primary key while I am doing the inserting and recreate it later
Do inserts into a temporary table with the same schema and periodically transfer them into the main table to keep the size of the table where insertions are happening small
Anything else?

-- Based on the responses I have gotten, let me clarify a little bit:

Portman: I'm using a clustered index because when the data is all imported I will need to access data sequentially in that order. I don't particularly need the index to be there while importing the data. Is there any advantage to having a nonclustered PK index while doing the inserts as opposed to dropping the constraint entirely for import?

Chopeen: The data is being generated remotely on many other machines (my SQL server can only handle about 10 currently, but I would love to be able to add more). It's not practical to run the entire process on the local machine because it would then have to process 50 times as much input data to generate the output.

Jason: I am not doing any concurrent queries against the table during the import process, I will try dropping the primary key and see if that helps.

~ Andrew

Developer IT

What's the fastest way to bulk insert a lot of data in SQL Server (C# client) - Developer IT

What's the fastest way to bulk insert a lot of data in SQL Server (C# client)

c#

sql

sql-server

sql-server-2005

Related posts about c#

.NET WebRequest.PreAuthenticate not quite what it sounds like

HttpWebRequest and Ignoring SSL Certificate Errors

The dynamic Type in C# Simplifies COM Member Access from Visual FoxPro

Dynamic Type to do away with Reflection

Finding a Relative Path in .NET

Related posts about sql

SQL SERVER – Concat Strings in SQL Server using T-SQL – SQL in Sixty Seconds #035 – Video

SQL SERVER – Concat Function in SQL Server – SQL Concatenation

Error with SQL Server Setup 2012 on Windows 2012

How can I detect which version of SQL (eg SQL 2008 or SQL Azure)

Nested SQL Select statement fails on SQL Server 2000, ok on SQL Server 2005

Categories cloud