SQL SERVER – Update Statistics are Sampled By Default

Filed under:
|
|
|
|
|
|
|
|
|
|
|
|
SQL Statistics

After reading my earlier post SQL SERVER – Create Primary Key with Specific Name when Creating Table on Statistics, I have received another question by a blog reader. The question is as follows:

Question: Are the statistics sampled by default?
Answer: Yes. The sampling rate can be specified by the user and it can be anywhere between a very low value to 100%.

Let us do a small experiment to verify if the auto update on statistics is left on. Also, let’s examine a very large table that is created and statistics by default- whether the statistics are sampled or not.

```USE [AdventureWorks] GO -- Create Table CREATE TABLE [dbo].[StatsTest]( [ID] [int] IDENTITY(1,1) NOT NULL, [FirstName] [varchar](100) NULL, [LastName] [varchar](100) NULL, [City] [varchar](100) NULL, CONSTRAINT [PK_StatsTest] PRIMARY KEY CLUSTERED ([ID] ASC) ) ON [PRIMARY] GO -- Insert 1 Million Rows INSERT INTO [dbo].[StatsTest] (FirstName,LastName,City) SELECT TOP 1000000 'Bob', CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith' ELSE 'Brown' END, CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 1 THEN 'New York' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 5 THEN 'San Marino' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 3 THEN 'Los Angeles' ELSE 'Houston' END FROM sys.all_objects a CROSS JOIN sys.all_objects b GO -- Update the statistics UPDATE STATISTICS [dbo].[StatsTest] GO -- Shows the statistics DBCC SHOW_STATISTICS ("StatsTest"PK_StatsTest) GO -- Clean up DROP TABLE [dbo].[StatsTest] GO ```

Now let us observe the result of the `DBCC SHOW_STATISTICS`.

The result shows that Resultset is for sure sampling for a large dataset. The percentage of sampling is based on data distribution as well as the kind of data in the table. Before dropping the table, let us check first the size of the table. The size of the table is 35 MB.

Now, let us run the above code with lesser number of the rows.

```USE [AdventureWorks] GO -- Create Table CREATE TABLE [dbo].[StatsTest]( [ID] [int] IDENTITY(1,1) NOT NULL, [FirstName] [varchar](100) NULL, [LastName] [varchar](100) NULL, [City] [varchar](100) NULL, CONSTRAINT [PK_StatsTest] PRIMARY KEY CLUSTERED ([ID] ASC) ) ON [PRIMARY] GO -- Insert 1 Hundred Thousand Rows INSERT INTO [dbo].[StatsTest] (FirstName,LastName,City) SELECT TOP 100000 'Bob', CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith' ELSE 'Brown' END, CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 1 THEN 'New York' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 5 THEN 'San Marino' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 3 THEN 'Los Angeles' ELSE 'Houston' END FROM sys.all_objects a CROSS JOIN sys.all_objects b GO -- Update the statistics UPDATE STATISTICS [dbo].[StatsTest] GO -- Shows the statistics DBCC SHOW_STATISTICS ("StatsTest"PK_StatsTest) GO -- Clean up DROP TABLE [dbo].[StatsTest] GO ```

You can see that Rows Sampled is just the same as Rows of the table. In this case, the sample rate is 100%.

Before dropping the table, let us also check the size of the table. The size of the table is less than 4 MB.

Let us compare the Result set just for a valid reference.

Test 1: Total Rows: 1000000, Rows Sampled: 255420, Size of the Table: 35.516 MB

Test 2: Total Rows: 100000, Rows Sampled: 100000, Size of the Table: 3.555 MB

The reason behind the sample in the Test1 is that the data space is larger than 8 MB, and therefore it uses more than 1024 data pages. If the data space is smaller than 8 MB and uses less than 1024 data pages, then the sampling does not happen. Sampling aids in reducing excessive data scan; however, sometimes it reduces the accuracy of the data as well.

Please note that this is just a sample test and there is no way it can be claimed as a benchmark test. The result can be dissimilar on different machines. There are lots of other information can be included when talking about this subject. I will write detail post covering all the subject very soon.

Reference: Pinal Dave (http://blog.SQLAuthority.com)

Filed under: Pinal Dave, SQL, SQL Authority, SQL Index, SQL Optimization, SQL Performance, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: SQL Statistics

© SQL Authority or respective owner

• SQLAuthority News – Pinal Dave: Blogger, MVP and now Interviewee by Michael J Swart

as seen on SQL Authority - Search for 'SQL Authority'
Michael J. Swart is a very unique person. I have often exchanged emails with him and also used a couple of his scripts in my presentations (with his permission). Every time I conduct spatial database presentation, I always start with his script where he has drawn the wonderful image of Botticelli’s… >>> More

• SQLAuthority News – Uncut and Unedited Video Interview of Pinal Dave

as seen on SQL Authority - Search for 'SQL Authority'
Earlier this year Lohith (@kashyapa) from Bangalore took my ‘Uncut and Unedited’ video interview. It was really fun to answer his questions as it was very different from regular interview. He asked few personal details few technical details and made me show few secrets. I think if you… >>> More

• There seems to be some 'lingering' SSH connections on my server. How do I fix it?

as seen on Server Fault - Search for 'Server Fault'
[[email protected] mike]# w 14:43:35 up 83 days, 1:25, 1 user, load average: 0.00, 0.00, 0.00 USER TTY FROM [email protected] IDLE JCPU PCPU WHAT mike pts/1 dsl-IP.w 14:43 0.00s 0.01s 0.03s sshd: mike [priv] [[email protected] mike]# ps aux | grep ssh root 1350 0.0 0.1… >>> More

• SQLAuthority News – DotNET Challenge of Sorting Generic List

as seen on SQL Authority - Search for 'SQL Authority'
This is a quick announcement of .NET challenge posted by Nupur Dave. She has asked very interesting question. If you are interested in learning .NET and winning iPAD by Red-Gate. I strongly suggest that all of you should attempt the quiz. Here is the question: How to insert an item in sorted generic… >>> More

• SQLAuthority News – SQL Server Technology Evangelists and Evangelism

as seen on SQL Authority - Search for 'SQL Authority'
This is the exact conversation that I had with three people during the recent SQL Server Public Training. Person 1: “Are you an SQL Server Evangelist?” Pinal : “No, but Vinod Kumar is.” Person 1: “Who are you?” Person 2: “He is Pinal, haha!” Person 1: “I… >>> More

• SQL SERVER – Concat Strings in SQL Server using T-SQL – SQL in Sixty Seconds #035 – Video

as seen on SQL Authority - Search for 'SQL Authority'
Concatenating  string is one of the most common tasks in SQL Server and every developer has to come across it. We have to concat the string when we have to see the display full name of the person by first name and last name. In this video we will see various methods to concatenate the strings. SQL… >>> More

• SQL SERVER – Concat Function in SQL Server – SQL Concatenation

as seen on SQL Authority - Search for 'SQL Authority'
Earlier this week, I was delivering Advanced BI training on the subject of “SQL Server 2008 R2″. I had great time delivering the session. During the session, we talked about SQL Server 2010 Denali. Suddenly one of the attendees suggested his displeasure for the product. He said, even though… >>> More

• Error with SQL Server Setup 2012 on Windows 2012

as seen on Server Fault - Search for 'Server Fault'
I am trying to install SQL Server on Windows 2012. I was able to finally get the wizard up and running after making some changes on the server, but now it fails no matter what I do with the following error: TITLE: SQL Server Setup failure. SQL Server Setup has encountered the following error: … >>> More

• Nested SQL Select statement fails on SQL Server 2000, ok on SQL Server 2005

as seen on Stack Overflow - Search for 'Stack Overflow'