When statistics was updated?

Statistics are used by the SQL Server optimizer to choose the efficient plan.  When we don’t have up to date statistics
it may end with SQL server optimizer choosing inefficient query plan. We can use below query to identify when SQL Server
statistics were last updated.

select db_id() as dbid,
  case
    when indid IN (0, 1) then convert (char (12), rows)
    else (select rows from dbo.sysindexes i2 where i2.id =  i.id and i2.indid in (0,1)) -- ''-''
  end as rowcnt,
  case
    when indid IN (0, 1) then rowmodctr
    else convert (bigint, rowmodctr) + (select rowmodctr from dbo.sysindexes i2 where i2.id =  i.id and i2.indid in (0,1))
  end as row_mods,
  case rows when 0 then 0 else convert (bigint,
    case
      when indid IN (0, 1) then convert (bigint, rowmodctr)
      else rowmodctr + (select convert (bigint, rowmodctr) from dbo.sysindexes i2 where i2.id =  i.id and i2.indid in (0,1))
    end / convert (numeric (20,2), (select case convert (bigint, rows) when 0 then 0.01 else rows end from dbo.sysindexes i2 where i2.id =  i.id and i2.indid in (0,1))) * 100)
  end as pct_mod,
  convert (nvarchar, u.name + '.' + o.name) as objname,
  case when i.status&0x800040=0x800040 then 'AUTOSTATS'
    when i.status&0x40=0x40 and i.status&0x800000=0 then 'STATS'
    else 'INDEX' end as type,
  convert (nvarchar, i.name) as idxname, i.indid,
  stats_date (o.id, i.indid) as stats_updated,
  case i.status & 0x1000000 when 0 then 'no' else '*YES*' end as norecompute,
  o.id as objid , rowcnt, i.status
from dbo.sysobjects o, dbo.sysindexes i, dbo.sysusers u
where o.id = i.id and o.uid = u.uid and i.indid between 1 and 254 and o.type = 'U'
order by pct_mod desc, convert (nvarchar, u.name + '.' + o.name), indid
GO

Advertisements

Linked server performance might be affected because of Remote Scan

In linked server queries when you use filtered query (Where clause) filtering may not be remoted, entire data is fetched locally from remote server and filtering happens locally hence will cause severe performance impact. You will find “Remote Scan”  operator in plan in this case. When filtering is remote you will see “Remote query” operator instead of remote scan.
 
How to fix it?
DynamicParameters and NestedQueries in the provider you are using should play a role here (Object exploreràServer objects à Linked servers  àProviders à Choose the provider you are using in linked server (Ex: SQLOLEDB) and enable  DynamicParameters and NestedQueries ). DynamicParameters setting should enable providers to support parameterized queries. If you enable it probably “remote scan” would switch to “remote query” and filtering will be happen on remote server instead of fetching the data locally and filtering.

SQLServer Query optimization

Do you have a slow SQLServer query and wanted to tune it… Tuning slow queries in SQL Server and making them run faster is not complex. Follow the query tunning steps in http://mssqlwiki.com/2012/11/06/tuning-sql-server-query/.

Below is extract from above article. If you would like to read the complete steps read above article.

Before you start troubleshooting the query which is running for longer duration, Identify if the query is slow because it is long waiting (or) Long running (or) Long compiling.

Compile time: Time taken to compile the query. compile time can be identified by looking at the

1. CompileTime=”n” in XML plan

2. SQL Server parse and compile time when Set statistics time on is enabled.

CPU time: Time taken by the query in CPU (Execution time – (compile time+ wait time). CPU time can be identified by looking at the

1. CPU column in profiler.

2. CPU time under SQL Server Execution Times when statistics time on is enabled.

Execution time: Time taken by the query for complete execution ( Execution time +CPU time +Wait time). Total duration of the query can be identified by using the

1.Duration column in profiler

2. SQL Server Execution Times, elapsed times when statistics time on is enabled.

What is long waiting query?

A query is considered to be long waiting query, when it spend most of its time waiting for some resource.

How to identify if the query is long waiting?

Long running query can be identified by comparing the CPU and duration column in profiler (or) CPU and elapsed time when statistics time on is set .

When a query is waiting for a resource (such as lock, network I/O, Page_I/O Etc) it will not consume CPU. So if you see duration being higher than CPU (Difference between Duration and CPU is wait time),It indicates that the query has spent large amount of time waiting for some resource.

Let us see an example of long waiting query. I have collected profiler trace while executing the query.

set statistics io on

set statistics time on

go

–Place your query here

select top 10000 * from a

go

set statistics io off

set statistics time off

go

Look at the Duration and CPU column in the profiler Cpu=256 and duration =1920. So this query has spent majority of time waiting for some resource.

Look at the output of statistics time and statistics I/O in above image.

SQL Server has spent only 2 milliseconds compiling the query and 256 milliseconds on CPU, but the overall duration was 1920 milliseconds so the query has spent maximum time waiting for some resource.

Identify the resource in which this query is waiting on using one of the steps listed below.

1. Look at the wait type column of the sysprocesses for the spid which is executing query while the query is executing.

2. If there is no other activity on the server collect sys.dm_os_wait_stats output before and after the query execution and identify the wait (Will not help in tuning queries running for short duration)

3. Collect XEvent to gather the wait stats of individual query.

Once you identify the resource in which the query is waiting on tune the resource. Most of the times queries would be slow waiting for below resource.

PAGEIOLATCH_* or Write log: This indicates I/O resource bottleneck follow the detailed troubleshooting steps mentioned in This Link to fix the I/O bottleneck. If you find SQL Server spawning excessive I/O Create necessary indexes.

a. Logical reads + Physical reads in statistics I/O output (Refer above image) or Reads and writes in profiler will indicate the I/O posted by this query. If you see very high reads for query compared with the result rest retuned by query it is an indication of missing indexes or bad plan. Create necessary indexes (You can use DTA for index recommendations.).

PAGELATCH_*: This waittype in sysprocesses indicates that SQL Server is waiting on access to a database page, but the page is not undergoing physical IO.

a. This problem is normally caused by a large number of sessions attempting to access the same physical page at the same time. We should Look at the wait resource of the spid The wait_resource is the page number (the format is dbid:file:pageno) that is being accessed.

b. We can use DBCC PAGE to identify object or type of the page in which we have the contention. Also it will help us to determine whether contention is for allocation, data or text.

c. If the pages that SQL Server is most frequently waiting on are in Tempdb database ,check the wait resource column for a page number in dbid 2 Ex(2:1:1 or 1:1:2). Enable TF 1118 and increase the number of TEMPDB data files and size them equally (You may be facing tempdb llocation latch contention mentioned in http://support.microsoft.com/kb/328551)

d. If the page is in a user database, check to see if the table has a clustered index on a monotonic key such as an identity where all threads are contending for the same page at the end of the table. In this case we need to choose a different clustered index key to spread the work across different pages.

LATCH_*: Non-buf latch waits can be caused by variety of things. We can use the wait resource column in sysprocesses to determine the type of latch involved(KB 822101).

a. A very common LATCH_EX wait is due to running a profiler trace or sp_trace_getdata Refer KB 929728 for more information.

b. Auto Grow and auto shrink while query is executed.

c. Queries going for excessive parallelism.

Blocking (LCK*): Use the query in This Link to identify the blocking. Tune the head blocker.

Asynch_network_io (or) network IO: Keep the result set returned by the query smaller. Follow detailed troubleshooting refer This Link

Resource_semaphore waits: Make sure there is no memory pressure on the server Follow steps in This Link for detailed troubleshooting.

SQL Trace: Stop all the profiler traces running on the server. Identify the traces which are running on the server using the query in This Link

Cx packet: Set the Max degree of parallelism. But remember Cxpacket wait type is not always a problem.

a. For servers that have eight or less processors, use the following configuration where N equals the number of processors: max degree of parallelism = 0 to N .

b. For servers that use more than eight processors, use the following configuration: max degree of parallelism = 8.Refer This Link

SOS_SCHEDULER_YIELD : Identify if there is CPU bottleneck on the server. This waiting means that the thread is waiting for CPU.

a. SQL Server worker thread’s Quantum target is 4ms which means the thread(worker) Will ( is expected to) yield back to SQL Server scheduler when it exceeds 4ms and before it yields back it check if there are any other runnable threads, If there is any runnable threads then the thread which is in top of runnable list is scheduled and current thread will go to the tail of the runnable list and will get rescheduled when the other threads which are already waiting in SOS Scheduler (runnable list) finishes its execution or quantum. The time thread spends in runnable list waiting for its quantum is accounted as SOS_SCHEDULER_YIELD. You will see this type when multiple threads are waiting to get CPU cycle. Follow trouble shooting the steps mentioned This Link

Important: In SQL Server instances when there more than 1 CPU it is possible that the CPU is higher than the duration. Because CPU is sum of time spend by query in all the CPU’s when choosing a parallel whereas the duration is actual duration of the query.

What is long running query?

A query is considered to be long running query, when it spend most of its time on CPU and not waiting for some resource.

How to identify if the query is long running ?

Long running query can be identified by comparing the CPU and duration column in profiler (or) CPU and elapsed time when statistics time on is set . If the CPU and duration is close than the query is considered to be long running. If the query is long running identify where the query spend the time ,It could be for compiling or post compilation (For executing the query). compare the duration of the query with CompileTime (XML plan compile time (or) SQL Server parse and compile time when statistics time is on refer above image).

High Compile time:

Compare the duration of the query with Compile Time (XML plan compile time (or) SQL Server parse and compile time when statistics time is on).Compile time will normally be in few millisecond . Follow the below steps if you see high compile time

1. Identify if you have large token perm refer http://support.microsoft.com/kb/927396

2. Create necessary indexes and stats. Tune the query manually (or) in DTA and apply the recommendation

3. Reduce the complexity of query. Query which joins multiple tables (or) having large number of IN clause can taking a while to compile.

4. You can reduce the compile’s by using force parameterization option.

High CPU time:

Compare the duration of the query with Compile Time (XML plan compile time (or) SQL Server parse and compile time when statistics time is on). If the compile time is very low compared to the duration. Then follow the below steps.

1. Update the stats of tables and indexes used by the query (If the stats are up to date Estimated rows and estimated execution will be approximately same in execution plan .If there is huge difference stats are out dated and requires update) .

2. Identify if the query has used bad plan because of parameter sniffing (If the ParameterCompiledValue and ParameterRuntimeValue is different in XML plan). Refer THIS LINK to know more about Parameter Sniffing

3. If updating the stats and fixing the parameter sniffing doesn’t resolve the issue it is more likely optimizer is not able to create efficient plan because of lack of indexes and correct statistics. Run the query which is driving the CPU in database tuning advisor and apply the recommendations. (You will find missing index detail in xml plan but DTA is more efficient).

4. If the query which is running longer and consuming CPU is linked server query try changing the security of linked server to ensure linked server user has ddl_admin or dba/sysadmin on the remote server. More details regarding the issue in THIS LINK.

5. Ensure optimizer is not aborting early and creating bad plan. For details refer THIS LINK

6. Ensure the query which is spiking the CPU doesn’t have plan guides (xml plan will have PlanGuideDB attribute. Also sys.plan_guides will have entries) and query hints(index= or (option XXX join) or inner (Join Hint) join).

7. Ensure that SET options are not changed.

CACHESTORE_SQLCP will increase when you run backup

CACHESTORE_SQLCP increases continuously when we run the backup.
When you run the database/Tlog backup very frequently or for the databases with large number of files/file groups CACHESTORE_SQLCP increases
We can identify the issue by using below queries

Select text,a.*from sys.dm_exec_cached_plans a cross apply sys.dm_exec_sql_text(plan_handle)  order by size_in_bytes desc
Go

Select p.plan_handle,CONVERT (varchar, GETDATE(), 126) AS runtime, LEFT (p.cacheobjtype + ' (' + p.objtype + ')', 35) AS cacheobjtype
,p.usecounts, p.size_in_bytes / 1024 AS size_in_kb, stat.total_worker_time/1000 AS tot_cpu_ms,
stat.total_elapsed_time/1000 AS tot_duration_ms, stat.total_physical_reads, stat.total_logical_writes, stat.total_logical_reads,
LEFT (CASE WHEN pa.value=32767 THEN 'ResourceDb' ELSE ISNULL (DB_NAME (CONVERT (sysname, pa.value)),
CONVERT (sysname,pa.value))END, 40) AS dbname,sql.objectid, CONVERT (nvarchar(50),
CASE WHEN sql.objectid IS NULL THEN NULL ELSE REPLACE (REPLACE (sql.[text],CHAR(13), ' '), CHAR(10), '')END) AS procname,
REPLACE (REPLACE (SUBSTRING (sql.[text], stat.statement_start_offset/2 + 1, CASE WHEN stat.statement_end_offset = -1
THEN LEN (CONVERT(nvarchar(max), sql.[text]))
ELSE stat.statement_end_offset/2 - stat.statement_start_offset/2 + 1 END), CHAR(13), ' '), CHAR(10), ' ') AS stmt_text
FROM sys.dm_exec_cached_plans p OUTER APPLY sys.dm_exec_plan_attributes (p.plan_handle) pa INNER JOIN sys.dm_exec_query_stats stat
ON p.plan_handle = stat.plan_handle OUTER APPLY sys.dm_exec_sql_text (p.plan_handle) AS sql
WHERE pa.attribute = 'dbid' ORDER BY p.plan_handle DESC

Go

 
Resolution

SQL Server 2008
There is a fix for this issue in SQL server 2008 http://support.microsoft.com/kb/961323

SQL Server2005
There is no Fix for SQL Server 2005. You can follow some workarounds
Since all the plans (problematic) are created for MSDB. You can use dbcc flushprocindb(4) —  to flush the plans of msdb without touching the plans of other databases (4 is database id of MSDB) .
You can add a new step to backup job created by maintenance plan to run dbcc flushprocindb(4) every time backup completes. So the plans created in MSDB will be flushed immediately after the backup.

dbcc dbredindex/Alter index (or) update stats which should run first

dbcc dbredindex/Alter index (or) update stats which should run first?

Statistics created manually or by auto create stats will not be updated by ALTER INDEX ALL ON table.

If you run dbcc dbredindex on a table , it will update stats on index indx with 100% sampling, but the stats created by auto create stats and manually created stats are updated with default sampling percentage.

Now choose your self.

 

Update statistics with full scan or sample?

Should I choose full scan while I update the statistics?

In an ideal situation, if data is 100% random and 100% even distributed, 10% sampling should be as accurate as full scan. But any uneven distribution of data can cause inaccuracy both with sampling and even with full scan.
In short, with skewed data, update stats with sampling can be less accurate than full scan ,but important fact is full scan will also not be accurate.

UPDATE STATISTICS WITH SAMPLE take longer then FULLSCAN

UPDATE STATISTICS WITH SAMPLE take longer then FULLSCAN and consumes more tempdb

Recently there was a question in MSDN forum about time and tempdb taken by update statistics with full scan and sampling.
http://social.msdn.microsoft.com/Forums/en/sqldatabaseengine/thread/eee70106-94aa-4140-82bf-654352f82d8f
Question was why full scan completes faster than sampling

I was trying to reproduce this behavior in one of my lab server.

1.When I was updating the statistics with n% sampling for the Clustered Index on Table(Which had clustered and non-clustered index ), SQL Server choses a non-clustered index to do the scan. Here the data scanned/fetched was in the order of the non-clustered index keys. Hence we need to SORT them once based on the clustered index keys to generate the statistics. Sort was consuming time and we also used tempdb to create the worktables.

2 .On the other side when I was using update stats with fullscan SQL Server chooses a “clustered index scan” to do the scan. Since the data was already sorted in clustered index, sort was avoided and update stats completed faster.

Why optimizer used non-clustered index to do the scan when I was using n% sampling ?
As we are using a non-clustered index to scan the rows, we effectively get lot lesser rows into the memory from disk when compared with using a clustered index to scan the rows. The decision about which index to use was cost based. As we know SQL Server is a cost based optimizer it chooses the index that can provide the results with lesser subtree cost.
If I wouldn’t have got that non-clustered index , then sampling might have run faster.
There is no straight answer “It depends“.You have to look at the plan of update stats and troubleshoot/Identify the cause like you do for any other slow queries.