SQL Server assert in Location: purecall.cpp:51

SQL Server assert in purecall.cpp:51

BEGIN STACK DUMP:

spid 231

Location: purecall.cpp:51

Expression: !”purecall”

SPID: 200

Process ID: 5125

Description: Pure virtual function call

Server Error: 17065, Severity: 16, State: 1.

Server SQL Server Assertion: File: <purecall.cpp>, line = 51 Failed Assertion = ‘!”purecall”‘ Pure virtual function call. This error may be timing-related. If the error persists after rerunning the statement, use DBCC CHECKDB to check the database for structural integrity, or restart the server to ensure in-memory data structures are not corrupted.

Possible causes for above assert are

1. Antivirus softwares which detours in sqlserver address space can inject their instruction in sqlserver modules and can cause this Ex. Sophos etc..

Run select * from sys.dm_os_loaded_modules and check if there are DLL’d loaded from Antivirus (Company column will have the AV company name). If you see any antivrus exclude SQLServer from them.

(or)

Run lm command in the dump and see if there are any Antivirus DLL’s loaded in sqlserver process memory.

2. If you don’t see any Antivirus dll then run windows memory diagnostic tool and check if there are any memory problems on your system( %windir%\system32\MdSched.exe).

3. If there is no antivirus or memory errors follow the steps in http://mssqlwiki.com/2012/10/16/sql-server-exception_access_violation-and-sql-server-assertion/

Trace waits in SQLServer (SQLTRACE_BUFFER_FLUSH,TRACEWRITE,SQLTRACE_WAIT_ENTRIES,SQLTRACE_LOCK)

When you run Profiler trace from client systems or on server with large number of events you will see below wait types.

SQLTRACE_WAIT_ENTRIES
SQLTRACE_LOCK
SQLTRACE_BUFFER_FLUSH
TRACEWRITE

There is no way to completely avoid this wait type without stopping all the traces. We can reduce this waitypes by configuring server side trace instead of client side trace.

select * from sys.traces will give you information about all the traces. ( Status column 0 stopped and 1 active)

For the traces collected using profiler you will find a NULL Path. Profiler traces can cause large number of above waittypes.

Thread which raises the trace event is responsible to Get buffers to write event and write event. So collecting the trace on network share or on slow disk or using profiler can slow down the trace write and make the threads wait, sometimes we may also end with dead locked schedulers.

So ideally you have to avoid running profiler when you see below waits and use sp_trace_create if you like to capture the trace and pass the local path for tracefile parameter.

How to view the Space used by each table in database

There are situations in which you may have to list the space used by each table in the database. You can use the query below to get the space used by individual tables.
–Space used by the database

sp_spaceused

–Space used by individual tables in the database

SELECT
(row_number() over(order by a3.name, a2.name)) as SNO,
a3.name AS [schemaname],
a2.name AS [tablename],
a1.rows as row_count,
(a1.reserved + ISNULL(a4.reserved,0))* 8 AS reserved, 
a1.data * 8 AS data,
(CASE WHEN (a1.used + ISNULL(a4.used,0)) > a1.data THEN 
(a1.used + ISNULL(a4.used,0)) - a1.data ELSE 0 END) * 8 AS index_size,
(CASE WHEN (a1.reserved + ISNULL(a4.reserved,0)) > a1.used THEN 
(a1.reserved + ISNULL(a4.reserved,0)) - a1.used ELSE 0 END) * 8 AS unused
FROM
(SELECT 
ps.object_id,
SUM (
CASE
WHEN (ps.index_id < 2) THEN row_count
ELSE 0
END
) AS [rows],
SUM (ps.reserved_page_count) AS reserved,
SUM (
CASE
WHEN (ps.index_id < 2) THEN 
(ps.in_row_data_page_count + ps.lob_used_page_count + 
ps.row_overflow_used_page_count)
ELSE (ps.lob_used_page_count + ps.row_overflow_used_page_count)
END
) AS data,
SUM (ps.used_page_count) AS used
FROM sys.dm_db_partition_stats ps
GROUP BY ps.object_id) AS a1
LEFT OUTER JOIN 
(SELECT 
it.parent_id,
SUM(ps.reserved_page_count) AS reserved,
SUM(ps.used_page_count) AS used
FROM sys.dm_db_partition_stats ps
INNER JOIN sys.internal_tables it ON (it.object_id = ps.object_id)
WHERE it.internal_type IN (202,204)
GROUP BY it.parent_id) AS a4 ON (a4.parent_id = a1.object_id)
INNER JOIN sys.all_objects a2 ON ( a1.object_id = a2.object_id ) 
INNER JOIN sys.schemas a3 ON (a2.schema_id = a3.schema_id)
WHERE a2.type <> 'S' and a2.type <> 'IT'
ORDER BY a3.name, a2.name

FILESTREAM feature is disabled

Restore database (or) SQL Server setup fails in script upgrade mode with below error
{
Msg 5591, Level 16, State 4, Line 2
FILESTREAM feature is disabled.
Msg 3013, Level 16, State 1, Line 2
RESTORE DATABASE is terminating abnormally.
}

Follow the steps in http://msdn.microsoft.com/en-us/library/cc645923.aspx and enable File stream, then restore the database.

If the upgrade for SQL Server is failing with error “ESTREAM feature is disabled” then enable file stream in configuration manager by following above article and start SQL Server by following steps mentioned in

 SQL Server2008/SQL Server2012: Script level upgrade for database ‘master’ failed  and run below statement to change file stream access level

sp_configure filestream_access_level, 2
RECONFIGURE with override

After you executed the above statement start the SQL Server normally.

XP_readerrorlog fails with Failed to open loopback connection.

We might get below error when we open sqlserver errorlog in SSMS or using xp_readerrorlog or sp_readerrorlog

{

Msg 22004, Level 16, State 1, Line 0

Failed to open loopback connection. Please see event log for more information.

Msg 22004, Level 16, State 1, Line 0

error log location not found

}

1. If you get this error in Clustered instance of SQL Server then disable “shared memory protocol” in client configuration.

2. Check if you have wrong alias created and fix them.

3. UAC can also cause this (You might see following message in sqlserver errorlog “Reason: Token-based server access validation failed with an infrastructure error. Check for previous errors”).when you run SSMS run it in elevated mode (right click SSMS–>Run as administrator).

4. OOM condition in sqlsever can also cause this error

5. Executing xp_readerrorlog from DAC connection can also cause this error.

TCP Provider: The semaphore timeout period has expired

TCP Provider: The semaphore timeout period has expired error from SQL Server agent and other applications at times.

1. Disable TCP Chimney.Refer KB:942861

2. If you are in windows 2003 Change the value of the processor affinity to match the number of processors in the system.Follow KB:892100

{
1.Click Start, click Run, type regedit, and then click OK.
2.Expand the following registry subkey:
HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\NDIS\Parameters
3.Right-click ProcessorAffinityMask, and then click Modify.
4.In the Value data box, type one of the following values, and then click OK:
◦If you have two processors, use the binary value 0b11, or hex value 0x3.
◦If you have three processors, use the binary value 0b111, or hex value 0x7.
◦If you have four processors, use the binary value 0b1111, or hex value 0xF.
5.Quit Registry Editor.
Note The 0x0 or 0xFFFFFFFF values are used to disable the ProcessorAffinityMask entry.
}

3. Check if priority boost is enabled for SQL Server. If yes disable it.

4. Make sure there is no working set trim and system wide memory pressure. You can use second query in significant part of sql server process memory has been paged out to identify and follow the same blog to fix it)

5. Check if paged pool and non-paged is empty. (Event ID:  2019  in event log)

6. If you see this problem in cluster make sure you have set the network priority of “private heart beat” network higher than the “public” network.Refer KB:258750

 

Unable to connect to Microsoft Distributed Transaction Coordinator (MS DTC) to check the completion status of transaction.

An error occurred while recovering database. Unable to connect to Microsoft Distributed Transaction Coordinator (MS DTC) to check the completion status of transaction.

You might get below errors when you try to restore database from SAN snapshot and see a Orphan Msdtc transaction Spid -2

    Error

Msg 6110, Level 16, State 1, Line 1
The distributed transaction with UOW {00000000-0000-0000-0000-000000000000} does not exist.

An error occurred while recovering database Unable to connect to Microsoft Distributed Transaction Coordinator (MS DTC) to check the completion status of transaction (0:12345). Fix MS DTC, and run recovery again.

    Resolution

Change the sp_configure option
sp_configure ‘in-doubt xact resolution’,2
–Presume abort. Any MS DTC in-doubt transactions are presumed to have aborted.
Go
DBCC dbrecover(‘Dbname’);

Addition details on above sp_configure option in http://msdn.microsoft.com/en-us/library/ms179586(v=sql.90).aspx

Note: Use above option with caution

SQL Server integration services fails to start.

If SQL Server integration services fails to start. Create a backup of DTExec.exe.config and MsDtsSrvr.exe.config file  in   “C:\Program Files\Microsoft SQL Server\90\DTS\Binn” and add  <generatePublisherEvidence enabled=”false”/> under runtime tag and the start the SSIS service.

 

 

Here is an example DTExec.exe.config file for SSIS 2005:
<configuration>
<startup>
<requiredRuntime version=”v2.0.50727″/>
</startup>
<runtime>
<generatePublisherEvidence enabled=”false”/>
</runtime>
</configuration>

 

Here is an example msdtssrvr.exe.config file for SSIS 2005:
<?xml version=”1.0″ encoding=”utf-8″ ?>
<configuration>
  <runtime>
    <gcServer enabled=”true”/>
    <generatePublisherEvidence enabled=”false”/>
  </runtime>
  <system.diagnostics>
    <switches>
      <add name=”TraceClientConnections” value=”0″ />
      <add name=”TraceManagementCalls” value=”0″ />
      <add name=”ServerTraceSwitch” value=”0″ />
    </switches>
  </system.diagnostics>
</configuration>

More details about the above tab is mentioned in http://blogs.msdn.com/b/amolravande/archive/2008/07/20/startup-performance-disable-the-generatepublisherevidence-property.aspx

Query to find SQL Server CPU utilization 2012.

Query to find SQL Server CPU utilization 2012. This query will help you to find the CPU utilization of server and SQL. If like to know the steps to bring down the SQL Server CPU utilization follow http://mssqlwiki.com/2012/10/04/troubleshooting-sql-server-high-cpu-usage/

 	
	declare @ms_now bigint
	select @ms_now = ms_ticks from sys.dm_os_sys_info;
select top 15 record_id,
		dateadd(ms, -1 * (@ms_now - [timestamp]), GetDate()) as EventTime, 
		SQLProcessUtilization,
		SystemIdle,
		100 - SystemIdle - SQLProcessUtilization as OtherProcessUtilization
	from (
		select 
			record.value('(./Record/@id)[1]', 'int') as record_id,
			record.value('(./Record/SchedulerMonitorEvent/SystemHealth/SystemIdle)[1]', 'int') as SystemIdle,
			record.value('(./Record/SchedulerMonitorEvent/SystemHealth/ProcessUtilization)[1]', 'int') as SQLProcessUtilization,
			timestamp
		from (
			select timestamp, convert(xml, record) as record 
			from sys.dm_os_ring_buffers 
			where ring_buffer_type = N'RING_BUFFER_SCHEDULER_MONITOR'
			and record like '%<SystemHealth>%') as x
		) as y 
	order by record_id desc

How to view xml plan as graphical execution plan

Do you want to look XML execution plan as graphical execution plan?

It is simple save the xml plan with .sqlplan extension and then open with management studio.

Similarly to view graphical plan as XML execution plan. Save the file with .XML extension or rename the existing graphical plan with .xml extension

Top queries by total I/O and average I/O in SQL Server

You can use the below queries to identify the top queries by total I/O and average I/O in SQL Server

–Top queries by total I/O


exec sp_executesql @stmt=N'begin try
select top 10 rank() over (order by total_logical_reads+total_logical_writes desc,sql_handle,statement_start_offset ) as row_no
, (rank() over (order by total_logical_reads+total_logical_writes desc,sql_handle,statement_start_offset ))%2 as l1
, creation_time
, last_execution_time
, (total_worker_time+0.0)/1000 as total_worker_time
, (total_worker_time+0.0)/(execution_count*1000) as [AvgCPUTime]
, total_logical_reads as [LogicalReads]
, total_logical_writes as [LogicalWrites]
, execution_count
, total_logical_reads+total_logical_writes as [AggIO]
, (total_logical_reads+total_logical_writes)/(execution_count+0.0) as [AvgIO]
, case when sql_handle IS NULL
then '' ''
else ( substring(st.text,(qs.statement_start_offset+2)/2,(case when qs.statement_end_offset = -1 then len(convert(nvarchar(MAX),st.text))*2 else qs.statement_end_offset end - qs.statement_start_offset) /2 ) )
end as query_text
, db_name(st.dbid) as database_name
, st.objectid as object_id
from sys.dm_exec_query_stats qs
cross apply sys.dm_exec_sql_text(sql_handle) st
where total_logical_reads+total_logical_writes > 0
order by [AggIO] desc
end try
begin catch
select -100 AS row_no
, 1 AS l1, 1 AS creation_time, 1 AS last_execution_time, 1 AS total_worker_time, 1 AS Avg_CPU_Time, 1 AS logicalReads, 1 AS LogicalWrites
, ERROR_NUMBER() AS execution_count
, ERROR_SEVERITY() AS AggIO
, ERROR_STATE() AS AvgIO
, ERROR_MESSAGE() AS query_text
end catch',@params=N''

–Top queries by Average I/O


exec sp_executesql @stmt=N'begin try
select top 10 rank() over (order by (total_logical_reads+total_logical_writes)/(execution_count+0.0) desc,sql_handle,statement_start_offset ) as row_no
, (rank() over (order by (total_logical_reads+total_logical_writes)/(execution_count+0.0) desc,sql_handle,statement_start_offset ))%2 as l1
, creation_time
, last_execution_time
, (total_worker_time+0.0)/1000 as total_worker_time
, (total_worker_time+0.0)/(execution_count*1000) as [AvgCPUTime]
, total_logical_reads as [LogicalReads]
, total_logical_writes as [LogicalWrites]
, execution_count
, total_logical_reads+total_logical_writes as [AggIO]
, (total_logical_reads+total_logical_writes+0.0)/execution_count as [AvgIO]
, case when sql_handle IS NULL
then '' ''
else ( substring(st.text,(qs.statement_start_offset+2)/2,(case when qs.statement_end_offset = -1 then len(convert(nvarchar(MAX),st.text))*2 else qs.statement_end_offset end - qs.statement_start_offset) /2 ))
end as query_text
, db_name(st.dbid) as database_name
, st.objectid as object_id
from sys.dm_exec_query_stats qs
cross apply sys.dm_exec_sql_text(sql_handle) st
where (total_logical_reads+total_logical_writes ) > 0
order by [AvgIO] desc
end try
begin catch
select -100 AS row_no
, 1 AS l1, 1 AS creation_time, 1 AS last_execution_time, 1 AS total_worker_time, 1 AS AvgCPUTime, 1 AS logicalReads, 1 AS LogicalWrites
, ERROR_NUMBER() AS execution_count
, ERROR_SEVERITY() AS AggIO
, ERROR_STATE() AS AvgIO
, ERROR_MESSAGE() AS query_text
end catch',@params=N''

How to audit SQLServer logins?

We can audit SQL Server logins using multiple ways like running a profiler, Logon triggers and creating a audit.

 

Below script will create a audit to capture the login events. We can capture the login in event logs or file.

Below is example to capture login events in file.  Replace D:\Audit\ in below script to a valid folder

 

USE [master]
GO

CREATE SERVER AUDIT [TestAudit]
TO FILE 
(	FILEPATH = N'D:\CASEDATA\'
	,MAXSIZE = 0 MB
	,MAX_ROLLOVER_FILES = 2147483647
	,RESERVE_DISK_SPACE = OFF
)
WITH
(	QUEUE_DELAY = 1000
	,ON_FAILURE = SHUTDOWN
	,AUDIT_GUID = 'b305ddf6-3487-4c77-afc2-fc8cfc357abc'
)
GO

USE [master]
GO

CREATE SERVER AUDIT SPECIFICATION [ServerAuditSpecification-20130205-205300]
FOR SERVER AUDIT [TestAudit]
ADD (SUCCESSFUL_LOGIN_GROUP)
WITH (STATE = OFF)
GO

How to set alternate location for SQL Server dumps?

How to set alternate location for SQL Server dumps?
To change the default location for dump files, you can add the SQLExceptionDumpPath registry key with REG_SZ type under HKLM\Software\Microsoft\<instance path>\Setup  and set the value of this key to the new path. The default location for this is the SQL installation Path\log directory.
Above change does not require restart
HKLM\Software\Microsoft\<instance path>\Setup\MaxDumps    -DWORD to limit maximum number of dumps (0=Unlimited)
HKLM\Software\Microsoft\<instance path>\Setup\MaxFullDumps to limit maximum number of full dumps (0=Unlimited)

The merge process was unable to create a new generation at the ‘Subscriber’

The merge process was unable to create a new generation at the ‘Subscriber’

Merge replication fails with below error

Error:
The merge process was unable to create a new generation at the ‘Publisher’

Unable to decide if MakeGeneration is needed

The merge process failed to execute a query because the query timed out. If this failure continues, increase the query timeout for the process. When troubleshooting,restart the synchronization with verbose history logging and specify an output file to which to write.

 Verbose Error log:
The merge process was unable to create a new generation at the ‘Subscriber’. Troubleshoot by restarting the synchronization with verbose history logging and specify an output file to which to write.
(Source: MSSQL_REPL, Error number: MSSQL_REPL-2147200994)Get help:
http://help/MSSQL_REPL-2147200994
Resolution:
Change the generation_leveling_threshold to 0 in sysmergepublications table in publisher and in subscriber. Do not update the system tables directly.
Follow the steps mentioned in workaround section of http://support.microsoft.com/kb/953568

Update,Insert or Delete on tables at subscriber fails in transactional replication

When you update/insert/delete rows in table in subscriber which is part of transactional replication with updateable subscriber we will end with below error
{
Msg 14126, Level 16, State 3, Procedure sp_check_sync_trigger, Line 28
You do not have the required permissions to complete the operation.

Msg 20512, Level 16, State 1, Procedure sp_MSreplraiserror, Line 8
Updateable Subscriptions: Rolling back transaction.
Msg 3609, Level 16, State 1, Procedure trg_MSsync_upd_State, Line 133

The transaction ended in the trigger. The batch has been aborted.
}
Above error is raised in sp_check_sync_trigger and according to the below logic in “sp_check_sync_trigger”  
If you have a update/delete/insert trigger in a table which is not part of replication (MSreplication_objects) then we raise “You do not have the required permissions to complete the operation.”
{

if not exists (select so.object_id from (dbo.MSreplication_objects as ro join sys.objects as so 
        on ro.object_name = so.name) 
        where so.object_id = @trigger_procid and ro.object_type = ‘T’  
        and (@owner is null or schema_name(so.schema_id) = @owner)) 
    begin 
        raiserror(14126, 16, 3) 
        return 1 

   end
}
Resolution
Identify all the triggers of the table for which you are not able to update/delete/insert rows and identify the trigger which is not part of MSreplication_objects and disable it.
You can use the below query to identify triggers in table which is not part of replication

select * from sys.objects where type=‘TR’ and name not in (select object_name from  MSreplication_objects)
and parent_object_id in (select OBJECT_ID(‘Replace your Table here’))

       

SSMS fails with below error when you connect (or) explore the databases

SQL Server management studio might fail with below error  when you connect (or) explore the databases.
Error in SQL Server 2012
TITLE: Microsoft SQL Server Management Studio ——————————
Object reference not set to an instance of an object. (Microsoft.VisualStudio.Platform.WindowManagement)
——————————
Error in SQL Server 2008 and SQL Server 2005
BUTTONS:
Unable to cast COM object of type ‘System.__ComObject’ to interface type ‘Microsoft.VisualStudio.OLE.Interop.IServiceProvider’. This operation failed because the QueryInterface call on the COM component for the interface with IID ‘{6D5140C1-7436-11CE-8034-00AA006009FA}’ failed due to the following error: No such interface supported (Exception from HRESULT: 0x80004002 (E_NOINTERFACE)). (Microsoft.VisualStudio.OLE.Interop)

regsvr32 /u ieproxy.dll
regsvr32 ieproxy.dll

Cause
1. You would get above error if ieproxy.dll is not registered properly. Re register Ieproxy.dll.
C:\Program Files\Internet Explorer>regsvr32 /u ieproxy.dll
C:\Program Files\Internet Explorer>regsvr32 ieproxy.dll
C:\Program Files (x86)\Internet Explorer>regsvr32 /u ieproxy.dll
C:\Program Files (x86)\Internet Explorer>regsvr32 ieproxy.dll
2 .Net framework corruption (Repair Dot Net)
3. regsvr32 actxprxy.dll
4. If the problem persists after doing above steps then search the registry for class-id mentioned  in error message (in this error it is 6D5140C1-7436-11CE-8034-00AA006009FA). Once you find the class in registry explore ProxyStubClsid32 and copy the GUID in default key. Search for the copied GUID again in the registry and locate the DLL for which we are failing and register that Dll.

Linked server from SQLServer2012 to SQLServer2000.

How to create Linked server from SQLServer2012 to SQLServer2000.
When you create linked server from SQLServer2012 to SQLServer2000 you cannot use SQL Native client 11 because it cannot connect with  SQL Server 2000
Each version of SQL native client will support the version of SQL Server that it ships with, two earlier versions and two later versions
When you use SQLOLED it will switch to latest SQLNCLI.
The Microsoft SQL Server Native Client OLE DB provider is the provider that is used with SQL Server if no provider name is specified or if SQL Server is specified as the product name. Even if you specify the older provider name, SQLOLEDB, it will be changed to SQLNCLI when persisted to the catalog.
You create a data source and use MSDASQL (Microsoft OLE DB Provider for ODBC) as long as OS supports it.
Sample
sp_addlinkedserver ‘Server\Shiloh’, @provider = N’MSDASQL’, @datasrc=N’SQL2000datasource’, @location=N’System’,@srvproduct=N’MSDASQL’;
Alternatively you can use SQL Server native client 10.0 but it expects “begin distributed transaction” for all the queries.

How to detect low memory conditions in SQL Server using ring buffers output

Use the below query to determine the low memory conditions in SQL Server using the sys.dm_os_ring_buffers It gives the historical memory usage of SQL Server and internal and external memory pressure information .


SELECT CONVERT (varchar(30), GETDATE(), 121) as [RunTime],
dateadd (ms, (rbf.[timestamp] - tme.ms_ticks), GETDATE()) as [Notification_Time],
cast(record as xml).value('(//Record/ResourceMonitor/Notification)[1]', 'varchar(30)') AS [Notification_type],
cast(record as xml).value('(//Record/MemoryRecord/MemoryUtilization)[1]', 'bigint') AS [MemoryUtilization %],
cast(record as xml).value('(//Record/MemoryNode/@id)[1]', 'bigint') AS [Node Id],
cast(record as xml).value('(//Record/ResourceMonitor/IndicatorsProcess)[1]', 'int') AS [Process_Indicator],
cast(record as xml).value('(//Record/ResourceMonitor/IndicatorsSystem)[1]', 'int') AS [System_Indicator],
cast(record as xml).value('(//Record/ResourceMonitor/Effect/@type)[1]', 'varchar(30)') AS [type],
cast(record as xml).value('(//Record/ResourceMonitor/Effect/@state)[1]', 'varchar(30)') AS [state],
cast(record as xml).value('(//Record/ResourceMonitor/Effect/@reversed)[1]', 'int') AS [reserved],
cast(record as xml).value('(//Record/ResourceMonitor/Effect)[1]', 'bigint') AS [Effect],

cast(record as xml).value('(//Record/ResourceMonitor/Effect[2]/@type)[1]', 'varchar(30)') AS [type],
cast(record as xml).value('(//Record/ResourceMonitor/Effect[2]/@state)[1]', 'varchar(30)') AS [state],
cast(record as xml).value('(//Record/ResourceMonitor/Effect[2]/@reversed)[1]', 'int') AS [reserved],
cast(record as xml).value('(//Record/ResourceMonitor/Effect)[2]', 'bigint') AS [Effect],

cast(record as xml).value('(//Record/ResourceMonitor/Effect[3]/@type)[1]', 'varchar(30)') AS [type],
cast(record as xml).value('(//Record/ResourceMonitor/Effect[3]/@state)[1]', 'varchar(30)') AS [state],
cast(record as xml).value('(//Record/ResourceMonitor/Effect[3]/@reversed)[1]', 'int') AS [reserved],
cast(record as xml).value('(//Record/ResourceMonitor/Effect)[3]', 'bigint') AS [Effect],

cast(record as xml).value('(//Record/MemoryNode/ReservedMemory)[1]', 'bigint') AS [SQL_ReservedMemory_KB],
cast(record as xml).value('(//Record/MemoryNode/CommittedMemory)[1]', 'bigint') AS [SQL_CommittedMemory_KB],
cast(record as xml).value('(//Record/MemoryNode/AWEMemory)[1]', 'bigint') AS [SQL_AWEMemory],
cast(record as xml).value('(//Record/MemoryNode/SinglePagesMemory)[1]', 'bigint') AS [SinglePagesMemory],
cast(record as xml).value('(//Record/MemoryNode/MultiplePagesMemory)[1]', 'bigint') AS [MultiplePagesMemory],
cast(record as xml).value('(//Record/MemoryRecord/TotalPhysicalMemory)[1]', 'bigint') AS [TotalPhysicalMemory_KB],
cast(record as xml).value('(//Record/MemoryRecord/AvailablePhysicalMemory)[1]', 'bigint') AS [AvailablePhysicalMemory_KB],
cast(record as xml).value('(//Record/MemoryRecord/TotalPageFile)[1]', 'bigint') AS [TotalPageFile_KB],
cast(record as xml).value('(//Record/MemoryRecord/AvailablePageFile)[1]', 'bigint') AS [AvailablePageFile_KB],
cast(record as xml).value('(//Record/MemoryRecord/TotalVirtualAddressSpace)[1]', 'bigint') AS [TotalVirtualAddressSpace_KB],
cast(record as xml).value('(//Record/MemoryRecord/AvailableVirtualAddressSpace)[1]', 'bigint') AS [AvailableVirtualAddressSpace_KB],
cast(record as xml).value('(//Record/@id)[1]', 'bigint') AS [Record Id],
cast(record as xml).value('(//Record/@type)[1]', 'varchar(30)') AS [Type],
cast(record as xml).value('(//Record/@time)[1]', 'bigint') AS [Record Time],
tme.ms_ticks as [Current Time]
FROM sys.dm_os_ring_buffers rbf
cross join sys.dm_os_sys_info tme
where rbf.ring_buffer_type = 'RING_BUFFER_RESOURCE_MONITOR' --and cast(record as xml).value('(//Record/ResourceMonitor/Notification)[1]', 'varchar(30)') = 'RESOURCE_MEMPHYSICAL_LOW'
ORDER BY rbf.timestamp ASC
Go

Once you get the output of above query you can use the steps in A significant part of SQL Server process memory has been paged out to troubleshoot the issue further

How to bring the database online when one or more data files is accidentally deleted

How to bring the database online when one or more data files is accidentally deleted.
 
Follow the steps below when you don’t have other options like restoring from backup. This has to be the last step to bring the database online and remember data in file which is deleted will be lost. This might cause inconsistent data.
 
1. Take physical file level backup of all the database files
2) Rename all the data files and log files [Ex :N:\ Data.MDF  to N:\Data.MDF_original ]
3) Create a new empty database with the same physical, logical file names and file groups
4) Now take the missing file OFFLINE
5) Take the database offline and replace the original files and try to bring the database online
 
The database will come online, only the missing file will be in OFFLINE STATE and will not be accessible.
If the missing file belong to separate file group we can remove the file group else you may have to export all the data from this database to new database else we cannot take FULL Database backup as one of the database file is not online for file group.
 
 
 
 
— Create New database
USE [master]
GO
 
CREATE DATABASE [MisTest] ON  PRIMARY
( NAME = N’MisTest’, FILENAME = N’C:\MisTest.mdf’ ,   MAXSIZE = UNLIMITED, FILEGROWTH = 1024KB ),
 FILEGROUP [Secondary]
( NAME = N’MisTestNDF’, FILENAME = N’C:\MisTestNDF.ndf’ ,   MAXSIZE = UNLIMITED, FILEGROWTH = 1024KB )
LOG ON
( NAME = N’MisTest_log’, FILENAME = N’C:\MisTest_log.LDF’ ,   MAXSIZE = 2048GB , FILEGROWTH = 10%)
GO
 
–Take the database offline
Alter database MisTest set offline
go
 
–Now rename the NDF file [to simulate the missing file scenario]
sp_configure ‘show advanced options’,1
go
reconfigure with override
go
 
sp_configure ‘xp_cmdshell’,1
go
reconfigure with override
go
 
xp_cmdshell ‘rename c:\MisTestNDF.ndf MisTestNDF_1.ndf’
go
 
 
–Try to bring the database online but it fails
Alter database MisTest set online
go
 
Msg 5120, Level 16, State 5, Line 1
Unable to open the physical file “C:\MisTestNDF.ndf”. Operating system error 2: “2(failed to retrieve text for this error. Reason: 15105)”.
Msg 945, Level 14, State 2, Line 1
Database ‘MisTest’ cannot be opened due to inaccessible files or insufficient memory or disk space.  See the SQL Server errorlog for details.
Msg 5069, Level 16, State 1, Line 1
ALTER DATABASE statement failed.
 
 
–Take the database offline and detach
Alter database MisTest set offline
go
USE [master]
GO
ALTER DATABASE [MisTest] SET  SINGLE_USER WITH ROLLBACK IMMEDIATE
GO
USE [master]
GO
EXEC master.dbo.sp_detach_db@dbname = N’MisTest’
GO
 
–Now try to attach the database with missing file, but it fails with the below error
USE [master]
GO
CREATE DATABASE [MisTest] ON
( FILENAME = N’C:\MisTest.mdf’ ),
( FILENAME = N’C:\MisTest_log.LDF’ )
,( FILENAME = N’C:\MisTestNDF.ndf’ )
FOR ATTACH
GO
 
Msg 5120, Level 16, State 101, Line 1
Unable to open the physical file “C:\MisTestNDF.ndf”. Operating system error 2: “2(failed to retrieve text for this error. Reason: 15105)”.
 
–Rename the original files
xp_cmdshell ‘rename c:\MisTest.mdf MisTest_1.mdf’
go
xp_cmdshell ‘rename c:\MisTest_log.ldf MisTest_1_log.ldf’
go
 
–Now recreate the Empty database with the same physical & logical file name
CREATE DATABASE [MisTest] ON  PRIMARY
( NAME = N’MisTest’, FILENAME = N’C:\MisTest.mdf’  , MAXSIZE = UNLIMITED, FILEGROWTH = 1024KB ),
 FILEGROUP [Secondary]
( NAME = N’MisTestNDF’, FILENAME = N’C:\MisTestNDF.ndf’  , MAXSIZE = UNLIMITED, FILEGROWTH = 1024KB )
LOG ON
( NAME = N’MisTest_log’, FILENAME = N’C:\MisTest_log.LDF’ , MAXSIZE = 2048GB , FILEGROWTH = 10%)
GO
 
 
–Mark the missing file as offline
Alter database MisTest
MODIFY FILE (NAME = ‘MisTestNDF’,OFFLINE)
 
–Take the database offline
Alter database MisTest set offline
 
— Replace the original file and bring the database online
xp_cmdshell ‘rename c:\MisTest.mdf MisTest_2.mdf’
go
xp_cmdshell ‘rename c:\MisTest_log.ldf MisTest_2_log.ldf’
go
 
xp_cmdshell ‘rename c:\MisTest_1.mdf MisTest.mdf’
go
xp_cmdshell ‘rename c:\MisTest_1_log.ldf MisTest_log.ldf’
go
Alter database MisTest set online
go
 
— The only problem is that we cannot take the backup of the database as the file is offline
 
backup database Mistest to disk = ‘c:\mistest.bak’
 
Msg 3636, Level 16, State 2, Line 1
An error occurred while processing ‘BackupMetadata’ metadata for database id 8 file id 3.
Msg 3046, Level 16, State 2, Line 1
Inconsistent metadata has been encountered. The only possible backup operation is a taillog backup using the WITH CONTINUE_AFTER_ERROR or NO_TRUNCATE option.
Msg 3013, Level 16, State 1, Line 1
BACKUP DATABASE is terminating abnormally.
 
–We cannot even remove the file as we get this error
 
USE [MisTest]
GO
ALTER DATABASE [MisTest]  REMOVE FILE [MisTestNDF]
GO
 
Msg 5056, Level 16, State 2, Line 1
Cannot add, remove, or modify a file in filegroup ‘Secondary’ because the filegroup is not online.
 
–Now try to remove the file & file group
 
 
–Try the database back up again but it would fail
 
backup database Mistest to disk = ‘c:\mistest.bak’
go
 
Msg 3636, Level 16, State 2, Line 2
An error occurred while processing ‘BackupMetadata’ metadata for database id 8 file id 3.
Msg 3046, Level 16, State 2, Line 2
Inconsistent metadata has been encountered. The only possible backup operation is a taillog backup using the WITH CONTINUE_AFTER_ERROR or NO_TRUNCATE option.
Msg 3013, Level 16, State 1, Line 2
BACKUP DATABASE is terminating abnormally.
 
 
 
USE [MisTest]
GO
ALTER DATABASE [MisTest]  REMOVE FILE [MisTestNDF]
GO
 
Msg 5056, Level 16, State 2, Line 1
Cannot add, remove, or modify a file in filegroup ‘Secondary’ because the filegroup is not online.
Msg 5042, Level 16, State 7, Line 1
The filegroup ‘Secondary’ cannot be removed because it is not empty.
 
Ignore the above error and proceed further.
 
ALTER DATABASE [MisTest] REMOVE FILEGROUP [Secondary]
GO
 
Now we can take full database backup normally  
 
 

 

System.OutOfMemoryException’ was thrown. (Microsoft.SqlServer.Management.Sdk.Sfc) error when we try to restore the database

When we try to restore database in SQL Server we get below error and restore database fails

Error

{
——————————
TITLE: Microsoft SQL Server Management Studio
——————————
Cannot show requested dialog.
——————————
ADDITIONAL INFORMATION:
Exception of type ‘System.OutOfMemoryException’ was thrown. (Microsoft.SqlServer.Management.Sdk.Sfc)
——————————
BUTTONS:
OK
——————————
}

Cause
We get above error when we have large number of records on backup history table.
Check the count of rows on below table

select count(1) from msdb.dbo.backupfile   with (nolock)
select count(1) from msdb.dbo.backupfilegroup  with (nolock)
select count(1) from msdb.dbo.restorefile  with (nolock)
select count(1) from msdb.dbo.restorefilegroup with (nolock)
select count(1) from msdb.dbo.restorehistory  with (nolock)
select count(1) from msdb.dbo.backupmediafamily with (nolock)

Resolution
We may have to reduce the number of records in the above tables. We can manually delete records based on date (or) If we do not want backup history we can truncate all the above tables.

 

How to configure the size of SQLServer error log

When running SQL Server certain times we may have to limit the size and number of SQL server errorlogs

To limit the size of SQL Server errorlogs add a new registry key GetMaxErrorLogSizeKb under
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\MSSQL.X\MSSQLServer\ErrorLogSizeInKb
By default, this key is absent. Modify the value to the size of errorlog(In Kb) you want to maintain.

To increase the number of log files, add a new registry key “NumErrorLogs” (REG_DWORD) under below location.
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\MSSQL.X\MSSQLServer\
By default, this key is absent. Modify the value to the number of logs that you want to maintain.

To learn more about sp_readerrorlog and its parameters read Beyond XP_READERRORLOG (Parameters of XP_READERRORLOG)

Remote query using a linked server generated memory dump and fails

When you fire a remote query using a linked server it may generate a memory dump and fail with error below
Msg 0, Level 11, State 0, Line 0
A severe error occurred on the current command. The results, if any, should be discarded.
Msg 0, Level 20, State 0, Line 0
A severe error occurred on the current command. The results, if any, should be discarded.
Resolution
1. Apply the fix mentioned in http://support.microsoft.com/default.aspx?scid=kb;EN-US;939965 if you are using OLEDB provider for DB2.
2. Few RDBMS allow index to contain column name twice. Dump the indexes of table which you are querying and check if any index has duplicate column name If you find a duplicate column name remove it. Linked server query will start working.

Lock pages in memory is recommended or not

One of SQL Server expert recently asked  below question in distribution list 
1. LPIM is not recommended?
2. LPIM is not required?
Answer:
When we have LPIM privilege for SQL Server startup account  AWE allocator API’s are used for memory allocation. Memory allocated using  AWE allocator API’s are not part of Process working set ,hence cannot be paged out and not visible in private bytes or working set in task manger and Perfmon àPrivate bytes or Perfmon àworking set.
Advantage: SQL Server working set (BPOOL) cannot be paged by windows  even when there is system wide memory pressure.
Disadvantage: Operating system will starve for memory when there is system wide memory pressure. OS has to completely rely on SQL Server to respond to low memory notification and scale down its memory usage . SQL Server may not respond fast enough to low memory condition at system level because OS is already starving for memory. LPIM prevents only the BPOOL from paging, Non-Bpool components can still be paged and we have critical portions of SQL Server memory like thread stack, SQL Server Images/DLL’s  in Non-Bpool which can still be paged by OS.
So many disadvantage…. But still why do we recommend LPIM in some places?
In earlier versions of windows 2003 when there is system wide memory pressure windows memory manger would trim one-quarter of working set of all the process. Imagine If SQL Server is using 200GB of RAM and there is system wide memory pressure, Windows memory manager would move 50 GB of
SQL Server working set to page file and we would end with performance problems. If LPIM is enabled OS cannot trim.
Imagine there is a faulty application in the server and it leaks memory  fast , It might consume all the memory in the server and windows memory manager might trim all of SQL Server working set.
Known issues in windows like the one in This link can cause windows memory manager to trim the working set of SQL Server process suddenly. Windows has a background process which keeps  writing the contents of working set to page file, so when there is paging only the dirty pages  needs to be moved to the page file others are already backed by back ground process, So paging would be very fast and SQL Server working set would be moved to page file in seconds before SQL Server responds to low memory resource notification from OS causing negative performance.  This link might throw more clarity.

In systems with large amout of memory (Ex: 1 TB )we might get non yielding scheduler situations when allocating memory in conventional memory model. LPIM is only option is this case.
 
LPIM can be used in servers in which it might take long time to identify the cause of the working set trim. It is always suggested to identify the cause of TRIM before choosing LPIM in first place.

When statistics was updated?

Statistics are used by the SQL Server optimizer to choose the efficient plan.  When we don’t have up to date statistics
it may end with SQL server optimizer choosing inefficient query plan. We can use below query to identify when SQL Server
statistics were last updated.

select db_id() as dbid,
  case
    when indid IN (0, 1) then convert (char (12), rows)
    else (select rows from dbo.sysindexes i2 where i2.id =  i.id and i2.indid in (0,1)) -- ''-''
  end as rowcnt,
  case
    when indid IN (0, 1) then rowmodctr
    else convert (bigint, rowmodctr) + (select rowmodctr from dbo.sysindexes i2 where i2.id =  i.id and i2.indid in (0,1))
  end as row_mods,
  case rows when 0 then 0 else convert (bigint,
    case
      when indid IN (0, 1) then convert (bigint, rowmodctr)
      else rowmodctr + (select convert (bigint, rowmodctr) from dbo.sysindexes i2 where i2.id =  i.id and i2.indid in (0,1))
    end / convert (numeric (20,2), (select case convert (bigint, rows) when 0 then 0.01 else rows end from dbo.sysindexes i2 where i2.id =  i.id and i2.indid in (0,1))) * 100)
  end as pct_mod,
  convert (nvarchar, u.name + '.' + o.name) as objname,
  case when i.status&0x800040=0x800040 then 'AUTOSTATS'
    when i.status&0x40=0x40 and i.status&0x800000=0 then 'STATS'
    else 'INDEX' end as type,
  convert (nvarchar, i.name) as idxname, i.indid,
  stats_date (o.id, i.indid) as stats_updated,
  case i.status & 0x1000000 when 0 then 'no' else '*YES*' end as norecompute,
  o.id as objid , rowcnt, i.status
from dbo.sysobjects o, dbo.sysindexes i, dbo.sysusers u
where o.id = i.id and o.uid = u.uid and i.indid between 1 and 254 and o.type = 'U'
order by pct_mod desc, convert (nvarchar, u.name + '.' + o.name), indid
GO

How to identify if the query is using Query hints or plan Guides ?

Generally we create plan guide or query hints to improve the performance of the query, but some times it would create problems if not used carefully.

To identify if the query is using plan guide look at the planguidedb attribute in XML plan. Sys.Plan_guides will also have an entry.
Query hints can be identifies by looking at the query.We will see hints like opton (force order),Inner loop join, index = etc
If you are tunning a slow query and if it is using plan guides or query hints first remove the hints and tune the query normally

Linked server performance might be affected because of Remote Scan

In linked server queries when you use filtered query (Where clause) filtering may not be remoted, entire data is fetched locally from remote server and filtering happens locally hence will cause severe performance impact. You will find “Remote Scan”  operator in plan in this case. When filtering is remote you will see “Remote query” operator instead of remote scan.
 
How to fix it?
DynamicParameters and NestedQueries in the provider you are using should play a role here (Object exploreràServer objects à Linked servers  àProviders à Choose the provider you are using in linked server (Ex: SQLOLEDB) and enable  DynamicParameters and NestedQueries ). DynamicParameters setting should enable providers to support parameterized queries. If you enable it probably “remote scan” would switch to “remote query” and filtering will be happen on remote server instead of fetching the data locally and filtering.

Limiting the number connections to SQL Server database

If you would like to

1. Limit the number connections to SQL Server database.
2. Restrict connections to SQL server from tools like SSMS (or) You dont want non-sysadmins to connect with SQL Server except from application.

We can acheive all this by creating logon triggers.

Sample 1
=======
USE master;
GO
CREATE LOGIN login_test WITH PASSWORD = ‘3KHJ6dhx(0xVYsdf’ MUST_CHANGE,
    CHECK_EXPIRATION = ON;
GO
GRANT VIEW SERVER STATE TO login_test;
GO
CREATE TRIGGER connection_limit_trigger
ON ALL SERVER WITH EXECUTE AS ‘login_test’
FOR LOGON
AS
BEGIN
IF ORIGINAL_LOGIN()= ‘login_test’ AND
    (SELECT COUNT(*) FROM sys.dm_exec_sessions
            WHERE is_user_process = 1 AND
                original_login_name = ‘login_test’) > 3
    ROLLBACK;
END;

Sample 2
=======
This trigger filters restricts other applications to access the database other than the  APPname (replace your application name with APPname)
Declare @role numeric(1)
select @role= IS_SRVROLEMEMBER (‘sysadmin’);
if @role1
begin
if (SELECT COUNT(*) FROM sys.dm_exec_sessions
            WHERE is_user_process = 1 AND
              program_name’APPName’ and
                original_login_name in (select suser_sname()))>0
Rollback;
end;

Note: we can use sp_configure ‘user connections’  to limit the number of connections to SQL Server.

JOB does not exist in the job cache

Error:
Job  does not exist in the job cache: attempting to re-acquire it from the server
Warning,[156] Job does not exist in the job cache
Error 22022 JOB does not exist in the job cache

Cause 1:
1. Expand the SQL Server SSMS
2. select the SQLAGENT
3. Right click and select properties
4.Check the properties in the connection TAB at SQL Server.

Check if SQL Server alias is pointing to different server. If yes change it back to default and restart SQL Server

Cause 2:
In correct MSX/TSX configuration or MSX/TSX

Cleaned up system tables related to MSX-TSX in MSDB and reconfigure MSX/TSX configuration and jobs.
systargetservergroupmembers
systargetservergroups
systargetservers
You can run ‘sp_msx_enlist’ at target server and check the output error if there has some error happened.

Cause 3:
Disable encryption.certificate validation are enabled for connections between master servers and target servers by default.
   a) “Start”->”run”->type “regedit” at ‘GSHJRQA’ server.
   b) Navigate to \HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\\SQLServerAgent\MsxEncryptChannelOptions(REG_DWORD)
   c) Change “MsxEncryptChannelOptions” value from 2 to 0.

The article ‘(null)’ does not exist

Trans Replication is failing with error while adding the subscription

Error:  “The article ‘(null)’ does not exist.”
This seems to be know issue when there is difference  is number of article between sysarticles and msarticles.
– select count(*) from sysarticles on publisher
– select count(*) from MSarticles on distributor

Once you get the articles which exists in one table and not in other use exec sp_droparticle to remove that article and add it back using exec sp_addarticle

Note:We might get this error if we have two tables with same name in databases with case sensitive collation and replicate them (Ex: table and TABLE )

SQLServer Query optimization

Do you have a slow SQLServer query and wanted to tune it… Tuning slow queries in SQL Server and making them run faster is not complex. Follow the query tunning steps in http://mssqlwiki.com/2012/11/06/tuning-sql-server-query/.

Below is extract from above article. If you would like to read the complete steps read above article.

Before you start troubleshooting the query which is running for longer duration, Identify if the query is slow because it is long waiting (or) Long running (or) Long compiling.

Compile time: Time taken to compile the query. compile time can be identified by looking at the

1. CompileTime=”n” in XML plan

2. SQL Server parse and compile time when Set statistics time on is enabled.

CPU time: Time taken by the query in CPU (Execution time – (compile time+ wait time). CPU time can be identified by looking at the

1. CPU column in profiler.

2. CPU time under SQL Server Execution Times when statistics time on is enabled.

Execution time: Time taken by the query for complete execution ( Execution time +CPU time +Wait time). Total duration of the query can be identified by using the

1.Duration column in profiler

2. SQL Server Execution Times, elapsed times when statistics time on is enabled.

What is long waiting query?

A query is considered to be long waiting query, when it spend most of its time waiting for some resource.

How to identify if the query is long waiting?

Long running query can be identified by comparing the CPU and duration column in profiler (or) CPU and elapsed time when statistics time on is set .

When a query is waiting for a resource (such as lock, network I/O, Page_I/O Etc) it will not consume CPU. So if you see duration being higher than CPU (Difference between Duration and CPU is wait time),It indicates that the query has spent large amount of time waiting for some resource.

Let us see an example of long waiting query. I have collected profiler trace while executing the query.

set statistics io on

set statistics time on

go

–Place your query here

select top 10000 * from a

go

set statistics io off

set statistics time off

go

Look at the Duration and CPU column in the profiler Cpu=256 and duration =1920. So this query has spent majority of time waiting for some resource.

Look at the output of statistics time and statistics I/O in above image.

SQL Server has spent only 2 milliseconds compiling the query and 256 milliseconds on CPU, but the overall duration was 1920 milliseconds so the query has spent maximum time waiting for some resource.

Identify the resource in which this query is waiting on using one of the steps listed below.

1. Look at the wait type column of the sysprocesses for the spid which is executing query while the query is executing.

2. If there is no other activity on the server collect sys.dm_os_wait_stats output before and after the query execution and identify the wait (Will not help in tuning queries running for short duration)

3. Collect XEvent to gather the wait stats of individual query.

Once you identify the resource in which the query is waiting on tune the resource. Most of the times queries would be slow waiting for below resource.

PAGEIOLATCH_* or Write log: This indicates I/O resource bottleneck follow the detailed troubleshooting steps mentioned in This Link to fix the I/O bottleneck. If you find SQL Server spawning excessive I/O Create necessary indexes.

a. Logical reads + Physical reads in statistics I/O output (Refer above image) or Reads and writes in profiler will indicate the I/O posted by this query. If you see very high reads for query compared with the result rest retuned by query it is an indication of missing indexes or bad plan. Create necessary indexes (You can use DTA for index recommendations.).

PAGELATCH_*: This waittype in sysprocesses indicates that SQL Server is waiting on access to a database page, but the page is not undergoing physical IO.

a. This problem is normally caused by a large number of sessions attempting to access the same physical page at the same time. We should Look at the wait resource of the spid The wait_resource is the page number (the format is dbid:file:pageno) that is being accessed.

b. We can use DBCC PAGE to identify object or type of the page in which we have the contention. Also it will help us to determine whether contention is for allocation, data or text.

c. If the pages that SQL Server is most frequently waiting on are in Tempdb database ,check the wait resource column for a page number in dbid 2 Ex(2:1:1 or 1:1:2). Enable TF 1118 and increase the number of TEMPDB data files and size them equally (You may be facing tempdb llocation latch contention mentioned in http://support.microsoft.com/kb/328551)

d. If the page is in a user database, check to see if the table has a clustered index on a monotonic key such as an identity where all threads are contending for the same page at the end of the table. In this case we need to choose a different clustered index key to spread the work across different pages.

LATCH_*: Non-buf latch waits can be caused by variety of things. We can use the wait resource column in sysprocesses to determine the type of latch involved(KB 822101).

a. A very common LATCH_EX wait is due to running a profiler trace or sp_trace_getdata Refer KB 929728 for more information.

b. Auto Grow and auto shrink while query is executed.

c. Queries going for excessive parallelism.

Blocking (LCK*): Use the query in This Link to identify the blocking. Tune the head blocker.

Asynch_network_io (or) network IO: Keep the result set returned by the query smaller. Follow detailed troubleshooting refer This Link

Resource_semaphore waits: Make sure there is no memory pressure on the server Follow steps in This Link for detailed troubleshooting.

SQL Trace: Stop all the profiler traces running on the server. Identify the traces which are running on the server using the query in This Link

Cx packet: Set the Max degree of parallelism. But remember Cxpacket wait type is not always a problem.

a. For servers that have eight or less processors, use the following configuration where N equals the number of processors: max degree of parallelism = 0 to N .

b. For servers that use more than eight processors, use the following configuration: max degree of parallelism = 8.Refer This Link

SOS_SCHEDULER_YIELD : Identify if there is CPU bottleneck on the server. This waiting means that the thread is waiting for CPU.

a. SQL Server worker thread’s Quantum target is 4ms which means the thread(worker) Will ( is expected to) yield back to SQL Server scheduler when it exceeds 4ms and before it yields back it check if there are any other runnable threads, If there is any runnable threads then the thread which is in top of runnable list is scheduled and current thread will go to the tail of the runnable list and will get rescheduled when the other threads which are already waiting in SOS Scheduler (runnable list) finishes its execution or quantum. The time thread spends in runnable list waiting for its quantum is accounted as SOS_SCHEDULER_YIELD. You will see this type when multiple threads are waiting to get CPU cycle. Follow trouble shooting the steps mentioned This Link

Important: In SQL Server instances when there more than 1 CPU it is possible that the CPU is higher than the duration. Because CPU is sum of time spend by query in all the CPU’s when choosing a parallel whereas the duration is actual duration of the query.

What is long running query?

A query is considered to be long running query, when it spend most of its time on CPU and not waiting for some resource.

How to identify if the query is long running ?

Long running query can be identified by comparing the CPU and duration column in profiler (or) CPU and elapsed time when statistics time on is set . If the CPU and duration is close than the query is considered to be long running. If the query is long running identify where the query spend the time ,It could be for compiling or post compilation (For executing the query). compare the duration of the query with CompileTime (XML plan compile time (or) SQL Server parse and compile time when statistics time is on refer above image).

High Compile time:

Compare the duration of the query with Compile Time (XML plan compile time (or) SQL Server parse and compile time when statistics time is on).Compile time will normally be in few millisecond . Follow the below steps if you see high compile time

1. Identify if you have large token perm refer http://support.microsoft.com/kb/927396

2. Create necessary indexes and stats. Tune the query manually (or) in DTA and apply the recommendation

3. Reduce the complexity of query. Query which joins multiple tables (or) having large number of IN clause can taking a while to compile.

4. You can reduce the compile’s by using force parameterization option.

High CPU time:

Compare the duration of the query with Compile Time (XML plan compile time (or) SQL Server parse and compile time when statistics time is on). If the compile time is very low compared to the duration. Then follow the below steps.

1. Update the stats of tables and indexes used by the query (If the stats are up to date Estimated rows and estimated execution will be approximately same in execution plan .If there is huge difference stats are out dated and requires update) .

2. Identify if the query has used bad plan because of parameter sniffing (If the ParameterCompiledValue and ParameterRuntimeValue is different in XML plan). Refer THIS LINK to know more about Parameter Sniffing

3. If updating the stats and fixing the parameter sniffing doesn’t resolve the issue it is more likely optimizer is not able to create efficient plan because of lack of indexes and correct statistics. Run the query which is driving the CPU in database tuning advisor and apply the recommendations. (You will find missing index detail in xml plan but DTA is more efficient).

4. If the query which is running longer and consuming CPU is linked server query try changing the security of linked server to ensure linked server user has ddl_admin or dba/sysadmin on the remote server. More details regarding the issue in THIS LINK.

5. Ensure optimizer is not aborting early and creating bad plan. For details refer THIS LINK

6. Ensure the query which is spiking the CPU doesn’t have plan guides (xml plan will have PlanGuideDB attribute. Also sys.plan_guides will have entries) and query hints(index= or (option XXX join) or inner (Join Hint) join).

7. Ensure that SET options are not changed.

How to make clustered instance of SQLServer to listen on Multiple IP address?

To make clustered SQL Server instance listen on multiple I/P’s

Add new IP addres Resource type in SQLGroup.

In the dependency list of SQLserver network name add the the new netwok IP Resource If you want the SQLServer to listen on different port with new IP
Restart the SQL Server network name (It will internally restart SQL Server resources)

1398(There is a time and/or date difference between the client and server.) Backup

Backup database failes with error

“Write on “\\\tes.bak” failed: 1398(There is a time and/or date difference between the client and server.)

Resolution

You will get this error if Kerberos ticket in the SQL Server is expired, restart SQL Server. Also check if Windows Time service is running

Could not obtain information about Windows NT group/user

SQL Server jobs failling with below error.

The job failed.  Unable to determine if the owner (domain\user) of job test has server access (reason: Could not obtain information about Windows NT group/user ‘domain\user’, error code 0x5. [SQLSTATE 42000] (Error 15404)).
From SQL agent logs we see below errors

2011-02-25 01:00:00 – ! [298] SQLServer Error: 15404, Could not obtain information about Windows NT group/user

1. ADD the SQL Agent service account to the Pre-Windows 2000 Compatible Access Group

2. If SQL Server is running under  localsystem add the computer object to the domain group “Pre-Windows 2000 compatible access”

3. verify if xp_logininfo ‘domain\user’ is successful.

4. Make sure account is not locked

Deafault extended event system_health

The default extended event system health captures lot of useful information likeclr_allocation_failure
clr_virtual_alloc_failure
memory_broker_ring_buffer_recorded
memory_node_oom_ring_buffer_recorded
scheduler_monitor_deadlock_ring_buffer_recorded,
scheduler_monitor_non_yielding_iocp_ring_buffer_recorded,
scheduler_monitor_non_yielding_ring_buffer_recorded,
scheduler_monitor_non_yielding_rm_ring_buffer_recorded,
scheduler_monitor_stalled_dispatcher_ring_buffer_recorded,
scheduler_monitor_system_health_ring_buffer_recorded,
connectivity_ring_buffer_recorded
error_reported
security_error_ring_buffer_recorded
xml_deadlock_report

We can verify them using below scripts

 


SELECT *,
CAST([target_data] as XML) [target_data]
FROM sys.dm_xe_session_targets a
INNER JOIN sys.dm_xe_sessions b ON b.[address] = a.[event_session_address]
--WHERE b.[name] = 'system_health'

&nbsp;

CREATE EVENT SESSION [system_health] ON SERVER
ADD EVENT sqlclr.clr_allocation_failure(
    ACTION(package0.callstack,sqlserver.session_id)),
ADD EVENT sqlclr.clr_virtual_alloc_failure(
    ACTION(package0.callstack,sqlserver.session_id)),
ADD EVENT sqlos.memory_broker_ring_buffer_recorded,
ADD EVENT sqlos.memory_node_oom_ring_buffer_recorded(
    ACTION(package0.callstack,sqlserver.session_id,sqlserver.sql_text,sqlserver.tsql_stack)),
ADD EVENT sqlos.scheduler_monitor_deadlock_ring_buffer_recorded,
ADD EVENT sqlos.scheduler_monitor_non_yielding_iocp_ring_buffer_recorded,
ADD EVENT sqlos.scheduler_monitor_non_yielding_ring_buffer_recorded,
ADD EVENT sqlos.scheduler_monitor_non_yielding_rm_ring_buffer_recorded,
ADD EVENT sqlos.scheduler_monitor_stalled_dispatcher_ring_buffer_recorded,
ADD EVENT sqlos.scheduler_monitor_system_health_ring_buffer_recorded,
ADD EVENT sqlos.wait_info(
    ACTION(package0.callstack,sqlserver.session_id,sqlserver.sql_text)
    WHERE ([duration]>(15000) AND ([wait_type]>(31) AND ([wait_type]>(47) AND [wait_type]<(54) OR [wait_type]<(38) OR [wait_type]>(63) AND [wait_type]<(70) OR [wait_type]>(96) AND [wait_type]<(100) OR [wait_type]=(107) OR [wait_type]=(113) OR [wait_type]>(174) AND [wait_type]<(179) OR [wait_type]=(186) OR [wait_type]=(207) OR [wait_type]=(269) OR [wait_type]=(283) OR [wait_type]=(284)) OR [duration]>(30000) AND [wait_type]<(22)))),
ADD EVENT sqlos.wait_info_external(
    ACTION(package0.callstack,sqlserver.session_id,sqlserver.sql_text)
    WHERE ([duration]>(5000) AND ([wait_type]>(365) AND [wait_type]<(372) OR [wait_type]>(372) AND [wait_type]<(377) OR [wait_type]>(377) AND [wait_type]<(383) OR [wait_type]>(420) AND [wait_type]<(424) OR [wait_type]>(426) AND [wait_type]<(432) OR [wait_type]>(432) AND [wait_type]<(435) OR [duration]>(45000) AND ([wait_type]>(382) AND [wait_type]<(386) OR [wait_type]>(423) AND [wait_type]<(427) OR [wait_type]>(434) AND [wait_type]<(437) OR [wait_type]>(442) AND [wait_type]<(451) OR [wait_type]>(451) AND [wait_type]<(473) OR [wait_type]>(484) AND [wait_type]<(499) OR [wait_type]=(365) OR [wait_type]=(372) OR [wait_type]=(377) OR [wait_type]=(387) OR [wait_type]=(432) OR [wait_type]=(502))))),
ADD EVENT sqlserver.connectivity_ring_buffer_recorded(SET collect_call_stack=(1)),
ADD EVENT sqlserver.error_reported(
    ACTION(package0.callstack,sqlserver.database_id,sqlserver.session_id,sqlserver.sql_text,sqlserver.tsql_stack)
    WHERE ([severity]>=(20) OR ([error_number]=(17803) OR [error_number]=(701) OR [error_number]=(802) OR [error_number]=(8645) OR [error_number]=(8651) OR [error_number]=(8657) OR [error_number]=(8902)))),
ADD EVENT sqlserver.security_error_ring_buffer_recorded(SET collect_call_stack=(1)),
ADD EVENT sqlserver.sp_server_diagnostics_component_result(SET collect_data=(1)
    WHERE ([sqlserver].[is_system]=(1) AND [component]<>(4))),
ADD EVENT sqlserver.xml_deadlock_report
ADD TARGET package0.event_file(SET filename=N'system_health.xel',max_file_size=(5),max_rollover_files=(4)),
ADD TARGET package0.ring_buffer(SET max_events_limit=(5000),max_memory=(4096))
WITH (MAX_MEMORY=4096 KB,EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,MAX_DISPATCH_LATENCY=120 SECONDS,MAX_EVENT_SIZE=0 KB,MEMORY_PARTITION_MODE=NONE,TRACK_CAUSALITY=OFF,STARTUP_STATE=ON)
GO

Deferred update instead of direct

Although we dont update the column of unique constraint we might verywell end up with defererred update if you dont pass all the coulumns of PK contraint in where clause of the update query. We have to review all update statements to check if all columns of unique constraint is passed in where clause.

 Assume you fire below query on table with below constraint

 CONSTRAINT [PK_] PRIMARY KEY CLUSTERED
(
 c1  ASC,
 c2 ASC
 )

update table xx set xx=1 where pk1=1 and xx2=2

It is possible that multiple rows will be returned by the where clause in the update,hence breaks the requirements for a direct update (it is considered to be a multirow update because of the where clause)
Althought you did not update PK columns ,your search argument did not specify value for all the columns used to create index ,so you might end with deferred update

CACHESTORE_SQLCP will increase when you run backup

CACHESTORE_SQLCP increases continuously when we run the backup.
When you run the database/Tlog backup very frequently or for the databases with large number of files/file groups CACHESTORE_SQLCP increases
We can identify the issue by using below queries

Select text,a.*from sys.dm_exec_cached_plans a cross apply sys.dm_exec_sql_text(plan_handle)  order by size_in_bytes desc
Go

Select p.plan_handle,CONVERT (varchar, GETDATE(), 126) AS runtime, LEFT (p.cacheobjtype + ' (' + p.objtype + ')', 35) AS cacheobjtype
,p.usecounts, p.size_in_bytes / 1024 AS size_in_kb, stat.total_worker_time/1000 AS tot_cpu_ms,
stat.total_elapsed_time/1000 AS tot_duration_ms, stat.total_physical_reads, stat.total_logical_writes, stat.total_logical_reads,
LEFT (CASE WHEN pa.value=32767 THEN 'ResourceDb' ELSE ISNULL (DB_NAME (CONVERT (sysname, pa.value)),
CONVERT (sysname,pa.value))END, 40) AS dbname,sql.objectid, CONVERT (nvarchar(50),
CASE WHEN sql.objectid IS NULL THEN NULL ELSE REPLACE (REPLACE (sql.[text],CHAR(13), ' '), CHAR(10), '')END) AS procname,
REPLACE (REPLACE (SUBSTRING (sql.[text], stat.statement_start_offset/2 + 1, CASE WHEN stat.statement_end_offset = -1
THEN LEN (CONVERT(nvarchar(max), sql.[text]))
ELSE stat.statement_end_offset/2 - stat.statement_start_offset/2 + 1 END), CHAR(13), ' '), CHAR(10), ' ') AS stmt_text
FROM sys.dm_exec_cached_plans p OUTER APPLY sys.dm_exec_plan_attributes (p.plan_handle) pa INNER JOIN sys.dm_exec_query_stats stat
ON p.plan_handle = stat.plan_handle OUTER APPLY sys.dm_exec_sql_text (p.plan_handle) AS sql
WHERE pa.attribute = 'dbid' ORDER BY p.plan_handle DESC

Go

 
Resolution

SQL Server 2008
There is a fix for this issue in SQL server 2008 http://support.microsoft.com/kb/961323

SQL Server2005
There is no Fix for SQL Server 2005. You can follow some workarounds
Since all the plans (problematic) are created for MSDB. You can use dbcc flushprocindb(4) —  to flush the plans of msdb without touching the plans of other databases (4 is database id of MSDB) .
You can add a new step to backup job created by maintenance plan to run dbcc flushprocindb(4) every time backup completes. So the plans created in MSDB will be flushed immediately after the backup.

dbcc dbredindex/Alter index (or) update stats which should run first

dbcc dbredindex/Alter index (or) update stats which should run first?

Statistics created manually or by auto create stats will not be updated by ALTER INDEX ALL ON table.

If you run dbcc dbredindex on a table , it will update stats on index indx with 100% sampling, but the stats created by auto create stats and manually created stats are updated with default sampling percentage.

Now choose your self.

 

Update statistics with full scan or sample?

Should I choose full scan while I update the statistics?

In an ideal situation, if data is 100% random and 100% even distributed, 10% sampling should be as accurate as full scan. But any uneven distribution of data can cause inaccuracy both with sampling and even with full scan.
In short, with skewed data, update stats with sampling can be less accurate than full scan ,but important fact is full scan will also not be accurate.

Can high number of connection cause SQL Server to refuse connection?

sp_configure “user connections” will decide the maximum number of simultaneous user connections allowed on SQL Server.Default value for user connections is 32767. So unless we hange the value it wouldnt cause problems.

Connections will be refused if we exceeded max worker threads. Look at the SQL Server error log and identify if there are deadlocked scheduler dumps. If there is follow the steps in http://mssqlwiki.com/2010/06/15/how-to-analyze-deadlocked-schedulers-dumps/ identify on what resource your sessions are waiting (Not excatly session. It is thread or session with threads associated with it).

When you have the issue look at sysprocesses for session with Kpid and identify where they are waiting and check if the active sessions (Sessions with KPID) exceeded max worker threads.

sp_send_dbmail blocked and waittype is preemptive_os_getprocaddress

This will happen if you use user transaction when calling sp_send_dbmail. Do not call sp_send_dbmail with in user transaction.If you open a transaction and
send a mail with some attachment in first connection and dont commit . Open a second connection and send a mail using sp_send_dbmail it will not be queued and wait
till the first connection commits.You will also see preemptive_os_getprocaddress wait for second connection

Same value for min server memory and max server memory in SQL server

SQL Server memory management is designed to dynamically adjust the committed memory based on the amount of available memory on the system.

SQL Server uses CreateMemoryResourceNotification to create a memory resource notification object and SQL Server Resource monitor threads calls QueryMemoryResourceNotification every time it runs to identify if there is any notification. If a low memory notification comes from Windows, SQL Server scales down its memory usage.

How much it scales down?
Till “Min server memory”  (If there is continous memory pressure on the system).
What happens when you set Max server memory and min server memory to same value?
Ans:SQL Server will never scale down its memory usage even when there is memory pressure system wide (Lowphysicalmemory notification  set at system level)

What are the affects?
Ans:If LPM is not enabled SQL Server’s working set will be paged. If LPM is enabled system will starve for memory  and non-bpool will be paged.

Refer http://mssqlwiki.com/2012/06/27/a-significant-part-of-sql-server-process-memory-has-been-paged-out/ for more details.

Cap the SQL Server MAX Server Memory after considering the memory required by other applications, Operating system, Drivers , SQL Server Non- Bpool allocations etc. Make sure you have adequate available physical memory even when the system is under heavy load.

Make sure you have all the fixes for working set trim refer http://mssqlwiki.com/sqlwiki/sql-performance/windows-2008-and-windows-2008-r2-known-issues-related-to-working-set-memory/

Invalid column name ‘originator_id’. (Microsoft SQL Server, Error: 207)

Create replication fails on database which is migrated from SQL Server 2005 with error

Error:

ADDITIONAL INFORMATION:
An exception occurred while executing a Transact-SQL statement or batch. (Microsoft.SqlServer.ConnectionInfo)
——————————
Invalid column name ‘originator_id’. (Microsoft SQL Server, Error: 207)

Resolution:
1.  Add originator_id  column to syspublications table in publication db. Run below query in your   publication DB.

    ALTER TABLE dbo.syspublications ADD [originator_id] [int] NULL;

2. Add a column originator_id  into the view dbo.syspublications inside distribution database

3. Add a originator_id  column into IHpublications table in distribution database

UPDATE STATISTICS WITH SAMPLE take longer then FULLSCAN

UPDATE STATISTICS WITH SAMPLE take longer then FULLSCAN and consumes more tempdb

Recently there was a question in MSDN forum about time and tempdb taken by update statistics with full scan and sampling.
http://social.msdn.microsoft.com/Forums/en/sqldatabaseengine/thread/eee70106-94aa-4140-82bf-654352f82d8f
Question was why full scan completes faster than sampling

I was trying to reproduce this behavior in one of my lab server.

1.When I was updating the statistics with n% sampling for the Clustered Index on Table(Which had clustered and non-clustered index ), SQL Server choses a non-clustered index to do the scan. Here the data scanned/fetched was in the order of the non-clustered index keys. Hence we need to SORT them once based on the clustered index keys to generate the statistics. Sort was consuming time and we also used tempdb to create the worktables.

2 .On the other side when I was using update stats with fullscan SQL Server chooses a “clustered index scan” to do the scan. Since the data was already sorted in clustered index, sort was avoided and update stats completed faster.

Why optimizer used non-clustered index to do the scan when I was using n% sampling ?
As we are using a non-clustered index to scan the rows, we effectively get lot lesser rows into the memory from disk when compared with using a clustered index to scan the rows. The decision about which index to use was cost based. As we know SQL Server is a cost based optimizer it chooses the index that can provide the results with lesser subtree cost.
If I wouldn’t have got that non-clustered index , then sampling might have run faster.
There is no straight answer “It depends“.You have to look at the plan of update stats and troubleshoot/Identify the cause like you do for any other slow queries.