Tuesday, November 29, 2011

Why you should not use undocumented commands in production server?

Today I worked on a interesting issue...

There was a Non-yielding scheduler dump generated and continuos Non-yielding scheduler error messages in SQL Server errorlog. This somehow caused Latch timeouts and Buffer Latch timeouts and SQL Server went unresponsive.

When we analyzed the dump, thread which is owning the latch was running the following query and query kept running without coming out of scheduler...

DECLARE cursor1 CURSOR FOR

SELECT name FROM master.dbo.sysdatabases WHERE status&512 = 0
OPEN cursor1
FETCH NEXT FROM cursor1 INTO @DBName
WHILE @@FETCH_STATUS = 0
BEGIN
SET @SQLString = 'USE [' + @DBName + '] DBCC LOGINFO WITH NO_INFOMSGS'
/* Some business logic here */
FETCH NEXT FROM cursor1 INTO @DBName
END
DEALLOCATE cursor1
 
Nothing is wrong with this query but what went wrong?
 
Problem was this instance had a snapshot database (CREATE DATABASE AS SNAPSHOT OF) and as explained in this link http://connect.microsoft.com/SQLServer/feedback/details/292136/running-dbcc-loginfo-on-db-snapshot SPID kept spinning for a spinlock without knowing that there is no log file for the database snapshot which caused Non-yielding scheduler situation in SQL Server.
 
Issue has been fixed in SQL 2008 and solution for SQL 2005 is to exclude the snapshot database name which running DBCC LOGINFO....
 
So this is one more instance to instruct DBA's not to use Undocumented commands in production server which will cause undesired situations in the production server...
 
Also this is an example to quickly generate Non-yielding scheduler and dump in SQL Server :)