(somewhat) Social

Entries in SQL (36)

Tuesday
May102011

T-SQL Tuesday #18–CTE A Simpler Form Of Recursion

It’s T-SQL Tuesday time folks. Bob Pusateri (blog|twitter) is the host and has chosen Common Table Expressions (CTEs) as the topic of the month.

Several ideas came to mind for writing about CTEs, one of the best uses I’ve seen for one recently was to grab the name of the most recent backup file for a database. You’ll have to ask Aaron Nelson (blog|twitter) to hook you up with that one though.

I thought I’d write about an interesting problem that was posed to me in an interview a couple of months ago.

 

Here’s a table

I was given a table with two columns; ManagerID, EmployeeID.

This table was populated with a few values thusly:

USE TempDB
GO
create table #ManagersEmployees (ManagerID int, EmployeeID int)
insert into #ManagersEmployees 
values(1,2), (2,3), (2,4), (2,5), (3,6), (3,7), (3,8)
    , (4,10),(5,11),(5,12), (12,13), (12,14)
GO

I was asked to write a recursive procedure to pull out the manager, employee tree for a given ManagerID.

 

CTEs to the rescue

Having done a little work with CTEs and understanding that I could easily write a recursive query using them I was able to quite quickly put together a script to pull the information needed. By throwing it into a procedure it could quickly and easily be executed.

CREATE PROCEDURE ManagerRecursion_CTE @ManagerID INT
AS
SET NOCOUNT ON
;WITH Managers_CTE (ManagerID, EmployeeID )
AS ( SELECT ManagerID, EmployeeID FROM #ManagersEmployees  
        WHERE ManagerID = @ManagerID
UNION ALL
    SELECT e.ManagerID, e.EmployeeID 
        FROM #ManagersEmployees e 
            INNER JOIN Managers_CTE c on e.ManagerID = c.EmployeeID)
SELECT * FROM Managers_CTE ORDER BY ManagerID, EmployeeID
GO

 

I tested and this worked nicely, it was a simple solution and provided the requested results.

 

That’s not recursion

The trouble is that while the results were not correct I was advised that this was not recursive and did not meet the criteria. Back to the drawing board then.

After a lot more work I came up with the following:

CREATE PROCEDURE ManagerRecursion_NonCTE @ManagerID INT
AS
SET NOCOUNT ON
DECLARE @rowcnt INT, @lastrow INT
DECLARE @Tbl_Results TABLE (rowid INT IDENTITY(1,1), ManagerID INT, EmployeeID INT)
 
INSERT INTO @Tbl_Results (ManagerID, EmployeeID)
SELECT ManagerID, EmployeeID
FROM #ManagersEmployees
WHERE ManagerID = @ManagerID
 
SET @rowcnt = @@ROWCOUNT
SET @lastrow = 0
WHILE @rowcnt > 0
BEGIN
INSERT INTO @Tbl_Results (ManagerID, EmployeeID)
SELECT m.ManagerID, m.EmployeeID
FROM #ManagersEmployees m
    INNER JOIN @Tbl_Results t
        ON m.ManagerID = t.EmployeeID
WHERE rowid > @lastrow
 
SELECT @rowcnt = @@ROWCOUNT
 
SELECT @lastrow = @@IDENTITY - @rowcnt
END
SELECT ManagerID, EmployeeID FROM @Tbl_Results ORDER BY ManagerID, EmployeeID
GO

 

I tested this and got back the same results as with the first procedure with all the values I passed in. Deep breath on this one as it was pushing the limits of what I could produce on the spot in an interview.

 

That’s still not recursion

Again, while the results we correct this was not recursive. It was back to the drawing board once more. This time I had to admit defeat, however did tell the interviewer that I would work on a solution at home and email it in. He gave me his contact information, we completed the rest of the interview and I went home determined to get the right data in the manner that the interviewer wanted.

After a whole bunch of reading and a lot of work I finally came up with correct results in a recursive procedure which I emailed in to get feedback.

CREATE PROC ManagerRecursion @EmpID INT, @InnerLoop INT = 0
AS
BEGIN
    IF @InnerLoop = 0
        BEGIN
        CREATE TABLE #Tbl_Results (ManagerID INT, EmployeeID INT)
        END
            INSERT INTO #Tbl_Results (ManagerID, EmployeeID)
            SELECT ManagerID, EmployeeID FROM #ManagersEmployees WHERE ManagerID = @EmpID
 
            SET @EmpID = (SELECT MIN(EmployeeID) FROM #ManagersEmployees WHERE ManagerID = @EmpID)
 
            WHILE @EmpID IS NOT NULL
            BEGIN
                EXEC ManagerRecursion @EmpID, 1
                SET @EmpID = (SELECT MIN(EmployeeID) FROM #ManagersEmployees WHERE EmployeeID > @EmpID 
                    AND EmployeeID NOT IN (SELECT ManagerID FROM #Tbl_Results)
                    AND EmployeeID IN (SELECT EmployeeID FROM #Tbl_Results))
            END
    IF @InnerLoop = 0   
        BEGIN
        SELECT * FROM #Tbl_Results order by ManagerID, EmployeeID
        DROP TABLE #Tbl_Results
        END      
END
 
GO

 

Unfortunately no feedback was forthcoming. I felt good about providing this solution despite that. I enjoy a challenge and this was certainly one of those.

 

So what was the right way?

That depends on who you ask. For me the first way was the right was. It performed well, the code was clean and easy and required a minimum amount of development. I feel that the solution here was exactly the reason that CTEs were created in the first place.

The second solution was a lot more work, the query got more complex and it does not perform as well as the first.

The final procedure was true recursion in that the procedure calls itself over and over again until all of the results are returned. It’s a loop that makes cursors look like they perform well. It was easily the worst performing of the three.

 

It all goes to show there’s more than one way to get the results you need. It’s also interesting how an example like this shows just how much work the SQL development team have done to help reduce complexity and improve performance.

Monday
May092011

Upcoming Clustering Fun

I’ve a fun week upcoming, I get to do a bunch of installs of SQL 2008 R2 on some Windows 2008 R2 clusters.

I’ve had a lot of experience in building and working with Windows 2003 clusters and deploying SQL 2008, this is a new era for me and means new toys to play with and new things to learn. I’m excited to really get to grips with Windows 2008 clustering, there are some significant differences between it and 2003 which is going to provide some challenges. I’m aiming to perform quite a few build up and tear downs of one cluster over the next couple of weeks to build up my levels of comfort with the newer technology, and work towards getting some nice scripts together to do so.

As a consequence of Windows 2008 R2 being new to me for clustering I decided to pick up Pro SQL Server 2008 Failover Clustering by Allan Hirt (blog|twitter), a Clustering MVP and guru (who has a new whitepaper out about Applying Updates to a Clustered Instance of SQL Server 2008 or SQL Server 2008 R2). Allan was on the MidnightDBA web show a couple of weeks ago (go watch it) where there was lots of conversation on several aspects of High Availability.

I’m still only a quarter of the way through the book but have already found out some fantastic information on things such as the network priority and installing server roles from the command line.

As the build goes along I plan on putting up a couple of posts to track the process and gotchas as well as a final review of Allan’s book, so check back soon.

Friday
May062011

Handling Periods In Database Names With PowerShell

I came across an interesting problem today. We have a vendor who’s tool, for some unknown reason, creates a database with a period in the name (ie Data.Base). Why on earth anyone would do this I don’t know, but it’s happened. I found this out when my PowerShell script to capture database sizes failed.

It seems that PowerShell was not able to handle that period.

 

Reproducing the problem

Create a new database using SSMS:

CREATE DATABASE [Terrible.DBName]

 

Now open up PowerShell and try to query what’s in that database:

dir SQLSERVER:\SQL\localhost\default\databases\Terrible.DBName

 

Doing so gives the error:

Get-ChildItem : Cannot retrieve the dynamic parameters for the cmdlet. SQL Server PowerShell provider error: The number of keys specified does not match the number of keys required to address this object. The number of keys required are: Name.
At line:1 char:4
+ dir <<<<  SQLSERVER:\SQL\localhost\default\databases\Terrible.DBName
    + CategoryInfo          : InvalidArgument: (:) [Get-ChildItem], ParameterBindingException
    + FullyQualifiedErrorId : GetDynamicParametersException,Microsoft.PowerShell.Commands.GetChildItem
   Command

I tried encapsulating the path in quotes and escaping the period with a back tick, both to no avail.

dir 'SQLSERVER:\SQL\localhost\default\databases\Terrible.DBName'    
dir SQLSERVER:\SQL\localhost\default\databases\Terrible`.DBName

 

Fixing the problem

Eventually I found the solution. Changing the period to it’s hex value of %2e allowed me to query correctly.

dir SQLSERVER:\SQL\localhost\default\databases\Terrible%2eDBName  

PeriodInDBName

 

Updating scripts

Now that I found the problem and resolution I’ll be going through and changing my scripts to take this into account.

One thing to note, when trying to use –replace to replace the period in a variable you need to place it within square brackets otherwise the entire string gets replaced with %2e.

The wrong way:

$a = "Some.String"
$a = $a -replace ".", "%2e"
$a

PeriodInDBNameWrongString

 

The right way:

$a = "Some.String"
$a = $a -replace "[.]", "%2e"
$a

PeriodInDBNameRightString

Thursday
May052011

Checking Database Space With PowerShell

Have you ever been asked the question “quick, I need to know how much free space there is in the Nodrap database on server Cromulent”?

Ever wished that you could quickly grab that information?

You might well have a TSQL script that will grab that information for you. I’ve been working on pulling that information centrally, grabbing that data for multiple servers becomes a much simpler task when using PowerShell.

 

Why PowerShell and not TSQL?

For me the big bonus that PowerShell provides me for grabbing this data is that I can pull it quickly and easily, I can also simply run it for multiple machines and the same script works for versions SQL 2000 and up.

 

How big is my database?

Open up your favorite PowerShell ISE (you can even use the one that comes with Windows). If the aren’t already loaded you’ll need to add the SQL snapins

Add-PSSnapin SqlServer* 

 

To get a list of databases, their sizes and the space available in each on the local SQL instance.

dir SQLSERVER:\SQL\localhost\default\databases | Select Name, Size, SpaceAvailable | ft -auto

CheckDBWPS SizeSpace

Very quick and easy. The information here is a little misleading though.

The Size is the size of the database and includes size of the transaction log(s). SpaceAvailable only represents the space available in the data files however, unless we’re looking at SQL 2000 in which case it’s the free space in the data and log files. To make things even more confusing Size is reported in MB and SpaceAvailable in KB.

To get more details we need to look at the transaction log information.

 

How big is my transaction log?

To grab this information we need to go deeper. In this case we will focus on the AdventureWorks database.

dir SQLSERVER:\SQL\localhost\default\databases\AdventureWorks\logfiles | 
    Select Name, Size, UsedSpace | ft -auto

CheckDBWPS LogSpace

 

This information is all in KB which helps have it make a little more sense.

 

Doing the math

Now we have the size of the database, the size of the log and the free space in each it’s some quick math which gives us the sizing information.

  • Size = Database Size in MB
  • Size / 1024 (from log) = Log size in MB
  • UsedSpace / 1024 = Log used in MB
  • (Size – UsedSpace) / 1024 = Log free in MB
  • Size – (Size / 1024) (from log) = Data files size in MB
  • SpaceAvailable / 1024 = Space free in data files
  • Size – (SpaceAvailable / 1024) – (Size / 1024) (from log) = Space used in data files
  • Size – ((SpaceAvailable / 1024) – ((Size – UsedSpace) / 1024) = Space used in data files (SQL 2000)

 

Outputting nice results

Running all this and grabbing the data from different areas can lead to messy results. This is where the PowerShell DataTable come to the rescue.

A DataTable is a PowerShell object, much like a SQL table that can hold data for you.

First you create the object, define and add columns and then add rows before finally returning the data.

Here’s a quick example:

$dataTable = New-Object system.Data.DataTable "SomeStuff"
 
#Specify the column names for the data table
$col1 = New-Object system.Data.DataColumn FunkyText,([string])
$col2 = New-Object system.Data.DataColumn JustANumber,([int])
$col3 = New-Object system.Data.DataColumn TodaysDate,([datetime])
 
#Add the columns to the data table
$dataTable.Columns.Add($col1)
$dataTable.Columns.Add($col2)
$dataTable.Columns.Add($col3)
 
#Create a new row
$row = $dataTable.NewRow()
#Add data for each column in the row
$row.FunkyText = "Something really funky"
$row.JustANumber = 1395
$row.TodaysDate = ((Get-Date ).ToString("yyyy-MM-dd HH:mm:ss"))
#Add the new row to the datatable
$dataTable.Rows.Add($row)
 
#Output the datatable
$dataTable | out-gridview

CheckDBWPS Datatable

 

Putting everything together

Running the following script will pull the data and log information for all the databases on the SQL instance specified in the top parameter. It also handles named instances without modification, and is factored to return the correct information for SQL 2000 as well as higher versions

Note: Run DBCC UPDATEUSAGE for SQL 2000 instances to ensure that the data is accurate

Param ($Servername= 'localhost')
 
 
 
Function Get-DBSizes ($Servername)
{
$Servername
 
#Check to see if it's a name instance, if it is the location will be different
if ($Servername -ilike "*\*") { $BaseLocation = "SQLSERVER:\SQL\$Servername\databases" }
    else { $BaseLocation = "SQLSERVER:\SQL\$Servername\default\databases" }
    
$dblist = dir $BaseLocation -Force | select Name
foreach ($db in $dblist)
{
    
    $location = $db.Name
    $locationFixed = $location -replace "[.]", "%2e"
    #Grab the database information
    $DBInfo = dir $BaseLocation -Force | Where-Object {$_.Name -eq $location; $_.Refresh()} | 
        select Name, 
                size,
                SpaceAvailable,
                CompatibilityLevel
 
    #Pull the log information
    #Use measure-object to sum up sizes in the event that we might have more than one log file
    $logsize = dir $BaseLocation\$locationFixed\logfiles | Where-Object {$_.Name; $_.Refresh()} | Measure-Object -Property Size -Sum 
    $logused = dir $BaseLocation\$locationFixed\logfiles | Where-Object {$_.Name; $_.Refresh()} | Measure-Object -Property UsedSpace -Sum 
    
    $sp = $DBInfo.SpaceAvailable
    $TotalDBSizeMB = $DBInfo.size
    $LogSizeMB = ($logsize.sum / 1024)
    $LogUsedMB = ($logused.Sum / 1024)
    $LogFreeMB = ($LogSizeMB - $LogUsedMB)
    $DataFilesSizeMB = ($TotalDBSizeMB - $LogSizeMB)
    $SpaceAvail = ($dbinfo.SpaceAvailable / 1024)
    
    #Because SQL2000 spaceavailable    includes log space we have to do different calculations depending on version
    #Run DBCC UPDATEUSAGE on your 2000 databases nightly to help ensure this data is accurate
    $Compat = $DBInfo.CompatibilityLevel
    if ($Compat -eq 'Version80') { $DataFileFreeMB = ($SpaceAvail - $LogFreeMB)    }
        else { $DataFileFreeMB = $SpaceAvail }
    
    
    $DataFileUsedMB = ($DataFilesSizeMB - $DataFileFreeMB)
    $DataFilePercentUsed = ($DataFileUsedMB / $DataFilesSizeMB) * 100
    $DataFilePercentFree = 100 - $DataFilePercentUsed 
    $LogPercentUsed = ($LogUsedMB / $LogSizeMB) * 100
    $LogPercentFree = 100 - $LogPercentUsed
    $date = (Get-Date ).ToString("yyyy-MM-dd HH:mm:ss")
    
    #Write the results into the data table
    $row = $dataTable.NewRow()
    $row.ServerName = $Servername
    $row.DatabaseName = $location
    $row.TotalDBSizeMB = $TotalDBSizeMB
    $row.DataFilesSizeMB = $DataFilesSizeMB
    $row.DataFilesUsedMB = $DataFileUsedMB
    $row.DataFilesFreeMB = $DataFileFreeMB
    $row.DataPercentUsed = $DataFilePercentUsed
    $row.DataPercentFree = $DataFilePercentFree
    $row.LogFilesSizeMB = $LogSizeMB
    $row.LogFilesUsedMB = $LogUsedMB
    $row.LogFilesFreeMB = $LogFreeMB
    $row.LogPercentUsed = $LogPercentUsed
    $row.LogPercentFree = $LogPercentFree
    $row.Date = $date
    $dataTable.Rows.Add($row)
    
    #And we are done
}
}
 
 
#Create data table to hold the results
$dataTable = New-Object system.Data.DataTable "Results"
#Specify the column names for the data table
$col1 = New-Object system.Data.DataColumn ServerName,([string])
$col2 = New-Object system.Data.DataColumn DatabaseName,([string])
$col3 = New-Object system.Data.DataColumn TotalDBSizeMB,([int])
$col4 = New-Object system.Data.DataColumn DataFilesSizeMB,([int])
$col5 = New-Object system.Data.DataColumn DataFilesUsedMB,([int])
$col6 = New-Object system.Data.DataColumn DataFilesFreeMB,([int])
$col7 = New-Object system.Data.DataColumn DataPercentUsed,([decimal])
$col8 = New-Object system.Data.DataColumn DataPercentFree,([decimal])
$col9 = New-Object system.Data.DataColumn LogFilesSizeMB,([int])
$col10 = New-Object system.Data.DataColumn LogFilesUsedMB,([int])
$col11 = New-Object system.Data.DataColumn LogFilesFreeMB,([int])
$col12 = New-Object system.Data.DataColumn LogPercentUsed,([decimal])
$col13 = New-Object system.Data.DataColumn LogPercentFree,([decimal])
$col14 = New-Object system.Data.DataColumn Date,([datetime])
#Add the columns to the data table
$dataTable.Columns.Add($col1)
$dataTable.Columns.Add($col2)
$dataTable.Columns.Add($col3)
$dataTable.Columns.Add($col4)
$dataTable.Columns.Add($col5)
$dataTable.Columns.Add($col6)
$dataTable.Columns.Add($col7)
$dataTable.Columns.Add($col8)
$dataTable.Columns.Add($col9)
$dataTable.Columns.Add($col10)
$dataTable.Columns.Add($col11)
$dataTable.Columns.Add($col12)
$dataTable.Columns.Add($col13)
$dataTable.Columns.Add($col14)
 
 
 
#Provide the name of the SQL server that we want to check
 
$WarningPreference = "silentlycontinue"
#Call the function to populate the results
#get-content c:\Temp\serverlist.txt | % {Get-DBSizes $_  } 
Get-DBSizes $Servername
 
#Output the results
$dataTable | Out-GridView
 

Click to embiggen

 

Final note

Download the PS1 file and save it on your machine, from there you can call the script and just pass in the SQL instance to get the results eg:

./Get-DatabaseSize.ps1 localhost
./Get-DatabaseSize.ps1 cromulent\awesomesauce

In a follow up post I’ll show how we apply a minor tweak to the script and have the data loaded into a SQL database.

Tuesday
May032011

Guest Post On Hey, Scripting Guy! Blog

Recently I was contacted by Aaron Nelson (blog|twitter) who provided me an awesome opportunity to write a guest blog post for the Hey, Scripting Guy! Blog. Naturally I jumped at the chance and the post went live this week as a part of a week of SQL related post in honor of SQLRally which is coming up next week in Orlando.SQLRally

My post was on using PowerShell to report on SQL Server backup status. Go have a read and let me know what you think. With the rest of the week being dedicated to SQL and PowerShell don’t forget to keep checking the Hey, Scripting Guy! Blog.