(somewhat) Social

Entries in dashboard (1)

Monday
Jan182010

ourSSD 1 - Is My Node Up?

You know that you are going to hear about it if your SQL Server is down. In fact I'm sure you have all kinds of bells and whistles that will grab your attention the second it happens, pre-empting the user community that will come screaming in your direction like a hoard of banshees being serenaded by bagpipers from hell. What if SQL isn't down though? Let's say that you are running in an active/passive cluster configuration, you think everything is great, you are covered. What happens when your active node takes a dive and it starts to failover like the good little cluster group it is, only to find that the other node is not able to take over control of the cluster because it's not up or not responding correctly? Yeah, you have a problem.          

This scenario is what brought about the first item I'll share for the monitoring with ourSSD. Very simply, are the physical nodes that make up my SQL Server infrastructure up and functional? (and by that I don't mean just pingable)          

How to do this, and do it for free? Not being a developer, knowing no languages I struggled, then, just as the night seemed darkest came a bright shiny Powershell light. This thing is awesome, in fact it was so good that it deserved bold type (it was || close to caps!). The things that you can do with it and the level of simplicity it can bring with the vast number of things that it can do make it an invaluable tool.
    

I'm not going to wax lyrical about Powershell integration with SQL nor give you the full skinny and rundown, there are people and resources way better than I at doing that sort of thing, take these links for example          


Here's what you get from me....          

So what's the deal, is my node up? Pinging will tell us that it's on the network, that can be misleading however. So how about we do something with the server and check the result? This is where WMI and Powershell hold hands and make smiley faces to give us something useful.          

Kick off the SQL Powershell Provider sqlps and let's check to see if you're up. In the Powershell window enter          

Get-WmiObject -class Win32_OperatingSystem -computername .          

 

Check it out, some data about your local computer. You should get your system directory, organization, Windows build number and a couple more things. If you want a slightly more crazy list use the following          



Get-WmiObject -class Win32_OperatingSystem -computername . | SELECT *           

 

Ok, that's a crazy amount of stuff. Thing is, we know that the computer is up (well of course we do, that's where we ran the script). What about another machine, well provided you have the requisite permissions then you can change the period for another computer on the network and get the results from it          



Get-WmiObject -class Win32_OperatingSystem -computername YUKON1 | SELECT *          

 

Hey, what's the deal? I just got an error saying that the RPC Server is unavailable. That's because I don't have a YUKON1 OR because YUKON1 cannot respond to my WMI query. Let's say that YUKON1 does exist, we don't need all that data to know that it's up, just having a result is good enough. Heck, we want to find out the status so let's use that.          



Get-WmiObject -class Win32_OperatingSystem -computername YUKON1 | SELECT Status
Status
-------
OK
    

 

Sweet, it's up and running. This is very simple, very basic and all that we need. Now, how to turn that one little item into something we can use for more than one server at a time.          

 

 

One of the most challenging things I found when first starting out was figuring out how to get and work with a recordset. I didn't want to have to type in stuff constantly, nor did I want to read in things from text files. I'm a DBA afer all, the data should be in the database. This is where Invoke-SqlCmd comes in. First, login to a SQL Server, create a database to hold some of the record stuff (I'm using AdminDB and will throughout all the examples and we'll add a table for the check.          



Use SQLMonitor
GO
CREATE TABLE dbo.PhysicalNodeStatus
 (
  NodeName varchar(25),
  NodeStatus varchar(4),
  CheckTime datetime default getdate()
 )
GO
CREATE TABLE dbo.PhysicalNodes
 (
  NodeName varchar(25)
 )
GO
    

 

Insert into the dbo.PhysicalNodes table some data for a couple of machines (include one or two that might be down or not exist for control purposes). Crank up your favourite text editor and let's work with some data.          



#Get a recordset of NodeNames from the PhysicalNodes table so that we can work with them
$DSNodes = Invoke-SqlCmd -ServerInstance ".\YUKON" -Database "SQLMonitor" -Query "SELECT NodeName from PhysicalNodes ORDER BY NodeName"
#Run the WMI Query for each of the records returned
foreach ($dbn in $DSNodes)
 {
  $nodename = $dbn.NodeName
  $goodbad = Get-WmiObject -class Win32_OperatingSystem -computername $nodename | SELECT Status
  $good = $goodbad.Status
  Write-Host "$nodename $good"
 }
    

 

If you save and run this you should get a nice returned list of servers along with OK if they were good. Of course, if they are not good, for whatever reason, you get the big red ugly error syntax. We generally don't want to see this and so at the very top of the script add $ErrorActionPreference = "SilentlyContinue"
This is good so far, how about something a little more visually interesting and useful. What you can do is use the value of Status in the $good variable and work with that to return something different depending on whether we got an ok result or not.          




$ErrorActionPreference = "SilentlyContinue"

#Get a recordset of NodeNames from the PhysicalNodes table so that we can work with them
$DSNodes = Invoke-SqlCmd -ServerInstance ".\YUKON" -Database "SQLMonitor" -Query "SELECT NodeName from PhysicalNodes ORDER BY NodeName"

#Run the WMI Query for each of the records returned

foreach ($dbn in $DSNodes)
 {
  $nodename = $dbn.NodeName
  $goodbad = Get-WmiObject -class Win32_OperatingSystem -computerName $nodename | SELECT Status
  $good = $goodbad.Status

 

 if ($good -eq "OK")
  {
   Write-Host -BackGroundColor Green -ForeGroundColor Black "$nodename good"
  }
 Else
  {
   Write-Host -BackGroundColor Red "$nodename bad"
  }
 }

 

Run this and now we have a nice list of the nodes and whether the node is good or bad plus some highlighting. Very useful, and great if you want to run manually. The final step though, let's get that information back into the database where we can work with it and do other things like create reports or send emails. Once more Invoke-SqlCmd comes to the rescue. I struggled for an age with using SqlConnection to do this, and was able to get it to work, this is much easier to work with however. For this we are simply going to perform an insert along with the data returned from the WMI query. We end up with a final script thusly...          




$ErrorActionPreference = "SilentlyContinue"

#Get a recordset of NodeNames from the PhysicalNodes table so that we can work with them
$DSNodes = Invoke-SqlCmd -ServerInstance ".\YUKON" -Database "SQLMonitor" -Query "SELECT NodeName from PhysicalNodes ORDER BY NodeName"

#Run the WMI Query for each of the records returned
foreach ($dbn in $DSNodes)
 {
  $nodename = $dbn.NodeName
  $goodbad = Get-WmiObject -class Win32_OperatingSystem -computerName $nodename | SELECT Status
  $good = $goodbad.Status  

 If ($good -eq "OK")
  {
   Write-Host -BackGroundColor Green -ForeGroundColor Black "$nodename good"
   Invoke-SqlCmd -ServerInstance ".\YUKON" -Database "SQLMonitor" -Query "INSERT INTO dbo.PhysicalNodeStatus (NodeName, NodeStatus) VALUES ('$nodename', 'up')"
  }
 Else
  {
   Write-Host -BackGroundColor Red "$nodename bad"
   Invoke-SqlCmd -ServerInstance ".\YUKON" -Database "SQLMonitor" -Query "INSERT INTO dbo.PhysicalNodeStatus (NodeName, NodeStatus) VALUES ('$nodename', 'down')"
  }
}  

 

Run this and check the PhysicalNodeStatus table, you should see the the node, it's status and the time you ran the process. Execute this as many times as you want and check the results are entered correctly for each run.          

 

Comment out the Write-Host lines, throw it in a Powershell Step within a SQL job, schedule and take a step back, check through the results. Boom. Done.