Loading MS Connect To A DB–Part 3: Keeping The Data Current
Tuesday, January 11, 2011 at 9:07PM In Part 2 of this series we finally got some data in the database. Now that there’s data we need to be able to keep it up to date. In particular we want to look for status and vote count changes.
We’re going to work with the same Connect items as was shown in the previous post (634019).
The question is, given that there is no RSS feed how are we going to keep this data current? This is where our friend the Net.WebClient comes to the rescue.
First things first, let’s grab a record to work with.
Pulling Data From the Database
We’re going to be using the Net.WebClient to open the URL associated with the Connect item, then parsing that information with Powershell and updating the database with the relevant details.
To pull in a record we use invoke-sqlcmd:
$workingURLs = invoke-sqlcmd -ServerInstance "." -Database "DenaliConnect" -Query "select ID, URL from dbo.ConnectItems where ID = 634019"
Now we take this recordset and dump the results to variables
$URL = $workingURLs.URL
$ID = $workingURLs.ID
The result is two variables:
$ID = 634019
$URL = http://connect.microsoft.com/SQLServer/feedback/details/634019/enable-controls-to-be-found-by-name-in-report-builder
Looking At The Web Page
With the URL captured it’s time to take a look at the data on the page to see what information is available and how we might get to it. The best way to do this is to open it up in the browser and then look at the source (click image to enbiggen)
Good news here, all of the information I want to grab to stored in meta tags. This makes like so much simpler when it comes to parsing the data.
Parsing The Data
First thing first we’ll pull the page
$PulledURL = (new-object Net.WebClient).DownloadString("$URL")
Now we have the page stored in a variable we can use our old friend REGEX to parse it and pull what we require. Let’s start with the Up-Votes.
Using regex we can pull just this data:
$regexUp = "<meta name=`"Search.MSConnect.Feedback.UpVoteCount[^>]*?/>"
$up = [regex]::Matches($PulledURL, "$regexup") | % { $_.Value } | Select -First 1
$up
which gives us: <meta name="Search.MSConnect.Feedback.UpVoteCount" content="4" />
Hey wow, that’s just what we needed. Using regex again we can pull just the numerical value and get the actual count.
$UpCount = [regex]::Matches($up, "\d{1,5}") | Select Value
$UpCountVal = [int]$UpCount.Value
$UpCountVal
We run this and we get 4. The right number!
Repeating this for the down-votes, validations, workarounds, opened date and status loads up a bunch of variables ready for writing to an update statement.
Updating The Database
We just use the variables to build an update statement
$SQLUpdate = @"
UPDATE dbo.ConnectItems
SET UpVoteCount = $UpCountVal,
DownVoteCount = $downCountVal,
ValidationCount = $validationCountVal,
WorkAroundCount = $WorkAroundCountVal,
OpenedDate = '$OpenedDate',
ItemStatus = '$ActualStatus'
WHERE ID = $ID
"@
$SQLUpdate
A quick invoke-sqlcmd and the database is updated. The voting information is current as are the other counts and the status.
invoke-sqlcmd -ServerInstance "." -Database "DenaliConnect" -Query "$SQLUpdate"
Final Steps
Now that this works fine for a single record it’s easy to turn the pull of the web page and the parsing into a function, then loop through all the currently active items and update their data accordingly.
You can download the full script here.
In the final step we’ll go over loading old data which wasn’t captured in the RSS feed.
- Intro
- Part 1 – Database & Powershell setup
- Part 2 – Parsing the RSS & loading the data
- Part 3 – Keeping the data current
- Part 4 – Loading old data


Reader Comments