(somewhat) Social
« T-SQL Tuesday #14–Resolutions | Main | Loading Microsoft Connect Items To A Database–Intro »
Thursday
Jan062011

Loading MS Connect To A DB–Part 1: DB & Powershell Setup

Database Setup

Step one in loading the data into the database was to create the schema. Already knowing what was going to be stored it was quite simple to create a couple of tables. One to hold the permanent data and another staging table for working with the data.

For the purposes of this I created a new database “DenaliConnect” and used the following script to create the schema: 
1/11/2011 - due to extremely long URLs causing data truncation errors I've increased the URL column from varchar(255) to varchar(1000) 

   1: USE [DenaliConnect]
   2: GO
   3: SET ANSI_NULLS ON
   4: GO
   5: SET QUOTED_IDENTIFIER ON
   6: GO
   7: SET ANSI_PADDING ON
   8: GO
   9: CREATE TABLE [dbo].[ConnectItems_Staging](
  10:     [ID] [int] NOT NULL,
  11:     [Title] [varchar](1000) NULL,
  12:     [URL] [varchar](1000) NOT NULL,
  13:     [ItemStatus] [varchar](1000) NOT NULL,
  14:     [UpVoteCount] [int] NOT NULL,
  15:     [DownVoteCount] [int] NOT NULL,
  16:     [ItemDescription] [nvarchar](max) NOT NULL,
  17:     [Author] [varchar](100) NULL,
  18:     [Modified] [datetime] NOT NULL,
  19: PRIMARY KEY CLUSTERED 
  20: (
  21:     [ID] ASC
  22: )WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
  23: ) ON [PRIMARY]
  24: GO
  25: SET ANSI_PADDING OFF
  26: GO
  27:  
  28: SET ANSI_NULLS ON
  29: GO
  30: SET QUOTED_IDENTIFIER ON
  31: GO
  32: SET ANSI_PADDING ON
  33: GO
  34: CREATE TABLE [dbo].[ConnectItems](
  35:     [ID] [int] NOT NULL,
  36:     [Title] [varchar](1000) NULL,
  37:     [URL] [varchar](1000) NOT NULL,
  38:     [ItemStatus] [varchar](1000) NOT NULL,
  39:     [UpVoteCount] [int] NOT NULL,
  40:     [DownVoteCount] [int] NOT NULL,
  41:     [ItemDescription] [nvarchar](max) NOT NULL,
  42:     [Author] [varchar](100) NULL,
  43:     [Modified] [datetime] NOT NULL,
  44:     [ValidationCount] [int] NOT NULL,
  45:     [WorkAroundCount] [int] NOT NULL,
  46:     [OpenedDate] [datetime] NOT NULL,
  47: PRIMARY KEY CLUSTERED 
  48: (
  49:     [ID] ASC
  50: )WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
  51: ) ON [PRIMARY]
  52: GO
  53: SET ANSI_PADDING OFF
  54: GO
  55:  
  56: ALTER TABLE [dbo].[ConnectItems_Staging] ADD  DEFAULT ((0)) FOR [UpVoteCount]
  57: GO
  58: ALTER TABLE [dbo].[ConnectItems_Staging] ADD  DEFAULT ((0)) FOR [DownVoteCount]
  59: GO
  60: ALTER TABLE [dbo].[ConnectItems_Staging] ADD  DEFAULT (getdate()) FOR [Modified]
  61: GO
  62: ALTER TABLE [dbo].[ConnectItems] ADD  DEFAULT ((0)) FOR [UpVoteCount]
  63: GO
  64: ALTER TABLE [dbo].[ConnectItems] ADD  DEFAULT ((0)) FOR [DownVoteCount]
  65: GO
  66: ALTER TABLE [dbo].[ConnectItems] ADD  DEFAULT (getdate()) FOR [Modified]
  67: GO
  68: ALTER TABLE [dbo].[ConnectItems] ADD  DEFAULT ((0)) FOR [ValidationCount]
  69: GO
  70: ALTER TABLE [dbo].[ConnectItems] ADD  DEFAULT ((0)) FOR [WorkAroundCount]
  71: GO
  72: ALTER TABLE [dbo].[ConnectItems] ADD  DEFAULT ('01/01/1901') FOR [OpenedDate]
  73: GO

 

This gives us a couple of tables to work with. To get the data from the ConnectItems_Staging to the ConnectItems table I took advantage of the MERGE statement in SQL 2008 and created a stored procedure to perform the load:

   1: SET ANSI_NULLS ON
   2: GO
   3: SET QUOTED_IDENTIFIER ON
   4: GO
   5: CREATE procedure [dbo].[MergeItems]
   6: AS
   7: MERGE dbo.ConnectItems as Target
   8: USING (SELECT ID, Title, URL, ItemStatus, UpVoteCount, DownVoteCount, ItemDescription, Author, Modified FROM dbo.ConnectItems_Staging) AS Source
   9: ON (Target.ID = Source.ID)
  10: WHEN MATCHED THEN
  11:     UPDATE SET Target.UpVoteCount = Source.UpVoteCount, Target.DownVoteCount = Source.DownVoteCount, Target.Modified = Source.Modified, Target.ItemStatus = Source.ItemStatus
  12: WHEN NOT MATCHED BY TARGET THEN
  13:     INSERT  (ID, Title, URL, ItemStatus, UpVoteCount, DownVoteCount, ItemDescription, Author, Modified)
  14:     VALUES (Source.ID, Source.Title, Source.URL, Source.ItemStatus, Source.UpVoteCount, Source.DownVoteCount, Source.ItemDescription, Source.Author, Source.Modified)
  15: ;
  16: GO

 

Unfortunately when it came to the online database which was going to be the backend for the published data we were stuck with running SQL 2005 in which MERGE is not supported. As such I ended up creating another proc to perform this task

   1: SET ANSI_NULLS ON
   2: GO
   3: SET QUOTED_IDENTIFIER ON
   4: GO
   5: CREATE PROCEDURE [dbo].[InsertUpdateItems]
   6: AS
   7: --Update existing records
   8: UPDATE C
   9: SET UpVoteCount = S.UpVoteCount
  10:     , DownVoteCount = S.DownVoteCount
  11:     , Modified = S.Modified
  12:     , ItemStatus = S.ItemStatus
  13: FROM dbo.ConnectItems C
  14:     INNER JOIN dbo.ConnectItems_Staging S
  15:         ON C.ID = S.ID
  16: --Insert new ones        
  17: INSERT INTO dbo.ConnectItems (ID, Title, URL, ItemStatus, UpVoteCount, DownVoteCount, ItemDescription, Author, Modified)
  18: SELECT S.ID, S.Title, S.URL, S.ItemStatus, S.UpVoteCount, S.DownVoteCount, S.ItemDescription, S.Author, S.Modified
  19: FROM dbo.ConnectItems_Staging S
  20:     LEFT OUTER JOIN dbo.ConnectItems C
  21:         ON S.ID = C.ID
  22: WHERE C.ID IS NULL        
  23: GO

 

Powershell Setup

The data pull is performed using Powershell V2 and the PowershellPack (which contains 10 modules to help you get the most out of Powershell). Included in the PowershellPack is the PSRSS module to manage RSS feeds.

I originally tried using the WebClient functionality of Powershell (New-Object System.Net.WebClient) which I use in my MCM Video Download script however I ran into encoding errors on the page. PSRSS eliminated this as a problem and allowed me to move forward.

 

Next Up

In Part 2 I’ll cover pulling the RSS feed from MS Connect and parsing the data for loading into the database.

 

PrintView Printer Friendly Version

EmailEmail Article to Friend

Reader Comments

There are no comments for this journal entry. To create a new comment, use the form below.

PostPost a New Comment

Enter your information below to add a new comment.

My response is on my own website »
Author Email (optional):
Author URL (optional):
Post:
 
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>