(somewhat) Social

Entries in TSQL Tuesday (8)

Tuesday
May102011

T-SQL Tuesday #18–CTE A Simpler Form Of Recursion

It’s T-SQL Tuesday time folks. Bob Pusateri (blog|twitter) is the host and has chosen Common Table Expressions (CTEs) as the topic of the month.

Several ideas came to mind for writing about CTEs, one of the best uses I’ve seen for one recently was to grab the name of the most recent backup file for a database. You’ll have to ask Aaron Nelson (blog|twitter) to hook you up with that one though.

I thought I’d write about an interesting problem that was posed to me in an interview a couple of months ago.

 

Here’s a table

I was given a table with two columns; ManagerID, EmployeeID.

This table was populated with a few values thusly:

USE TempDB
GO
create table #ManagersEmployees (ManagerID int, EmployeeID int)
insert into #ManagersEmployees 
values(1,2), (2,3), (2,4), (2,5), (3,6), (3,7), (3,8)
    , (4,10),(5,11),(5,12), (12,13), (12,14)
GO

I was asked to write a recursive procedure to pull out the manager, employee tree for a given ManagerID.

 

CTEs to the rescue

Having done a little work with CTEs and understanding that I could easily write a recursive query using them I was able to quite quickly put together a script to pull the information needed. By throwing it into a procedure it could quickly and easily be executed.

CREATE PROCEDURE ManagerRecursion_CTE @ManagerID INT
AS
SET NOCOUNT ON
;WITH Managers_CTE (ManagerID, EmployeeID )
AS ( SELECT ManagerID, EmployeeID FROM #ManagersEmployees  
        WHERE ManagerID = @ManagerID
UNION ALL
    SELECT e.ManagerID, e.EmployeeID 
        FROM #ManagersEmployees e 
            INNER JOIN Managers_CTE c on e.ManagerID = c.EmployeeID)
SELECT * FROM Managers_CTE ORDER BY ManagerID, EmployeeID
GO

 

I tested and this worked nicely, it was a simple solution and provided the requested results.

 

That’s not recursion

The trouble is that while the results were not correct I was advised that this was not recursive and did not meet the criteria. Back to the drawing board then.

After a lot more work I came up with the following:

CREATE PROCEDURE ManagerRecursion_NonCTE @ManagerID INT
AS
SET NOCOUNT ON
DECLARE @rowcnt INT, @lastrow INT
DECLARE @Tbl_Results TABLE (rowid INT IDENTITY(1,1), ManagerID INT, EmployeeID INT)
 
INSERT INTO @Tbl_Results (ManagerID, EmployeeID)
SELECT ManagerID, EmployeeID
FROM #ManagersEmployees
WHERE ManagerID = @ManagerID
 
SET @rowcnt = @@ROWCOUNT
SET @lastrow = 0
WHILE @rowcnt > 0
BEGIN
INSERT INTO @Tbl_Results (ManagerID, EmployeeID)
SELECT m.ManagerID, m.EmployeeID
FROM #ManagersEmployees m
    INNER JOIN @Tbl_Results t
        ON m.ManagerID = t.EmployeeID
WHERE rowid > @lastrow
 
SELECT @rowcnt = @@ROWCOUNT
 
SELECT @lastrow = @@IDENTITY - @rowcnt
END
SELECT ManagerID, EmployeeID FROM @Tbl_Results ORDER BY ManagerID, EmployeeID
GO

 

I tested this and got back the same results as with the first procedure with all the values I passed in. Deep breath on this one as it was pushing the limits of what I could produce on the spot in an interview.

 

That’s still not recursion

Again, while the results we correct this was not recursive. It was back to the drawing board once more. This time I had to admit defeat, however did tell the interviewer that I would work on a solution at home and email it in. He gave me his contact information, we completed the rest of the interview and I went home determined to get the right data in the manner that the interviewer wanted.

After a whole bunch of reading and a lot of work I finally came up with correct results in a recursive procedure which I emailed in to get feedback.

CREATE PROC ManagerRecursion @EmpID INT, @InnerLoop INT = 0
AS
BEGIN
    IF @InnerLoop = 0
        BEGIN
        CREATE TABLE #Tbl_Results (ManagerID INT, EmployeeID INT)
        END
            INSERT INTO #Tbl_Results (ManagerID, EmployeeID)
            SELECT ManagerID, EmployeeID FROM #ManagersEmployees WHERE ManagerID = @EmpID
 
            SET @EmpID = (SELECT MIN(EmployeeID) FROM #ManagersEmployees WHERE ManagerID = @EmpID)
 
            WHILE @EmpID IS NOT NULL
            BEGIN
                EXEC ManagerRecursion @EmpID, 1
                SET @EmpID = (SELECT MIN(EmployeeID) FROM #ManagersEmployees WHERE EmployeeID > @EmpID 
                    AND EmployeeID NOT IN (SELECT ManagerID FROM #Tbl_Results)
                    AND EmployeeID IN (SELECT EmployeeID FROM #Tbl_Results))
            END
    IF @InnerLoop = 0   
        BEGIN
        SELECT * FROM #Tbl_Results order by ManagerID, EmployeeID
        DROP TABLE #Tbl_Results
        END      
END
 
GO

 

Unfortunately no feedback was forthcoming. I felt good about providing this solution despite that. I enjoy a challenge and this was certainly one of those.

 

So what was the right way?

That depends on who you ask. For me the first way was the right was. It performed well, the code was clean and easy and required a minimum amount of development. I feel that the solution here was exactly the reason that CTEs were created in the first place.

The second solution was a lot more work, the query got more complex and it does not perform as well as the first.

The final procedure was true recursion in that the procedure calls itself over and over again until all of the results are returned. It’s a loop that makes cursors look like they perform well. It was easily the worst performing of the three.

 

It all goes to show there’s more than one way to get the results you need. It’s also interesting how an example like this shows just how much work the SQL development team have done to help reduce complexity and improve performance.

Monday
Jan102011

T-SQL Tuesday #14–Resolutions

Newly minted MVP Jen McCown (blog|twitter) of MidnightDBA fame is this months T-SQL Tuesday host. Jen wants us to talk about resolutions. My personal laptop is a 1600 x 900. Oh wait, that’s a lousy play on words.

I don’t actually do the whole resolution thing. Sure the day ticks over and the year changes from one to the next but it’s just another day. I know a lot of people have strong feelings about the New Year and potential for new beginnings, I’m just not one of those people.

That being said I do have some goals that I aim to accomplish within the next few months.

 

Certification

I completed my MCITP: Database Administrator 2008 (exams 70-432 and 70-450) last year and have been figuring that I should look at taking the MCITP: Database Development 2008 (exams 70-433 and 70-451). After all I’ve been doing this stuff for a few years now so it shouldn’t be too bad, right? Anyhow, thanks to Microsoft offering their Second Shot program I decided to just go ahead and do so. My first exam is scheduled for next month. Depending on how that one works out I’ll try for the other in July.

 

Presenting

I spoke at my first SQLSaturday in Salt Lake City last year. I really enjoyed it and want to present to at least four more events this year. I’ve already submitted a Powershell session for SQL Saturday #67 in Chicago, and hope to get chosen (there are a lot of submitted sessions already with some amazing speakers so if I don’t get picked there won’t be any disappointment on my part). Unfortunately I’ll be on vacation for the local event in Olympia so will be looking to see what other travel excursions I can take.

I also made the decision that I would submit a session to the PASS Summit this year. I think the odds on my getting selected to present there are even less than they are at SQL Saturday #67 however I won’t be letting that deter me.

 

Blogging

I’ve certainly picked this up more in the last few months after a very wayward start. My posts usually take at least a couple of hours to put together, mainly because I am trying to do some knowledge imparting. Rather than just provide a script I try to explain the methodology behind it in the hopes that it will help you, the reader, to take that and expand upon what I’ve done.

After reading Paul Randal’s (blog|twitter) post the other day “So you want to blog about SQL Server?” I’ve decided I’m going to intersperse some shorter posts alongside my rambling ones just so that I get some regular content out there.

 

Personal

I really have one key personal goal to achieve in the next few months. I’ve been living in the US for about 12 years now. I’m not going to be moving back to England, I have far too many ties here, as such I need to knuckle down and get my citizenship sorted out. That is really my number one thing and takes priority over all the others. Fortunately there’s not much more to do than fill in a few forms, pay a bunch of money and wait to take the Naturalization Test (link goes to a self test on the USCIS site, take it and see how well you do). The test really should be easy to pass but I am going to be studying the heck out of it just to be doubly sure that I do. Not something I want to mess up.

 

Once all these things are over and done with I think I’ll just go ahead and take the MCM exam. You know, just to pass the time…

Monday
Dec132010

T-SQL Tuesday #13–A Mysterious Business

It’s T-SQL Tuesday again and Steve Jones (blog|twitter), Mr SQLServerCentral, is hosting this month. Steve has provided a topic that is near and dear to the heart of pretty much every DBA – “What issues have you had in interacting with the business to get your job done”.

I really didn’t know what I could possibly write about this given that I’ve never had an issue with the business.

Stopped laughing yet?

Here’s an actual business challenge that I have had the fun to deal with in the last few months.

 

What is the maximum number of columns you can have in a table?

It’s a simple little question like that which instantly sets off the alarms in your head, especially when it comes from a Business Analyst. What worried me even more was that this individual had a habit of promising the moon on a stick to users without actually understanding the bigger picture of what might be involved.

The BA was delighted when I told him that the non-wide table limit was 1,024 columns (based upon the Maximum Capacity Specifications in BOL). This took the plotz-meter from rumble to gurgle.

I heard nothing more from the BA for a couple of months so figured it had been just a curiosity that I had answered, then…

 

Please deploy this work order to create a new table

A work order showed up in my queue. A script was attached to create a bunch of new tables. One of these tables was to have 514 columns, another 365 and another 234 along with a few others that were in the 30-40 column range. That’s some pretty wide tables. Each of these tables was to also have a whole host of indexes (more than 40 on one of them) and were expected to contain several million rows of data. There were also FK constraints (well at least there was some DRI).

With an impending sense of doom I asked why we were creating such wide tables and how they were going to be used. I had a hope that these were going to be used in some kind of OLAP solution. It was a faint hope given that we don’t have Analysis Services deployed anywhere.

I was advised that data was going to be loaded into the tables using a “real-time” ETL process to provide reporting functionality to the user base. There would be a bunch of reports written which could be called, there would also be ad-hoc reporting access to a small group of power users.

I could understand if all of these tables were denormalized to the max however to actually get useful data out would require performing a minimum of four joins.

 

The business needs them

When taken back to the development team they stated that they developed this to the requirements that they were given. I can understand and appreciate this, they are working with what they are provided. I have no doubt they would have questioned this requirement and just been asked to proceed.

I performed a code review and sent back my recommendations. Narrower tables, denormalization, fewer indexes (given that they could never cover any of the queries that would be executed). I also explained the huge performance issues that we were going to experience, the impact to the other databases on the server and how we were going to see some quite significant disk issues.

My recommendations were returned with “these need to go in, the business needs them and we have made a commitment to get them in by next week”.

 

You’re there to keep the boss happy

My concerns were obviously not being heeded so it was escalation time. I went to my boss and explained the situation. I went over the concerns I had, the expected performance problems we would experience, the bad design, the lack of denormalization which would have been the ideal solution for what the team was attempting to do.

It sat with the boss for a couple of days. Finally he got back to me. It had gone up the chain and come back down again. The decision was that “the business” needed this and until I could provide hard numbers that this was a problem in production it would go in.

That was a bit of an epiphany moment. I found out that I am not actually there to do the best job that I can. I’m not there to get the most out of each and every server, eek out a little more performance from a SQL instance and get things running all nice and smooth. I came to the realization that I was just there to keep the boss happy.

 

Performance isn’t everything

The new tables went into production. Disk queuing went through the roof, buffer cache went through the floor and my head went through the desk. Tickets were opened because things were performing so poorly, they were rerouted to the BA and closed as “expected behavior”.

I provided perfmon stats and I showed the severe impact that these new tables and processes were having on our environment. I sat down with management and explained what was going on over and over again figuring that some change would be made.

I tracked down a few of the power-users who were a part of “the business” and asked them what they thought of the whole situation. To a person they told me that this was not what they wanted, not what they asked for and not what they needed.

 

These tables still exist. Performance is still lousy. I’ve stopped taking this to management as it wasn’t keeping my boss happy. I would love to say that the business won, but it doesn’t actually look like the business did. It seems more than likely that this was the brainchild of one BA who got the ear of someone important and made their dream a nightmare reality.

Monday
Sep132010

T-SQL Tuesday #10 - Applying Indexes To Replicated Tables

It's T-SQL Tuesday time again! This month Michael J. Swart (blog|twitter) is hosting and has given us the topic of indexes.

I've previously written about how you can find indexes that are not being used and get their sizes. Not being allowed to reuse that meant that I had to come up with something else.

In my previous post on calculating replication schema options I documented a way of being able to change article options in transactional replication so that you could apply all the non-clustered indexes on your subscribing table that exist on the publishing one. But what if you are using replication to offload reporting and you need to create indexes that don't exist on the publishing table?

You could always manually apply these indexes any time that you perform a snapshot but this requires manual work and can easily be forgotten.

Fortunately there is another option, you can have replication automatically apply a post-snapshot SQL script which will create those indexes for you. Here's how...

 

Create a basic publication (snapshot or transactional) and add a table article. Perform the snapshot and compare indexes between the tables (in this example I am using the person.address table from AdventureWorks2008).

After the snapshot here's how the indexes stand

 

For this example we're going to create a new index on Admindb.person.address. To do this first we need to generate the index creation script

USE AdminDB
GO
IF NOT EXISTS (SELECT * FROM sys.indexes WHERE object_id = OBJECT_ID(N'[person].[address]') AND name = N'NCIX_Zip')
CREATE NONCLUSTERED INDEX [NCIX_Zip] ON [person].[address]
(
    [PostalCode] ASC
) WITH (ONLINE = ON, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO

 

Save this script to C:\ReplicationScripts\CreateReportingIndex.sql

 

Now we change the existing publication so that whenever a snapshot is taken it will apply this script when applied to the subscriber. This change can be done in a couple of ways.

The quick and easy way is to use sp_changepublication and just pass in the publication name, property change and location of the script.

USE AdventureWorks2008
GO
EXEC sp_changepublication
    @publication = 'Post Snapshot Test'
  , @property = N'post_snapshot_script'
  , @value = 'C:\ReplicationScripts\CreateReportingIndex.sql'
  , @force_invalidate_snapshot = 1 --must invalidate existing snapshots for this to take effect
GO

 

This can also be done using the GUI (right click on the publication and select Properties)

 

 

When you next perform a snapshot then the script will be applied. We can check that this actually worked first by looking at replication monitor where it will tell you that the script was applied.

 

And then you can check the table itself and confirm that the index exists

 

 

There are a couple of final things to note.

Usually you will specify a unc path for the SQL script so that you just have to maintain a single copy. You just need to ensure that the subscriber is able to access the post-snapshot script location.

Post-snapshots are not limited to creating indexes, in fact you can do pretty much anything just so long as the Distribution Agent has the access;

  • Send yourself an email to let you know that the snapshot has been applied on the subscriber
  • Apply table permissions
  • Create views

 

If you use replication and find yourself manually reapplying indexes it's worth evaluating whether using post-snapshot scripts might save you some work.

Monday
Jul122010

T-SQL Tuesday #008 - Gettin My Learnin On

This months T-SQL Tuesday is being hosted by Robert Davis (blog|twitter) and he wants to take a look at learning. 

I'm finding this quite appropriate right now given a recent blog post by Buck Woody (blog|blog|blog|twitter) based around creating a "SQL Server Professional Development Plan" (go give that a read by the way).

In his article Buck talks about foundation learning and educating yourself about a possible career and it's opportunities. This took me back, back, back a little further and then back a tad more than that to when I was in school. As a 13 year old kid I was a bit of a smart arse (some things never change). I had gotten to grips with a ZX81 and the ZX Spectrum and could see that computers were the way forward. It had reached a point in my schooling age where we had to choose our options for the next couple of years of study. For me it was a simple choice. I was going to take computers for two years, then go to college and take Computer Science. Sweet!

 

I put in my options and was very excited the next week when I got back my class schedule. I opened my schedule and to my absolute horror I was stuck with taking geography instead. Upon asking the reason for this I was advised that the school felt I would be wasted in the computer class and was much better suited to geography. I still don't know why they called it geography as it all seemed to be about geology.

The bitter taste of geography did not sit well, it really put a complete dampener on education for me and I never did wind up going to college.

 

Many years later I found myself interviewing for a job at a company for overnight work (never fun) which consisted of changing backup tapes and watching some monitors for alarms. If an alarm came in I was to call the oncall person and then they would fix it.

That was 10 years ago (oh and Jorge, Arcserve sucked back then too).

I did the changing of the tapes and I called people when things broke. Here's the problem with that...it's BORING!

 

After a couple of weeks I started engaging with the oncall folks on the issues. Let me help I told them. I'll help with the bad things "I want your drama". My reasoning was that if they helped me understand what was broken and how it could be fixed then I wouldn't need to call anyone the next time. This would allow the team to have an improved quality of life and eliminate the Lady Gaga oncall person, you know the type....

Obligatory GagaYou called, I can't hear a thing.
I have got no service
In the club, you see, see
Wha-Wha-What did you say?
Oh, you're breaking up on me
Sorry, I cannot hear you,
I'm kinda busy.

 

 

Helping these folks out got me a solid grounding in Windows NT 4.0 and 2000.

Now that everyone knew I could do more than stare at flashing lights they asked me to help with the rollouts of the web stuff, so I got started with using SiteServ.

One day the boss came to me and said that I'd been doing some good work, that we had this big upgrade from SQL 7.0 to SQL 2000 that was going to take place and would I like to be involved. My first response was "I'd love to, sounds awesome". My second was "so what's SQL?".

I worked with that team for a few years, they taught me a great deal.

One of my fondest memories of those early days of SQL for me was being made fun of for using Enterprise Manager. I was told in no uncertain terms that if I ever wanted to amount to anything as a DBA that I had to learn T-SQL and I had to use it for the powers of good (ie for admin purposes, not for development).

That has held me in good sway over the years and it's something I still force myself to do today. As much as possible I will use T-SQL to perform a task. Now I'm taking it a step further and I'm using powershell to do things that previously I would never have considered using a scripting language for. As an example, today I took a script I already had, made a very quick change and had something running in 5 minutes that gave me the size of every drive C on all of my servers.

 

In a roundabout way this get's me to my point. Learning. I learn by doing. I have books a plenty, but nothing sinks into my brain until I actually do it.

The old saying is that practice makes perfect. We learn by doing. So go out, get your hands dirty, buy yourself a license for SQL 2008 Developer Edition it's only $50. Once you have it install it somewhere. Corrupt some databases, recovery from the corruption. Create some databases, write some queries, have some fun with it. Once you start playing you'll start learning.

 

Go out, get yer learnin on!