(somewhat) Social
« Estimating Data Compression | Main | SQL Job Notifications »
Monday
May242010

BCP Using Powershell

I have a bunch of stacked up blog posts I need to get out. Heavy 14 hour,  7 day a week workloads while trying to move house is seriously impacting my ability to get these things done. This post in particular has been one I've been wanting to get done in a while. It would have been great to get this out last week in honor of Powershell Week at SQL University, still better late than never huh?

 

The Request

I was approached by "The Boss" and was asked if I could "provide assistance" to the development team and the customer (please note little c). We were about to release a massive internal application upgrade, one that was a fundamental rewrite of an application which had been in production for over 8 years and was central to pretty much every other application my group supports (kind of explains the long works days I've been having).

I was asked if there was a way that I could provide an export of every table in the new OLTP database, on a nightly basis, to a file share. This data was to be used by internal groups so that they could "do their stuff". Persuaded of the critical nature of this export I sat down to figure out how I could best accomplish this.

As we all know, Powershell is hella cool, I even have a Powershell tag on the site, so why not make use of this?

 

The Trouble With Powershell

There is really not much bad you can say about Powershell. My only real beef is that it's majorly addictive. Here's how it works

  • You start doing something simple
  • Realization comes that if you do a little more work then you can make it totally reusable code for something that may come along later.
  • Then while testing you see places where you can make improvements.
  • You make those improvements and see where you can add something.
  • Then you ask why you can't make it more extensible.
  • Ooh, wait, I should really add this, it would be useful later

STOP!!!

 At some point you just have to let it go. Then come back to it later when you have more time to do all the fun things you want to do.

 

The Awesomeness Of Powershell

Below is the script which performs a data export using BCP.The following steps are performed

  • Creates the required directory structure
  • Loops through the tables in the database
  • Dumps the table schema to a file
  • Creates a XML format file for each
  • BCPs out the data
  • Zips the data/format files/schema
  • Copies the zip file to a remote share
  • Download the entire script.

Each step can be enabled/disabled using a flag. I'll do my best to explain the script in parts as we go.

 

The Script

Setting up the variables

First off I'm going to show the variables that are used and can be set for the process. I've commented this part pretty well to show what each variable does.

If you are going to zip and copy the data then you will need to download and install 7-Zip, the free opensource Zip utility. I would also recommend downloading Robocopy if you are going to copy the zip file to another location.

The row delimiter is hex based, I needed something a little more obscure than the regular as some of the tables being exported have columns containing carriage returns/line feeds.

#============================================
#REQUIRED VARIABLES
#============================================
#SQL Server & Local Path Information
$ServerInstance = ".\YUKON"            #SQL Instance where data resides
$BCPRoot = "C:\"                            #The root directory where we will be dumping the data, subdirectories will be created from here
$BCPDir = "AdventureWorksExport"        #Subdirectory used for export
$BCPDatabase = "AdventureWorks"    #Name of the database that we will be dumping object from

#Options: 1 = enabled, 0 = disabled
$ScriptSchema = 1             #creates drop & create schema scripts
$DeleteOldSchema = 0        #deletes the old schema scripts (flag ignored if $ScriptSchema = 1
$ScriptFormatFiles = 1        #creates xml formatfiles for every object (flag ignored if $BCPOutData = 1)
$DeleteOldFormatFiles = 1    #deletes the old formatfiles (flag ignored if $ScriptFormatFiles = 1)
$BCPOutData = 1                #BCPs out the data
$DeleteOldBCP = 1            #Deletes the old BCP'd data (flag ignored if $BCPOutData = 1)
$ZipData = 1                #Zips the exported data (uses 7-zip, required installation)
$RemoteCopy = 1                #Copy the zipped data to remote location
$UseRobocopy = 1            #Uses Robocopy to perform file copy otherwise uses windows copy process
$CleanupLocalFile = 1         #Deletes yesterdays zip file on the local machine
$CleanupRemoteFile = 1         #Deletes yesterdays zip file on the remote machine

#Zip Information (requires 7-zip to be installed)
$7ZipPath = "c:\program files\7-zip\7z.exe "#Change path depending upon location of 7-zip
$ZipType = "-tzip"                            #Type of zip format
$ZipUseAlternateTempPath = 1                #Allow for the use of an alternative temp path (uses c:\windows\temp by default)
$ZipTemp = "-wc:\"                            #Location of temp path (ignored if $ZipUseAlternateTempPath = 0)


#Delimiters used for the formatfiles and BCP
$rowdelim = "0x0D0A7E7E0D0A" #Row delimiter #hex crlf~~crlf
$coldelim = "|^|"             #Column delimiter


#Copy options
$RobocopyPath = "c:\PSTools\robocopy.exe"            #Location of robocopy (ignored if $UseRobocopy = 0)
$CopyDestination = "\\RemoteServer\Fileshare$"            #Remote file location (ignored if $RemoteCopy = 0)


#============================================
#END REQUIRED VARIABLES
#============================================

 

Functions used

The original iteration of this script was one long script, then I found that I was reusing code multiple times and decided that turning the work to functions just made a lot more sense.

 

Function CheckPaths checks that the export directories exist under the root path specified in the variables. If those directories don't exist then they are created.

out-file is used throughout the script to write information to a log.

#============================================
#CHECK OUTPUT PATHS & CREATE IF THEY DON'T EXIST
#============================================
Function CheckPaths ($BCPPath, $BCPRoot, $BCPDir, $LogPath)
    {
    #First thing, check to see if the BCPPath exists, if it doesn't, create it
    if (Test-Path $BCPPath)
        {
            "("+(Get-Date -UFormat "%Y-%m-%d %H:%M") +") Starting Script"  | Out-File $LogPath
            "Base directory exists" | Out-File -Append $LogPath
        }
    else
        {
            "Creating base directory"
            New-Item -Path $BCPRoot -Name $BCPDir -type Directory -Force | Out-Null
            "("+(Get-Date -UFormat "%Y-%m-%d %H:%M") +") Starting Script"  | Out-File $LogPath
            "("+(Get-Date -UFormat "%Y-%m-%d %H:%M") +") Creating base directory"  | Out-File $LogPath
        }
    
    
    #Check for the schema directory
    if (Test-Path "$BCPPath\schema") {"Schema dump path exists" | Out-File -Append $LogPath }
        else
            {
                "Creating schema dump path" | Out-File -Append $LogPath
                New-Item -Path $BCPPath -Name schema -type directory -Force | Out-Null
            }
        
    #Check for the data directory
    if (Test-Path "$BCPPath\data") { "BCP data path exists" | Out-File -Append $LogPath    }
        else
            {
                "Creating BCP data path" | Out-File -Append $LogPath
                New-Item -Path $BCPPath -Name data -type directory -Force | Out-Null
            }
        
    #Check for the formatfiles directory
    if (Test-Path "$BCPPath\formatfiles") {"Formatfile path exists" | Out-File -Append $LogPath    }
        else
            {
                "Creating formatfile path" | Out-File -Append $LogPath
                New-Item -Path $BCPPath -Name formatfiles -type directory -Force | Out-Null
            }    
    }    
   

 

Function ExportFormatFiles creates format files for the table name passed in using the row and column delimiter variables assigned.

 #============================================
#CREATE FORMATFILES
#============================================
Function ExportFormatFiles ($export, $formatfile, $coldelim, $rowdelim, $ServerInstance, $LogPath)
    {
        #create a bcp format file
        "("+(Get-Date -UFormat "%Y-%m-%d %H:%M") +") Creating format file for [$BCPDatabase].$tables" | Out-File -Append $LogPath
        bcp $export format nul -c -x -f $formatfile -t $coldelim -r $rowdelim -T -S $ServerInstance | Out-Null
    }

 

 

Function BCPOutData performs the actual BCP out. It uses the format file created by the ExportFormatFiles function. out-null is used at the end of the bcp command line to prevent the scrolling output as the data is exported.

#============================================
#BCP OUT DATA
#============================================
Function BCPOutData ($export, $formatfile, $coldelim, $rowdelim, $ServerInstance, $LogPath)
    {
        "("+(Get-Date -UFormat "%Y-%m-%d %H:%M") +") BCPing out data for [$BCPDatabase].$tables" | Out-File -Append $LogPath
        bcp $export out $exportfile -f $formatfile -T -S $ServerInstance | Out-Null
    }

 

Function ExecProcess is used to call an external process from within powershell. WaitForExit() allows the script to complete an external process before continuing. We use this when performing the zip and when using robocopy.

#============================================
#CALL EXTERNAL PROCESSES
#============================================
function ExecProcess($Process, $PassedArguments)
    {
        $proc = New-Object     System.Diagnostics.Process
        $proc.StartInfo.FileName = $Process
        $proc.StartInfo.Arguments = $PassedArguments
        $proc.Start()
        $proc.WaitForExit()
    }

 

 

Function ScriptOutSchemas creates two files, one with the DDL to create the tables we are exporting, the other to drop them. Of particular note here are the script options used. NoIdentities allows columns which have identity set on them to be scripted without that option. NoFileGroup removes any reference to filegroups from the create table statement.

#============================================    
#SCRIPT OUT THE SCHEMAS
#============================================
Function ScriptOutSchemas ($scriptopt, $script, $BCPPath, $tables)
    {
        "("+(Get-Date -UFormat "%Y-%m-%d %H:%M") +") Scripting Schema for [$BCPDatabase].$tables" | Out-File -Append $LogPath
        #This generates a drop script for the table
        $scriptopt.ScriptDrops = $True
        $scriptopt.IncludeIfNotExists = $True
        $scriptopt.FileName = "$BCPPath\schema\DropTables.sql"
        $scriptopt.AppendToFile = $True
        $script.Script($tables) + "GO `r" | Out-Null
        
        
        #This generates the create script for the table and turns off Identity, ignores filegroups
        $scriptopt.NoIdentities = $True
        $scriptopt.ScriptDrops = $False
        $scriptopt.NoFileGroup = $true
        $scriptopt.IncludeIfNotExists = $False
        $scriptopt.FileName = "$BCPPath\schema\CreateTables.sql"
        $scriptopt.AppendToFile = $True    
        $script.Script($tables) + "GO `r`r" | Out-Null
    }    

 

Function DeleteOldData is reused most. We check for the existance of the passed in file, if it's there then we get rid of it.

#============================================    
#DELETE OLD DATA
#============================================
Function DeleteOldData ($FileToDelete)
    {
        "("+(Get-Date -UFormat "%Y-%m-%d %H:%M") +") Deleting "+$FileToDelete | Out-File -Append $LogPath
        if (Test-Path $FileToDelete) { Remove-Item $FileToDelete -Force }#| out-null    }
    }

 

 

 

The main script

This comprises several items. First we take all of our defined parameters and build final variables.

#============================================
#Build paths et al
$BCPPath = $BCPRoot+$BCPDir
$LogPath = $BCPRoot+$BCPDir+"\Export"+$TodaysDate+".log"
$TodaysDate = Get-Date -Format "yyyyMMdd"
$YesterdaysDate = Date(Get-Date).AddDays(-1) -Format "yyyyMMdd"

$FormatFileDeletePath = $BCPPath+"\formatfiles\*.xml"
$BCPFileDeletePath = $BCPPath+"\data\*.txt"
$SchemaDeletePath = $BCPPath+"\schema\*.sql"

$ZipFile = $BCPRoot+$BCPDatabase+"_"+$TodaysDate+".zip"
if ($ZipUseAlternateTempPath) {    $arguments = @("a", $ZipType, $ZipFile, $BCPPath, $ZipTemp)    }
    else { $arguments = @("a", $ZipType, $ZipFile, $BCPPath) }
    
$CopyFileName = $BCPDatabase+"_"+$TodaysDate+".zip"    
$CopyArguments = @($BCPRoot, $CopyDestination, $CopyFileName )    
    
$YesterdaysLocalFileName = $BCPRoot+$BCPDatabase+"_"+$YesterdaysDate+".zip"
$YesterdaysRemoteFileName = $CopyDestination+"\"+$BCPDatabase+"_"+$YesterdaysDate+".zip"
#============================================   

 

 

Then we check to see if certain options have been flagged disabled and notify to the screen if so.

#NOTIFICATION ON DISABLED OPTIONS
if (!($ScriptSchema)) {"Schema scripting disabled"  | Out-File -Append $LogPath }
if (!($ScriptFormatFiles)) {"Formatfile creation disabled" | Out-File -Append $LogPath }
if (!($BCPOutData)) {"BCP export disabled"  | Out-File -Append $LogPath }

 

Call the function to check and see if the directory structure exists

#Check that the paths exist for the exports
CheckPaths $BCPPath $BCPRoot $BCPDir $LogPath

 

Delete any old data based upon set flags.

#Delete old schema files if required
if ($ScriptSchema) { DeleteOldData $SchemaDeletePath }
    else { if ($DeleteOldSchema) { DeleteOldData $SchemaDeletePath } }

#Delete old format files if required
if ($ScriptFormatFiles) { DeleteOldData $FormatFileDeletePath }
    else { if ($DeleteOldFormatFiles) { DeleteOldData $FormatFileDeletePath } }

#Delete old BCP data files if required
if ($BCPOutData){ DeleteOldData $BCPFileDeletePath }
    else{ if ($DeleteOldBCP) { DeleteOldData $BCPFileDeletePath }}

 

We're using SMO to do the work within SQL so we initialize that assembly.

#Load SMO & Connect to the SQL Server
[System.Reflection.Assembly]::LoadWithPartialName('Microsoft.SqlServer.SMO') | out-null
$script = new-object("Microsoft.SqlServer.Management.Smo.Scripter")
$s = new-object ('Microsoft.SqlServer.Management.Smo.Server') "$ServerInstance"
$script.Server=$s
$scriptopt = $script.Options


Now the work really begins. We go through all the tables in the database and depending on the set flags we export the schema, create the format files and bcp out the data. As a part of this we check to see if we are trying to do work with a system table, if so we ignore it and move on to the next. Note, each format file and each bcp'd file is named after the table.

$dbs=$s.Databases
#Generate script for all tables
foreach ($tables in $dbs["$BCPDatabase"].Tables)
    {
    #ignore system tables
    if ($tables.IsSystemObject)    {"[$BCPDatabase].$tables is a system object and will be ignored" | Out-File -Append $LogPath }
        else {
                #Gen schema if required
                if ($ScriptSchema) { ScriptOutSchemas $scriptopt $script $BCPPath $tables }

                #BCP out data - if BCP we always create format files
                if ($BCPOutData)
                    {
                        $export = "["+$BCPDatabase+"]."+$tables        
                
                        #set the export paths
                        $formatfile = "`"$BCPPath\formatfiles\"+$tables+".xml`""
                        $formatfile = $formatfile | foreach {$_ -replace "\[", ""} | foreach {$_ -replace "\]", ""}
                
                        ExportFormatFiles $export $formatfile $coldelim $rowdelim $ServerInstance $LogPath
                
                        $exportfile = "`"$BCPPath\data\"+$tables+".txt`""
                        $exportfile = $exportfile | foreach {$_ -replace "\[", ""} | foreach {$_ -replace "\]", ""}
                
                        BCPOutData $export $formatfile $coldelim $rowdelim $ServerInstance $LogPath
                    }
                #if not BCP let's check if we export the formatfiles
                else
                    {
                    if ($ScriptFormatFiles)
                        {
                            $export = "["+$BCPDatabase+"]."+$tables
                            $formatfile = "`"$BCPPath\formatfiles\"+$tables+".xml`""
                            $formatfile = $formatfile | foreach {$_ -replace "\[", ""} | foreach {$_ -replace "\]", ""}
                            
                            ExportFormatFiles $export $formatfile $coldelim $rowdelim $ServerInstance $LogPath
                        }
                }
        }
    }
    
"("+(Get-Date -UFormat "%Y-%m-%d %H:%M") +") Exports complete" | Out-File -Append $LogPath
        

If the zip flag is set then we compress the entire export (data, format files, schema) into a zip file using 7-Zip.

#Zip the exported data
if ($ZipData) { ExecProcess $7ZipPath $arguments }

 

We copy the zipped file over to a remote location. If the robocopy flag is not set then we use a regular file copy.

#Copy the exports to the destination server
if ($RemoteCopy)
    {
        if ($UseRobocopy) {    ExecProcess $RobocopyPath $CopyArguments }
            else { Copy-Item $ZipFile $CopyDestination -Force }
    }    

 

Finally we clean up yesterdays data and delete it from both the local and remote location.

#Delete old data
if ($CleanupLocalFile) { DeleteOldData $YesterdaysLocalFileName }
if ($CleanupRemoteFile) { DeleteOldData $YesterdaysRemoteFileName }



"("+(Get-Date -UFormat "%Y-%m-%d %H:%M") +") Script complete"

    

 

That's it. Done. Below is a link to the full script. Download it, try it, let me know how it works for you. I'm looking to enhance it some down the road by adding things like the use of an xml file for the settings used, being able to query a list of tables to be exported rather than just dump everything in the database. Once I have the time of course...

 

Download the entire script.

PrintView Printer Friendly Version

EmailEmail Article to Friend

References (7)

References allow you to track sources for this article, as well as articles that were written in response to this article.
  • Response
    Response: www.meotin.com
    Thats finally sinking in.
  • Response
    Response: qewKemOJ
    English To SQL - English To SQL - Blog - BCP Using Powershell
  • Response
    English To SQL - English To SQL - Blog - BCP Using Powershell
  • Response
    This is the third installment in a blog series. The previous entry is located here Based on the material
  • Response
    This is the third installment in a blog series. The previous entry is located here Based on the material
  • Response
    Lovely page, Stick to the great work. Thank you so much!
  • Response
    Response: Hollister
    English To SQL - English To SQL - Blog - BCP Using Powershell,See artikkel on kirjutatud täpne aga kui sa tahad, et näha seotud artikleid saate vaadata infot siit:Hollister,

Reader Comments (1)

Hello,

This post is absolutely incredible. It has just helped me massively in writing a script to BCP out a whole bunch of tables and views to files on disk and then BCP them back into tables with SQL Azure, much, much faster than the migration wizard could manage it.

Thank you so much.

Legend!

Martyn.

June 22, 2012 | Unregistered CommenterMartyn Jones

PostPost a New Comment

Enter your information below to add a new comment.

My response is on my own website »
Author Email (optional):
Author URL (optional):
Post:
 
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>