Given the ubiquity and utility of SQL Server, it is the preferred method of data storage for many, from DBAs to developers to data analysts. Often though, there are instances where your data is stored remotely, perhaps in cloud-based services. For instance, you may have created a Google Form, with the responses stored in a Google Spreadsheet. With the CData DataCmdlet for Google Spreadsheets, you get direct access to your Google Spreadsheet from PowerShell scripts. When paired with the sqlps module from Microsoft, you can easily replicate data from your Google Spreadsheet to a SQL Server database. This article provides sample PowerShell scripting for such a replication. The PowerShell script snippets can be put together, in order of appearance, to create a working script, or you can download the attached script file.
Note: There are cmdlets available from CData Software for connecting to 70+ data sources.
Getting Started
To begin, you will want to import the sqlps module and install the GoogleSheetsCmdlets:
#Import sqlps Import-Module sqlps -DisableNameChecking #Install GoogleSheetsCmdlets Install-Module GoogleSheetsCmdlets
With the modules installed, you will next want to active the license for the GoogleSheetsCmdlets. To do so, you can simply call the Connect-GSheets
cmdlet. The cmdlet will request the name of a spreadsheet and then walk you through the licensing process (to activate the free, 30-day trial, simply use TRIAL
as the Product Key).
Querying Google Sheets Data
The first step in replicating the data is pulling the data from the Google Spreadsheet. To do so, we will store a connection object using the Connect-GSheets
cmdlet and then store the data using the Select-GSheets
cmdlet.
<# Variables to set which Google Spreadsheet and single sheet to connect to using the CData DataCmdlets for Google Spreadsheets #> $spreadsheet = 'SPREADSHEET_NAME' $sheet = 'SINGLE_SHEET_NAME' <# Connecting to Google Spreadsheets and reading data from the designated sheet. On the first connection, the user will be prompted to authenticate from a web-browser pop-up. Subsequent connections and requests will be authenticated silently by the module. #> $conn = Connect-GSheets -Spreadsheet $spreadsheet $data = Select-GSheets -Connection $conn -Table $sheet
In order to pull the individual values from the data and build an INSERT
query for SQL Server, we will need to save the names of the columns found in the Spreadsheet.
<# Discovering the name of the columns from the Google Spreadsheet. Note that we exclude the 'Id' column, which is specific to Google Spreadsheets and we exclude the Columns, Connection, and Table columns, which are used when piping data from one DataCmdlet to another (but are likely unecessary in a SQL Server replication of Google Spreadsheets data). #> $columns = ($data | Get-Member -MemberType NoteProperty | select -Property Name).Name | ? {$_ -NotIn @('Columns','Connection','Table','Id')}
With the data and column names from the Google Spreadsheet stored, we are ready to push the data into a SQL Server database.
Inserting Data into the SQL Server Database
For the sake of readability and reusability, you will want to create variables for the information about the SQL Server and user, as required by the sqlps module.
<# Variables used to connect to and send queries to a SQL Server database. #> $sqlServer = 'SERVER\INSTANCE' $sqlDatabase = 'DATABASE' $sqlUser = 'USER' $sqlPassword = 'PASSWORD' $sqlTable = 'TABLE'
With the connection variables configured, we can process the data from Google Sheets, using piping to build an INSERT
query for each row in the Spreadsheet, and push the data into the SQL Server database.
<# Piping the data row by row and building a query based using a comma-separated string of the columns and a comma-separated string of the corresponding values in the row. Each row is inserted individually into the database. ************************************************************** Please note that building queries using string concatenation is highly susceptible to attacks from SQL injection, which the sqlps module reportedly does not protect against. It is up to the user to provide sanitation of SQL inputs. ************************************************************** Lastly, we query the database to view the inserted data. #> $data | % { $row = $_ $values = @() $columns | % { $col = $_ $values += "'" + ($row.$($col) -replace "'", "''") + "'" } $query = "INSERT INTO " + $sqlTable + " (" + ($columns -join ", ") + ") VALUES (" + ($values -join ", ") + ");" Invoke-Sqlcmd -ServerInstance $sqlServer -Database $sqlDatabase -User $sqlUser -Password $sqlPassword -Query $query } $query = "SELECT * FROM " + $sqlTable + ";" Invoke-Sqlcmd -ServerInstance $sqlServer -Database $sqlDatabase -User $sqlUser -Password $sqlPassword -Query $query
Closing Thoughts & Next Steps
At this point, you have a working PowerShell script for replicating data from a Google Spreadsheet to a SQL Server database. This simple script is perfect for data that is neatly organized in the Google Sheet (the data starts in cell A1, the first row contains column headers). If your data is not well-formatted, you can still use the GoogleSheetsCmdlets, but you will need to configure your connection and query in order to account for the formatting of the data. For more information on this process, please refer to the Using Spreadsheets as Tables section in the online Help documentation.
You can also reconfigure the script to incrementally replicate the data, as long as there is a last-modified date/datetime type column in the data. You would simply set the -Where
parameter in the Select-GSheets
cmdlet when you retrieve the data:
$whereClause = 'LastModified > ' + $lastReplicationDatetime $data = Select-GSheets -Connection $conn -Table $sheet -Where $whereClause
CData Software has produced cmdlets for connecting to 70+ BigData, NoSQL, and SaaS sources. For more information, including downloads and free trials, please refer to the DataCmdlets page and start replicating your data to SQL Server today!