As part of my job, I needed to research how a few things work with Synapse and Fabric. The latter includes the former, mostly. I decided to setup a workspace and a do some experimentation.
My first stop was the Stairway to Synapse Analytics. I started with Level 1, since I needed to create something.
In the portal, I created a new Synapse Analytics resource. I won’t go into details, but basically create a new resource and search. Read the article to see how this works. From there, I used a resource group I had for work stuff. I also had to create a new Data Lake Storage Gen2 account. I didn’t bother with the Managed Resource Group.
I added a user account and password and then clicked Create. I went to the summary page, where I could see the small cost for serverless. I clicked “Create” again to start the deployment.
The deployment failed.
I was leaning towards this not being helpful, but I went into the details of the error message and found something interesting. My subscription couldn’t deploy in this region (westeurope).
OK, go back, recreate the resource in UKSouth. That worked fine and I saw my resources.
At this point, I had a resource.
Adding Data
The next step for me was to add some data. I grabbed a few csv files and uploaded them to my storage account. Lots of querying in Synapse is through external tables to flat files, so I picked some files I can query.
Once these were uploaded, I was next interested to see if I could query this data. In my main Synapse resource, I see some endpoints.
I copied the serverless one and then opened SSMS. I put this, my user and pwd, and got connected. I was hoping that @@Version would tell me I was connected to Synapse, but I got the same results I get some Azure SQL Database, albeit with a different timestamp. However, ServerProperty() helps.
That works, what about my data? Let’s try a query from Level 2.
I’ll take this query and run it. I’ve adjusted this from the values in Level 2, which actually uses the Synapse Workspace explorer online.
–retrieve data from csv file
SELECT
TOP 100 *
FROM
OPENROWSET(
BULK 'https://synapsesqlprompt.blob.core.windows.net/sqlpromptfs/solar_2024_01.csv',
FORMAT = 'CSV',
HEADER_ROW = TRUE,
PARSER_VERSION = '2.0'
)
WITH (
[Time] DATE,
[System Production (Wh)] VARCHAR (100)
) AS [result]
It fails.
Hmmm, let’s try that in the browser. Here it works.
The error is with a credential.
There is a quickstart online, and that query works. However, there is a note in the query that I need a credential if my file is protected.
The article had this link and used the sample code to create a credential:
CREATE CREDENTIAL [https://synapsesqlprompt.blob.core.windows.net/sqlpromptfs]
WITH IDENTITY='SHARED ACCESS SIGNATURE'
, SECRET = '';
GO
Now I can run my original query and it works.
That was a pretty cool exercise for me to get started. In less than 30 minutes I was able to create a Synapse workspace, add some data, and query it.
Now to learn a bit more about how this works, and to use Flyway to deploy some objects.