Leveraging AI Tools for SQL Queries

Question

Leveraging AI Tools for SQL Queries

Mamta Dalal

SSC Enthusiast

Points: 191
More actions
January 19, 2024 at 12:00 am

#4360898

Comments posted to this topic are about the item Leveraging AI Tools for SQL Queries

Viewing 13 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply

Jeff Moden SSC Guru Points: 1004686 More actions · Answer 1

Not aimed at the author but aimed at the whole concept... Let's look at the first example prompt in the article...

You are an SQL Server database developer. Assume you are working with the AdventureWorks2022 sample database. Write the SQL code to create a view named vwPersonDetails with columns Title FirstName, MiddleName, LastName, and JobTitle from the Person and Employee tables of HumanResources schema.

Including the single spaces, that took 294 characters of typing and esoteric memory of column and table names because there was no use of "intellesense" etc. "Look how easy that was"??? NOT easy compared to something like SQL Prompt or even (ugh!) Intellisense or even click'n'drag from the object explorer. It's easier to type a part of something and have SQL Prompt or Intellisense fill in the rest.

And, even if you didn't have any of that, the you still had to know all the names of the columns and tables and the prompt ended up being a total of 294 characters of typing instead of just the 247 in the code. And, guess what? You don't need to copy and paste the code to run it and you don't need to read an explanation because... you were smart enough to write it in the first place and it didn't take any more knowledge than to build that prompt. In fact, SQL Prompt or Intellisense probably wrote half the code for you.

AI just made extra work here. I'll put a good developer up against even themself and the one that doesn't us AI, is going to get it done first.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

David.Poole SSC Guru Points: 76323 More actions · Answer 2

Adventureworks is a publicly available DB designed to show off SQL Server. As such it has sensible naming standards and PK/FKs.

Is there a capability that would work against DBs that are not visible publicly and whose SQL source code is kept private too?

I was in a client meeting when the observation was made about the value of increasing resources for a particular activity. Basically there are two types

The more you put in the more you get out.
The more you put in beyond a certain point, you don't get anything else out.

Jeff's comment made me think of the 2nd one.

There are AI products out there that can do a discovery piece on your databases and infer PK/FK relationships and others. I am quietly optimistic that these will offer data people a valuable service.

I'm uncomfortable with ChatGPT. I've found that a lot of its answers are believable rather than correct.

LinkedIn Profile

Jonathan AC Roberts SSCoach Points: 18408 More actions · Answer 3

Jeff Moden wrote:

Not aimed at the author but aimed at the whole concept... Let's look at the first example prompt in the article...
You are an SQL Server database developer. Assume you are working with the AdventureWorks2022 sample database. Write the SQL code to create a view named vwPersonDetails with columns Title FirstName, MiddleName, LastName, and JobTitle from the Person and Employee tables of HumanResources schema.
Including the single spaces, that took 294 characters of typing and esoteric memory of column and table names because there was no use of "intellesense" etc. "Look how easy that was"??? NOT easy compared to something like SQL Prompt or even (ugh!) Intellisense or even click'n'drag from the object explorer. It's easier to type a part of something and have SQL Prompt or Intellisense fill in the rest.
And, even if you didn't have any of that, the you still had to know all the names of the columns and tables and the prompt ended up being a total of 294 characters of typing instead of just the 247 in the code. And, guess what? You don't need to copy and paste the code to run it and you don't need to read an explanation because... you were smart enough to write it in the first place and it didn't take any more knowledge than to build that prompt. In fact, SQL Prompt or Intellisense probably wrote half the code for you.
AI just made extra work here. I'll put a good developer up against even themself and the one that doesn't us AI, is going to get it done first.

In the version of ChatGPT I use you can create a list of predefined prompts. So you just select a suitable prompt from a list then type the question. So if you are using it on a regular basis for SQL Server you can set a list of predefined prompts like:

"You are a SQL Server developer and will add comments to the code I provide. The format of comments is [give your example here]".

Then just select that prompt, paste the code in and it will add comments to the code.

Jeff Moden SSC Guru Points: 1004686 More actions · Answer 4

@jonathan-2,

While the AI is quite capable of explaining exactly WHAT the code does programmatically, it's simply not capable of figuring out the WHY for most code. For example, the AI will explain quite nicely that the code is giving a 10% discount to something but it is not generally capable of determining WHY the 10% discount is being given.

Heh... and I couldn't believe the part of the article that said you could check the accuracy of the code by copying it to another AI and asking what it does. Why not just copy the code to SSMS and press the bloody {F5} key in a test environment?

While I think that AI is a huge technological advancement, I think people have already gone stupid in it's use in a lot of areas. This article is a perfect example of how that is already happening.

And, again, that's no reflection on the author of this article.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004686 More actions · Answer 5

David.Poole wrote:

I'm uncomfortable with ChatGPT. I've found that a lot of its answers are believable rather than correct.

Exactly. I summarize that as being "Confidently Incorrect" and, in that area, ChatGPT has come to the point of perfectly matching many people in that aspect... and they make me uncomfortable, as well.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jonathan AC Roberts SSCoach Points: 18408 More actions · Answer 6

Jeff Moden wrote:

David.Poole wrote:
I'm uncomfortable with ChatGPT. I've found that a lot of its answers are believable rather than correct.
Exactly. I summarize that as being "Confidently Incorrect" and, in that area, ChatGPT has come to the point of perfectly matching many people in that aspect... and they make me uncomfortable, as well.

I look at it a bit like having a junior assistant that you can set work to but then have to check the results.

bamage Newbie Points: 1 More actions · Answer 7

AdventureWorks has been around a long time and every aspect of its data model has been crawled and indexed. How well do ChatGPT and Bard do on an undocumented DB, assuming its data model is fairly well built out?

This reply was modified 1 years, 11 months ago by bamage.

thisisfutile Hall of Fame Points: 3669 More actions · Answer 8

I think the "explanation" of the table it gives speaks volumes to its limitations. Explaining:

Title: The title of the person
First name; The first name of the person
etc

These are not helpful and just a waste. A lot of software designers do this. If they have a setting called "Advanced Measurements ON/OFF", when you click the help it says, "Select ON to enable Advanced Measurements". I know this. I want to know what advanced measurements does. If you're not going to explain anything helpful, don't make the "help" available.

Peter Heller Ten Centuries Points: 1000 More actions · Answer 9

Providing the Northwinds2022TSQLV7.SQL metadata and using the SQL Server 2019 on Ubuntu helps with establishing the metadata and which target engine is being used.
I need to write a query to retrieve the 10 lowest orders by OrderPrice for the month of September. Please use the following tables in the query: Sales.[Order] as o, Sales.OrderDetail as od, Sales.Customer as c, and Sales.Shipper as s.
Ensure the query follows these guidelines: 1. Include a literal column named LowestOrderPrices with the value 'Lowest Order Prices'.
1. Create an alias SeptemberOrders by extracting the year and month (YYYY-09) from the o.OrderDate.
2. Derive OrderPrice by calculating the product of od.UnitPrice and od.Quantity.
3. Create an alias DiscountedOrderPrice by applying the discount percentage to the OrderPrice.
4. Calculate DaysFromOrderToShip as the difference in days between o.OrderDate and o.ShippedDate.
5. Explicitly list all column names in the select statement.
6. Order the results by OrderPrice in ascending order and limit the results to the 10 lowest orders.

I really liked your article, thank you. In the explanation using "Order and Limit: The query is ordered by OrderPrice and limited to the 10 lowest orders using OFFSET and FETCH." instead of the proprietary Top 10

Peter Heller Ten Centuries Points: 1000 More actions · Answer 10

Providing the Northwinds2022TSQLV7.SQL metadata and using the SQL Server 2019 on Ubuntu helps with establishing the metadata and which target engine is being used.
I need to write a query to retrieve the 10 lowest orders by OrderPrice for the month of September. Please use the following tables in the query: Sales.[Order] as o, Sales.OrderDetail as od, Sales.Customer as c, and Sales.Shipper as s.
Ensure the query follows these guidelines: 1. Include a literal column named LowestOrderPrices with the value 'Lowest Order Prices'.
1. Create an alias SeptemberOrders by extracting the year and month (YYYY-09) from the o.OrderDate.
2. Derive OrderPrice by calculating the product of od.UnitPrice and od.Quantity.
3. Create an alias DiscountedOrderPrice by applying the discount percentage to the OrderPrice.
4. Calculate DaysFromOrderToShip as the difference in days between o.OrderDate and o.ShippedDate.
5. Explicitly list all column names in the select statement.
6. Order the results by OrderPrice in ascending order and limit the results to the 10 lowest orders.

I really liked your article, thank you. In the explanation using "Order and Limit: The query is ordered by OrderPrice and limited to the 10 lowest orders using OFFSET and FETCH." instead of the proprietary Top 10

Peter Heller Ten Centuries Points: 1000 More actions · Answer 11

Providing the Northwinds2022TSQLV7.SQL metadata and using the SQL Server 2019 on Ubuntu helps with establishing the metadata and which target engine is being used.
I need to write a query to retrieve the 10 lowest orders by OrderPrice for the month of September. Please use the following tables in the query: Sales.[Order] as o, Sales.OrderDetail as od, Sales.Customer as c, and Sales.Shipper as s.
Ensure the query follows these guidelines: 1. Include a literal column named LowestOrderPrices with the value 'Lowest Order Prices'.
1. Create an alias SeptemberOrders by extracting the year and month (YYYY-09) from the o.OrderDate.
2. Derive OrderPrice by calculating the product of od.UnitPrice and od.Quantity.
3. Create an alias DiscountedOrderPrice by applying the discount percentage to the OrderPrice.
4. Calculate DaysFromOrderToShip as the difference in days between o.OrderDate and o.ShippedDate.
5. Explicitly list all column names in the select statement.
6. Order the results by OrderPrice in ascending order and limit the results to the 10 lowest orders.

I really liked your article, thank you. In the explanation using "Order and Limit: The query is ordered by OrderPrice and limited to the 10 lowest orders using OFFSET and FETCH." instead of the proprietary Top 10

Peter Heller Ten Centuries Points: 1000 More actions · Answer 12

This was an excellent article in exploring the iterative prompt engineering. Also, consider looking into https://ollama.com/ for private LLM's or Huggingfaces.com. Look into medium.com for ideas on AI and the vast amount tools out there.

Llama 3.1 is a new state-of-the-art model from Meta available in 8B, 70B and 405B parameter sizes.
Google Gemma 2 is a high-performing and efficient model by now available in three sizes: 2B, 9B, and 27B.
2B9B
A set of Mixture of Experts (MoE) model with open weights by Mistral AI in 8x7b and 8x22b parameter sizes.
Phi-3 is a family of lightweight 3B (Mini) and 14B (Medium) state-of-the-art open models by Microsoft.
Another amazing tool https://github.com/danielmiessler/fabric