September 24, 2012 at 11:24 pm
Hi,
I have an assignment to check the consistency of SQL server database for different conditions.I am having some 200+ columns i a table i need to populate data using prepare statement in .net and check the conditions how it works for each loop values.I am having loop values starting from 1,5,50,100.... 10000k i will increment the loop value as 1,5,50,100.... 10000k and need to get the start time and end time of each loop, i need to check the consistency of the database.
Value for the loop(1,5,50,100.... 10000k) will be from a xml file or text file
For the 200+ Columns i am having 2PK's.
CONDIDTIONS
1)Whether i need to restart for every loop value say if i run loop value as 1 then i need to restart the system (not the services) and run for loop value 5 and restart and so on.
2)Whether i need to restart for every loop value say if i run loop value as 1 then i need to restart the services and run for loop value 5 and restart services and so on.
3)Whether i can continue my loop value starting from 1,5...10000k without restart or any other process continuously running starting from 1...10000k in single go
why i am asking this is whether SQL server will act as differently when i do the above process and will there be any performance improvement or performance variations from the above steps if i followed. Whether there be same value for all the three steps or will there be variation in from time and end time for each conditions
I need to check for insertion time(time difference b/w from time and then loop condition and then end time) for each loop
Thanks!
September 25, 2012 at 4:23 am
Looping within the SQL Server environment is generally a poor approach. If you need to validate data sets or behavior, I would strongly recommend finding a set-based approach to get it done. T-SQL is a set-based language and doesn't perform loops well at all.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
September 25, 2012 at 6:01 am
What is Set based approach.How can i check the data insertion time with .net or java as interface.i need to measure the insertion time.Start time then insertion(population of data) then end time. here in population of data i have used .net as my interface
This is my approach
command.CommandText = "SELECT Getdate() ";
StartTime = (DateTime)command.ExecuteScalar();/*Getting the StartTime*/
Console.WriteLine(StartTime.ToString("MM/dd/yyyy hh:mm:ss.fff tt"));
command.CommandText = "Insert into test (Col1,Col2);
command.Prepare();
for (int i = 0; i < count; i++) /*Loop Starts heres*/
{
command.Parameters.AddWithValue("@param1", 0);
command.Parameters.AddWithValue("@param2", 0);
}
command.CommandText = "SELECT Getdate() ";
EndTime = (DateTime)command.ExecuteScalar();/*Getting the EndTime*/
Console.WriteLine(StartTime.ToString("MM/dd/yyyy hh:mm:ss.fff tt"));
here i= 1,5,50,100.... 10000k
I need to collect the data based on the above loop and i am asking suggestion which three conditions is best.
Thanks!
September 25, 2012 at 6:07 am
I'm still very uncomfortable with the loop approach. It's just going to be slow. But, you need to make sure that each command is a single transaction, or, the entire set will wait until you're done through the whole loop to commit. So explicitly start and commit the transaction around the command to do the insert.
Also, instead of capturing start & stop time in the code (unless you're trying to measure the code itself) instead use Extended Events to capture the execution times of the commands.
I still don't completely understand what you're trying to do or why it has to be done in a loop.
Here's an article differentiating the set-based operation.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
September 25, 2012 at 6:23 am
The SQL server' default is Auto commit which is Read committed transaction,here each command is a single transaction so once the loop run each and every record is auto committed to the database.
I still don't completely understand what you're trying to do or why it has to be done in a loop.
Since i am measuring the total insertion time for bunch of loop values 1,5,50,100.... 10000k.
The article which you have tagged does not contain any insert statement options rather it is having Select statement.I am just using a prepare statement (which may be java ,.net,C#,Vc++ and so on) and measuring the time.
I am just checking what is the time take for different loop values.
Thanks!
September 25, 2012 at 7:23 am
Are you just trying to measure performance in SQL Server?
Most of the people here won't take the approach you're using. I'm not sure what is completely your goal.
If you're using .Net to create the sample data you will insert, then you should take a look at this. Generating Test Data: Part 1 - Generating Random Integers and Floats[/url] and this Generating Test Data: Part 2 - Generating Sequential and Random Dates[/url]
The process of inserting 1000 rows is different if you insert them as a set or if you insert them row by (agonizing) row. The first will perform much better.
By the way, the article that Grant linked doesn't show a set based solution but just another loop. For a better solution, you should take a look at this The "Numbers" or "Tally" Table: What it is and how it replaces a loop.[/url]
Can you show us what is the code sent to SQL Server for your 5 row insertion?
September 25, 2012 at 7:41 am
[Quote] Most of the people here won't take the approach you're using. I'm not sure what is completely your goal. [/Quote]
So what is your suggestion for inserting the data externally not from the database itself. I'm inserting the data and checking the performance for each loop how much time it is taken to insert that's is why i have used prepare statement for each and every statement.
[Quote]Can you show us what is the code sent to SQL Server for your 5 row insertion?[/Quote]
I have already given my approach in the above
connectionString = ("server=MYServerName;Database=MYDB;User id=XXXXX;Password=XXXXXXX;");
then comes the insertion
command.CommandText = "SELECT Getdate() ";
StartTime = (DateTime)command.ExecuteScalar();/*Getting the StartTime*/
Console.WriteLine(StartTime.ToString("MM/dd/yyyy hh:mm:ss.fff tt"));
command.CommandText = "Insert into test (Col1, Col2);
command.Prepare();
for (int i = 0; i < count; i++) /*Loop Starts heres*/
{
command.Parameters.AddWithValue("@param1", 0);
command.Parameters.AddWithValue("@param2", 0);
}
command.CommandText = "SELECT Getdate() ";
EndTime = (DateTime)command.ExecuteScalar();/*Getting the EndTime*/
Console.WriteLine(StartTime.ToString("MM/dd/yyyy hh:mm:ss.fff tt"));
i= 1,5,50,100.... 10000k
Thanks!
September 25, 2012 at 7:50 am
What I was really hoping to see is how the final statement looks when you execute it.
It's different if it looks like this
INSERT INTO test(col1, col2) VALUES( @param1, @param2)
INSERT INTO test(col1, col2) VALUES( @param1, @param2)
INSERT INTO test(col1, col2) VALUES( @param1, @param2)
INSERT INTO test(col1, col2) VALUES( @param1, @param2)
INSERT INTO test(col1, col2) VALUES( @param1, @param2)
Than if it looks like this
INSERT INTO test(col1, col2) VALUES
( @param1, @param2),
( @param1, @param2),
( @param1, @param2),
( @param1, @param2),
( @param1, @param2)
I'm not sure if I'm clear enough.
If you're just testing, then just test, and test several times to obtain consistent results because times can vary depending on the workload of the server.
September 25, 2012 at 8:13 am
Your loop doesn't insert any data - it just repeatedly overwrites the command's parameters. The only calls to the database are befor the loop begins and after it ends, to get the starttime and the endtime.
So you seem to be measuring the client-side ADO.net cost of repeatedly writing values into parameters, not anything in SQL.
Also, it's not a good idea to call Console.WriteLine inside the measured time.
I agree with previous posts which have suggested looking for a set-oriented approach instead of this one row at a time stuff. Maybe look at ways of batching inserts/updates in ADO.net - I think dataadaptors provide a means of doing this. Or maybe write SQL to accept an XML object as a string, parse that into a collection of rows and insert them in a single statement. Of course to have to write the c# or whatever code to generate the XML, but the performance gains will make it well worth while.
Tom
September 25, 2012 at 8:17 am
[Quote]
INSERT INTO test(col1, col2) VALUES
( @param1, @param2),
( @param1, @param2),
( @param1, @param2),
( @param1, @param2),
( @param1, @param2)
[/Quote]
How you are saying that the above and
[Quote]
INSERT INTO test(col1, col2) VALUES( @param1, @param2)
INSERT INTO test(col1, col2) VALUES( @param1, @param2)
INSERT INTO test(col1, col2) VALUES( @param1, @param2)
INSERT INTO test(col1, col2) VALUES( @param1, @param2)
INSERT INTO test(col1, col2) VALUES( @param1, @param2)
[/Quote]
are same when i am issuing a loop for each and every record
While i<=2
Begin
INSERT INTO test(Col0, col1, col2) VALUES( i,@param1, @param2)
Set @i=@+1
End
which means that the loop value is incremented and same need to be stored and not as you stated right?
Thanks!
September 25, 2012 at 8:25 am
[Quote]Your loop doesn't insert any data - it just repeatedly overwrites the command's parameters. The only calls to the database are befor the loop begins and after it ends, to get the starttime and the endtime. [/Quote]
how you are saying that loop will not insert the data.I have used
command.CommandText = "Insert into test (Col1, Col2);
command.Prepare();
for (int i = 0; i < count; i++) /*Loop Starts heres*/
{
command.Parameters.AddWithValue("@param1", 0);
command.Parameters.AddWithValue("@param2", 0);
command.ExecuteNonQuery();
command.Parameters.Clear();
}
I have missed the last 2 lines command.ExecuteNonQuery(); and command.Parameters.Clear(); statement in the code.
[Quote]Also, it's not a good idea to call Console.WriteLine inside the measured time.[/Quote]
I am writing the output to a variable and writing it finally to a text file at the end of the work completion. i have specified Console.WriteLine(); in order to show you how it is taken.
Thanks!
September 25, 2012 at 8:36 am
yuvipoy (9/25/2012)
[Quote]INSERT INTO test(col1, col2) VALUES
( @param1, @param2),
( @param1, @param2),
( @param1, @param2),
( @param1, @param2),
( @param1, @param2)
[/Quote]
How you are saying that the above and
[Quote]
INSERT INTO test(col1, col2) VALUES( @param1, @param2)
INSERT INTO test(col1, col2) VALUES( @param1, @param2)
INSERT INTO test(col1, col2) VALUES( @param1, @param2)
INSERT INTO test(col1, col2) VALUES( @param1, @param2)
INSERT INTO test(col1, col2) VALUES( @param1, @param2)
[/Quote]
are same when i am issuing a loop for each and every record
While i<=2
Begin
INSERT INTO test(Col0, col1, col2) VALUES( i,@param1, @param2)
Set @i=@+1
End
which means that the loop value is incremented and same need to be stored and not as you stated right?
Thanks!
What I was saying is that both instructions would do the same thing but not the same way. If you execute your last while loop in sql, it would be the same as the second statement you quoted (it might be slower). In other words, you could change your approach as Tom suggested. Here's something for you to test:
DECLARE @test-2TABLE(
col0int,
col1varchar(15),
col2datetime);
DECLARE @intint,
@param1varchar(15),
@param2datetime2,
@iint
SET @int = 1
SET @param1 = 'MyTest'
SET @param2 = SYSDATETIME()
While @int <= 100000
Begin
INSERT INTO @test-2(Col0, col1, col2) VALUES( @int,@param1, GETDATE())--@param2)
Set @int=@int+1
End ;
SELECT DATEDIFF( MCS, @param2, SYSDATETIME())
--You could show the results but it's time consuming
--SELECT * FROM @test-2
DELETE @test-2 --This would be out of our test
SET @param2 = SYSDATETIME();
WITH Tally(n) AS(
SELECT ROW_NUMBER() OVER( ORDER BY (SELECT NULL))
FROM master.sys.all_columns a, master.sys.all_columns b)
INSERT INTO @test-2(Col0, col1, col2)
SELECT n, @param1, GETDATE()
FROM Tally
WHERE n <= 100000
SELECT DATEDIFF( MCS, @param2, SYSDATETIME())
--Once again, you could show the results but it's time consuming
--SELECT * FROM @test-2
September 25, 2012 at 8:47 am
Thanks for your replay,
As i have stated early [Quote]Inserting the data externally not from the database itself. [/Quote] my aim is not inserting from the database i need to insert data from non database which may be java or .net c# C++ or anything i need to check how for different connection how the times are taken.Now i'm working with C# so i have posted my example with C# where as i am coding on different platforms also.
The code which you have shared here is a internal one which done at the database level not at the C# level.
I'm not check the loop on database i am checking with C# code with data insertion as my preliminary requirement for now.
Thanks!
September 25, 2012 at 8:58 am
yuvipoy (9/25/2012)
Thanks for your replay,As i have stated early [Quote]Inserting the data externally not from the database itself. [/Quote] my aim is not inserting from the database i need to insert data from non database which may be java or .net c# C++ or anything i need to check how for different connection how the times are taken.Now i'm working with C# so i have posted my example with C# where as i am coding on different platforms also.
The code which you have shared here is a internal one which done at the database level not at the C# level.
I'm not check the loop on database i am checking with C# code with data insertion as my preliminary requirement for now.
Thanks!
If you're inserting from procedural code, and looking for performance, you're going to want to look at several options:
1. You will likely want to look into using a table-valued parameter. You could then create an array in your procedural code (or various other ienumerable types) which can then be streamed in and inserted as a set. I've used this approach successfully on 1000-5000 record sets and it's orders of magnitudes faster than single insert statements.
2. if you're talking in large scale inserts - you may do even better by writing it out to an external file anf having SQL pick up and insert the file in bulk. BCP in with a bulk insert can really get some screaming perf.
3. on the .NET side specifically, you could even look at recordset objects. they all for in-memory inserts which then can be "flushed" into the DB as a set. I haven't played with any such object in java, but would be surprised if they didn't have a similar concept.
The loop used in this way is going to consistently give you the worse perf. You also will have nothing but issues with consistency: with the code as is, you have no good way to get the data out if, say, row 57000 contains garbage and can't be inserted into the DB.
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
September 25, 2012 at 9:10 am
yuvipoy (9/25/2012)
Thanks for your replay,As i have stated early [Quote]Inserting the data externally not from the database itself. [/Quote] my aim is not inserting from the database i need to insert data from non database which may be java or .net c# C++ or anything i need to check how for different connection how the times are taken.Now i'm working with C# so i have posted my example with C# where as i am coding on different platforms also.
The code which you have shared here is a internal one which done at the database level not at the C# level.
I'm not check the loop on database i am checking with C# code with data insertion as my preliminary requirement for now.
Thanks!
I get your point, you will insert data from the outside. However, I wanted to show you how different would the two approaches be. There're plenty of ways to do it and, as Matt said, a while loop will be the worst.
If you're trying to test how will a server respond under a high load (many users inserting data at the same time) it can be a good excercise. If you're trying how to improve the performance for a single user inserting lots of data, then you should look for alternatives.
Here's another way you might want to check http://www.sqlservercentral.com/articles/T-SQL/63003/
Viewing 15 posts - 1 through 15 (of 16 total)
You must be logged in to reply to this topic. Login to reply