January 23, 2012 at 8:52 am
Hi fellows,
I need a help here.
I have a query to populate some tables.
Since dirty reads are not a problem but the inserted/updated tables cannot block the selects I'm trying to tunne it.
I was amazed the way BEGIN TRAN...COMMIT TRAN changes the response time.
I tested 3 scenarios:
1) BEGIN TRAN
cursor A select from production table
insert in table A
cursor B select from function()
insert into table B
end cursor B
end cursor A
COMMIT TRAN
2) cursor A select from production table
BEGIN TRAN
insert in table A
cursor B select from function()
insert into table B
end cursor B
COMMIT TRAN
end cursor A
3) no begin...commit block
so:
1 - takes 7 seconds, cannot select from A or B duo to blocks
2 - takes 35 seconds, can select from A and B
3 - takes 250 seconds!
Whats happening?
Why the begin tran...commit tran impacts time?
The's any way to speed up the query and allow (dirty) reads in A and B?
January 23, 2012 at 10:28 am
First of all read this article. http://sqlblog.com/blogs/andrew_kelly/archive/2009/04/10/how-dirty-are-your-reads.aspx
It is NOT a magic go faster pill and is fraught with all kinds of issues.
Secondly, If you just need to insert data a much better approach than dirty reads is to get rid of the cursor. I am 99.999999% certain you don't need a cursor for inserts. There are very few times a cursor is required and inserts just isn't one of those times. If you want take a shot at getting your inserts to a set based solution post up some ddl and sample data and we can tackle it.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
January 23, 2012 at 11:02 am
Depending on the details of how your cursor is being declared (updatable vs read-only, fast forward, et al), having an explicit transaction wrapped around the whole process would make it faster, at the cost of eliminating concurrency (nothing else can access that data at the same time).
As mentioned already, it's almost certain you can make it even faster (and more reliable) by eliminating the cursor altogether.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
January 23, 2012 at 11:10 am
Thanks for the reply Sean,
1. Yes, I agree most of cursors can be replaced by a nice, fast and elegant relational query but that's not the case indeed.
2. I need some work to "recode" it in a reasonable way to make a post and I'll do it eventually. First I'll try to figure why the begin...tran can impact it.
3. No need for precision, dirty reads are allowed.
4. I just want to know if someone can explain this strange behavior, it's a cursor problem?
there are issues using insert/update inside a cursor without a begin...commit?
The cursor are fast forward readonly. The query runs once a day to populate A and B tables, there are no other way to populate these tables as they are used only in selects.
The tables in the cursor select statments are not updated inside the cursor, not even in joins or views.
It just take lots of production tables to squeeze some data to fill separate tables for some OLAP app.
January 23, 2012 at 11:13 am
jcb (1/23/2012)
3. No need for precision, dirty reads are allowed.
But what about duplicate or missing data? It is not just precision that is affected by dirty reads. If you are not in a position to remove the **cough**cursor**cough** then you should at least consider using snapshot isolation instead of nolock dirty reads.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
January 23, 2012 at 12:43 pm
Sean,
"Snapshot isolation transaction failed accessing database 'xxx' because snapshot isolation is not allowed in this database. Use ALTER DATABASE to allow snapshot isolation."
And I'm not allowed to use ALTER DATABASE also...
Well thinq its like querying: "How much thousands of DBAs called "Smith" live in Pasadena?" and you ill see duplicate rows or missing a few ones ill no to be a issue.
Taking several hours to update is a problem, blocking users from querying while it update is a BIG problem!
The cursor cannot be replaced by a relational query duo this very procedural nature.
All I can guess it's about a big commit versus thousands of small auto commits.
If I cannot wait for the big commit I can try create some stag tables and first populate these instead of directly insert in the A,B tables.
It ill add a final step (copy stagA,stagB -> A,B) and some overhead for sure but can solve my bloking problems.
January 23, 2012 at 1:45 pm
"Snapshot isolation transaction failed accessing database 'xxx' because snapshot isolation is not allowed in this database. Use ALTER DATABASE to allow snapshot isolation."
And I'm not allowed to use ALTER DATABASE also...
Sounds like that may not be an option then.
Well thinq its like querying: "How much thousands of DBAs called "Smith" live in Pasadena?" and you ill see duplicate rows or missing a few ones ill no to be a issue.
Taking several hours to update is a problem, blocking users from querying while it update is a BIG problem!
Understood. If the risk of bad data is acceptable and you are not in a position to get rid of the cursor this may be the best you can do at this point.
The cursor cannot be replaced by a relational query duo this very procedural nature.
Not to continue beating a dead horse but an insert is NOT procedural. I get that you probably have somewhat complex coding in that function but it is doing two things. First, scalar functions like this are slower than molasses because the optimizer always treats it as handling a single row. Second, it is preventing you from turning a procedural row by row process into a set based single insert statement.
All I can guess it's about a big commit versus thousands of small auto commits.
If I cannot wait for the big commit I can try create some stag tables and first populate these instead of directly insert in the A,B tables.
It ill add a final step (copy stagA,stagB -> A,B) and some overhead for sure but can solve my bloking problems.
Yes the multiple transactions will lock and release the table for each insert. This makes the overall process slower but allows for other connections to read in between. Given everything you have said this is probably the best option for you at this point.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
January 24, 2012 at 6:25 am
If you really are stuck with a cursor, try Static instead of Fast_Forward Read_Only. Static cursors use more resources in tempdb, but are often faster than other options. They also don't hold locks on the base tables, since they use a temp table (essentially) instead.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
January 24, 2012 at 10:44 am
thanks for the tip.
I tried it but there's no improvement in performance.
I Also tried FORWARD_ONLY STATIC OPTIMISTIC.
The problem is not the tables the cursor are reading, its the tables its inserting.
I'll need to truncate these tables anyway (at least while I dont create any kind of log/column to flag what rows from the sources tables updated since the batch last runned).
Its taking ~20 minutes to read ~1.2M rows and populate 3 tables.
Since I'm scanning every row from source tables to create every row in the destiny tables it's RBAR for sure.
In fact it's more like a synchronization from a database to another (read only).
I'll post the cursor after I finish some jobs.
January 24, 2012 at 10:48 am
Why not just truncate / insert?
If not possible, why not merge?
January 27, 2012 at 9:59 am
Below is my batch,
If anynone can manage to wipe the cursors and change it to use only relational logic to booster performance I ill be grateful.
/* dbo.Tagger receives a string and returns a table tag(tag varchar(100)) */
GO
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
begin tran
set nocount on
truncate table dbo.tag_register;
/* dropar FKs para permitir truncate */
alter table dbo.tag_register drop constraint [FK_tag_register-tag];
alter table dbo.tag_register drop constraint [FK_tag_register-register];
truncate table dbo.tag;
truncate table dbo.register;
alter table dbo.tag_register add constraint [FK_tag_register-tag] FOREIGN KEY (id_tag) references dbo.[tag] (id_tag);
alter table dbo.tag_register add constraint [FK_tag_register-register] FOREIGN KEY (id_register) references dbo.[register] (id_register);
declare @tags table(tag varchar(100));
declare @tag varchar(100);
declare @id_tag int, @id_register int;
/* CURSOR PFÍSICA */
declare @id_tablePF int, @vc_nome varchar(500), @vc_sobrenome varchar(500), @vc_lema varchar(500), @vc_cat varchar(500)
, @id_municipio int, @id_regional int
DECLARE TagCursor cursor LOCAL FAST_FORWARD READ_ONLY for --FORWARD_ONLY STATIC OPTIMISTIC
select --top 10000
pf.id_tablePF, pf.vc_nome, pf.vc_sobrenome, pf.vc_lema, pfc.vc_tablePF_cat,pfe.id_cidade,pf.id_regional
from dbo.tablePF pf (nolock)
join dbo.tablePF_cat pfc on pfc.id_tablePF_cat = pf.id_tablePF_cat
left join dbo.tablePF_endereco pfe on pfe.id_tablePF = pf.id_tablePF and pfe.it_ec = 1 --se der problema por retornar mais de uma pessoa, vai ser aqui!
OPEN TagCursor
FETCH next from TagCursor into @id_tablePF, @vc_nome, @vc_sobrenome, @vc_lema, @vc_cat, @id_municipio, @id_regional
WHILE (@@FETCH_STATUS = 0)
BEGIN
--begin tran
/* inserir na tabela de registros */
insert into dbo.register(id_from_table,it_table,vc_register,id_municipio,id_regional)
values(@id_tablePF,'P',@vc_nome + isnull(' '+@vc_sobrenome,''),@id_municipio,@id_regional);
set @id_register = @@IDENTITY;
/* procurar tags na fonte do registro */
delete @tags;
insert into @tags select tag from dbo.tagger(@vc_nome);
insert into @tags select tag from dbo.tagger(@vc_sobrenome);
insert into @tags select tag from dbo.tagger(@vc_lema);
if (@vc_cat <> 'Comum')
insert into @tags select tag from dbo.tagger(@vc_cat);
--select distinct * from @tags;
DECLARE SUBTagCursor cursor LOCAL FAST_FORWARD READ_ONLY for
select distinct tag from @tags
OPEN SUBTagCursor
FETCH next from SUBTagCursor into @tag
WHILE (@@FETCH_STATUS = 0)
BEGIN
/* inserir ocorrência da tag */
set @id_tag = (select top 1 id_tag from dbo.tag where vc_tag = @tag);
if (@id_tag is null)
begin
insert into dbo.tag (vc_tag,ocurrences) values (@tag,1);
set @id_tag = @@IDENTITY;
end
else
begin
update dbo.tag set ocurrences = ocurrences +1 where vc_tag = @tag;
end
/* inserir relacionamento tag x registro */
insert into dbo.tag_register(id_register,id_tag)values(@id_register,@id_tag);
FETCH next from SUBTagCursor into @tag
END
close SUBTagCursor;
deallocate SUBTagCursor;
FETCH next from TagCursor into @id_tablePF, @vc_nome, @vc_sobrenome, @vc_lema, @vc_cat, @id_municipio, @id_regional
END
close TagCursor;
deallocate TagCursor;
/* CURSOR INSTITUIÇÃO */
declare @id_tableI int, @vc_tableI_tipo varchar(200)
, @vc_razao_social varchar(100), @vc_tableI varchar(150), @vc_nome_etiq varchar(100)
, @vc_nome_can varchar(100), @vc_sigla varchar(50), @vc_nome_fant varchar(100)
DECLARE TagCursor cursor LOCAL FAST_FORWARD READ_ONLY for --FORWARD_ONLY STATIC OPTIMISTIC
select --top 10000
i.id_tableI, ie.id_municipio, dbo.ObterRegionaltableIId(i.id_tableI) --returns a ID
, it.vc_tableI_tipo
, i.vc_razao_social, i.vc_tableI, i.vc_nome_etiq
, i.vc_nome_can, i.vc_sigla, i.vc_nome_fant
from dbo.tableI i (nolock)
left join dbo.tableI_tipo it on it.id_tableI_tipo = i.id_tableI_tipo
left join dbo.tableI_endereco ie on ie.id_tableI = i.id_tableI and ie.it_ec = 1 --se der problema por retornar mais de uma pessoa, vai ser aqui!
OPEN TagCursor
FETCH next from TagCursor into @id_tableI, @id_municipio, @id_regional
, @vc_tableI_tipo
, @vc_razao_social, @vc_tableI, @vc_nome_etiq
, @vc_nome_can, @vc_sigla, @vc_nome_fant
WHILE (@@FETCH_STATUS = 0)
BEGIN
--begin tran
/* inserir na tabela de registros */
if (@vc_razao_social = '') set @vc_razao_social = null;
if (@vc_tableI = '') set @vc_razao_social = null;
if (@vc_nome_etiq = '') set @vc_razao_social = null;
if (@vc_nome_fant = '') set @vc_razao_social = null;
if (@vc_nome_can = '') set @vc_razao_social = null;
insert into dbo.register(id_from_table,it_table,vc_register,id_municipio,id_regional)
values(@id_tableI,'I',coalesce(@vc_razao_social,@vc_tableI,@vc_nome_etiq,@vc_nome_fant,@vc_nome_can),@id_municipio,@id_regional);
set @id_register = @@IDENTITY;
/* procurar tags na fonte do registro */
delete @tags;
insert into @tags select tag from dbo.tagger(@vc_nome);
insert into @tags select tag from dbo.tagger(@vc_tableI_tipo);
insert into @tags select tag from dbo.tagger(@vc_razao_social);
insert into @tags select tag from dbo.tagger(@vc_tableI);
insert into @tags select tag from dbo.tagger(@vc_nome_etiq);
insert into @tags select tag from dbo.tagger(@vc_nome_can);
insert into @tags select tag from dbo.tagger(@vc_sigla);
insert into @tags select tag from dbo.tagger(@vc_nome_fant);
--select distinct * from @tags;
DECLARE SUBTagCursor cursor LOCAL FAST_FORWARD READ_ONLY for
select distinct tag from @tags
OPEN SUBTagCursor
FETCH next from SUBTagCursor into @tag
WHILE (@@FETCH_STATUS = 0)
BEGIN
/* inserir ocorrência da tag */
set @id_tag = (select top 1 id_tag from dbo.tag where vc_tag = @tag);
if (@id_tag is null)
begin
insert into dbo.tag (vc_tag,ocurrences) values (@tag,1);
set @id_tag = @@IDENTITY;
end
else
begin
update dbo.tag set ocurrences = ocurrences +1 where vc_tag = @tag;
end
/* inserir relacionamento tag x registro */
insert into dbo.tag_register(id_register,id_tag)values(@id_register,@id_tag);
FETCH next from SUBTagCursor into @tag
END
close SUBTagCursor;
deallocate SUBTagCursor;
--commit tran
FETCH next from TagCursor into @id_tableI, @id_municipio, @id_regional
, @vc_tableI_tipo
, @vc_razao_social, @vc_tableI, @vc_nome_etiq
, @vc_nome_can, @vc_sigla, @vc_nome_fant
END
close TagCursor;
deallocate TagCursor;
/* CURSOR tableM */
declare @id_tableM int
DECLARE TagCursor cursor LOCAL FAST_FORWARD READ_ONLY for --FORWARD_ONLY STATIC OPTIMISTIC
select
m.id_tableM, me.id_municipio
, m.vc_razao_social
, m.vc_nome_fant
from dbo.tableM m (nolock)
left join dbo.tableM_endereco me on me.id_tableM = m.id_tableM and me.it_ec = 1
OPEN TagCursor
FETCH next from TagCursor into @id_tableM, @id_municipio
, @vc_razao_social
, @vc_nome_fant
WHILE (@@FETCH_STATUS = 0)
BEGIN
--begin tran
/* inserir na tabela de registros */
if (@vc_razao_social = '') set @vc_razao_social = null;
if (@vc_nome_fant = '') set @vc_razao_social = null;
insert into dbo.register(id_from_table,it_table,vc_register,id_municipio,id_regional)
values(@id_tableM,'M',coalesce(@vc_razao_social,@vc_nome_fant),@id_municipio,null);
set @id_register = @@IDENTITY;
/* procurar tags na fonte do registro */
delete @tags;
insert into @tags select tag from dbo.tagger(@vc_razao_social);
insert into @tags select tag from dbo.tagger(@vc_nome_fant);
--select distinct * from @tags;
DECLARE SUBTagCursor cursor LOCAL FAST_FORWARD READ_ONLY for
select distinct tag from @tags
OPEN SUBTagCursor
FETCH next from SUBTagCursor into @tag
WHILE (@@FETCH_STATUS = 0)
BEGIN
/* inserir ocorrência da tag */
set @id_tag = (select top 1 id_tag from dbo.tag where vc_tag = @tag);
if (@id_tag is null)
begin
insert into dbo.tag (vc_tag,ocurrences) values (@tag,1);
set @id_tag = @@IDENTITY;
end
else
begin
update dbo.tag set ocurrences = ocurrences +1 where vc_tag = @tag;
end
/* inserir relacionamento tag x registro */
insert into dbo.tag_register(id_register,id_tag)values(@id_register,@id_tag);
FETCH next from SUBTagCursor into @tag
END
close SUBTagCursor;
deallocate SUBTagCursor;
--commit tran
FETCH next from TagCursor into @id_tableM, @id_municipio
, @vc_razao_social
, @vc_nome_fant
END
close TagCursor;
deallocate TagCursor;
set nocount off
commit tran
--rollback tran
SET TRANSACTION ISOLATION LEVEL READ COMMITTED
GO
January 27, 2012 at 11:49 am
jcb (1/27/2012)
Below is my batch,If anynone can manage to wipe the cursors and change it to use only relational logic to booster performance I ill be grateful.
It is certainly a lot of work to replace and given the nested cursors it is no surprise it is running slower than molasses. This is probably beyond the scope of a forum. That is several days work at least to make sure it correct.
It is nearly impossible to determine how to even point you in the right direction because there is no description about what you are trying to do, there are functions that need to be understood, no table definitions, etc, etc, etc...
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
January 27, 2012 at 11:56 am
I'd need a lot more than just the script to begin fixing that, as already mentioned.
Table and UDF definitions, definitely, and some insert statements to generate sample data, would start.
I wouldn't be at all surprised to find that the real killer on this is the UDF(s), not the cursor. The cursor is probably slowing it down, but if the UDF is anything other than a simple string function or something like that, if it pulls any data from any tables, or is at all procedural in what it does, it's probably costing more server resources than the cursor is.
Not a performance issue, but you don't need "with (nolock)" if you're already setting the isolation level to read uncommitted. Nolock is redundant in that case.
Anyway, to start debugging/tuning, I'd work on the UDF first. If it was created by the same person/people as the cursor, it probably could be improved.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
January 27, 2012 at 12:18 pm
Sean,
This is running at 1million rows/20 min.
Not too slow since actually I managed to apply this query only to recently updated/inserted data and for a few hundred rows it runs in less than a second.
I just posted the batch in response to some requests for it (and its a simplified version).
The porpouse of the query is to scan data from some tables and populates [tag], [register] and [tag_register] tables. These tables are used to implement a google like search for a web app.
The SP concerning the google like search ill be published in SSC scripts section next week.
I changed my mind and I guess its possible to remove the cursors at all but I still doubt it ill improve performance.
Thanks Gsquare,
I fear tho exposes to much of the schema for posting the UDFs and tables definitions but I'm working in a simplified version for the SSC forum 😀
January 27, 2012 at 12:42 pm
I know how it goes on not being able to post because of IP policies, security, et al. I was just hoping you could, because we could probably help.
Instead of tags in tables like that, have you guys considered full-text indexing? That handles tags and related concepts very, very well in most cases.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
Viewing 15 posts - 1 through 14 (of 14 total)
You must be logged in to reply to this topic. Login to reply