March 31, 2010 at 11:51 pm
Hello Every body
I have a table in which there is a column which hold a series of numbers 1-10000 entered by the users but problem is that some numbers are not entered such as 99, 755,6870,..... n so many it is very difficult to find these missing number manually. so is it possible to find these numbers with the help of query??
April 1, 2010 at 3:22 am
You can use tally table. Do a a not in with the tally table and your table.
You can read about tally table here - http://www.sqlservercentral.com/articles/T-SQL/62867/
Ping back if you have any doubt.
- arjun
https://sqlroadie.com/
April 1, 2010 at 7:09 am
There are many ways to do this (tally table as suggested by Arjun, recursive CTE, EXCEPT) here is a demonstration of one of these possible methods (EXCEPT):
-- For demonstration purposes only
USE tempdb;
GO
-- Drop the test table if it exists
IF OBJECT_ID(N'dbo.Test', N'U')
IS NOT NULL
DROP TABLE dbo.Test;
GO
-- Create the test table
CREATE TABLE dbo.Test
(
row_id INTEGER NOT NULL
PRIMARY KEY CLUSTERED
WITH (FILLFACTOR = 100),
);
GO
-- Add 10,000 rows numbered 1-10,000
INSERT dbo.Test
(row_id)
SELECT TOP (10000)
ROW_NUMBER() OVER (
ORDER BY (SELECT 0))
FROM master.sys.columns C1,
master.sys.columns C2,
master.sys.columns C3;
GO
-- Delete 50 random rows
-- and output the row_ids deleted
WITH Random (rn)
AS (
SELECT DISTINCT
TOP (50)
FLOOR(RAND(CHECKSUM(NEWID())) * 10000 + 1)
FROM master.sys.columns C1,
master.sys.columns C2,
master.sys.columns C3
)
DELETE dbo.Test
OUTPUT deleted.row_id
WHERE row_id = ANY
(
SELECT R.rn
FROM Random R
);
GO
WITH Tally (n)
AS (
SELECT TOP (10000)
ROW_NUMBER() OVER (
ORDER BY (SELECT 0))
FROM master.sys.columns C1,
master.sys.columns C2,
master.sys.columns C3
)
-- Find the missing row_ids
SELECT n
FROM Tally
EXCEPT
SELECT row_id
FROM dbo.Test T
-- Tidy up
DROP TABLE dbo.Test;
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
April 1, 2010 at 7:15 am
You might want to check out this recent thread (click here). It seems to be very similiar to what you are asking.
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
April 1, 2010 at 7:33 am
Paul, can you explain this? Especially, the ROW_NUMBER() OVER (ORDER BY (SELECT 0)) part. I'm sorry if i am bugging you.
SELECT TOP (10000)
ROW_NUMBER() OVER (
ORDER BY (SELECT 0))
FROM master.sys.columns C1,
master.sys.columns C2,
master.sys.columns C3;
- arjun
https://sqlroadie.com/
April 1, 2010 at 7:46 am
Arjun Sivadasan (4/1/2010)
Paul, can you explain this? Especially, the ROW_NUMBER() OVER (ORDER BY (SELECT 0)) part. I'm sorry if i am bugging you.SELECT TOP (10000)
ROW_NUMBER() OVER (
ORDER BY (SELECT 0))
FROM master.sys.columns C1,
master.sys.columns C2,
master.sys.columns C3;
- arjun
It's a shorthand way to generate a 'virtual' tally table.
It is a CROSS JOIN of three tables which are guaranteed to exist. The ROW_NUMBER just assigns a sequential number to each resulting row. The ORDER BY (SELECT 0) just means that I don't care about the ordering of rows. Row numbers are generated sequentially, and the TOP limits the processing to 10,000 numbers. Note that the TOP stops execution as soon as 10,000 is reached - the full CROSS JOIN is not fully processed, just the first 10,000 rows.
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
April 1, 2010 at 7:51 am
I was wondering why you chose sys.columns. Thanks Paul. It makes sense now.
- arjun
https://sqlroadie.com/
April 1, 2010 at 8:21 am
Hehehe ..
The Paul's code is the best!!
!!!
April 1, 2010 at 8:24 am
andersonrj18 (4/1/2010)
You might improve, but this is the idea!
Ha ha very funny...April 1st I get it.
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
April 1, 2010 at 9:04 am
Hi guys - great work determining missing rows.
I was wondering what the best way to go from the list of missing rows, to a list containing ranges of missing rows, is.
All the existing structure is true; but instead of results being 1, 3, 4, 5, 7 outputting two columns, [1,1], [3,5], [7,7], etc.
I have a very convoluted process that does the missing row determination as a left join from a known list of ID's; and an indexed Tally table (10mill rows). Taking this list of missing numbers (1,3,4,5,7) I do a self-join on it, with a - 1, to determine the minimum rowid's missing, and update this to a summary table. I do the same again to determine the maximum of the missing group, and update that to the summary table, after joining with a + 1. From this I have a list of range starts and range ends... This is applied to each row in the range begin and end rows by two update creating two tempory tables, with an Identity type id and a list of either the begins or the ends - then joining the missing rows with the results of a joining of the range-begin and range-end on id (as each beginning has a corresponding ending) - where the missing id is between the range-begin and range-end...
it works fast enough and all, but is very messy.
i attach code that can emulate the results I want; but the volume this is live on is up to 4 million rows... that makes it a tricky one.
Any suggestions, hints or ideas welcome...
Thanks
So long, and thanks for all the fishpaste 😉
April 1, 2010 at 9:12 am
Alocyte (4/1/2010)
Hi guys - great work determining missing rows.
Thanks. Why not post this question in a new thread though?
I answered a very, very similar problem (ranges of groups) just recently.
I'll see if I can find it.
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
April 1, 2010 at 11:00 am
not me my friend .. you are funny!!
Your code doesn't work properly..
Tell me, what happens if I have two missing row_id in sequence?
765 and 766..
your code will show me just 765, where's the 766 ...humm ..could you explain??
thanks!!
April 1, 2010 at 11:13 am
andersonrj18 (4/1/2010)
Your code doesn't work properly...
Have you run it? Works fine for me 😛
Your cursor code seems to have disappeared. All good!
Paul
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
April 1, 2010 at 11:27 am
not worked, you know that!!
This is my code ... it works!!
create table #temp (id int)
declare @cont int
set @cont=1
while @cont <= 1000
begin
insert into #temp values (@cont)
set @cont=@cont+1
end
select t1.id from #temp t1
where not exists
( select 1 from #tb_temporaria t2
where t1.id = t2.id)
drop table #temp
April 1, 2010 at 11:31 am
andersonrj18 (4/1/2010)
not worked, you know that!!This is my code ... it works!!
:laugh: A WHILE loop is no better than your original cursor! :laugh:
Try it on a million rows - your code and mine.
See how you go. 😛
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
Viewing 15 posts - 1 through 15 (of 15 total)
You must be logged in to reply to this topic. Login to reply