April 10, 2008 at 8:15 pm
I don't know how to explain this clearly, but I have the data set below
with all int data type.
ID year reportdate
1 1998 19980101
1 2000 20001231
1 2003 20030331
drop table table_1
create table table_1
(
ID int NULL,
Year int NULL,
ReportDate int NULL
)
insert into table_1
values(1, 1998, 19980101)
insert into table_1
values(1, 2000, 20001231)
insert into table_1
values(1, 2003, 20030331)
select * from table_1
I need to add the missing year and missing reportdate in between or among
these existing years and date, for it to come out as below. as you can see,
it is kind of tricky because the missing year is easy to add; however, the report
date of the missing year has to be the date of the following year.
I created a function to do this, and it works fine when the id has only one year missing,
or consecutively missing (missing year in order), but it does not work when they
are missing and out of sequence. the while loop only works when
it's to be looped through in order of +1 or -1, but how can you loop
through when you don't know when it would be missing to add them properly?
Any idea or suggestion would be greatly appreciated.
-- Result to be
ID year reportdate
1 1998 19980101
1 1999 19991231
1 2000 20001231
1 2001 20010331
1 2002 20020331
1 2003 20030331
April 10, 2008 at 10:20 pm
How about something like (I hope is right - I haven't got access to a SQL Server instance to check it)
DECLARE @yearlist cursor for select distinct year from table_1
DECLARE @EndYear int, @year INT
OPEN yearlist
FETCH NEXT FROM @yearlist into @year
WHILE @@Fetch_status = 0
BEGIN
SELECT @endyear = Min (Year)
from table_1
where year > @year
if @year + 1 <> @endyear
BEGIN
WJILE @year < @endyear
BEGIN
INSERT INTO table_1 (year, reportdate)
SELECT @year + 1, (@year + 1) * 10000 + CONVERT (INT, Right ((min(year), 4)
FROM Table_1
WHERE Year > @year
SELECT @year = @year + 1
END
fetch next from @yearlist into @year
END
deallocate @yearlist
April 11, 2008 at 8:57 am
thanks very much happycat59 , let me try in my procedure.
Even if it does not work perfectly, you gave me idea and modify
to my needs and some lights at the end of the tunel. Greatly appreciated.
April 11, 2008 at 9:10 am
If you have a lot of these to do - you could use a TALLY table to help build this instead of a cursor. with a fairly sizeable number - this would be lots faster.
A Tally table is something that has an unbroken sequence of numbers in it. Can be VERY useful to keep around for just these kinds of scenarios and lots more.
Here's a quick way to build one:
create table Tally(number as int primary key clustered)
Insert Tally(number)
select top 100000 --pick how many you want
identity(int,1,1)
from sys.all_columns SC1 cross join sys.all_columns SC2
Once you build that your query becomes really easy:
insert table1 (year, reportdate)
select tally.number as [year],
cast(tally.number as varchar(4))+'1231' as reportdate
from tally
LEFT OUTER JOIN table_1 on tally.number=table_1.year
where
tally.number between 1990 and 2010 --pick your daterange here
and table_1.year IS NULL
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
April 11, 2008 at 12:54 pm
Thanks very much Matt Miller.
I am still working the cursor into my solution
and certainly will try yours as well, because
you are right, if it can be done without cursor
it would be preferred way. But I will see what
best for my problem. Appreciated a lot for
helping me.
April 12, 2008 at 9:05 pm
hi happycat59
your cursor was right, I just had to change the right(year) portion
to right(reportdate) and added one more end.
however, even after it has inserted all the correct years
and date, the cursor will not stop, it just kept running until I manually stop it.
is there anything I missed to do?
thanks again.
April 12, 2008 at 10:04 pm
Matt is correct about the Tally table... but a Tally table isn't really worth a hoot when it comes to performance unless you also build the necessary Clustered Primary Key... please see the script for building a Tally table at the following URL...
http://www.sqlservercentral.com/scripts/Advanced+SQL/62486/
--Jeff Moden
Change is inevitable... Change for the better is not.
April 12, 2008 at 10:30 pm
Thanks Jeff for the link and info. I have never heard of it until Matt wrote it to me.
It is very helpful and agree that better with proper index.
I was be able to replace my function and use the tally to
generate the misisng year, but i'm still having
trouble with generating the associated date
with the criteria that I described. So, still playing around with it
and the cursor solution, so will see.
April 12, 2008 at 11:15 pm
As you're finding out, the hard part is to "smear" the existing data into the new rows for the missing years. This type of smearing is VERY procedural and most folks think the only way to do it is either with a cursor or a While loop. Since you've never heard of the Tally Table before, might as well show you a little SQL Server trick that you've never seen before to replace the cursor. And, on large datasets, it's very fast...
... here's all the code including your original table generator. The details are in the comments in the code...
[font="Courier New"]
--=============================================================================
-- Recreate the posted test data... this is NOT part of the solution.
--=============================================================================
create table table_1
(
ID int NULL,
Year int NULL,
ReportDate int NULL
)
insert into table_1
values(1, 1998, 19980101)
insert into table_1
values(1, 2000, 20001231)
insert into table_1
values(1, 2003, 20030331)
--=============================================================================
-- Using the Tally table as a source of numbers for the missing years, copy
-- all existing rows and make new rows for missing years in a temp table.
--=============================================================================
SELECT IDENTITY(INT,1,1) AS RowNum,
t1.ID, yrs.Year, t1.ReportDate
INTO #AllYears
FROM dbo.Table_1 t1
RIGHT OUTER JOIN
(--==== Returns all years from the first year to the last year
SELECT t.N AS Year
FROM dbo.Tally T --List of numbers
INNER JOIN
(--==== Find first and last years to use as criteria
-- in the inner join with the Tally table
SELECT MIN(Year) AS YearLo,
MAX(Year) AS YearHi
FROM Table_1
) y
ON t.N BETWEEN y.YearLo AND y.YearHi
) yrs
ON t1.Year = yrs.Year
ORDER BY yrs.Year DESC
--===== Add the required clustered key to force the physical order of data
-- for the "smear" update that will follow
ALTER TABLE #AllYears
ADD PRIMARY KEY CLUSTERED (RowNum)
--=============================================================================
-- Using the clustered key to automatically order the data in the correct
-- descending order for Year, "smear" the data from existing rows down
-- into the new rows that are missing data using SQL Server's proprietary
-- UPDATE tablename SET @vaiable = column = expression
--=============================================================================
--===== Declare some variables to remember values for the previous row updated.
DECLARE @PrevID INT --The previous ID
DECLARE @PrevMMDD INT --Right 4 digits of the previous ReportDate
SELECT @PrevID = 0, --Start off w/non-null values that are not in the table.
@PrevMMDD = 0
--===== Do the "smear" update. Using the variables to remember the values from
-- each previous row, add the missing data to the table where the data is
-- missing.
UPDATE #AllYears
SET @PrevID = ID = CASE WHEN ID IS NULL THEN @PrevID ELSE ID END,
@PrevMMDD = CASE WHEN RIGHT(ReportDate,4) IS NULL THEN @PrevMMDD
ELSE RIGHT(ReportDate,4) END,
ReportDate = REPLACE(STR(Year,4)+STR(@PrevMMDD,4),' ','0')
FROM #AllYears
--=============================================================================
-- Display the desired result
--=============================================================================
SELECT ID, Year, ReportDate
FROM #AllYears
ORDER BY ID,ReportDate
--===== Housekeeping so can run the test again from SMS
DROP TABLE #AllYears
DROP TABLE Table_1
[/font]
--Jeff Moden
Change is inevitable... Change for the better is not.
April 12, 2008 at 11:34 pm
p.s.
Longer code is not always slower code. Longer code does the ol' "Divide and Conquer". The code I posted is what they call "Set Based" programming and that doesn't mean doing it all in a single query. It's a powerful way to write code for many reasons...
1. Easier to develop because you "peel on potato at a time" (Divide and Conquer). Solving one problem at a time in a larger task just makes it easier to write code for.
2. Easier to document. Each step is naturally broken out from the whole task and leaves plenty of room for documentation.
3. The documentation makes it easier for the next poor slob (or, maybe you a year from now) to figure out what you're doing in the code for troubleshooting. The managers at work are amazed that research on larger procs to make simple changes has dropped from as much as 2 days to mere minutes just because we forced folks to document their code. And, fewer mistakes in modifications are made to boot.
4. The documentation also makes it easier to develop the code because you don't have to remember what you've already written and you don't need to read/analyze the code to see what you've already done. Pays off big time on larger procs but is just as important on the smaller ones.
And, that's my soapbox for the day 😛
--Jeff Moden
Change is inevitable... Change for the better is not.
April 13, 2008 at 5:51 am
Jeff, your procedure works wonderfully.
Thanks so much for going above-and-beyond for helping
me in creating the proc and for giving me guidance
on doing things the proper ways. You should write a book
about this. I really like the way you comment your proc
and put each part into section. It is very easy to read and
clean. I'm deeply grateful, so thanks x a million.
Have a beautiful Sunday.
April 13, 2008 at 9:18 am
Thanks, SQLBlue. :blush: Very happy it does what you want and I sure do appreciate the compliment... you made my weekend. 🙂
--Jeff Moden
Change is inevitable... Change for the better is not.
April 16, 2008 at 1:12 pm
Jeff Moden (4/12/2008)
As you're finding out, the hard part is to "smear" the existing data into the new rows for the missing years. This type of smearing is VERY procedural and most folks think the only way to do it is either with a cursor or a While loop. Since you've never heard of the Tally Table before, might as well show you a little SQL Server trick that you've never seen before to replace the cursor. And, on large datasets, it's very fast...... here's all the code including your original table generator. The details are in the comments in the code...
[font="Courier New"]
--=============================================================================
-- Recreate the posted test data... this is NOT part of the solution.
--=============================================================================
create table table_1
(
ID int NULL,
Year int NULL,
ReportDate int NULL
)
insert into table_1
values(1, 1998, 19980101)
insert into table_1
values(1, 2000, 20001231)
insert into table_1
values(1, 2003, 20030331)
--=============================================================================
-- Using the Tally table as a source of numbers for the missing years, copy
-- all existing rows and make new rows for missing years in a temp table.
--=============================================================================
SELECT IDENTITY(INT,1,1) AS RowNum,
t1.ID, yrs.Year, t1.ReportDate
INTO #AllYears
FROM dbo.Table_1 t1
RIGHT OUTER JOIN
(--==== Returns all years from the first year to the last year
SELECT t.N AS Year
FROM dbo.Tally T --List of numbers
INNER JOIN
(--==== Find first and last years to use as criteria
-- in the inner join with the Tally table
SELECT MIN(Year) AS YearLo,
MAX(Year) AS YearHi
FROM Table_1
) y
ON t.N BETWEEN y.YearLo AND y.YearHi
) yrs
ON t1.Year = yrs.Year
ORDER BY yrs.Year DESC
--===== Add the required clustered key to force the physical order of data
-- for the "smear" update that will follow
ALTER TABLE #AllYears
ADD PRIMARY KEY CLUSTERED (RowNum)
--=============================================================================
-- Using the clustered key to automatically order the data in the correct
-- descending order for Year, "smear" the data from existing rows down
-- into the new rows that are missing data using SQL Server's proprietary
-- UPDATE tablename SET @vaiable = column = expression
--=============================================================================
--===== Declare some variables to remember values for the previous row updated.
DECLARE @PrevID INT --The previous ID
DECLARE @PrevMMDD INT --Right 4 digits of the previous ReportDate
SELECT @PrevID = 0, --Start off w/non-null values that are not in the table.
@PrevMMDD = 0
--===== Do the "smear" update. Using the variables to remember the values from
-- each previous row, add the missing data to the table where the data is
-- missing.
UPDATE #AllYears
SET @PrevID = ID = CASE WHEN ID IS NULL THEN @PrevID ELSE ID END,
@PrevMMDD = CASE WHEN RIGHT(ReportDate,4) IS NULL THEN @PrevMMDD
ELSE RIGHT(ReportDate,4) END,
ReportDate = REPLACE(STR(Year,4)+STR(@PrevMMDD,4),' ','0')
FROM #AllYears
--=============================================================================
-- Display the desired result
--=============================================================================
SELECT ID, Year, ReportDate
FROM #AllYears
ORDER BY ID,ReportDate
--===== Housekeeping so can run the test again from SMS
DROP TABLE #AllYears
DROP TABLE Table_1
[/font]
Jeff,
How did you get the code to have the proper color coding? Is there a way to do this automatically when posting or did you have to manually color code everything? It does make it a lot easier to read.
Thanks,
Ian.
"If you are going through hell, keep going."
-- Winston Churchill
April 16, 2008 at 1:20 pm
http://www.Simple-Talk.com (another of RedGate's pages), has an "SQL Pretifier" on it. There's a link on the front page. It will do that kind of color-coding.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
April 16, 2008 at 1:29 pm
Phil Factor also apparently customized one just for here....:
http://extras.sqlservercentral.com/prettifier/prettifier.aspx
Use the IFCodes version (that's how to format stuff for SSC).
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
Viewing 15 posts - 1 through 15 (of 16 total)
You must be logged in to reply to this topic. Login to reply