November 28, 2012 at 8:54 am
Hi
I am being supplied with a data fril from a 3rd party piece of software (which is old, very old)
It is supplying the data in the following format:
(current data and expected data further down the page)
-------------------------------------------------------------------------------------------
CREATE TABLE PS_TestForOnline
(
Coding_Source int,
Admit_Date date,
Disch_Date date,
Patient_No nvarchar (8),
Ward varchar (10),
Consultant varchar (15),
Episode_Date date,
Episode_Number varchar (15)
);
INSERT INTO PS_TestForOnline
VALUES('4','23-Nov-12','23-Nov-12','1234567','DSS*CIC','SAR','23-Nov-12', '1234567>1:1');
INSERT INTO PS_TestForOnline
VALUES('50',null,null,'','','',null,'');
INSERT INTO PS_TestForOnline
VALUES('6','21-Nov-12','23-Nov-12','7654321','AG*CIC' ,'RAVV','23-Nov-12', '7654321>2:1');
INSERT INTO PS_TestForOnline
VALUES('35',null,null,'','','',null,'');
INSERT INTO PS_TestForOnline
VALUES('50',null,null,'','','',null,'');
INSERT INTO PS_TestForOnline
VALUES('60',null,null,'','','',null,'');
select * from PS_TestForOnline
drop table PS_TestForOnline
-----------------------------------------------------------------------------------------------
The current data has basically a line of data items. The Coding_Source number on the rows with the blanks relate to the row above. So basically what i need to do is populate any blank rows (where just a Coding_Source is present) with the details from the row above.
These is what i would expect my results to look like.
----------------------------------------------------------------------------------------------
CREATE TABLE PS_TestForOnline_Results
(
Coding_Source int,
Admit_Date date,
Disch_Date date,
Patient_No nvarchar (8),
Ward varchar (10),
Consultant varchar (15),
Episode_Date date,
Episode_Number varchar (15),
);
INSERT INTO PS_TestForOnline_Results
VALUES('4','23-Nov-12','23-Nov-12','1234567','DSS*CIC','SAR','23-Nov-12', '1234567>1:1');
INSERT INTO PS_TestForOnline_Results
VALUES('50','23-Nov-12','23-Nov-12','1234567','DSS*CIC','SAR','23-Nov-12', '1234567>1:1');
INSERT INTO PS_TestForOnline_Results
VALUES('6','21-Nov-12','23-Nov-12','7654321','AG*CIC' ,'RAVV','23-Nov-12', '7654321>2:1');
INSERT INTO PS_TestForOnline_Results
VALUES('35','21-Nov-12','23-Nov-12','7654321','AG*CIC' ,'RAVV','23-Nov-12', '7654321>2:1');
INSERT INTO PS_TestForOnline_Results
VALUES('50','21-Nov-12','23-Nov-12','7654321','AG*CIC' ,'RAVV','23-Nov-12', '7654321>2:1');
INSERT INTO PS_TestForOnline_Results
VALUES('60','21-Nov-12','23-Nov-12','7654321','AG*CIC' ,'RAVV','23-Nov-12', '7654321>2:1');
select * from PS_TestForOnline_Results
drop table PS_TestForOnline_Results
-----------------------------------------------------------------------------------------------
Many thanks in advance for any advice you can give me.
Cheers
November 28, 2012 at 9:05 am
Have you no row number? Or time stamp? Really, to guarantee that the results are correct we need to be able to have some way of determining the row order. I could code something that will work with what you've provided us, but it wouldn't be guaranteed to continue working in production as there is no concept of row order.
November 28, 2012 at 9:09 am
I supppose we could add in a self gentrating row ID as we go along.???
This will eventually be imported from a csv/txt file via the import wizard in SQL Server 2008 ( and ulimatly assigned into a job)
With my limited knowledge of sql, i preseume adding in a row id every time would be doable?
Thanks
November 28, 2012 at 9:11 am
LoosinMaMind (11/28/2012)
I supppose we could add in a self gentrating row ID as we go along.???This will eventually be imported from a csv/txt file via the import wizard in SQL Server 2008 ( and ulimatly assigned into a job)
With my limited knowledge of sql, i preseume adding in a row id every time would be doable?
Thanks
Sure but you still need something to use as an order by otherwise the order is not going to be consistent.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
November 28, 2012 at 9:13 am
Assuming you have a column SeqNum to order the rows by, you can do this
UPDATE t1
SET Admit_Date = ca.Admit_Date,
Disch_Date = ca.Disch_Date,
Patient_No = ca.Patient_No,
Ward = ca.Ward,
Consultant = ca.Consultant,
Episode_Date = ca.Episode_Date,
Episode_Number = ca.Episode_Number
FROM PS_TestForOnline t1
CROSS APPLY (SELECT TOP 1 t2.Admit_Date,t2.Disch_Date,t2.Patient_No,t2.Ward,t2.Consultant,t2.Episode_Date,t2.Episode_Number
FROM PS_TestForOnline t2
WHERE t2.SeqNum < t1.SeqNum
AND t2.Admit_Date IS NOT NULL
ORDER BY t2.SeqNum DESC) ca
WHERE t1.Admit_Date IS NULL;
____________________________________________________
Deja View - The strange feeling that somewhere, sometime you've optimised this query before
How to get the best help on a forum
http://www.sqlservercentral.com/articles/Best+Practices/61537November 28, 2012 at 9:19 am
The only form of sequence we have is that the file comes correctly sorted.
We know that the rows are in the right order when we open the file.
????
November 28, 2012 at 9:30 am
Looks like Mark beat me to it, but here were my thoughts.
First, we need a row number in the sample data to guarantee row ordering (as mentioned before). For this, I've modified your sample data: -
CREATE TABLE PS_TestForOnline
(
Pos INT IDENTITY(1,1),
Coding_Source int,
Admit_Date date,
Disch_Date date,
Patient_No nvarchar (8),
Ward varchar (10),
Consultant varchar (15),
Episode_Date date,
Episode_Number varchar (15)
);
INSERT INTO PS_TestForOnline
VALUES('4','23-Nov-12','23-Nov-12','1234567','DSS*CIC','SAR','23-Nov-12', '1234567>1:1');
INSERT INTO PS_TestForOnline
VALUES('50',null,null,'','','',null,'');
INSERT INTO PS_TestForOnline
VALUES('6','21-Nov-12','23-Nov-12','7654321','AG*CIC' ,'RAVV','23-Nov-12', '7654321>2:1');
INSERT INTO PS_TestForOnline
VALUES('35',null,null,'','','',null,'');
INSERT INTO PS_TestForOnline
VALUES('50',null,null,'','','',null,'');
INSERT INTO PS_TestForOnline
VALUES('60',null,null,'','','',null,'');
My initial thought was then to do a sort of smear of the data, like this: -
SELECT a.Coding_Source, ad.Admit_Date, dd.Disch_Date, pn.Patient_No, w.Ward, ed.Episode_Date,
en.Episode_Number
FROM PS_TestForOnline a
CROSS APPLY (SELECT TOP 1 Admit_Date
FROM PS_TestForOnline b
WHERE b.POS <= a.POS
AND b.Admit_Date IS NOT NULL AND b.Admit_Date <> ''
ORDER BY b.POS DESC) ad
CROSS APPLY (SELECT TOP 1 Disch_Date
FROM PS_TestForOnline b
WHERE b.POS <= a.POS
AND b.Disch_Date IS NOT NULL AND b.Disch_Date <> ''
ORDER BY b.POS DESC) dd
CROSS APPLY (SELECT TOP 1 Patient_No
FROM PS_TestForOnline b
WHERE b.POS <= a.POS
AND b.Patient_No IS NOT NULL AND b.Patient_No <> ''
ORDER BY b.POS DESC) pn
CROSS APPLY (SELECT TOP 1 Ward
FROM PS_TestForOnline b
WHERE b.POS <= a.POS
AND b.Ward IS NOT NULL AND b.Ward <> ''
ORDER BY b.POS DESC) w
CROSS APPLY (SELECT TOP 1 Episode_Date
FROM PS_TestForOnline b
WHERE b.POS <= a.POS
AND b.Episode_Date IS NOT NULL AND b.Episode_Date <> ''
ORDER BY b.POS DESC) ed
CROSS APPLY (SELECT TOP 1 Episode_Number
FROM PS_TestForOnline b
WHERE b.POS <= a.POS
AND b.Episode_Number IS NOT NULL AND b.Episode_Number <> ''
ORDER BY b.POS DESC) en;
Which produces: -
Coding_Source Admit_Date Disch_Date Patient_No Ward Episode_Date Episode_Number
------------- ---------- ---------- ---------- ---------- ------------ ---------------
4 2012-11-23 2012-11-23 1234567 DSS*CIC 2012-11-23 1234567>1:1
50 2012-11-23 2012-11-23 1234567 DSS*CIC 2012-11-23 1234567>1:1
6 2012-11-21 2012-11-23 7654321 AG*CIC 2012-11-23 7654321>2:1
35 2012-11-21 2012-11-23 7654321 AG*CIC 2012-11-23 7654321>2:1
50 2012-11-21 2012-11-23 7654321 AG*CIC 2012-11-23 7654321>2:1
60 2012-11-21 2012-11-23 7654321 AG*CIC 2012-11-23 7654321>2:1
Mark's way involves a lot less, so if it works then that is the way to go.
November 28, 2012 at 9:45 am
many thanks
November 28, 2012 at 11:40 am
LoosinMaMind (11/28/2012)
many thanks
I want to ephasize again that there must be some way of guaranteeing the insert order of the original data so the IDENTITY column will actually reflect the correct order. Parallelism must be defeated using OPTION(MAXDOP 1) on the original insert if fro a bulk source, just for starters.
What is the source of the original data? A file? Or is the app writing directly to the table?
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply