September 30, 2017 at 12:54 pm
rmcfaden - Saturday, September 30, 2017 9:04 AMJason,
OMG! 300K+ records output in 6 seconds versus what I was doing which took between 1 and 2 minutes.
I've checked a few records so far and they match up to outputting and transposing in Excel.One oddity though, I ran it against a small sample first, similar to the data above, but got confusing results:
OMG! 300K+ records output in 6 seconds versus what I was doing which took between 1 and 2 minutes.
And now you know why the forum regulars are so fond of that function. A lot of very smart people have put in lots of hours trying to best it over the years.
One oddity though, I ran it against a small sample first, similar to the data above, but got confusing results:
Now that, is really strange!!! It's something that's happening in the aggregation...
If you run just this, you'll see that the function is putting out the correct values.
SELECT *
FROM
#TABLEA t
CROSS APPLY ( VALUES ('A', t.actual_Track), ('P', t.pieces_track) ) trk (tType, tValue)
CROSS APPLY dbo.SplitCSVToTable8K(trk.tValue, ';') sc
Hopefully someone, smarter than myself, will come along and shed some light on this behavior, because this is the first time that I've seen it myself...
In any case, it fix is a simple kludge that doesn't appear to add any actual cost to the plan.
SELECT
t.PK_ID,
Job_Prod = MAX(t.Job_Prod),
Job_Id = MAX(t.Job_Id),
Actual_Start = DATEADD(DAY, ROW_NUMBER() OVER (PARTITION BY t.PK_ID ORDER BY t.PK_ID) -1, MAX(t.Actual_Start)),
actual_Track = STUFF(MAX(CASE WHEN trk.tType = 'A' THEN '0'+sc.Item ELSE '0' END), 1, 1, ''),
pieces_track = STUFF(MAX(CASE WHEN trk.tType = 'P' THEN '0'+sc.Item ELSE '0' END), 1, 1, '')
FROM
#TABLEA t
CROSS APPLY ( VALUES ('A', t.actual_Track), ('P', t.pieces_track) ) trk (tType, tValue)
CROSS APPLY dbo.SplitCSVToTable8K(trk.tValue, ';') sc
GROUP BY
t.PK_ID,
sc.ItemNumber
ORDER BY
t.PK_ID,
sc.ItemNumber;
PK_ID Job_Prod Job_Id Actual_Start actual_Track pieces_track
----------- ------------ ---------------- ----------------------- ------------- ---------------
1 Virginia VirginiaA 2017-09-24 08:00:00.000 5 1
1 Virginia VirginiaA 2017-09-25 08:00:00.000 4 2
1 Virginia VirginiaA 2017-09-26 08:00:00.000 3 3
1 Virginia VirginiaA 2017-09-27 08:00:00.000 2 4
1 Virginia VirginiaA 2017-09-28 08:00:00.000 1 5
2 Virginia VirginiaB 2017-09-25 08:00:00.000 1 5
2 Virginia VirginiaB 2017-09-26 08:00:00.000 2 4
2 Virginia VirginiaB 2017-09-27 08:00:00.000 3 3
2 Virginia VirginiaB 2017-09-28 08:00:00.000 4 2
2 Virginia VirginiaB 2017-09-29 08:00:00.000 5 1
3 California CaliforniaA 2017-09-24 08:00:00.000 .1 .4
3 California CaliforniaA 2017-09-25 08:00:00.000 .2 .3
3 California CaliforniaA 2017-09-26 08:00:00.000 .3 .2
3 California CaliforniaA 2017-09-27 08:00:00.000 .4 .1
4 California CaliforniaB 2017-09-25 08:00:00.000 .4 .1
4 California CaliforniaB 2017-09-26 08:00:00.000 .3 .2
4 California CaliforniaB 2017-09-27 08:00:00.000 .2 .3
4 California CaliforniaB 2017-09-28 08:00:00.000 .1 .4
October 1, 2017 at 4:22 pm
bmg002 - Friday, September 29, 2017 10:55 AMIf I understand your request properly, I THINK that this will do exactly what you are asking, no:
http://sqljason.com/2010/05/converting-single-comma-separated-row.html
It uses cross apply and you don't need to worry about the split function.If that isn't correct, could you post your table creation script, an INSERT script with sample data, the queries you have tried, and your desired output?
EDIT - sorry, didn't see Steve already suggested getting the scripts.
On another note - is this required to be done inside SQL? If not, could you split the string at a different level such as SSRS, SSIS, or some home built application?
Or could you split the string at table insert time in the application and do multiple inserts instead of inserting a comma separated list?
I ask because based on my experience, SQL will very rarely out-perform any other application in terms of string splitting. Fixing the data now I think is good, but you probably also want to fix it at the data insertion level as well to reduce the use of splitting things inside SQL.
The XML conversion method is as slow and sometimes slower than the use of a WHILE Loop, It doesn't stand a chance against the DelimitedSplit8K function. Certain "famous" authors have made it appear that the XML method is a good way because of the rather stupid way they've conducted their tests.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 1, 2017 at 4:38 pm
rmcfaden - Saturday, September 30, 2017 9:04 AMJason,
OMG! 300K+ records output in 6 seconds versus what I was doing which took between 1 and 2 minutes.
I've checked a few records so far and they match up to outputting and transposing in Excel.One oddity though, I ran it against a small sample first, similar to the data above, but got confusing results:
USE TEMPDB
GOIF OBJECT_ID('dbo.TABLEA','U') IS NOT NULL DROP TABLE dbo.TABLEA;
CREATE TABLE TABLEA (
[Job_Prod] varchar(100),
[Job_No] varchar(16),
Act_Start datetime,
actual_Track varchar(1000),
pieces_track varchar(1000)
);IF OBJECT_ID('TempDb..#TABLEA','U') IS NOT NULL DROP TABLE #TABLEA;
CREATE TABLE #TABLEA (
PK_ID INT NOT NULL IDENTITY(1,1),
[Job_Prod] varchar(100),
[Job_No] varchar(16),
Act_Start datetime,
actual_Track varchar(1000),
pieces_track varchar(1000)
);INSERT INTO TABLEA VALUES ('Virginia', 'VirginiaA', '2017-09-24 08:00:00.000','5;4;3;2;1','1;2;3;4;5')
INSERT INTO TABLEA VALUES ('Virginia', 'VirginiaB', '2017-09-25 08:00:00.000', '1;2;3;4;5','5;4;3;2;1')
INSERT INTO TABLEA VALUES ('California', 'CaliforniaA', '2017-09-24 08:00:00.000', '.1;.2;.3;.4','.4;.3;.2;.1')
INSERT INTO TABLEA VALUES ('California', 'CaliforniaB', '2017-09-25 08:00:00.000', '.4;.3;.2;.1','.1;.2;.3;.4')INSERT INTO #TABLEA SELECT [Job_No], [Job_Prod], Act_Start, actual_Track, pieces_track
FROM TABLEA--SELECT A.[Job_No], A.Job_Prod,
-- SplitA.a.value('.', 'VARCHAR(100)') AS StringA,
-- SplitB.b.value('.', 'VARCHAR(100)') AS StringB
--FROM (SELECT [Job_No], Job_Prod,
-- CAST ('<M>' + REPLACE([actual_Track], ';', '</M><M>') + '</M>' AS XML) AS StringA,
-- CAST ('<M>' + REPLACE([pieces_Track], ';', '</M><M>') + '</M>' AS XML) AS StringB
-- FROM TableA) AS A CROSS APPLY StringA.nodes ('/M') AS SplitA(a)
-- CROSS APPLY StringB.nodes ('/M') AS SplitB(b);--IF OBJECT_ID(N'#temp1') IS NOT NULL
-- DROP TABLE #temp1
--GO
--SELECT A.[Job_No],
-- Split.a.value('.','VARCHAR(100)') as String1,
-- A.pieces_track
--INTO #temp1
--FROM (SELECT [state], CAST('<M>' + REPLACE([City],',','</M><M>')+'</M>' AS XML) AS String
-- FROM TableA) AS A CROSS APPLY String.nodes('/M') AS Split(a);
--SELECT A.[State],
-- A.String1,
-- Split.a.value('.','VARCHAR(100)') as String2
--FROM (SELECT [state],[String1],CAST('<M>' + REPLCAE([City],',','</M><M>')+'</M>' AS XML) AS String
-- FROM #temp1) AS A CROSS APPLY String.nodes('/M') AS Split(a);SELECT
t.PK_ID,
Job_Prod = MAX(t.Job_Prod),
Job_No = MAX(t.Job_No),
Act_Start = DATEADD(DAY, ROW_NUMBER() OVER (PARTITION BY t.PK_ID ORDER BY t.PK_ID) -1, MAX(t.Act_Start)),
actual_Track = MAX(CASE WHEN trk.tType = 'A' THEN sc.Item ELSE '0' END),
pieces_track = MAX(CASE WHEN trk.tType = 'P' THEN sc.Item ELSE '0' END)
FROM
#TABLEA t
CROSS APPLY ( VALUES ('A', t.actual_Track), ('P', t.pieces_track) ) trk (tType, tValue)
CROSS APPLY dbo.DelimitedSplit8K(trk.tValue, ';') sc
GROUP BY
t.PK_ID,
sc.ItemNumber
ORDER BY
t.PK_ID,
sc.ItemNumber;
PK_ID Job_Prod Job_No Act_Start actual_Track pieces_track
1 VirginiaA Virginia 2017-09-24 08:00:00.000 5 1
1 VirginiaA Virginia 2017-09-25 08:00:00.000 4 2
1 VirginiaA Virginia 2017-09-26 08:00:00.000 3 3
1 VirginiaA Virginia 2017-09-27 08:00:00.000 2 4
1 VirginiaA Virginia 2017-09-28 08:00:00.000 1 5
2 VirginiaB Virginia 2017-09-25 08:00:00.000 1 5
2 VirginiaB Virginia 2017-09-26 08:00:00.000 2 4
2 VirginiaB Virginia 2017-09-27 08:00:00.000 3 3
2 VirginiaB Virginia 2017-09-28 08:00:00.000 4 2
2 VirginiaB Virginia 2017-09-29 08:00:00.000 5 1
3 CaliforniaA California 2017-09-24 08:00:00.000 0 0
3 CaliforniaA California 2017-09-25 08:00:00.000 0 0
3 CaliforniaA California 2017-09-26 08:00:00.000 0 0
3 CaliforniaA California 2017-09-27 08:00:00.000 0 0
4 CaliforniaB California 2017-09-25 08:00:00.000 0 0
4 CaliforniaB California 2017-09-26 08:00:00.000 0 0
4 CaliforniaB California 2017-09-27 08:00:00.000 0 0
4 CaliforniaB California 2017-09-28 08:00:00.000 0 0
Why are you creating #TableA???
--Jeff Moden
Change is inevitable... Change for the better is not.
October 1, 2017 at 11:26 pm
Jeff,
First let me say thank you for taking the time to reply to my post. Thank you to all of you who have taken time to assist a nearly clueless noob.
Creating #TABLEA because I was told it needed a Primary Key to use with the OVER.
The Job_Prod is unique within the "TABLEA" table (which obviously goes by a different name in my db). Should I be able to use it instead, it not being numeric, nor sequential?
Perhaps it would be better to just use the existing table and use ROW_NUMBER as suggested by Jason.
I tried this:
SELECT
--t.PK_ID,
Job_Prod = MAX(t.Job_Prod),
Job_No = MAX(t.Job_No),
TrDate = DATEADD(DAY, ROW_NUMBER() OVER (PARTITION BY t.Job_Prod ORDER BY t.Job_Prod) -1, MAX(t.Act_Start)),
[Hours] = Convert(Float,STUFF(MAX(CASE WHEN trk.tType = 'A' THEN '0'+sc.Item ELSE '0' END), 1, 1, '')),
Pieces = Convert(Int,STUFF(MAX(CASE WHEN trk.tType = 'P' THEN '0'+sc.Item ELSE '0' END), 1, 1, ''))
FROM
--#TABLEA t
dbo.TABLEA t
CROSS APPLY ( VALUES ('A', t.actual_Track), ('P', t.pieces_track) ) trk (tType, tValue)
CROSS APPLY dbo.DelimitedSplit8K(trk.tValue, ';') sc
GROUP BY
t.Job_Prod,
sc.ItemNumber
ORDER BY
t.Job_Prod,
sc.ItemNumber;
This gave me many Null values for many of my dates.
Then I tried:
WITH CTETABLEA AS (
SELECT ROW_NUMBER() OVER (ORDER BY JOB_PROD) PK_ID,
[Job_Prod], [Job_No], Act_Start, actual_Track, pieces_track FROM TABLEA
)
SELECT
--t.PK_ID,
Job_Prod = MAX(t.Job_Prod),
Job_No = MAX(t.Job_No),
TrDate = DATEADD(DAY, ROW_NUMBER() OVER (PARTITION BY t.PK_ID ORDER BY t.PK_ID) -1, MAX(t.Act_Start)),
[Hours] = Convert(Float,STUFF(MAX(CASE WHEN trk.tType = 'A' THEN '0'+sc.Item ELSE '0' END), 1, 1, '')),
Pieces = Convert(Int,STUFF(MAX(CASE WHEN trk.tType = 'P' THEN '0'+sc.Item ELSE '0' END), 1, 1, ''))
FROM
--#TABLEA t
CTETABLEA t
CROSS APPLY ( VALUES ('A', t.actual_Track), ('P', t.pieces_track) ) trk (tType, tValue)
CROSS APPLY dbo.DelimitedSplit8K(trk.tValue, ';') sc
GROUP BY
t.PK_ID,
sc.ItemNumber
ORDER BY
t.PK_ID,
sc.ItemNumber;
Which seems to have gotten rid of the improper NULL dates. Those that remain are associated with no actual Start date and no track strings in the main table.
Is the CTE better than the Temp table?
October 2, 2017 at 7:17 am
I'm still not clear on why there's the desired output lists data that is not present nor easily determined from the existing data, due to unequal numbers of elements in the two strings that are being split. I can get the date to increment using DATEADD and ROW_NUMBER, but I think the lack of an equal number of string elements within each string is not easily programmatically accomplished. This query will produce rows for all the elements that exist in both strings, but not for any element where there's no matching element in the other string. If that's a typo in the original data, let me know. I realize others have already posted what is probably this same code, but wanted to at least get back to this:CREATE TABLE #TABLEA (
Job_Prod varchar(100),
Job_Id varchar(16),
Actual_Start datetime,
actual_Track varchar(1000),
pieces_track varchar(1000)
);
INSERT INTO #TABLEA (Job_Prod, Job_Id, Actual_Start, actual_Track, pieces_track)
VALUES ('VirginiaA', 'Virginia', '2017-09-24 08:00:00.000', '5;4;3;2;1','1;2;3;4;5;6'),
('VirginiaB', 'Virginia', '2017-09-25 08:00:00.000', '1;2;3;4;5;6','5;4;3;2;1'),
('CaliforniaA', 'California', '2017-09-24 08:00:00.000', '0.1;0.2;0.3;0.4','4;3;2;1'),
('CaliforniaB', 'California', '2017-09-25 08:00:00.000', '1.4;2.3;3.2;4.1','1;2;3;4');
SELECT Job_Prod, Job_ID,
CONVERT(date, DATEADD(day, ROW_NUMBER() OVER(PARTITION BY TA.Job_Prod ORDER BY S1.ItemNumber) - 1, TA.Actual_Start), 101) AS TrDate,
S1.Item AS FirstDateField, S2.Item AS SecondDateField
FROM #TABLEA AS TA
CROSS APPLY dbo.DelimitedSplit8K(TA.actual_Track, ';') AS S1
CROSS APPLY dbo.DelimitedSplit8K(TA.pieces_track, ';') AS S2
WHERE S1.ItemNumber = S2.ItemNumber
ORDER BY Job_Prod;
DROP TABLE #TABLEA;
Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)
October 2, 2017 at 8:13 am
rmcfaden - Sunday, October 1, 2017 11:26 PMJeff,
First let me say thank you for taking the time to reply to my post. Thank you to all of you who have taken time to assist a nearly clueless noob.
Creating #TABLEA because I was told it needed a Primary Key to use with the OVER.
The Job_Prod is unique within the "TABLEA" table (which obviously goes by a different name in my db). Should I be able to use it instead, it not being numeric, nor sequential?
Perhaps it would be better to just use the existing table and use ROW_NUMBER as suggested by Jason.
You doing very well if you claim to be "clueless". At the very least, you're asking the right people for help.
You don't need a PK to use OVER for this. But, if Job_Prod is certain to be unique in TableA, then you already have something that's UNIQUE PER ROW and you don't need to use OVER at all. Use the Job_Prod column in a manner similar to what Jason was suggesting for ROW_NUMBER() OVER.
I'm at work and don't have much time to help right now. I'll take another gander after work.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 2, 2017 at 11:40 am
Jeff Moden - Sunday, October 1, 2017 4:22 PMThe XML conversion method is as slow and sometimes slower than the use of a WHILE Loop, It doesn't stand a chance against the DelimitedSplit8K function. Certain "famous" authors have made it appear that the XML method is a good way because of the rather stupid way they've conducted their tests.
I didn't check the forum on the weekend and it looks like you guys made a lot of progress on it.
I just wanted to comment about that 8K spliiter function - while I do agree that it performs quite well, I've also read in places that it was outperformed by other things including XML in some cases. In those cases though, I believe it wasn't the most up to date version of the 8K splitter. That being said, there isn't a nice automated way that I am aware of to get the latest version when it gets updated whereas XML is more of a standard and will continue to be supported in the future.
Another nice thing about the XML method is that even though it might not perform as well as the 8K splitter function in some cases, it will work against almost any out of box SQL install without the need of finding the latest 8K splitter every time you install a new SQL instance. I am not sure if the article I read recently still holds true (https://sqlperformance.com/2012/07/t-sql-queries/split-strings), but it indicates that XML can actually be faster than the 8k delimited splitter that is on SSC for small data sets (here I am referring to the column size not the number of rows). But that article is a bit old (2012) so it may no longer hold true. Plus that article is more about splitting varchar(max) and nvarchar(max) which the 8k spliiter here can't do. I was not able to find a more recent article comparing the different string splitting functions, but I would be curious about it and may do some internal testing when I have more time.
I am fairly certain that on similar hardware, a C# application should perform faster, but would be more work to get the code running; especially if you wanted some form of UI and making it multithreaded, but even then it wouldn't be horrbile to do.
But now I feel I am getting into the weeds on this and will bow out. I'm glad the OP got a solution that works.
The above is all just my opinion on what you should do.
As with all advice you find on a random internet forum - you shouldn't blindly follow it. Always test on a test server to see if there is negative side effects before making changes to live!
I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.
October 2, 2017 at 2:19 pm
bmg002 - Monday, October 2, 2017 11:40 AMJeff Moden - Sunday, October 1, 2017 4:22 PMThe XML conversion method is as slow and sometimes slower than the use of a WHILE Loop, It doesn't stand a chance against the DelimitedSplit8K function. Certain "famous" authors have made it appear that the XML method is a good way because of the rather stupid way they've conducted their tests.I didn't check the forum on the weekend and it looks like you guys made a lot of progress on it.
I just wanted to comment about that 8K spliiter function - while I do agree that it performs quite well, I've also read in places that it was outperformed by other things including XML in some cases. In those cases though, I believe it wasn't the most up to date version of the 8K splitter. That being said, there isn't a nice automated way that I am aware of to get the latest version when it gets updated whereas XML is more of a standard and will continue to be supported in the future.
Another nice thing about the XML method is that even though it might not perform as well as the 8K splitter function in some cases, it will work against almost any out of box SQL install without the need of finding the latest 8K splitter every time you install a new SQL instance. I am not sure if the article I read recently still holds true (https://sqlperformance.com/2012/07/t-sql-queries/split-strings), but it indicates that XML can actually be faster than the 8k delimited splitter that is on SSC for small data sets (here I am referring to the column size not the number of rows). But that article is a bit old (2012) so it may no longer hold true. Plus that article is more about splitting varchar(max) and nvarchar(max) which the 8k spliiter here can't do. I was not able to find a more recent article comparing the different string splitting functions, but I would be curious about it and may do some internal testing when I have more time.I am fairly certain that on similar hardware, a C# application should perform faster, but would be more work to get the code running; especially if you wanted some form of UI and making it multithreaded, but even then it wouldn't be horrbile to do.
But now I feel I am getting into the weeds on this and will bow out. I'm glad the OP got a solution that works.
The articles where you've read that have come to that conclusion because of the single row data they've replicated (I call it "grooved" data). I'm almost done with the articles that explain what should have been an obvious shortcoming in the test data. Further, some of the articles also changed the actual code of the function to fit their needs and they broke it in the process.
The really bad part about it is that they've provided a test harness that produces the faulty scenario and most readers apparently don't understand the "Devils in the Data" that actually leads people into adopting the XML splitter, which is one of the slowest of them all.
Considering your "findings" on it all above, I'll try to expedite those articles.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 2, 2017 at 4:42 pm
Jeff Moden - Monday, October 2, 2017 8:13 AMrmcfaden - Sunday, October 1, 2017 11:26 PMJeff,
First let me say thank you for taking the time to reply to my post. Thank you to all of you who have taken time to assist a nearly clueless noob.
Creating #TABLEA because I was told it needed a Primary Key to use with the OVER.
The Job_Prod is unique within the "TABLEA" table (which obviously goes by a different name in my db). Should I be able to use it instead, it not being numeric, nor sequential?
Perhaps it would be better to just use the existing table and use ROW_NUMBER as suggested by Jason.You doing very well if you claim to be "clueless". At the very least, you're asking the right people for help.
You don't need a PK to use OVER for this. But, if Job_Prod is certain to be unique in TableA, then you already have something that's UNIQUE PER ROW and you don't need to use OVER at all. Use the Job_Prod column in a manner similar to what Jason was suggesting for ROW_NUMBER() OVER.
I'm at work and don't have much time to help right now. I'll take another gander after work.
This seems to work just as well as using a temp table or CTE:
SELECT
Job_Prod = MAX(t.Job_Prod),
Job_No = MAX(t.Job_No),
TrDate = DATEADD(DAY, ROW_NUMBER() OVER (PARTITION BY t.Job_Prod ORDER BY t.Job_Prod) -1, MAX(t.Act_Start)),
[Hours] = Convert(Float,STUFF(MAX(CASE WHEN trk.tType = 'A' THEN '0'+sc.Item ELSE '0' END), 1, 1, '')),
Pieces = Convert(Int,STUFF(MAX(CASE WHEN trk.tType = 'P' THEN '0'+sc.Item ELSE '0' END), 1, 1, ''))
FROM
TABLEA t
CROSS APPLY ( VALUES ('A', t.actual_Track), ('P', t.pieces_track) ) trk (tType, tValue)
CROSS APPLY dbo.DelimitedSplit8K(trk.tValue, ';') sc
GROUP BY
t.Job_Prod,
sc.ItemNumber
ORDER BY
t.Job_Prod,
sc.ItemNumber;
I'm guessing that the use of MAX here is to avoid Grouping by those items. because the query works equally as well as :
SELECT
Job_Prod,
Job_No,
TrDate = DATEADD(DAY, ROW_NUMBER() OVER (PARTITION BY t.JOB_PROD ORDER BY t.JOB_PROD) -1, t.Act_Start),
[Hours] = Convert(Float,STUFF(MAX(CASE WHEN trk.tType = 'A' THEN '0'+sc.Item ELSE '0' END), 1, 1, '')),
Pieces = Convert(Int,STUFF(MAX(CASE WHEN trk.tType = 'P' THEN '0'+sc.Item ELSE '0' END), 1, 1, ''))
FROM
SS t
CROSS APPLY ( VALUES ('A', t.actual_Track), ('P', t.pieces_track) ) trk (tType, tValue)
CROSS APPLY dbo.DelimitedSplit8K(trk.tValue, ';') sc
GROUP BY
t.JOB_PROD,
Job_No,
t.Act_Start,
sc.ItemNumber
ORDER BY
t.JOB_PROD,
sc.ItemNumber;
3 seconds
October 2, 2017 at 4:51 pm
rmcfaden - Monday, October 2, 2017 4:42 PMJeff Moden - Monday, October 2, 2017 8:13 AMrmcfaden - Sunday, October 1, 2017 11:26 PMJeff,
First let me say thank you for taking the time to reply to my post. Thank you to all of you who have taken time to assist a nearly clueless noob.
Creating #TABLEA because I was told it needed a Primary Key to use with the OVER.
The Job_Prod is unique within the "TABLEA" table (which obviously goes by a different name in my db). Should I be able to use it instead, it not being numeric, nor sequential?
Perhaps it would be better to just use the existing table and use ROW_NUMBER as suggested by Jason.You doing very well if you claim to be "clueless". At the very least, you're asking the right people for help.
You don't need a PK to use OVER for this. But, if Job_Prod is certain to be unique in TableA, then you already have something that's UNIQUE PER ROW and you don't need to use OVER at all. Use the Job_Prod column in a manner similar to what Jason was suggesting for ROW_NUMBER() OVER.
I'm at work and don't have much time to help right now. I'll take another gander after work.
This seems to work just as well as using a temp table or CTE:
SELECT
Job_Prod = MAX(t.Job_Prod),
Job_No = MAX(t.Job_No),
TrDate = DATEADD(DAY, ROW_NUMBER() OVER (PARTITION BY t.Job_Prod ORDER BY t.Job_Prod) -1, MAX(t.Act_Start)),
[Hours] = Convert(Float,STUFF(MAX(CASE WHEN trk.tType = 'A' THEN '0'+sc.Item ELSE '0' END), 1, 1, '')),
Pieces = Convert(Int,STUFF(MAX(CASE WHEN trk.tType = 'P' THEN '0'+sc.Item ELSE '0' END), 1, 1, ''))
FROM
TABLEA t
CROSS APPLY ( VALUES ('A', t.actual_Track), ('P', t.pieces_track) ) trk (tType, tValue)
CROSS APPLY dbo.DelimitedSplit8K(trk.tValue, ';') sc
GROUP BY
t.Job_Prod,
sc.ItemNumber
ORDER BY
t.Job_Prod,
sc.ItemNumber;I'm guessing that the use of MAX here is to avoid Grouping by those items. because the query works equally as well as :
SELECT
Job_Prod,
Job_No,
TrDate = DATEADD(DAY, ROW_NUMBER() OVER (PARTITION BY t.JOB_PROD ORDER BY t.JOB_PROD) -1, t.Act_Start),
[Hours] = Convert(Float,STUFF(MAX(CASE WHEN trk.tType = 'A' THEN '0'+sc.Item ELSE '0' END), 1, 1, '')),
Pieces = Convert(Int,STUFF(MAX(CASE WHEN trk.tType = 'P' THEN '0'+sc.Item ELSE '0' END), 1, 1, ''))
FROM
SS t
CROSS APPLY ( VALUES ('A', t.actual_Track), ('P', t.pieces_track) ) trk (tType, tValue)
CROSS APPLY dbo.DelimitedSplit8K(trk.tValue, ';') sc
GROUP BY
t.JOB_PROD,
Job_No,
t.Act_Start,
sc.ItemNumber
ORDER BY
t.JOB_PROD,
sc.ItemNumber;3 seconds
3 Seconds for how many rows?
--Jeff Moden
Change is inevitable... Change for the better is not.
October 2, 2017 at 5:13 pm
Jeff Moden - Monday, October 2, 2017 4:51 PMrmcfaden - Monday, October 2, 2017 4:42 PMJeff Moden - Monday, October 2, 2017 8:13 AMrmcfaden - Sunday, October 1, 2017 11:26 PMJeff,
First let me say thank you for taking the time to reply to my post. Thank you to all of you who have taken time to assist a nearly clueless noob.
Creating #TABLEA because I was told it needed a Primary Key to use with the OVER.
The Job_Prod is unique within the "TABLEA" table (which obviously goes by a different name in my db). Should I be able to use it instead, it not being numeric, nor sequential?
Perhaps it would be better to just use the existing table and use ROW_NUMBER as suggested by Jason.You doing very well if you claim to be "clueless". At the very least, you're asking the right people for help.
You don't need a PK to use OVER for this. But, if Job_Prod is certain to be unique in TableA, then you already have something that's UNIQUE PER ROW and you don't need to use OVER at all. Use the Job_Prod column in a manner similar to what Jason was suggesting for ROW_NUMBER() OVER.
I'm at work and don't have much time to help right now. I'll take another gander after work.
This seems to work just as well as using a temp table or CTE:
SELECT
Job_Prod = MAX(t.Job_Prod),
Job_No = MAX(t.Job_No),
TrDate = DATEADD(DAY, ROW_NUMBER() OVER (PARTITION BY t.Job_Prod ORDER BY t.Job_Prod) -1, MAX(t.Act_Start)),
[Hours] = Convert(Float,STUFF(MAX(CASE WHEN trk.tType = 'A' THEN '0'+sc.Item ELSE '0' END), 1, 1, '')),
Pieces = Convert(Int,STUFF(MAX(CASE WHEN trk.tType = 'P' THEN '0'+sc.Item ELSE '0' END), 1, 1, ''))
FROM
TABLEA t
CROSS APPLY ( VALUES ('A', t.actual_Track), ('P', t.pieces_track) ) trk (tType, tValue)
CROSS APPLY dbo.DelimitedSplit8K(trk.tValue, ';') sc
GROUP BY
t.Job_Prod,
sc.ItemNumber
ORDER BY
t.Job_Prod,
sc.ItemNumber;I'm guessing that the use of MAX here is to avoid Grouping by those items. because the query works equally as well as :
SELECT
Job_Prod,
Job_No,
TrDate = DATEADD(DAY, ROW_NUMBER() OVER (PARTITION BY t.JOB_PROD ORDER BY t.JOB_PROD) -1, t.Act_Start),
[Hours] = Convert(Float,STUFF(MAX(CASE WHEN trk.tType = 'A' THEN '0'+sc.Item ELSE '0' END), 1, 1, '')),
Pieces = Convert(Int,STUFF(MAX(CASE WHEN trk.tType = 'P' THEN '0'+sc.Item ELSE '0' END), 1, 1, ''))
FROM
SS t
CROSS APPLY ( VALUES ('A', t.actual_Track), ('P', t.pieces_track) ) trk (tType, tValue)
CROSS APPLY dbo.DelimitedSplit8K(trk.tValue, ';') sc
GROUP BY
t.JOB_PROD,
Job_No,
t.Act_Start,
sc.ItemNumber
ORDER BY
t.JOB_PROD,
sc.ItemNumber;3 seconds
3 Seconds for how many rows?
353,002
October 2, 2017 at 7:37 pm
Cool. I'm thinking that we can simplify a bit. Using one of the code pieces that you posted from the collective thoughts of the people on this thread, we can actually do away with the ROW_NUMBER() (Notice how the ItemNumber from the function was used instead. I'm also not sure what the STUFF was being used for so I removed that, as well. In the process, that gets rid of a concatenation for each STUFF as well.
-- USE TEMPDB
GO
--=================================================================================================
-- Create and Populate the test table. This is NOT a part of the solution.
--=================================================================================================
IF OBJECT_ID('dbo.TableA','U') IS NOT NULL DROP TABLE dbo.TableA;
GO
CREATE TABLE TABLEA
(
Job_Prod VARCHAR(100)
,Job_No VARCHAR(16)
,Actual_Start DATETIME
,Actual_Track VARCHAR(1000)
,Pieces_Track VARCHAR(1000)
)
;
INSERT INTO TABLEA VALUES
('VirginiaA' , 'Virginia' , '2017-09-24 08:00:00.000', '5;4;3;2;1' ,'1;2;3;4;5;6')
,('VirginiaB' , 'Virginia' , '2017-09-25 08:00:00.000', '1;2;3;4;5;6' ,'5;4;3;2;1')
,('CaliforniaA', 'California', '2017-09-24 08:00:00.000', '0.1;0.2;0.3;0.4','4;3;2;1')
,('CaliforniaB', 'California', '2017-09-25 08:00:00.000', '1.4;2.3;3.2;4.1','1;2;3;4')
;
GO
--=================================================================================================
-- Solve the problem using the ItemNumber from the function to create the sequential dates
-- starting with the "Actual_Start" date.
--=================================================================================================
SELECT Job_Prod
,Job_No
,TrDate = DATEADD(dd, sc.ItemNumber-1, t.Actual_Start)
,[Hours] = CONVERT(FLOAT,MAX(CASE WHEN trk.tType = 'A' THEN sc.Item ELSE '0' END))
,Pieces = CONVERT(INT ,MAX(CASE WHEN trk.tType = 'P' THEN sc.Item ELSE '0' END))
FROM dbo.TableA t
CROSS APPLY (VALUES ('A', t.actual_Track)
,('P', t.pieces_track)
) trk (tType, tValue)
CROSS APPLY dbo.DelimitedSplit8K(trk.tValue, ';') sc
GROUP BY Job_Prod
,t.Job_No
,t.Actual_Start
,sc.ItemNumber
ORDER BY Job_Prod
,Job_No
,sc.ItemNumber
;
--Jeff Moden
Change is inevitable... Change for the better is not.
October 2, 2017 at 9:09 pm
Jeff Moden - Monday, October 2, 2017 7:37 PMCool. I'm thinking that we can simplify a bit. Using one of the code pieces that you posted from the collective thoughts of the people on this thread, we can actually do away with the ROW_NUMBER() (Notice how the ItemNumber from the function was used instead. I'm also not sure what the STUFF was being used for so I removed that, as well. In the process, that gets rid of a concatenation for each STUFF as well.
As to the row_number() / PK thing... In my defense, it was after 3:00 AM and I was fighting a bad case of insomnia when I posted my original solution. I assumed (yea I know) that the data didn't have uniqueness withing the presented data... At least that's my excuse, and I'm stinking with it!
As to the:
I'm also not sure what the STUFF was being used for so I removed that, as well. In the process, that gets rid of a concatenation for each STUFF as well.
That wasn't part of the original solution, I added it after mcfaden discovered some seriously odd behavior when testing with the following data set... Plug this into your solution... You really do need to see this for yourself... I'm still scratching my head.INSERT INTO TABLEA VALUES ('Virginia', 'VirginiaA', '2017-09-24 08:00:00.000','5;4;3;2;1','1;2;3;4;5')
INSERT INTO TABLEA VALUES ('Virginia', 'VirginiaB', '2017-09-25 08:00:00.000', '1;2;3;4;5','5;4;3;2;1')
INSERT INTO TABLEA VALUES ('California', 'CaliforniaA', '2017-09-24 08:00:00.000', '.1;.2;.3;.4','.4;.3;.2;.1')
INSERT INTO TABLEA VALUES ('California', 'CaliforniaB', '2017-09-25 08:00:00.000', '.4;.3;.2;.1','.1;.2;.3;.4')
October 3, 2017 at 6:59 am
Jason A. Long - Monday, October 2, 2017 9:09 PMJeff Moden - Monday, October 2, 2017 7:37 PMCool. I'm thinking that we can simplify a bit. Using one of the code pieces that you posted from the collective thoughts of the people on this thread, we can actually do away with the ROW_NUMBER() (Notice how the ItemNumber from the function was used instead. I'm also not sure what the STUFF was being used for so I removed that, as well. In the process, that gets rid of a concatenation for each STUFF as well.As to the row_number() / PK thing... In my defense, it was after 3:00 AM and I was fighting a bad case of insomnia when I posted my original solution. I assumed (yea I know) that the data didn't have uniqueness withing the presented data... At least that's my excuse, and I'm stinking with it!
As to the:
I'm also not sure what the STUFF was being used for so I removed that, as well. In the process, that gets rid of a concatenation for each STUFF as well.
That wasn't part of the original solution, I added it after mcfaden discovered some seriously odd behavior when testing with the following data set... Plug this into your solution... You really do need to see this for yourself... I'm still scratching my head.INSERT INTO TABLEA VALUES ('Virginia', 'VirginiaA', '2017-09-24 08:00:00.000','5;4;3;2;1','1;2;3;4;5')
INSERT INTO TABLEA VALUES ('Virginia', 'VirginiaB', '2017-09-25 08:00:00.000', '1;2;3;4;5','5;4;3;2;1')
INSERT INTO TABLEA VALUES ('California', 'CaliforniaA', '2017-09-24 08:00:00.000', '.1;.2;.3;.4','.4;.3;.2;.1')
INSERT INTO TABLEA VALUES ('California', 'CaliforniaB', '2017-09-25 08:00:00.000', '.4;.3;.2;.1','.1;.2;.3;.4')
Good, Sir, with the quality of the code you write, you never have to justify yourself to me. And, yes... I absolutely did realize that the OP added the STUFF thing. When I said that the code was the result of the "collective" efforts, I did include the OP in that. And, yes... the OP didn't cough up that the Job_Prod column was, in fact, unique until fairly late in the game. Heh... I was just doing the BASF thing... "We don't make the paint, we just make it dry a little quicker". 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
October 3, 2017 at 7:21 am
Jeff Moden - Tuesday, October 3, 2017 6:59 AMGood, Sir, with the quality of the code you write, you never have to justify yourself to me. And, yes... I absolutely did realize that the OP added the STUFF thing. When I said that the code was the result of the "collective" efforts, I did include the OP in that. And, yes... the OP didn't cough up that the Job_Prod column was, in fact, unique until fairly late in the game. Heh... I was just doing the BASF thing... "We don't make the paint, we just make it dry a little quicker". 😉
Actually, Jason added the Stuff thing. That was because in my testing I found that oddity for that test data.
I do apologize for not letting everyone in on the uniqueness of the Job_Prod column. It did not dawn on me to mention that. (NOOB)
BTW, I am struggling with which post to acknowledge as the answer.
You guys are awesome and I greatly appreciate all the efforts you have put into assisting me and certainly for all you do for everyone who posts on this site.
I was thinking this morning that I would like to know what you guys know, but then again, if all that knowledge swimming around in your brains leads to insomnia, perhaps I do not want that.
Viewing 15 posts - 16 through 30 (of 46 total)
You must be logged in to reply to this topic. Login to reply