January 5, 2016 at 10:20 pm
I have a table that contains rowdata column. I need to get row_number result set based on when values in rowdata range between starting values of 5% and 8%. Each partition should be based on when 5% ends with different characters at the end of the values. For example I would like to see row_numb field list row_number when the numbers are between 5% and 8%, it should reset to 1 whenever numbers starts with 5 ends with a different letter (x, y,z). Below is how I like the OP to look. I hope I am clear.
idrowdata Row_numb
15001x 1
2600 2
3700 3
4800 4
55002y1
66000 2
7700 3
8800 4
95003z 1
10600 2
11700 3
12800 4
Here is the script to create test tbl:
drop table test
CREATE TABLE [test](
[id] [int] NOT NULL identity,
[rowdata] [nvarchar](10) NULL
)
INSERT test (rowdata) VALUES ('5001x')
INSERT test (rowdata) VALUES ('600')
INSERT test (rowdata) VALUES ('700')
INSERT test (rowdata) VALUES ('800')
INSERT test (rowdata) VALUES ('5002y')
INSERT test (rowdata) VALUES ('6000')
INSERT test (rowdata) VALUES ('700')
INSERT test (rowdata) VALUES ('800')
INSERT test (rowdata) VALUES ('5003z')
INSERT test (rowdata) VALUES ('600')
INSERT test (rowdata) VALUES ('700')
INSERT test (rowdata) VALUES ('800')
GO
January 5, 2016 at 10:34 pm
Quick solution
😎
USE tempdb;
GO
SET NOCOUNT ON;
IF OBJECT_ID(N'dbo.testRN') IS NOT NULL DROP TABLE dbo.testRN
CREATE TABLE dbo.testRN
(
[id] [int] NOT NULL identity,
[rowdata] [nvarchar](10) NULL
)
INSERT INTO dbo.testRN (rowdata)
VALUES ('5001x')
,('600')
,('700')
,('800')
,('5002y')
,('6000')
,('700')
,('800')
,('5003z')
,('600')
,('700')
,('800');
;WITH BASE_DATA AS
(
SELECT
T.id
,T.rowdata
,SUM(SIGN(PATINDEX('%[A-z]%',T.rowdata))) OVER
(
ORDER BY T.id
ROWS BETWEEN UNBOUNDED PRECEDING
AND CURRENT ROW
) AS RID
FROM dbo.testRN T
)
SELECT
BD.id
,BD.rowdata
,ROW_NUMBER() OVER
(
PARTITION BY BD.RID
ORDER BY BD.id
) AS Row_numb
FROM BASE_DATA BD;
Output
id rowdata Row_numb
----------- ---------- ----------
1 5001x 1
2 600 2
3 700 3
4 800 4
5 5002y 1
6 6000 2
7 700 3
8 800 4
9 5003z 1
10 600 2
11 700 3
12 800 4
January 6, 2016 at 4:34 am
Many thanks, I will apply it on actual data and let you know if I run into issues.
January 6, 2016 at 8:47 am
Perfect, it worked like a charm! Now I have one last question to ask: The ultimate goal is after we group the data is to extract each subset of data into a new table. For example extract the first subset:
1 5001x 1
2 600 2
3 700 3
4 800 4
into table 1
and extract:
5 5002y 1
6 6000 2
7 700 3
8 800 4
into table 2 and so on.
Thank you!
January 7, 2016 at 1:15 am
lsalih (1/6/2016)
Perfect, it worked like a charm! Now I have one last question to ask: The ultimate goal is after we group the data is to extract each subset of data into a new table. For example extract the first subset:1 5001x 1
2 600 2
3 700 3
4 800 4
into table 1
and extract:
5 5002y 1
6 6000 2
7 700 3
8 800 4
into table 2 and so on.
Thank you!
The insert for the first table would look like this, think you can figure out the rest;-)
😎
;WITH BASE_DATA AS
(
SELECT
T.id
,T.rowdata
,SUM(SIGN(PATINDEX('%[A-z]%',T.rowdata))) OVER
(
ORDER BY T.id
ROWS BETWEEN UNBOUNDED PRECEDING
AND CURRENT ROW
) AS RID
FROM dbo.testRN T
)
INSERT INTO dbo.TABLE_01 (ID,ROWDATA,ROW_NUMB)
SELECT
BD.id
,BD.rowdata
,ROW_NUMBER() OVER
(
PARTITION BY BD.RID
ORDER BY BD.id
) AS Row_numb
FROM BASE_DATA BD
WHERE BD.RID = 1;
January 7, 2016 at 4:06 am
Thank you much, I got it! 🙂
January 7, 2016 at 5:01 am
For performance, I would capture the creative query with ROW_NUMBER in a temporary table, add a (clustered, nonunique) index on the row_number, and then copy the data to the new tables based off that temp table. That is probably going to be cheaper than repeating the query voer and over.
January 7, 2016 at 5:43 am
Hugo - I am putting the query in SSIS package, I will be using your recommended method. Thanks for your input.
January 7, 2016 at 5:03 pm
lsalih (1/6/2016)
Perfect, it worked like a charm! Now I have one last question to ask: The ultimate goal is after we group the data is to extract each subset of data into a new table. For example extract the first subset:1 5001x 1
2 600 2
3 700 3
4 800 4
into table 1
and extract:
5 5002y 1
6 6000 2
7 700 3
8 800 4
into table 2 and so on.
Thank you!
I have to ask, please... why are these subsets of data being stored in separate tables? The reason I ask is because there might be a MUCH less expensive solution all the way around but I need to know the original reason to think so or not.
--Jeff Moden
Change is inevitable... Change for the better is not.
January 7, 2016 at 5:19 pm
The reason is converting some old financial data into a new system, the data came in text files from different groups and it was not clean. There are few steps we had to follow to migrate the data. This step was needed as part of the requirement and it was a one time process. The data is now transferred into the new system, with the big help I got here.
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply