September 26, 2021 at 5:12 pm
we have detail daily data in a table about members., trying to finding how many new members (by considering ID1, ID2, ID3 columns values) monthly totals.
so we want to find out when is the ID1, ID2, ID3 combination first exist and mark add count for monthly.
CREATE TABLE #MYSRC
(
ID_1 VARCHAR(10),
ID_2 int,
ID_3 VARCHAR(10),
Pay1 varchar(6),
Pay2 varchar(6),
Status varchar(10)
FileDate
)
--- SOME MORE detail INFO about data
--- WE HAVE ABOUT 2 YEARS WORTH OF DAILY FEEDS DATA LOADED TO ONETABLE(MYSRCHISTORY_members) last column FileDate shows what day is the file,
--- USUALLY EACH DAY FILE HAVE ABOUT 30K RECORDS, IN WHICH IT INCLUDES OLD RECORDS AS IS, NEW ENTRIES ADDED (by considering id1,id2,id3 combination) , FOR SOME MEMBERS IT MAY CHANGE OTHER COLUMN VALUES.
--- WE FOUND MEMBER BASED ON THREE id FIELDS, IF ID_1,2,3 CHANGES THEN IT IS CONSIDER AS NEW MEMBER.
--- HERE WOULD LIKE TO KNOW HOW MANY NEWMEMBERS WERE ADDED PER EACH MONTH (MONTHLY AGGREGATION)
---Inserts 5 records in that file, 3 old 2 new (in 2 one is just id change one is
INSERT INTO #MYSRC VALUES ('3120', 250, '3241', '120.75','0.00','EN-25', '09/20/2021') -- exist same set (id1,2,3) in earlier data, so not a new member for Sep
INSERT INTO #MYSRC VALUES ('1121', 251, '3242', null,'3.00','EN-25', '09/20/2021') -- exist same set (id1,2,3) in earlier data, so not a new member for Sep
INSERT INTO #MYSRC VALUES ('3122', 250, '3243', '20.75','0.00','EN-25', '09/20/2021') -- exist same set (id1,2,3) in earlier data, so not a new member for Sep
INSERT INTO #MYSRC VALUES ('7120', 250, '3241', '120.75','0.00','EN-50', '09/03/2021') --New RECORDS because of ID1 change,
INSERT INTO #MYSRC VALUES ('1121', 251, '3242', null,'3.00','EN-25', '09/03/2021') ---Old RECORDS/member
INSERT INTO #MYSRC VALUES ('1122', 250, '3243', '20.75','0.00','EN-25', '09/02/2021') --old record
INSERT INTO #MYSRC VALUES ('1123', 252, '3244', null,null,'EN-25', '09/02/2021') --old record/member
INSERT INTO #MYSRC VALUES ('1122', 27, '3243', '20.75','0.00','EN-25', '09/01/2021') --New Set of ID1,2,3 not exist in earlier data so count it as new member for SEP
INSERT INTO #MYSRC VALUES ('1123', 252, '3244', null,null,'EN-25', '09/01/2021') --New Set of ID1,2,3 not exist in earlier data so count it as new member for SEP
---- August Data below - FileDate (last column)
INSERT INTO #MYSRC VALUES ('3120', 250, '3241', '120.75','0.00','EN-25', '08/31/2021') -- exist same set (id1,2,3) in earlier data, so not a new member for Aug
INSERT INTO #MYSRC VALUES ('1121', 251, '3242', null,'3.00','EN-25', '08/31/2021') --- exist same set (id1,2,3) in earlier data, so not a new member for Aug
INSERT INTO #MYSRC VALUES ('3122', 250, '3243', '20.75','0.00','EN-25', '08/31/2021') --- exist same set (id1,2,3) in earlier data, so not a new member for Aug
INSERT INTO #MYSRC VALUES ('3120', 250, '777', '120.75','0.00','EN-25', '08/30/2021') --New Set of ID1,2,3 not exist in earlier data so count it as new member for Aug
INSERT INTO #MYSRC VALUES ('1121', 251, '3242', null,'3.00','EN-25', '08/30/2021') --- exist same set (id1,2,3) in earlier data, so not a new member for July
INSERT INTO #MYSRC VALUES ('3120', 250, '3241', '120.75','0.00','EN-25', '08/29/2021') -- exist same set (id1,2,3) in earlier data, so not a new member for July
INSERT INTO #MYSRC VALUES ('1121', 251, '3242', null,'3.00','EN-25', '08/29/2021') -- exist same set (id1,2,3) in earlier data, so not a new member for July
INSERT INTO #MYSRC VALUES ('3122', 250, '3243', '20.75','0.00','EN-25', '08/29/2021') -- exist same set (id1,2,3) in earlier data, so not a new member for July
INSERT INTO #MYSRC VALUES ('3120', 250, '888', '120.75','0.00','EN-25', '08/18/2021') ---same set (id1,2,3) not exist earlier, appeared first in Aug 2021. so count it
INSERT INTO #MYSRC VALUES ('1121', 251, '3242', null,'3.00','EN-25', '08/17/2021')--- exist same set (id1,2,3) in earlier data, so not a new member for Aug
INSERT INTO #MYSRC VALUES ('1121', 37, '3242', null,'3.00','FU-25', '08/11/2021') --- New Set of ID1,2,3 not exist in earlier data so count it as new member for AUG
INSERT INTO #MYSRC VALUES ('3122', 36, '100', '20.75','0.00','EN-25', '08/11/2021') ---New Set of ID1,2,3 not exist in earlier data so count it as new member for AUG
---- July Data below
INSERT INTO #MYSRC VALUES ('3120', 250, '3241', '120.75','0.00','EN-25', '07/31/2021') --- exist same set (id1,2,3) in earlier data, so not a new member for July
INSERT INTO #MYSRC VALUES ('1121', 251, '3242', null,'3.00','EN-25', '07/31/2021') --- exist same set (id1,2,3) in earlier data, so not a new member for July
INSERT INTO #MYSRC VALUES ('9122', 250, '3243', '20.75','0.00','EN-25', '07/31/2021') -- New Set of ID1,2,3 not exist in earlier data so count it as new member for July
INSERT INTO #MYSRC VALUES ('3120', 250, '777', '120.75','0.00','EN-25', '07/20/2021') -- New Set of ID1,2,3 not exist in earlier data so count it as new member for July
INSERT INTO #MYSRC VALUES ('1121', 251, '3242', null,'3.00','EN-25', '07/20/2021') --- exist same set (id1,2,3) in earlier data, so not a new member for July
INSERT INTO #MYSRC VALUES ('3120', 250, '3241', '120.75','0.00','EN-25', '07/10/2021') -- New Set of ID1,2,3 not exist in earlier data so count it as new member for July
INSERT INTO #MYSRC VALUES ('6121', 251, '3242', null,'3.00','EN-25', '07/10/2021') -- New Set of ID1,2,3 not exist in earlier data so count it as new member for July
INSERT INTO #MYSRC VALUES ('3122', 250, '3243', '20.75','0.00','EN-25', '07/10/2021') -- New Set of ID1,2,3 not exist in earlier data so count it as new member for July
INSERT INTO #MYSRC VALUES ('3120', 250, '888', '120.75','0.00','EN-25', '07/05/2021') --- exist same set (id1,2,3) in earlier data, so not a new member for July
INSERT INTO #MYSRC VALUES ('5121', 251, '3242', null,'3.00','EN-25', '07/05/2021')-- New Set of ID1,2,3 not exist in earlier data so count it as new member for July
INSERT INTO #MYSRC VALUES ('1121', 251, '3242', null,'3.00','EN-25', '07/01/2021') --- exist same set (id1,2,3) in earlier data, so not a new member for July
INSERT INTO #MYSRC VALUES ('4122', 250, '100', '20.75','0.00','EN-25', '07/01/2021') -- New Set of ID1,2,3 not exist in earlier data so count it as new member for July
---- Jun
INSERT INTO #MYSRC VALUES ('3120', 250, '888', '120.75','0.00','EN-25', '06/05/2021')
INSERT INTO #MYSRC VALUES ('1121', 251, '3242', null,'3.00','EN-25', '06/05/2021')
INSERT INTO #MYSRC VALUES ('1121', 251, '3242', null,'3.00','EN-25', '06/01/2021')
INSERT INTO #MYSRC VALUES ('3122', 250, '100', '20.75','0.00','EN-25', '06/01/2021') --- assume jun 2021 is Starting population
Expected Output
Date, NewMembersCount
Jun-2021, 0 -- The oldest in the file table can make all 0's or simply just rows count
July-2021, 7 -- 7 new members were added on July 2021
Aug-2021,4 -- 4 new members were added on July 2021
Sep-2021,3 -- 3 new members were added on July 2021
Thank you
ASita
September 26, 2021 at 9:50 pm
I really appreciate what you tried to do with the readily consumable data but it's broken. The table creation is broken and the table to be inserted into is not to be found anywhere.
--Jeff Moden
Change is inevitable... Change for the better is not.
September 26, 2021 at 11:38 pm
Thank you for trying to help Jeff. I am sorry for this wrong table name in source. I corrected table name in my initial post code.
Also here is example of output
Create Table #Output
( DateCol Varchar(10),
NewMemberscount int, -- Total number of new members
ENMembersCount int, -- #mysrc table, "Status" Column (for only new members) first 2 characters is EN then count falls here
FUCMembersCount int -- #mysrc table, "Status" Column (for only new members) first 2 characters is FU then count falls here
)
INSERT INTO #Output Values ('Jun-2021',0,0,0)
INSERT INTO #Output Values ('Jul-2021',7,4,3)
INSERT INTO #Output Values ('Aug-2021',3,2,1)
INSERT INTO #Output Values ('Sep-2021',4,2,2)
Thank you Again in advance, please help
Best Regards
ASiti
September 27, 2021 at 12:01 am
You really need to actually try to run the code you posted before you post it. Heh... you even apparently ignored what I told you what was wrong with the code. It's still broken.
--Jeff Moden
Change is inevitable... Change for the better is not.
September 27, 2021 at 12:36 am
Please use below Code i executed ran without issues
CREATE TABLE #MYSRC
(
ID_1 VARCHAR(10),
ID_2 int,
ID_3 VARCHAR(10),
Pay1 varchar(6),
Pay2 varchar(6),
Status varchar(10),
FileDate DATE
)
--- SOME MORE detail INFO about data
--- WE HAVE ABOUT 2 YEARS WORTH OF DAILY FEEDS DATA LOADED TO ONETABLE(MYSRCHISTORY_members) last column FileDate shows what day is the file,
--- USUALLY EACH DAY FILE HAVE ABOUT 30K RECORDS, IN WHICH IT INCLUDES OLD RECORDS AS IS, NEW ENTRIES ADDED (by considering id1,id2,id3 combination) , FOR SOME MEMBERS IT MAY CHANGE OTHER COLUMN VALUES.
--- WE FOUND MEMBER BASED ON THREE id FIELDS, IF ID_1,2,3 CHANGES THEN IT IS CONSIDER AS NEW MEMBER.
--- HERE WOULD LIKE TO KNOW HOW MANY NEWMEMBERS WERE ADDED PER EACH MONTH (MONTHLY AGGREGATION)
---Inserts 5 records in that file, 3 old 2 new (in 2 one is just id change one is
INSERT INTO #MYSRC VALUES ('3120', 250, '3241', '120.75','0.00','EN-25', '09/20/2021') -- exist same set (id1,2,3) in earlier data, so not a new member for Sep
INSERT INTO #MYSRC VALUES ('1121', 251, '3242', null,'3.00','EN-25', '09/20/2021') -- exist same set (id1,2,3) in earlier data, so not a new member for Sep
INSERT INTO #MYSRC VALUES ('3122', 250, '3243', '20.75','0.00','EN-25', '09/20/2021') -- exist same set (id1,2,3) in earlier data, so not a new member for Sep
INSERT INTO #MYSRC VALUES ('7120', 250, '3241', '120.75','0.00','FN-50', '09/03/2021') --New RECORDS because of ID1 change,
INSERT INTO #MYSRC VALUES ('1121', 251, '3242', null,'3.00','EN-25', '09/03/2021') ---Old RECORDS/member
INSERT INTO #MYSRC VALUES ('1122', 250, '3243', '20.75','0.00','EN-25', '09/02/2021') --old record
INSERT INTO #MYSRC VALUES ('1123', 252, '3244', null,null,'EN-25', '09/02/2021') --old record/member
INSERT INTO #MYSRC VALUES ('1122', 27, '3243', '20.75','0.00','FN-25', '09/01/2021') --New Set of ID1,2,3 not exist in earlier data so count it as new member for SEP
INSERT INTO #MYSRC VALUES ('1123', 252, '3244', null,null,'EN-25', '09/01/2021') --New Set of ID1,2,3 not exist in earlier data so count it as new member for SEP
---- August Data below - FileDate (last column)
INSERT INTO #MYSRC VALUES ('3120', 250, '3241', '120.75','0.00','EN-25', '08/31/2021') -- exist same set (id1,2,3) in earlier data, so not a new member for Aug
INSERT INTO #MYSRC VALUES ('1121', 251, '3242', null,'3.00','EN-25', '08/31/2021') --- exist same set (id1,2,3) in earlier data, so not a new member for Aug
INSERT INTO #MYSRC VALUES ('3122', 250, '3243', '20.75','0.00','EN-25', '08/31/2021') --- exist same set (id1,2,3) in earlier data, so not a new member for Aug
INSERT INTO #MYSRC VALUES ('3120', 250, '777', '120.75','0.00','EN-25', '08/30/2021') --New Set of ID1,2,3 not exist in earlier data so count it as new member for Aug
INSERT INTO #MYSRC VALUES ('1121', 251, '3242', null,'3.00','EN-25', '08/30/2021') --- exist same set (id1,2,3) in earlier data, so not a new member for July
INSERT INTO #MYSRC VALUES ('3120', 250, '3241', '120.75','0.00','EN-25', '08/29/2021') -- exist same set (id1,2,3) in earlier data, so not a new member for July
INSERT INTO #MYSRC VALUES ('1121', 251, '3242', null,'3.00','EN-25', '08/29/2021') -- exist same set (id1,2,3) in earlier data, so not a new member for July
INSERT INTO #MYSRC VALUES ('3122', 250, '3243', '20.75','0.00','EN-25', '08/29/2021') -- exist same set (id1,2,3) in earlier data, so not a new member for July
INSERT INTO #MYSRC VALUES ('3120', 250, '888', '120.75','0.00','EN-25', '08/18/2021') ---same set (id1,2,3) not exist earlier, appeared first in Aug 2021. so count it
INSERT INTO #MYSRC VALUES ('1121', 251, '3242', null,'3.00','EN-25', '08/17/2021')--- exist same set (id1,2,3) in earlier data, so not a new member for Aug
INSERT INTO #MYSRC VALUES ('1121', 37, '3242', null,'3.00','FU-25', '08/11/2021') --- New Set of ID1,2,3 not exist in earlier data so count it as new member for AUG
INSERT INTO #MYSRC VALUES ('3122', 36, '100', '20.75','0.00','EN-25', '08/11/2021') ---New Set of ID1,2,3 not exist in earlier data so count it as new member for AUG
---- July Data below
INSERT INTO #MYSRC VALUES ('3120', 250, '3241', '120.75','0.00','EN-25', '07/31/2021') --- exist same set (id1,2,3) in earlier data, so not a new member for July
INSERT INTO #MYSRC VALUES ('1121', 251, '3242', null,'3.00','EN-25', '07/31/2021') --- exist same set (id1,2,3) in earlier data, so not a new member for July
INSERT INTO #MYSRC VALUES ('9122', 250, '3243', '20.75','0.00','EN-25', '07/31/2021') -- New Set of ID1,2,3 not exist in earlier data so count it as new member for July
INSERT INTO #MYSRC VALUES ('3120', 250, '777', '120.75','0.00','EN-25', '07/20/2021') -- New Set of ID1,2,3 not exist in earlier data so count it as new member for July
INSERT INTO #MYSRC VALUES ('1121', 251, '3242', null,'3.00','EN-25', '07/20/2021') --- exist same set (id1,2,3) in earlier data, so not a new member for July
INSERT INTO #MYSRC VALUES ('3120', 250, '3241', '120.75','0.00','FU-25', '07/10/2021') -- New Set of ID1,2,3 not exist in earlier data so count it as new member for July
INSERT INTO #MYSRC VALUES ('6121', 251, '3242', null,'3.00','FU-25', '07/10/2021') -- New Set of ID1,2,3 not exist in earlier data so count it as new member for July
INSERT INTO #MYSRC VALUES ('3122', 250, '3243', '20.75','0.00','EN-25', '07/10/2021') -- New Set of ID1,2,3 not exist in earlier data so count it as new member for July
INSERT INTO #MYSRC VALUES ('3120', 250, '888', '120.75','0.00','EN-25', '07/05/2021') --- exist same set (id1,2,3) in earlier data, so not a new member for July
INSERT INTO #MYSRC VALUES ('5121', 251, '3242', null,'3.00','FU-25', '07/05/2021')-- New Set of ID1,2,3 not exist in earlier data so count it as new member for July
INSERT INTO #MYSRC VALUES ('1121', 251, '3242', null,'3.00','EN-25', '07/01/2021') --- exist same set (id1,2,3) in earlier data, so not a new member for July
INSERT INTO #MYSRC VALUES ('4122', 250, '100', '20.75','0.00','EN-25', '07/01/2021') -- New Set of ID1,2,3 not exist in earlier data so count it as new member for July
---- Jun
INSERT INTO #MYSRC VALUES ('3120', 250, '888', '120.75','0.00','EN-25', '06/05/2021')
INSERT INTO #MYSRC VALUES ('1121', 251, '3242', null,'3.00','EN-25', '06/05/2021')
INSERT INTO #MYSRC VALUES ('1121', 251, '3242', null,'3.00','EN-25', '06/01/2021')
INSERT INTO #MYSRC VALUES ('3122', 250, '100', '20.75','0.00','EN-25', '06/01/2021') --- assume jun 2021 is Starting population
September 27, 2021 at 12:37 am
there was a comma and a data type issue, i corrected Jeff. please help. Thank you
September 27, 2021 at 1:12 am
The latest version of the code works just fine now. I also took a look at your expected output. We don't need to create any code to know that your desired output does not match the data because there's only one row in the test data with a status of "FU-25".
--Jeff Moden
Change is inevitable... Change for the better is not.
September 27, 2021 at 1:39 am
Sorry FUs updated. Jeff.
Please let me know. if you need any further clarification.
Data output is for information what output columns expected etc. may not need to exactly match. but around.
Thank you much and apologize for the troublesome code
asiri
September 27, 2021 at 1:53 am
That was just one of the problems with the data v.s. what you wanted for the output.
Anyway, here's the code you need.
WITH cte AS
(
SELECT FileMonth = DATEADD(mm,DATEDIFF(mm,0,FileDate),0)
,Status2 = SUBSTRING(Status,1,CHARINDEX('-',CONCAT(Status,'-'))-1)
,InstanceNum = ROW_NUMBER() OVER (PARTITION BY ID_1,ID_2,ID_3 ORDER BY FileDate)
FROM #MYSRC
)
SELECT DateCol = REPLACE(SUBSTRING(CONVERT(VARCHAR(20),FileMonth,106),4,20),' ','-')
,NewMembersCount = SUM(InstanceNum)
,ENMembersCount = SUM(IIF(Status2 = 'EN',InstanceNum,0))
,FNMembersCount = SUM(IIF(Status2 = 'FU',InstanceNum,0))
FROM cte
WHERE InstanceNum = 1
GROUP BY FileMonth
ORDER BY FileMonth
;
--Jeff Moden
Change is inevitable... Change for the better is not.
September 27, 2021 at 2:05 am
Data output is for information what output columns expected etc. may not need to exactly match. but around.
Thank you much and apologize for the troublesome code
asiri
Just remember... If you can't successfully run what you've posted, neither can we. To be honest, if the poster doesn't care, I don't care and I normally don't say a word. I normally try to run the consumable data code 1 time and, if it doesn't run, I simply move on. There are obviously an exception here and there. The only reason why yours is an exception is because you tried so hard the first time.
Be sure. Run your code before you post it.
--Jeff Moden
Change is inevitable... Change for the better is not.
September 27, 2021 at 4:18 am
Thank you very much Jeff. I noticed one thing if there is none changed in the latest month it is not showing any record for the month it would be great if we can provide just 0's
Once again Thank you jeff
September 27, 2021 at 4:19 am
Totally Understand Jeff. I will be careful and follow your valuable guidelines for sure. and i will be extra cautious.
Thank you & Grateful to you!
September 27, 2021 at 10:35 am
Thank you very much Jeff. I noticed one thing if there is none changed in the latest month it is not showing any record for the month it would be great if we can provide just 0's
Once again Thank you jeff
This should do it for you. You can get the fnTally function from a similarly named link in my signature line below.
I also added a bit of documentation as to what each section of code does.
--===== Get the MIN and MAX FileDate
DECLARE @pStartMonth DATE
,@pEndMonth DATE
;
SELECT @pStartMonth = MIN(FileDate)
,@pEndMonth = MAX(FileDate)
FROM #MYSRC
;
WITH
cteEnumerate AS
(--==== Number the rows for the given date range by unique instance of ID_1,ID_2,ID_3
SELECT FileMonth = DATEADD(mm,DATEDIFF(mm,0,FileDate),0)
,Status2 = SUBSTRING(Status,1,CHARINDEX('-',CONCAT(Status,'-'))-1)
,InstanceNum = ROW_NUMBER() OVER (PARTITION BY ID_1,ID_2,ID_3 ORDER BY FileDate)
FROM #MYSRC
WHERE FileDate >= DATEADD(mm,DATEDIFF(mm,0,@pStartMonth),0)
AND FileDate < DATEADD(mm,DATEDIFF(mm,-1,@pEndMonth) ,0) -- First month after @pEndMonth
)
,cteAggregate AS
(--==== Only the rows with an InstanceNum = 1 are new. Group those by month and count them.
SELECT DateCol = REPLACE(SUBSTRING(CONVERT(VARCHAR(20),FileMonth,106),4,20),' ','-')
,NewMembersCount = SUM(InstanceNum)
,ENMembersCount = SUM(IIF(Status2 = 'EN',InstanceNum,0))
,FUMembersCount = SUM(IIF(Status2 = 'FU',InstanceNum,0))
,FileMonth
FROM cteEnumerate
WHERE InstanceNum = 1
GROUP BY FileMonth
)
,cteCalendar AS
(--==== Create a calendar of starting dates for all months in the desired date range.
SELECT FileMonth = DATEADD(mm,t.N,DATEADD(mm,DATEDIFF(mm,0,@pStartMonth),0))
FROM dbo.fnTally(0,DATEDIFF(mm,@pStartMonth,@pEndMonth))t
)--==== Outer join to the calendar table so dates with no data will produce a "0".
SELECT DateCol = cal.FileMonth
,NewMembersCount = ISNULL(agg.NewMembersCount,0)
,ENMembersCount = ISNULL(agg.ENMembersCount ,0)
,FUMembersCount = ISNULL(agg.FUMembersCount ,0)
FROM cteCalendar cal
LEFT JOIN cteAggregate agg
ON cal.FileMonth = agg.FileMonth
;
The question now is, do you know how it all works so that when someone asks you for a change or a similar report, you can do things on your own?
Also, you'll need to spend some time figuring out a better way because this method isn't sustainable. No matter how big the file grows, you'll always need to look at the entire file to find the first date that someone became a member. My recommendation would be to create a list of members and their "join" date according to the FileDate. That's likely to be a smaller list and it could also be used for other things.
--Jeff Moden
Change is inevitable... Change for the better is not.
September 28, 2021 at 2:11 am
Good Afternoon Jeff,
Thank you very much for detail, it really helps me, i am working on with my data set. I think I can extend this but if there is any tweaks that i am struggling will let you know
Once again Grateful to you Jeff
Best Regards
asita
October 2, 2021 at 2:19 pm
Hello Jeff Good Afternoon,
Can you please assist with above query, how can i see the actual data that is new for that month.
for example May 2021 we have 10 new members how can i see the data for these 10 members for all columns?
Thank you in advance
ASita
Thank you
ASita
Viewing 15 posts - 1 through 15 (of 33 total)
You must be logged in to reply to this topic. Login to reply