Filling the missing rows/gap in the data

  • Hello,

    I have been setting up a database where I can store our sales and activity data, and struggling at some point, since I'm learning on the job. 🙂 But now I ran into an issue as I mentioned below.

    P.S. I ran into an issue with dynamic SQL and pivoting the other day, and the folks here helped me out every time I got stuck. So, I thought that the best way to learn how I can pull this off is to ask from here, since there is no one around me I can ask.

    I have a table as follows where I store the monthly sales. But, there are missing rows, since there are no sales in some months. I've attached a sample data to define what I'm pointing out. As you may see in the screenshot, no sales recorded for the subterritory "IS02" within September, that's why no record displays in the table. So, filling the gaps with the relative records and putting "0" in "UNITS" for the months with no sales is what I need at this point.

    Please refer the Sample_Data_2 too see an illustration about what I need to perform.

    Thanks in advance

  • What you're looking for here is a Calendar/Tally table[/url].

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
    Larnu.uk

  • seismicbeat (10/31/2016)


    Hello,

    I have been setting up a database where I can store our sales and activity data, and struggling at some point, since I'm learning on the job. 🙂 But now I ran into an issue as I mentioned below.

    P.S. I ran into an issue with dynamic SQL and pivoting the other day, and the folks here helped me out every time I got stuck. So, I thought that the best way to learn how I can pull this off is to ask from here, since there is no one around me I can ask.

    I have a table as follows where I store the monthly sales. But, there are missing rows, since there are no sales in some months. I've attached a sample data to define what I'm pointing out. As you may see in the screenshot, no sales recorded for the subterritory "IS02" within September, that's why no record displays in the table. So, filling the gaps with the relative records and putting "0" in "UNITS" for the months with no sales is what I need at this point.

    Please refer the Sample_Data_2 too see an illustration about what I need to perform.

    Thanks in advance

    Pictures are good but they're tough to execute. If you'd like a coded answer, please see the first link in my signature line below.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (10/31/2016)


    Pictures are good but they're tough to execute. If you'd like a coded answer, please see the first link in my signature line below.

    😀

    IF OBJECT_ID('TempDB..#testtable','U') IS NOT NULL DROP TABLE #testtable;

    CREATE TABLE #testtable

    (

    COMPANY char(2),

    AREAS char(2),

    PRODUCT char(2),

    MARKET char(2),

    SUBTERRITORIES char(4),

    [YEAR] int,

    [MONTH] int,

    UNITS int

    );

    INSERT#testtable (COMPANY,AREAS,PRODUCT,MARKET,SUBTERRITORIES,YEAR,MONTH,UNITS)

    VALUES('AA','BB','AB','AC','ISO1',2016,8,15),

    ('AA','BB','AB','AC','ISO1',2016,9,22),

    ('AA','BB','AB','AC','ISO2',2016,7,17),

    ('AA','BB','AB','AC','ISO2',2016,8,19),

    ('AA','BB','AB','AC','ISO3',2016,7,36),

    ('AA','BB','AB','AC','ISO3',2016,8,45),

    ('AA','BB','AB','AC','ISO3',2016,9,52),

    ('AA','BB','AB','AC','ISO4',2016,7,6),

    ('AA','BB','AB','AC','ISO4',2016,9,12);

    Far away is close at hand in the images of elsewhere.
    Anon.

  • Was experimenting a bit but could only come up with

    cross join distinct list of COMPANY,AREAS,PRODUCT,MARKET,SUBTERRITORIES with range of YEAR and MONTH (using tally table) and left join original table

    Far away is close at hand in the images of elsewhere.
    Anon.

  • Hello,

    Thanks all for all the help, sorry for not providing you the queries and not showing how I compile the data. The data is provided to me in an .abf file (backup file that can be attached to SSAS). I've setup a Linked Server and established a connection between the database and SSAS to get the data with MDX. I've made some research about OPENQUERY and DRILLTHROUGH operators to query the data from SSAS, eventually I came up with the following queries;

    TRUNCATE TABLE WEEKLY_SALES_RAW

    INSERT INTO WEEKLY_SALES_RAW

    SELECT * FROM OPENQUERY (WEEK,

    '

    DRILLTHROUGH MAXROWS 100000000

    SELECT

    FROM [SSASCUBENAME]

    RETURN

    NAME([$STANDARD ATTRIBUTES].[COMP]) AS COMPANY

    ,NAME([$GEOGRAPHY].[AREAS]) AS AREAS

    ,NAME([$PRODUCT].[PRODUCTS]) AS PRODUCT

    ,NAME([$PRODUCT].[MARKET]) AS MARKET

    ,NAME([$GEO].[SUBTERRITORIES]) AS SUBTERRITORIES

    ,NAME([$PERIOD].[WEEK]) AS WEEK

    ,NAME([$PERIOD].[MONTH]) AS MONTH

    ,[SALES].[UNITS] AS [UNITS]

    ,[SALES].[VALUES] AS [VALUES]

    '

    )

    It retrieves the data in most detailed information from the source, then I perform the following query to modify the data.

    TRUNCATE TABLE WEEKLY_SALES

    INSERT INTO WEEKLY_SALES

    SELECT

    temp.COMPANY,

    temp.AREAS,

    temp.PRODUCT,

    temp.MARKET,

    temp.SUBTERRITORIES,

    CASE

    WHEN temp.MONTH = 'Jan' THEN '01'

    WHEN temp.MONTH = 'Feb' THEN '02'

    WHEN temp.MONTH = 'Mar' THEN '03'

    WHEN temp.MONTH = 'Apr' THEN '04'

    WHEN temp.MONTH = 'May' THEN '05'

    WHEN temp.MONTH = 'Jun' THEN '06'

    WHEN temp.MONTH = 'Jul' THEN '07'

    WHEN temp.MONTH = 'Aug' THEN '08'

    WHEN temp.MONTH = 'Sep' THEN '09'

    WHEN temp.MONTH = 'Oct' THEN '10'

    WHEN temp.MONTH = 'Nov' THEN '11'

    WHEN temp.MONTH = 'Dec' THEN '12'

    END as MONTH,

    temp.YEAR,

    temp.WEEK,

    temp.STARTDATE,

    temp.ENDDATE,

    temp.UNITS,

    temp.VALUE

    FROM

    (

    SELECT

    a.COMPANY,

    a.AREAS,

    a.PRODUCT,

    a.MARKET,

    a.SUBTERRITORIES,

    SUBSTRING(MONTH,1,3) as MONTH,

    SUBSTRING(MONTH,5,4) as YEAR,

    SUBSTRING(WEEK,9,2) as WEEK,

    DATEADD

    (DAY,

    ((CAST(SUBSTRING(WEEK,9,2) AS INT) - 1) * 7) + 3,

    (DATEADD(YEAR, (DATEDIFF(YEAR, '19000101', (SUBSTRING(MONTH,5,4) + '0101'))), '19000101'))) as STARTDATE,

    DATEADD

    (DAY,

    ((CAST(SUBSTRING(WEEK,9,2) AS INT) - 1) * 7) + 9,

    (DATEADD(YEAR, (DATEDIFF(YEAR, '19000101', (SUBSTRING(MONTH,5,4) + '0101'))), '19000101'))) as ENDDATE,

    a.UNITS,

    a.VALUE

    FROM

    WEEKLY_SALES_RAW a

    ) as temp

    The result of the above query is what I showed in the pictures. Now, I can calculate both monthly and weekly sales, but the issue I ran into is, if there are no sales in some subterritories, those records don't show up in the data like I showed in the pictures.

    So, let's suppose no sales recorded for the subterritory "IS02" within September. What I need is filling the gaps and missing records with "0" sales. I hope, I could properly explain what I need.

    I will try to perform the tally table solution that Thom suggested, and thanks David for the Cross Join solution. 🙂

    Thanks

  • Maybe this can shed light on what needs doing:

    CREATE TABLE #TEST_DATA (

    COMPANY char(2),

    AREAS char(2),

    PRODUCT char(2),

    MARKET char(2),

    SUBTERRITORIES char(4),

    [YEAR] int,

    [MONTH] int,

    UNITS int

    );

    INSERT#TEST_DATA (COMPANY, AREAS, PRODUCT, MARKET, SUBTERRITORIES, [YEAR], [MONTH], UNITS)

    VALUES('AA','BB','AB','AC','ISO1',2016,8,15),

    ('AA','BB','AB','AC','ISO1',2016,9,22),

    ('AA','BB','AB','AC','ISO2',2016,7,17),

    ('AA','BB','AB','AC','ISO2',2016,8,19),

    ('AA','BB','AB','AC','ISO3',2016,7,36),

    ('AA','BB','AB','AC','ISO3',2016,8,45),

    ('AA','BB','AB','AC','ISO3',2016,9,52),

    ('AA','BB','AB','AC','ISO4',2016,7,6),

    ('AA','BB','AB','AC','ISO4',2016,9,12);

    DECLARE @START_YEAR AS int = 2016;

    DECLARE @START_MONTH AS int = 1;

    DECLARE @START_DATE AS date = DATEADD(MONTH, @START_MONTH - 1, DATEADD(YEAR, @START_YEAR - 1900, 0));

    WITH NUMBERS AS (

    SELECT 1 AS N UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1

    ),

    TALLY AS (

    SELECT TOP (12) ROW_NUMBER() OVER(ORDER BY N1.N) - 1 AS RN

    FROM NUMBERS AS N1, NUMBERS AS N2

    ),

    ALL_MONTHS AS (

    SELECT T.RN, DATEADD(MONTH, T.RN, @START_DATE) AS MONTH_DATE

    FROM TALLY AS T

    ),

    ALL_GROUPS AS (

    SELECT DISTINCT COMPANY, MARKET, AREAS, SUBTERRITORIES, PRODUCT

    FROM #TEST_DATA

    ),

    ALL_GROUPS_AND_MONTHS AS (

    SELECT *, YEAR(MONTH_DATE) AS THE_YEAR, MONTH(MONTH_DATE) AS THE_MONTH

    FROM ALL_MONTHS, ALL_GROUPS

    )

    SELECT M.MONTH_DATE, M.COMPANY, M.MARKET, M.AREAS, M.SUBTERRITORIES, M.PRODUCT,

    SUM(ISNULL(D.UNITS, 0)) AS TOTAL_UNITS

    FROM ALL_GROUPS_AND_MONTHS AS M

    LEFT OUTER JOIN #TEST_DATA AS D

    ON M.THE_YEAR = D.[YEAR]

    AND M.THE_MONTH = D.[MONTH]

    AND M.COMPANY = D.COMPANY

    AND M.MARKET = D.MARKET

    AND M.AREAS = D.AREAS

    AND M.SUBTERRITORIES = D.SUBTERRITORIES

    GROUP BY M.MONTH_DATE, M.COMPANY, M.MARKET, M.AREAS, M.SUBTERRITORIES, M.PRODUCT

    ORDER BY M.MONTH_DATE, M.COMPANY, M.MARKET, M.AREAS, M.SUBTERRITORIES, M.PRODUCT;

    DROP TABLE #TEST_DATA;

    It assembles the pieces one by one, and then LEFT JOINs from the "all permutation" grouping to the actual data, and you can customize the timeframe using the DECLAREd starting date and the TOP (12) to as many months as needed. If you need to get down to the weekly level, you'll need to add some more logic to handle identifying which week a given data observation falls into, but I think you get the idea...

    EDIT1: small typo: "identify" to "identifying".

    EDIT2: query correction to include PRODUCT field.

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • seismicbeat (10/31/2016)


    Hello,

    Thanks all for all the help, sorry for not providing you the queries and not showing how I compile the data. The data is provided to me in an .abf file (backup file that can be attached to SSAS). I've setup a Linked Server and established a connection between the database and SSAS to get the data with MDX. I've made some research about OPENQUERY and DRILLTHROUGH operators to query the data from SSAS, eventually I came up with the following queries;

    If you're pulling this from an SSAS cube, it could be that your MDX query is only returning non-empty cells. It might be easier to just adjust your MDX query to return all cells. It's been ages since I worked with MDX, and I've never worked with it through a linked server, so I can't be much more help than that.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • David Burrows (10/31/2016)


    Jeff Moden (10/31/2016)


    Pictures are good but they're tough to execute. If you'd like a coded answer, please see the first link in my signature line below.

    😀

    IF OBJECT_ID('TempDB..#testtable','U') IS NOT NULL DROP TABLE #testtable;

    CREATE TABLE #testtable

    (

    COMPANY char(2),

    AREAS char(2),

    PRODUCT char(2),

    MARKET char(2),

    SUBTERRITORIES char(4),

    [YEAR] int,

    [MONTH] int,

    UNITS int

    );

    INSERT#testtable (COMPANY,AREAS,PRODUCT,MARKET,SUBTERRITORIES,YEAR,MONTH,UNITS)

    VALUES('AA','BB','AB','AC','ISO1',2016,8,15),

    ('AA','BB','AB','AC','ISO1',2016,9,22),

    ('AA','BB','AB','AC','ISO2',2016,7,17),

    ('AA','BB','AB','AC','ISO2',2016,8,19),

    ('AA','BB','AB','AC','ISO3',2016,7,36),

    ('AA','BB','AB','AC','ISO3',2016,8,45),

    ('AA','BB','AB','AC','ISO3',2016,9,52),

    ('AA','BB','AB','AC','ISO4',2016,7,6),

    ('AA','BB','AB','AC','ISO4',2016,9,12);

    Using the test data above that David Burrows was kind enough to carve out and assuming that there will be at least 1 sale per month someone where (or the company is going out of business fast!), then no Calendar Table or Tally Table is necessary because ALL of the data is already built into the table. We just need to use it.

    WITH cteNonDateInfo AS

    ( --=== Get the unique non-date info.

    SELECT DISTINCT COMPANY, AREAS, PRODUCT, MARKET, SUBTERRITORIES

    FROM #testtable

    )

    , cteDateInfo AS

    ( --=== Get the unique date info assuming that each

    -- month will have at least 1 sale somewhere.

    SELECT DISTINCT YEAR, MONTH

    FROM #testtable

    ) --=== Put it all back together with a nice little CROSS JOIN.

    SELECT ndi.*, di.*, UNITS = ISNULL(tt.UNITS,0)

    FROM cteNonDateInfo ndi

    CROSS JOIN cteDateInfo di

    LEFT JOIN #testtable tt

    ON ndi.COMPANY = tt.COMPANY

    AND ndi.AREAS = tt.AREAS

    AND ndi.PRODUCT = tt.PRODUCT

    AND ndi.MARKET = tt.MARKET

    AND ndi.SUBTERRITORIES = tt.SUBTERRITORIES

    AND di.YEAR = tt.YEAR

    AND di.MONTH = tt.MONTH

    ;

    ... which returns the following...

    COMPANY AREAS PRODUCT MARKET SUBTERRITORIES YEAR MONTH UNITS

    ------- ----- ------- ------ -------------- ----------- ----------- -----------

    AA BB AB AC ISO1 2016 7 0

    AA BB AB AC ISO2 2016 7 17

    AA BB AB AC ISO3 2016 7 36

    AA BB AB AC ISO4 2016 7 6

    AA BB AB AC ISO1 2016 8 15

    AA BB AB AC ISO2 2016 8 19

    AA BB AB AC ISO3 2016 8 45

    AA BB AB AC ISO4 2016 8 0

    AA BB AB AC ISO1 2016 9 22

    AA BB AB AC ISO2 2016 9 0

    AA BB AB AC ISO3 2016 9 52

    AA BB AB AC ISO4 2016 9 12

    (12 row(s) affected)

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply