Dimensional table and Stored Procedure

  • To meet these requirements below, I came up with this SQL. Please evaluate my code if it produces the requirement(s) result.

    A dimensional table and procedure that tracks Primary Accounts’ group movements. An

    account moves to different groups that are associated with levels of commission paid.

    Analysts needs to follow how they moved over time. Accounts can move back and forth

    between groups but can only be assigned to one group within a certain time period. The

    Table needs to contain start and end date. A unique Id needs to be created to link to the

    records associated to that time and group. Procedure runs daily and appends the previous

    days data.

     

    I am particular interested what the amount should be here " select top 9223372036854775807 EndDate

    from AccountCommissionHistory"



    CREATE TABLE CommissionGroups (
    GroupID INT PRIMARY KEY,
    GroupName VARCHAR(255) NOT NULL
    );

    CREATE TABLE AccountCommissionHistory (
    ID INT PRIMARY KEY,
    AccountID INT NOT NULL,
    GroupID INT NOT NULL,
    StartDate DATE NOT NULL,
    EndDate DATE,
    FOREIGN KEY (GroupID) REFERENCES CommissionGroups(GroupID)
    );

    CREATE TABLE Accounts (
    AccountID INT PRIMARY KEY,
    AccountName VARCHAR(255) NOT NULL,
    CurrentGroupID INT,
    FOREIGN KEY (CurrentGroupID) REFERENCES CommissionGroups(GroupID)
    );
    ---
    create procedure AppendAccountCommissionHistory
    as
    begin
    begin
    insert into AccountCommissionHistory (
    ID,
    AccountID,
    GroupID,
    StartDate,
    EndDate
    )
    select
    (
    select (coalesce(
    max(ID),
    0
    ) + 1)
    from AccountCommissionHistory
    ),
    AccountID,
    CurrentGroupID,
    dateadd(dd, -1, current_timestamp),
    case
    when (
    select top 07 EndDate
    from AccountCommissionHistory
    where AccountID = Accounts.AccountID
    order by EndDate desc
    ) = dateadd(dd, -1, current_timestamp) then null
    else current_timestamp
    end
    from Accounts
    where (
    select top 807 EndDate
    from AccountCommissionHistory
    where AccountID = Accounts.AccountID
    order by EndDate desc
    ) <> dateadd(dd, -1, current_timestamp);
    end;
    end;;
  • I've added comments to your code below.  Some of the code is invalid and/or doesn't make sense.

    yrstruly wrote:

    create procedure AppendAccountCommissionHistory
    as
    begin
    begin
    insert into AccountCommissionHistory (
    ...
    (
    select (coalesce(
    max(ID),
    0
    ) + 1) --<<--(1) this will not work
    from AccountCommissionHistory
    ),
    AccountID,
    CurrentGroupID,
    dateadd(dd, -1, current_timestamp),
    case
    when (
    select top 07 EndDate
    from AccountCommissionHistory
    where AccountID = Accounts.AccountID
    order by EndDate desc
    ) = dateadd(dd, -1, current_timestamp) then null --<<-- this is not a valid comparison
    else current_timestamp
    end
    from Accounts
    where (
    select top 807 EndDate --<<-- no idea what this is trying to do
    from AccountCommissionHistory
    where AccountID = Accounts.AccountID
    order by EndDate desc
    ) <> dateadd(dd, -1, current_timestamp);
    end;
    end;;

     

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • I would like to select the latest date. I tried this:

    cast

  • Are you trying to make a slowly changing AccountGroup dimension table?

    It might as simple as inserting the current data, then every day you expire changed rows and insert new rows. You can use a merge or upsert.  If you insert current rows with a known default end date, or even a null end date you not need to use a subquery to identify current rows, because the end date tells you which rows are current.

    You appear to be trying to create a value for the ID column in AccountHistory table. Your code will insert the same value for every row which will violate the primary key. I would use a column default (identity) to create the id, but you could always use a GUID with a default constraint.  I don't know if guids are appropriate for a nominal scale, but I'd use a number anyway.

    -- Initialise Table
    DECLARE @EffectiveDate DATE = '2023-01-01',
    @FutureEndDate DATE = '9999-12-31'

    INSERT dbo.AccountCommissionHistory (AccountID, GroupID, StartDate, EndDate)
    SELECT a.AccountID, a.CurrentGroupID, @EffectiveDate, @FutureEndDate
    FROM dbo.Accounts AS A
    GO


    --NEXT DAY AND EVERY DAY
    DECLARE @EffectiveDate DATE = '2023-02-02',
    @FutureEndDate DATE = '9999-12-31'

    --Expire current rows not in source
    UPDATE a
    SET a.EndDate = DATEADD(DAY, -1, @EffectiveDate)
    FROM dbo.AccountCommissionHistory AS a
    WHERE a.EndDate = @FutureEndDate
    AND NOT EXISTS (SELECT 1
    FROM dbo.Accounts AS b
    WHERE b.AccountID = a.AccountID
    AND b.CurrentGroupID = a.GroupID)

    -- Insert new rows
    INSERT dbo.AccountCommissionHistory (AccountID, GroupID, StartDate, EndDate)
    SELECT a.AccountID, a.CurrentGroupID, @EffectiveDate, @FutureEndDate
    FROM dbo.Accounts AS a
    WHERE NOT EXISTS (SELECT 1
    FROM dbo.AccountCommissionHistory AS b
    WHERE b.AccountID = a.AccountID
    AND b.GroupID = a.CurrentGroupID
    AND b.EndDate = @FutureEndDate)
  • Thank you @Ed B. Yes I am trying to create a slowly changing AccountGroup dimension table. I ran your code and it worked without any errors.

    I am also creating a  daily aggregated table or data view along with procedure that tracts the opening and closing

    balances of accounts along with debited and credited amounts. The data view needs to be

    able to roll up to a daily level that will be used to track the total opening and closing balance

    of ALL accounts. The accounts should have an end-of-day balance every day from their first

    transaction, even if they didn’t transact that day. Procedure runs daily and appends the

    previous days data.

    I came up with the SQL and would like your opinion, can there be any improvement/suggestions?

     

    CREATE VIEW daily_aggregated_view AS
    SELECT
    primary_account_id,
    Transaction_Date,
    SUM(CASE WHEN transaction_type.transaction_code = 1 THEN transaction_amount ELSE 0 END) AS debit_amount,
    SUM(CASE WHEN transaction_type.transaction_code = 2 THEN transaction_amount ELSE 0 END) AS credit_amount,
    SUM(CASE WHEN transaction_type.transaction_code = 1 THEN transaction_amount ELSE 0 END -
    CASE WHEN transaction_type.transaction_code = 2 THEN transaction_amount ELSE 0 END)
    OVER (PARTITION BY primary_account_id ORDER BY Transaction_Date) AS opening_balance,
    SUM(CASE WHEN transaction_type.transaction_code = 1 THEN transaction_amount ELSE 0 END -
    CASE WHEN transaction_type.transaction_code = 2 THEN transaction_amount ELSE 0 END) +
    SUM(transaction_amount)
    OVER (PARTITION BY primary_account_id ORDER BY Transaction_Date) AS closing_balance
    FROM
    Transactions
    INNER JOIN transaction_type ON Transactions.primary_type = transaction_type.transaction_code
    GROUP BY
    primary_account_id,
    Transaction_Date;









    --------------------------
    CREATE PROCEDURE daily_aggregate_procedure AS
    BEGIN
    INSERT INTO daily_aggregated_view






    SELECT


    primary_account_id,
    Transaction_Date,
    SUM(CASE WHEN transaction_type.transaction_code = 1 THEN transaction_amount ELSE 0 END) AS debit_amount,
    SUM(CASE WHEN transaction_type.transaction_code = 2 THEN transaction_amount ELSE 0 END) AS credit_amount,
    SUM(CASE WHEN transaction_type.transaction_code = 1 THEN transaction_amount ELSE 0 END -
    CASE WHEN transaction_type.transaction_code = 2 THEN transaction_amount ELSE 0 END)
    OVER (PARTITION BY primary_account_id ORDER BY Transaction_Date) AS opening_balance,
    SUM(CASE WHEN transaction_type.transaction_code = 1 THEN transaction_amount ELSE 0 END -
    CASE WHEN transaction_type.transaction_code = 2 THEN transaction_amount ELSE 0 END) +
    SUM(transaction_amount)
    OVER (PARTITION BY primary_account_id ORDER BY Transaction_Date) AS closing_balance
    FROM
    Transactions
    INNER JOIN transaction_type ON Transactions.primary_type = transaction_type.transaction_code
    WHERE
    Transaction_Date = CAST(GETDATE() AS DATE)
    GROUP BY
    primary_account_id,
    Transaction_Date, transaction_code,transaction_amount;
    END;

    • This reply was modified 1 year, 11 months ago by  yrstruly.
  • I think you might have missed one of the requirements. It states there should be a snapshot for every account for every date even on dates an account has no transactions. I think that by grouping by account and date you will miss dates on which accounts have no transactions.

    I might start by creating a table with one row per account per date since the account's initial transaction, with daily credit and debit columns. The code that initialises the table will be different from the daily stored procedure.  Once the table is established, each day you would insert a row for every account, including those with no transactions that day.

    A view on top of the table could calculate the opening and closing balance for each account on each day.  The code to initialise the daily snapshot table might look something like this. I haven't used running totals logic much, so I don't know if the example is correct, but it might be a decent direction. This is an example, I'm not suggesting these are the only columns and correct data types and clustered index etc. Without data I can't tell if it works either.


    DROP TABLE IF EXISTS dbo.account_daily_snapshot
    CREATE TABLE dbo.account_daily_snapshot
    ( primary_account_id INT NOT NULL,
    Transaction_Date DATE NOT NULL,
    debit_amount DECIMAL(18,2),
    credit_amount DECIMAL(18,2),
    CONSTRAINT PKaccount_daily_snapshot PRIMARY KEY CLUSTERED (primary_account_id, Transaction_Date)
    )

    INSERT dbo.account_daily_snapshot (primary_account_id, Transaction_Date, debit_amount, credit_amount)
    SELECT a.primary_account_id, b.Transaction_Date,
    ISNULL(c.debit_amount,0) AS debit_amount,
    ISNULL(c.credit_amount,0) AS credit_amount
    FROM
    (
    SELECT primary_account_id, MIN(Transaction_Date) AS MinDate
    FROM dbo.Transactions
    GROUP BY primary_account_id
    ) AS a
    JOIN (
    SELECT DISTINCT Transaction_Date
    FROM dbo.Transactions
    ) AS b ON b.Transaction_Date >= a.MinDate
    LEFT OUTER JOIN
    (
    SELECT x.primary_account_id, x.Transaction_Date,
    SUM(CASE WHEN y.transaction_code = 1 THEN x.transaction_amount ELSE 0 END) AS debit_amount,
    SUM(CASE WHEN y.transaction_code = 2 THEN x.transaction_amount ELSE 0 END) AS credit_amount
    FROM dbo.Transactions AS x
    JOIN dbo.transaction_type AS y ON x.primary_type = y.transaction_code
    GROUP BY primary_account_id, Transaction_Date

    ) AS c ON a.primary_account_id = c.primary_account_id AND b.Transaction_Date = c.Transaction_Date

    SELECT primary_account_id,
    Transaction_Date,
    SUM(credit_amount + debit_amount) OVER
    ( PARTITION BY primary_account_id ORDER BY Transaction_Date
    ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) AS opening_balance,
    SUM(credit_amount + debit_amount) OVER
    ( PARTITION BY primary_account_id ORDER BY Transaction_Date
    ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS closing_balance
    FROM dbo.account_daily_snapshot

     

  • Thank you.  I attached some sample data and a data model.

    This is what i got:

    cast

    Attachments:
    You must be logged in to view attached files.
  • I suggest you provide sample data in the form table create and insert statements. I would simplify the tables to include only the relevant columns and make up the data. The example has missing dates because it's based on a single account. In order to get a row for all dates, the transactions table must have at least on row every day. This seems quite plausible, but if it doesn't, then identify the min and max dates and create a series of dates between them.

  • SET ANSI_NULLS ON
    GO

    SET QUOTED_IDENTIFIER ON
    GO

    CREATE TABLE [dbo].[transaction_type](
    [transaction_type] [nvarchar](255) NULL,
    [transaction_code] [float] NULL
    ) ON [PRIMARY]
    GO

    ---------
    /****** Object: Table [dbo].[Transactions] Script Date: 2023/02/03 23:25:36 ******/
    SET ANSI_NULLS ON
    GO

    SET QUOTED_IDENTIFIER ON
    GO

    CREATE TABLE [dbo].[Transactions](
    [transaction_id] [bigint] NOT NULL,
    [Transaction_Date] [date] NULL,
    [primary_account_id] [int] NULL,
    [parent_account_id] [int] NULL,
    [provider_account_id] [int] NULL,
    [icp_account_id] [int] NULL,
    [group_id] [int] NULL,
    [primary_type] [int] NULL,
    [transaction_amount] [decimal](18, 2) NULL,
    [stock_cost] [decimal](18, 2) NULL,
    [vat] [decimal](18, 2) NULL,
    [superdealer_vat_reg] [decimal](18, 2) NULL,
    [vendor_vat_reg] [int] NULL,
    [vendor_comm] [decimal](18, 2) NULL,
    [superdealer_comm] [decimal](18, 2) NULL,
    [icp_comm] [decimal](18, 2) NULL,
    [psitek_comm] [decimal](18, 2) NULL,
    [vendor_rbc_cost] [decimal](18, 2) NULL,
    [bank_charge] [decimal](18, 2) NULL,
    [bank_charge_owner] [decimal](18, 2) NULL,
    [detail_type_name] [varchar](50) NULL,
    [detail_object_id] [bigint] NULL,
    [primary_account_balance] [decimal](18, 2) NULL,
    [content_type] [int] NULL,
    [reversal_id] [bigint] NULL,
    PRIMARY KEY CLUSTERED
    (
    [transaction_id] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
    ) ON [PRIMARY]
    GO



    ----------


    INSERT INTO [dbo].[Transactions]
    ([transaction_id]
    ,[Transaction_Date]
    ,[primary_account_id]
    ,[parent_account_id]
    ,[provider_account_id]
    ,[icp_account_id]
    ,[group_id]
    ,[primary_type]
    ,[transaction_amount]
    ,[stock_cost]
    ,[vat]
    ,[superdealer_vat_reg]
    ,[vendor_vat_reg]
    ,[vendor_comm]
    ,[superdealer_comm]
    ,[icp_comm]
    ,[psitek_comm]
    ,[vendor_rbc_cost]
    ,[bank_charge]
    ,[bank_charge_owner]
    ,[detail_type_name]
    ,[detail_object_id]
    ,[primary_account_balance]
    ,[content_type]
    ,[reversal_id])
    VALUES
    (<transaction_id, bigint,>
    ,<Transaction_Date, date,>
    ,<primary_account_id, int,>
    ,<parent_account_id, int,>
    ,<provider_account_id, int,>
    ,<icp_account_id, int,>
    ,<group_id, int,>
    ,<primary_type, int,>
    ,<transaction_amount, decimal(18,2),>
    ,<stock_cost, decimal(18,2),>
    ,<vat, decimal(18,2),>
    ,<superdealer_vat_reg, decimal(18,2),>
    ,<vendor_vat_reg, int,>
    ,<vendor_comm, decimal(18,2),>
    ,<superdealer_comm, decimal(18,2),>
    ,<icp_comm, decimal(18,2),>
    ,<psitek_comm, decimal(18,2),>
    ,<vendor_rbc_cost, decimal(18,2),>
    ,<bank_charge, decimal(18,2),>
    ,<bank_charge_owner, decimal(18,2),>
    ,<detail_type_name, varchar(50),>
    ,<detail_object_id, bigint,>
    ,<primary_account_balance, decimal(18,2),>
    ,<content_type, int,>
    ,<reversal_id, bigint,>)
    GO


    ------------

    INSERT INTO [dbo].[transaction_type]
    ([transaction_type]
    ,[transaction_code])
    VALUES
    (<transaction_type, nvarchar(255),>
    ,<transaction_code, float,>)
    GO


    SET ANSI_NULLS ON
    GO

    SET QUOTED_IDENTIFIER ON
    GO

    CREATE TABLE [dbo].[Accounts](
    [AccountID] [int] NOT NULL,
    [AccountName] [varchar](255) NOT NULL,
    [CurrentGroupID] [int] NULL,
    PRIMARY KEY CLUSTERED
    (
    [AccountID] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
    ) ON [PRIMARY]
    GO

    ALTER TABLE [dbo].[Accounts] WITH CHECK ADD FOREIGN KEY([CurrentGroupID])
    REFERENCES [dbo].[CommissionGroups] ([GroupID])
    GO



    -----------------
    INSERT INTO [dbo].[Accounts]
    ([AccountID]
    ,[AccountName]
    ,[CurrentGroupID])
    VALUES
    (<AccountID, int,>
    ,<AccountName, varchar(255),>
    ,<CurrentGroupID, int,>)
    GO


  • We appreciate the DDL.

    Please provide insert scripts with sample data that we can develop & test with, not just bracketed insert templates that fail with errors when executed.

  • Please explain how do i retrieve that?

  • See bac file. https://drive.google.com/file/d/136YZirJHEzhPGomzRc4iLXSLU-w_KI-O/view?usp=share_link

    • This reply was modified 1 year, 11 months ago by  yrstruly.
  • @Ed B Have you looked at my data yet?

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply