SQL Streaks

  • Hi I wish to calculate several flags for the streaks in SQL

    I have added the tables and the desired table

    I hope someone could help

    kind regards

    M

    create table DataTable

    (

    [Site] varchar (5)

    ,[C_Date] date

    ,Series int

    ,LongRun int

    ,Flag Varchar(10)

    )

    Insert into DataTable ( Site,C_Date,Series,LongRun,Flag)

    Values ('A','01-01-2016',1,1,'NULL'),

    ('A','01-02-2016',2,1,'NULL'),

    ('A','01-02-2016',3,1,'NULL'),

    ('A','01-02-2016',NULL,0,'NULL'),

    ('A','01-02-2016',1,1,'NULL'),

    ('A','01-02-2016',2,1,'NULL'),

    ('A','01-02-2016',3,1,'NULL'),

    ('A','01-02-2016',4,1,'NULL'),

    ('A','01-02-2016',5,1,'NULL'),

    ('A','01-02-2016',6,1,'NULL'),

    ('A','01-02-2016',7,1,'LongRunEnd'),

    ('B','01-01-2016',1,1,'NULL'),

    ('B','01-02-2016',2,1,'NULL'),

    ('B','01-02-2016',3,1,'NULL'),

    ('B','01-02-2016',NULL,0,'NULL'),

    ('B','01-02-2016',NULL,0,'NULL'),

    ('B','01-02-2016',NULL,0,'NULL'),

    ('B','01-02-2016',NULL,0,'NULL'),

    ('B','01-02-2016',NULL,0,'NULL'),

    ('B','01-02-2016',NULL,0,'NULL'),

    ('B','01-02-2016',NULL,0,'NULL'),

    ('B','01-02-2016',NULL,0,'NULL')

    Select * from DataTable

    create table DesiredResult

    (

    [Site] varchar (5)

    ,[C_Date] date

    ,Series int

    ,LongRun int

    ,Flag Varchar(20)

    ,Flag2 Varchar(20)

    ,Flag3 Varchar(20)

    )

    Insert into DesiredResult ( Site,C_Date,Series,LongRun,Flag,Flag2,Flag3)

    Values ('A','01-01-2016',1,1,'NULL','NULL','NULL'),

    ('A','01-02-2016',2,1,'NULL','NULL','NULL'),

    ('A','01-02-2016',3,1,'NULL','NULL','NULL'),

    ('A','01-02-2016',NULL,0,'NULL','NULL','NULL'),

    ('A','01-02-2016',1,1,'LongRunStart','LongRun','ReCalc'),

    ('A','01-02-2016',2,1,'NULL','LongRun','ReCalc'),

    ('A','01-02-2016',3,1,'NULL','LongRun','ReCalc'),

    ('A','01-02-2016',4,1,'NULL','LongRun','ReCalc'),

    ('A','01-02-2016',5,1,'NULL','LongRun','ReCalc'),

    ('A','01-02-2016',6,1,'NULL','LongRun','NULL'),

    ('A','01-02-2016',7,1,'LongRunEnd','LongRun','NULL'),

    ('B','01-01-2016',1,1,'NULL','NULL','NULL'),

    ('B','01-02-2016',2,1,'NULL','NULL','NULL'),

    ('B','01-02-2016',3,1,'NULL','NULL','NULL'),

    ('B','01-02-2016',NULL,0,'NULL','NULL','NULL'),

    ('B','01-02-2016',NULL,0,'NULL','NULL','NULL'),

    ('B','01-02-2016',NULL,0,'NULL','NULL','NULL'),

    ('B','01-02-2016',NULL,0,'NULL','NULL','NULL'),

    ('B','01-02-2016',NULL,0,'NULL','NULL','NULL'),

    ('B','01-02-2016',NULL,0,'NULL','NULL','NULL'),

    ('B','01-02-2016',NULL,0,'NULL','NULL','NULL'),

    ('B','01-02-2016',NULL,0,'NULL','NULL','NULL')

    Select * from DesiredResult

    Drop table DataTable

    Drop Table DesiredResult


     

  • Think we need some more logic here. Perhaps you could explain how you get from your sample data to your desired output. What is 'ReCalc', for example, and how it that calculated?

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
    Larnu.uk

  • Hi thank you for your reply,

    they are just textual flags 

    I am calculating times where there are 7 or more 1s in the long run column, this will tell me that there has been a streak of 1's and I  now need to recalculate the central line based on the 1st 5 values (recCalc)
    I hope this makes sense 🙂

    Kind Regards

  • please confirm what version of SQL you are using.....in some of your other posts IIRC you said you were using 2008??
    This is a 2016 forum

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • J Livingston SQL - Tuesday, July 4, 2017 7:40 AM

    please confirm what version of SQL you are using.....in some of your other posts IIRC you said you were using 2008??
    This is a 2016 forum

    Hi, I am using 2016 however the databases are old and only have up to 100 compatibility therefore some functionality such as Lead/Lag wont work for me as the databases are too old. 

    Hope this explains
    Thank you for your reply

  • michelle.mabbs - Tuesday, July 4, 2017 7:44 AM

    J Livingston SQL - Tuesday, July 4, 2017 7:40 AM

    please confirm what version of SQL you are using.....in some of your other posts IIRC you said you were using 2008??
    This is a 2016 forum

    Hi, I am using 2016 however the databases are old and only have up to 100 compatibility therefore some functionality such as Lead/Lag wont work for me as the databases are too old. 

    Hope this explains
    Thank you for your reply

    As you've previously been told, the version of SSMS you are using is not relevant. At all. It's the version of SQL Server installed on the server and, if different, the compatibility mode of the database. Please post in the correct forum in future.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Phil Parkin - Tuesday, July 4, 2017 7:51 AM

    michelle.mabbs - Tuesday, July 4, 2017 7:44 AM

    J Livingston SQL - Tuesday, July 4, 2017 7:40 AM

    please confirm what version of SQL you are using.....in some of your other posts IIRC you said you were using 2008??
    This is a 2016 forum

    Hi, I am using 2016 however the databases are old and only have up to 100 compatibility therefore some functionality such as Lead/Lag wont work for me as the databases are too old. 

    Hope this explains
    Thank you for your reply

    As you've previously been told, the version of SSMS you are using is not relevant. At all. It's the version of SQL Server installed on the server and, if different, the compatibility mode of the database. Please post in the correct forum in future.

    I do apologise I must have missed the point, I will ensure that I post in the 2008 in future

  • michelle.mabbs - Tuesday, July 4, 2017 7:44 AM

    J Livingston SQL - Tuesday, July 4, 2017 7:40 AM

    please confirm what version of SQL you are using.....in some of your other posts IIRC you said you were using 2008??
    This is a 2016 forum

    Hi, I am using 2016 however the databases are old and only have up to 100 compatibility therefore some functionality such as Lead/Lag wont work for me as the databases are too old. 

    Hope this explains
    Thank you for your reply

    Then I would suggest (in the future) posting in the 2008 forums. If you're posting in the 2016 forums it is assumed you are using SQL Server 2016 and therefore can use 2016 compliant SQL.

    Just looking at your data, I noticed that there are several entries for Site A, C_Date 2016-02-01. How do you know which applies to which? There are no "run" ID's. Simply stating "the order I inserting them into the table" is not a valid answer as when doing a SELECT data from a table, the ordering is not guarenteed without an ORDER BY clause. Thus SELECT * FROM DataTable could have the first row with the value as 'B' for Site column.

    Edit: For example, in your sample data, rows 1-3 are identical to rows 5-7. How can we separate these?

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
    Larnu.uk

  • HI, the ID is simply a row number partitioned by the Site and Ordered by the C_Date column

    Kr
    M

    I have revised the desired table and added in an ID column


    create table DesiredResult

    (

    [Site] varchar (5)

    ,[C_Date] date

    ,ID int

    ,Series int

    ,LongRun int

    ,Flag Varchar(20)

    ,Flag2 Varchar(20)

    ,Flag3 Varchar(20)

    )

    Insert into DesiredResult ( Site,C_Date,ID,Series,LongRun,Flag,Flag2,Flag3)

    Values ('A','01-01-2016',1,1,1,'NULL','NULL','NULL'),

    ('A','01-02-2016',2,2,1,'NULL','NULL','NULL'),

    ('A','01-03-2016',3,3,1,'NULL','NULL','NULL'),

    ('A','01-04-2016',4,NULL,0,'NULL','NULL','NULL'),

    ('A','01-05-2016',5,1,1,'LongRunStart','LongRun','ReCalc'),

    ('A','01-06-2016',6,5,1,'NULL','LongRun','ReCalc'),

    ('A','01-07-2016',7,3,1,'NULL','LongRun','ReCalc'),

    ('A','01-08-2016',8,4,1,'NULL','LongRun','ReCalc'),

    ('A','01-09-2016',9,5,1,'NULL','LongRun','ReCalc'),

    ('A','01-10-2016',10,6,1,'NULL','LongRun','NULL'),

    ('A','01-11-2016',11,7,1,'LongRunEnd','LongRun','NULL'),

    ('B','01-01-2016',1,1,1,'NULL','NULL','NULL'),

    ('B','01-02-2016',2,2,1,'NULL','NULL','NULL'),

    ('B','01-03-2016',3,3,1,'NULL','NULL','NULL'),

    ('B','01-04-2016',4,NULL,0,'NULL','NULL','NULL'),

    ('B','01-05-2016',5,NULL,0,'NULL','NULL','NULL'),

    ('B','01-06-2016',6,NULL,0,'NULL','NULL','NULL'),

    ('B','01-07-2016',7,NULL,0,'NULL','NULL','NULL'),

    ('B','01-08-2016',8,NULL,0,'NULL','NULL','NULL'),

    ('B','01-09-2016',9,NULL,0,'NULL','NULL','NULL'),

    ('B','01-10-2016',10,NULL,0,'NULL','NULL','NULL'),

    ('B','01-11-2016',11,NULL,0,'NULL','NULL','NULL')

    Select * from DesiredResult

  • michelle.mabbs - Tuesday, July 4, 2017 8:11 AM

    HI, the ID is simply a row number partitioned by the Site and Ordered by the C_Date column

    Kr
    M

    I have revised the desired table and added in an ID column

    So what is the ID for 2016-02-01, Site A? (Consider that you use no other columns in your above statement).

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
    Larnu.uk

  • Thom A - Tuesday, July 4, 2017 8:16 AM

    michelle.mabbs - Tuesday, July 4, 2017 8:11 AM

    HI, the ID is simply a row number partitioned by the Site and Ordered by the C_Date column

    Kr
    M

    I have revised the desired table and added in an ID column

    So what is the ID for 2016-02-01, Site A?

    Hi it is 2 , I just created a row number for all of the sites and months

  • Not totally clear what you want but see if this helps

    WITH OrderedSrc AS (
    SELECT Site,C_Date,ID,Series,LongRun,Flag,
       ROW_NUMBER() OVER(PARTITION BY Site ORDER BY ID) -
            ROW_NUMBER() OVER(PARTITION BY Site,LongRun ORDER BY ID) AS grp
    FROM DataTable
    ),
    Grps AS (
    SELECT Site,C_Date,ID,Series,LongRun,Flag,grp,
       ROW_NUMBER() OVER(PARTITION BY Site,LongRun,grp ORDER BY ID) AS rnFwd,
       ROW_NUMBER() OVER(PARTITION BY Site,LongRun,grp ORDER BY ID DESC) AS rnRev
    FROM OrderedSrc)
    SELECT ID,Site,C_Date,Series,LongRun,
         CASE WHEN LongRun = 1 AND rnFwd + rnRev - 1 >= 7 AND rnFwd = 1 THEN 'LongRunStart'
           WHEN LongRun = 1 AND rnFwd + rnRev - 1 >= 7 AND rnRev = 1 THEN 'LongRunEnd' END AS Flag,
         CASE WHEN LongRun = 1 AND rnFwd + rnRev - 1 >= 7 THEN 'LongRun' END AS Flag2,
         CASE WHEN LongRun = 1 AND rnFwd + rnRev - 1 >= 7 AND rnFwd <= 5 THEN 'ReCalc' END AS Flag3
    FROM Grps
    ORDER BY Site,ID;

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537
  • That's in your Desired result, not your actual data. For example, is the below correct?
    create table DataTable
      ([Site] varchar(5),
      [C_Date] date,
      Series int,
      ID int,
      LongRun int,
      Flag Varchar(10))
    GO
    INSERT INTO DataTable
    VALUES
      ('A', '20160101', 1,1,1,'NULL'),
      ('A', '20160201', NULL,2,0,'NULL'),
      ('A', '20160201', 1,3,1,'NULL'),
      ('A', '20160201', 2,4,1,'NULL'),
      ('A', '20160201', 2,5,1,'NULL'),
      ('A', '20160201', 3,6,1,'NULL'),
      ('A', '20160201', 3,7,1,'NULL'),
      ('A', '20160201', 4,8,1,'NULL'),
      ('A', '20160201', 5,9,1,'NULL'),
      ('A', '20160201', 6,10,1,'NULL'),
      ('A', '20160201', 7,11,1,'LongRunEnd'),
      ('B', '20160101', 1,1,1,'NULL'),
      ('B', '20160201', NULL,2,0,'NULL'),
      ('B', '20160201', NULL,3,0,'NULL'),
      ('B', '20160201', NULL,4,0,'NULL'),
      ('B', '20160201', NULL,5,0,'NULL'),
      ('B', '20160201', NULL,6,0,'NULL'),
      ('B', '20160201', NULL,7,0,'NULL'),
      ('B', '20160201', NULL,8,0,'NULL'),
      ('B', '20160201', NULL,9,0,'NULL'),
      ('B', '20160201', 2,10,1,'NULL'),
      ('B', '20160201', 3,11,1,'NULL');
    GO

    SELECT *
    FROM DataTable;
    GO

    DROP TABLE DataTable;
    GO

    This is with ID order by C_date and Series. So the run with NULL as the series is now ID 2, and then you have Series 2 twice at ID 4 and 5.

    I assume not, but without actual reliably ordered data, then you can't achieve what you want.

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
    Larnu.uk

  • michelle.mabbs - Tuesday, July 4, 2017 6:56 AM

    This looks so fundamentally wrong that it bothers me. We don’t use flagsin SQL; that was assembly language programming. The concept of a run is in a totally different level of aggregation from the rows in atable. Would you confuse a forest with its individual trees?it’s ahuge conceptual difference.

    The runs ought to be in a VIEW or a query derived from the basic information you have, not put in every row. You have more nulls in one table than I have in an entire schema. First, ask yourself what is the minimum amount of data you have on a single entity. These will be the attributes it looks like whatever this stuff is (you really didn't do good job of naming things) involve something called the site. Now, of course, your table has to have a key. This is by definition, and yet you missed it. I’m going to make a guess that your data probably looks something like this..

    I hope you know there is no such crap as a generic universal “idâ€in RDBMS.

    CREATE TABLE Foobar

    (site_name VARCHAR(5) NOT (site_nameVARCHAR(5) NOT NUNULL. foo_date DATE DEFAULT CURRENT_DATE NOT NULL,foo_dateDATE DEFAULT CURRENT_DATE NOT NULL, foo_seq INTEGER NOT NULL,foo_seq INTEGER NOT NULL,  PRIMARY KEY (site_name, foo_date, foo_seq));PRIMARY KEY (site_name, foo_date, foo_seq));

    INSERT INTO Foobar

    VALUES

    ('A','2016-01-01', 1),('A','2016-02-01', 1),

    ('A','2016-02-01', 2),'A','2016-02-01', 3),

    ('A','2016-02-01', 4),('A','2016-02-01', 5),

    ('A','2016-02-01', 6),('A','2016-02-01', 7),

    ('A','2016-02-01', 8),','2016-02-01', 9),

    ('A','2016-02-01', 10),

    ('B','2016-01-01', 1),.

    ('B','2016-02-01', 1),

    ('B','2016-02-01', 2),

    ('B','2016-02-01', 3),

    ('B','2016-02-01', 4),

    ('B','2016-02-01', 5),

    ('B','2016-02-01', 6),

    ('B','2016-02-01', 7),

    ('B','2016-02-01', 8),

    ('B','2016-02-01', 9),

    ('B','2016-02-01', 10);

    >>This is with ID order by C_date and Series. So the run with NULL as the series is now ID 2, and then you have Series 2 twice at ID 4 and5. <<

    This seems to make no sense to me. We have a create sequence statement inSQL and as you touch each site, you can give it another sequence number within the database engine itself.

    Your problem is now a simple version of the “islands and gaps†idiomin SQL. We will of course never store this information in the table because it’s a different level of aggregation. It will be a few thestart and end of each of your runs.

    Please post DDL and follow ANSI/ISO standards when asking for help. 

  • jcelko212 32090 - Wednesday, July 5, 2017 2:01 PM

    This looks so fundamentally wrong that it bothers me. We don’t use flagsin SQL; that was assembly language programming. The concept of a run is in a totally different level of aggregation from the rows in atable. Would you confuse a forest with its individual trees?it’s ahuge conceptual difference.

    CREATE TABLE Foobar

    (site_name VARCHAR(5) NOT (site_nameVARCHAR(5) NOT NUNULL. foo_date DATE DEFAULT CURRENT_DATE NOT NULL,foo_dateDATE DEFAULT CURRENT_DATE NOT NULL, foo_seq INTEGER NOT NULL,foo_seq INTEGER NOT NULL,  PRIMARY KEY (site_name, foo_date, foo_seq));PRIMARY KEY (site_name, foo_date, foo_seq));
    INSERT INTO Foobar

    Don't worry...so does the SQL.

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply