Code to sum hours and minutes (varchar), 800,000 rows.

  • What is the code to sum this row of data (trip duration in hours and minutes, and about 800,000 rows). Characters are varchar, and I have tried casting based on  a couple of codes I saw here but none worked. Please help.

    Screenshot (24)

  • Did you try this ?

    Declare   @tb table ( varcharcol varchar(10) not null )

    insert into @tb
    values ( '23:01'),( '00:01'),( '00:02'),( '00:03'),( '22:00')

    ;with cteTimes as (
    Select varcharcol, convert(time(0), varcharcol+':00', 108) Timecol
    from @tb
    )
    Select *
    , datediff(hh, cast('00:00:00' as time(0)), Timecol) * 60 + datepart(mi, timecol) as InMinutes
    , sum (datediff(hh, cast('00:00:00' as time(0)), Timecol) * 60 + datepart(mi, timecol) ) over ( partition by 1 ) TotalMinutes
    , dateadd(mi, sum (datediff(hh, cast('00:00:00' as time(0)), Timecol) * 60 + datepart(mi, timecol) ) over ( partition by 1 ) , convert(datetime2(0),'00:00:00',108)) Total_19000101
    from cteTimes

    Just to get you on track ...

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • Why are you using a varcharto store the information? Can you have values that are 24 hours or longer? If not, a time(0) would seem more appropriate. If you can, then a numerical data type, like an int that stores the number of tickets (perhaps minutes, based on your image?) would be a better choice. This would avoid any unnecessary conversions and string manipulation to get something simple like an aggregate of the time.

    If you can change the design, I would suggest that you consider it sooner rather than later.

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
    Larnu.uk

  • DROP TABLE IF EXISTS #data;
    CREATE TABLE #data ( trip_duration varchar(30) NOT NULL );
    INSERT INTO #data VALUES
    ('23:01'), ('00:01'), ('05:15'), ('00:45'), ('00:11');

    SELECT
    SUM(DATEDIFF(MINUTE, 0, CAST(trip_duration + ':00' AS time(0)))) / 60 AS total_hours,
    SUM(DATEDIFF(MINUTE, 0, CAST(trip_duration + ':00' AS time(0)))) % 60 AS total_minutes
    FROM #data

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • Teewhy wrote:

    What is the code to sum this row of data (trip duration in hours and minutes, and about 800,000 rows). Characters are varchar, and I have tried casting based on  a couple of codes I saw here but none worked. Please help.

    Screenshot (24)

    WHAT is the datatype for that column, please? And can the hours part of that data be larger than 23???

    Edit#1... sorry... missed your indication of that in the original post.  We still need to know if the hours portion can be larger than 23, please.

    Edit#2... Also, what format do you want the answer in??? Decimal hours, total minutes, HHHHHH:MI, or ????

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • DECLARE @sum INT;

    SELECT @sum = 60 * SUM(CAST(PARSENAME(REPLACE(x, ':', '.'), 2) AS INT)) + SUM(CAST(PARSENAME(REPLACE(x, ':', '.'), 1) AS INT))
    FROM (
    VALUES ('23:01'),
    ('00:01'),
    ('05:15'),
    ('00:45'),
    ('00:11')
    ) AS src(x);

    SELECT CONCAT(@sum / 60, ':', @sum % 60);

    N 56°04'39.16"
    E 12°55'05.25"

  • SwePeso wrote:

    DECLARE @sum INT;

    SELECT @sum = 60 * SUM(CAST(PARSENAME(REPLACE(x, ':', '.'), 2) AS INT)) + SUM(CAST(PARSENAME(REPLACE(x, ':', '.'), 1) AS INT))
    FROM (
    VALUES ('23:01'),
    ('00:01'),
    ('05:15'),
    ('00:45'),
    ('00:11')
    ) AS src(x);

    SELECT CONCAT(@sum / 60, ':', @sum % 60);

    You can't add times that way, since decimals will roll over at 99/100 but minutes/hours need to roll over at 59/60.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • Are you sure? I am summing at the lowest level, converting hours to minutes.

    Then I am reverting the sum back to hours and minutes (for the sample data 29 hours and 13 minutes).

    • This reply was modified 1 year, 9 months ago by  SwePeso.
    • This reply was modified 1 year, 9 months ago by  SwePeso.

    N 56°04'39.16"
    E 12°55'05.25"

  • It is over  800,000 rows of data. So upon summation, the hours would be expected to run into thousands of hour.

    Secondly, the data was imported from excel in hh:mm format but it is coming up in sql as varchar.

    I want a very simple solution, but the responses so far are quite complex.

    Would appreciate a simple fix.

  •  

    It is over  800,000 rows of data. So upon summation, the hours would be expected to run into thousands of hour.s

    Secondly, the data was imported from excel in hh:mm format but it is coming up in sql as varchar.

    I have tried casting it as int in sql to get around the varchar issue but it is coming up as error.

    I want a very simple solution, but the responses so far are quite complex.

    Would appreciate a simple fix.

  • Teewhy wrote:

    Secondly, the data was imported from excel in hh:mm format but it is coming up in sql as varchar.

    Understood... Such a statement generally means that no value will be greater than 23:59 but I'm not one to make that assumption.  I need a straight up yes or no answer.  Will the original values ever by larger than 23:59?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • yes

  • Then the derivative code from SwePeso's post will correctly return the total number of decimal hours for you.  You only think his code was complex because he included test data in his FROM clause instead of the table that you did not provide the name of.

     SELECT TotalHours = (SUM(CAST(PARSENAME(REPLACE(Trip_Duration,':','.'),2) AS INT))*60 
    +SUM(CAST(PARSENAME(REPLACE(Trip_Duration,':','.'),1) AS INT)))/60.0
    FROM dbo.PutYourTableNameHere
    ;

    PARSENAME is being used as a text splitter here.  REPLACE changes the ":" to "." so that PARSENAME can do the split.

    It can be "DRY"ed out a bit using a CROSS APPLY, thusly...

     SELECT TotalHours = (SUM(CAST(PARSENAME(ca.ReadyForSplit,2) AS INT))*60 
    +SUM(CAST(PARSENAME(ca.ReadyForSplit,1) AS INT)))/60.0
    FROM dbo.PutYourTableNameHere
    CROSS APPLY (VALUES(REPLACE(Trip_Duration,':','.')))ca(ReadyForSplit)
    ;

    Some will tell you that you could use STRING_SPLIT() for this but you're using SQL Server 2019 and there is no guarantee there as to which order the elements returned from the split will be in.  You could also brute force a split using CHARINDEX() but that would no longer be as simple as you seem to be after.

    If there was a guarantee that all of the times were less than 24 hours, this would be even more simple but you've said that the times could be greater than 23:59 and so that simple method won't work here.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks a lot! This was really helpful. And your explanation clarified the code perfectly!

    However, looking at the code (SELECT TotalHours = (SUM(CAST(PARSENAME(REPLACE(Trip_Duration,':','.'),2) AS INT))*60

    +SUM(CAST(PARSENAME(REPLACE(Trip_Duration,':','.'),1) AS INT)))/60.0), what is the specific reason why 2 is in the hour section and 1 in the minute section of the code? (when I flipped the numbers, it didn't return error, but the output was four times more).

    • This reply was modified 1 year, 9 months ago by  Teewhy.
  • PARSENAME is used by Jeff to extract the HOUR portion which is extracted by specifying 2 before a .(dot). Similarly 1 is used to extract the SECONDs portion after the . (dot).

    Look at this link how PARSENAME works

    https://learn.microsoft.com/en-us/sql/t-sql/functions/parsename-transact-sql?view=sql-server-ver16

    =======================================================================

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply