A query to calculate the frequency

  • Hi,

    I'm trying to generate a table for a binomial law based graphic.

    Here is a version of one of my tables :

    IF OBJECT_ID('TempDB..#mytable','U') IS NOT NULL

    DROP TABLE #mytable

    CREATE TABLE #mytable

    (

    ProductId INT,

    Value DECIMAL(6,4)

    )

    INSERT INTO @mytable (ProductId, Value)

    SELECT 1, 5.3 UNION ALL

    SELECT 1, 4.7 UNION ALL

    SELECT 1, 2.0 UNION ALL

    SELECT 1, 1.2 UNION ALL

    SELECT 1, 1.3 UNION ALL

    SELECT 1, 4.1 UNION ALL

    SELECT 1, 3.6 UNION ALL

    SELECT 1, 3.7 UNION ALL

    SELECT 1, 3.5 UNION ALL

    SELECT 2, 0.2 UNION ALL

    SELECT 2, 1.5 UNION ALL

    SELECT 2, 0.7 UNION ALL

    SELECT 2, 3.2 UNION ALL

    SELECT 2, 2.6 UNION ALL

    SELECT 2, 1.8 UNION ALL

    SELECT 2, 1.3 UNION ALL

    SELECT 2, 0.9 UNION ALL

    SELECT 2, 1.1 UNION ALL

    SELECT 3, 2.3 UNION ALL

    SELECT 3, 1.5 UNION ALL

    SELECT 3, 1.9 UNION ALL

    SELECT 3, 2.1 UNION ALL

    SELECT 3, 2.2 UNION ALL

    SELECT 3, 3.1 UNION ALL

    SELECT 3, 2.8 UNION ALL

    SELECT 3, 5.0 UNION ALL

    SELECT 3, 1.1 UNION ALL

    SELECT 3, 4.3 UNION ALL

    SELECT 3, 2.2 UNION ALL

    SELECT 3, 1.5 UNION ALL

    SELECT 3, 3.6 UNION ALL

    SELECT 3, 3.2 UNION ALL

    SELECT 3, 4.9 UNION ALL

    SELECT 3, 2.7

    I can't come up with a query that would give me the equivalent of Excel's FREQUENCY function.

    I need the following datas for each Product : min, max, sample size, sqrt(samplesize) and intervals (intervals are obtained through min+(max-min)/sqrt(sample))

    So I can obtain the following table where frequency = (number of values in interval/samplesize)*100 :

    ProductId Interval Frequency

    1 1.2 11%

    1 2.57 22%

    1 3.93 33%

    1 5.3 33%

    ----------------------

    2 0.2 11%

    2 1.2 33%

    2 2.2 33%

    2 3.2 22%

    ----------------------

    3 1.1 6%

    3 2.075 19%

    3 3.05 38%

    3 4.025 19%

    3 5 19%

    Thanks a lot for any help I could get.

    Peter

  • The frequency function in Excel looks to have the first interval be based on what is less than your first data value. In the case of your example, your frequency on that first interval is always going to be 1 (your min value) / N records for that given ProductID. I'm guessing this is not what you want. I think you want to split up the sum of the values starting at the min value, not 0.

    Also, the table definition provided doesn't work so I made some modifications.

    DECLARE @mytable TABLE

    (ProductId INT

    ,Value DECIMAL(6,4))

    INSERT INTO @mytable (ProductId, Value)

    SELECT 1, 5.3 UNION ALL

    SELECT 1, 4.7 UNION ALL

    SELECT 1, 2.0 UNION ALL

    SELECT 1, 1.2 UNION ALL

    SELECT 1, 1.3 UNION ALL

    SELECT 1, 4.1 UNION ALL

    SELECT 1, 3.6 UNION ALL

    SELECT 1, 3.7 UNION ALL

    SELECT 1, 3.5 UNION ALL

    SELECT 2, 0.2 UNION ALL

    SELECT 2, 1.5 UNION ALL

    SELECT 2, 0.7 UNION ALL

    SELECT 2, 3.2 UNION ALL

    SELECT 2, 2.6 UNION ALL

    SELECT 2, 1.8 UNION ALL

    SELECT 2, 1.3 UNION ALL

    SELECT 2, 0.9 UNION ALL

    SELECT 2, 1.1 UNION ALL

    SELECT 3, 2.3 UNION ALL

    SELECT 3, 1.5 UNION ALL

    SELECT 3, 1.9 UNION ALL

    SELECT 3, 2.1 UNION ALL

    SELECT 3, 2.2 UNION ALL

    SELECT 3, 3.1 UNION ALL

    SELECT 3, 2.8 UNION ALL

    SELECT 3, 5.0 UNION ALL

    SELECT 3, 1.1 UNION ALL

    SELECT 3, 4.3 UNION ALL

    SELECT 3, 2.2 UNION ALL

    SELECT 3, 1.5 UNION ALL

    SELECT 3, 3.6 UNION ALL

    SELECT 3, 3.2 UNION ALL

    SELECT 3, 4.9 UNION ALL

    SELECT 3, 2.7

    Modify the code as needed ... I realize I'm taking a guess as to what you would want the data intervals to be. And please feel free to ask questions.

    DECLARE @tally TABLE (N INT) --hopefully you have a real tally table to use

    INSERT INTO @tally(N)

    SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6

    ;WITH cteTemp AS

    (

    SELECT sq.ProductId,

    t.N as intNum,

    cast(sq.minVal + (sq.interval * (t.N - 1)) as decimal(3,2)) as intBegin,

    intEnd = cast(case

    when t.N = sq.countInterval then maxVal

    else sq.minVal + (sq.interval * t.N) - 0.01 end as decimal(3,2)),

    sq.countVal as recCount

    FROM

    (

    SELECT ProductID,

    MIN(Value) as minVal,

    MAX(Value) as maxVal,

    COUNT(VALUE) as countVal,

    SQRT(COUNT(Value)) as countInterval,

    (MAX(Value) - MIN(Value))/SQRT(COUNT(Value)) as interval

    FROM @mytable

    GROUP BY ProductId

    ) sq

    JOIN @tally t

    ON t.N <= sq.countInterval)

    SELECT cte.ProductId,

    cte.intBegin,

    cte.intEnd,

    cast(cast(COUNT(mt.Value) as float)/cte.recCount*100 as decimal(5,2)) as Frequency

    FROM cteTemp cte

    LEFT JOIN @mytable mt

    ON mt.ProductId = cte.ProductId

    AND mt.Value >= cte.intBegin

    AND mt.Value <= cte.intEnd

    GROUP BY cte.ProductId, cte.intNum, cte.intBegin, cte.intEnd, cte.recCount

    ORDER BY ProductId, intNum

    └> bt



    Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • I updated the OP with the insert statements without quotes from your post.

    And I owe you a big one. It works perfectly and it is even better than Excel's function when dealing with intervals. I did try to use my tally table but I guess I need to study more.

    Peter

  • Wow, pretty impressive. I never considered doing something like that in that way. Thanks for the post.

    The probability of survival is inversely proportional to the angle of arrival.

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply