June 20, 2008 at 7:12 am
Jeff Moden (6/20/2008)
rbarryyoung (6/20/2008)
Try this:
SELECT Metal, min(Result), max(Result), count(Result), stdev(Result)
FROM Metals
GROUP BY Metal
By the way, stdev() is the statistical standard deviation, if you want the standard deviation for a population , use stdevp() instead.
Barry... you have to include the "Unit of Measure" in the Group By... 🙂
Hmm...The sarcasm was so thick in most of the posts I didn't detect whether he wanted the unit as part of the group or the calculation. I guess we'll have to wait for confirmation one way or another.
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
June 20, 2008 at 7:20 am
Yeah... I'm afraid I'm a bit guilty there, as well.
elmerbotha... Chris Morris (posted above) thought that you might have more than 1 data point for each UoM for each Metal... I went by the same assumption. Let us know if it's something different.
--Jeff Moden
Change is inevitable... Change for the better is not.
June 20, 2008 at 8:14 am
Jeff: Good point about the UoM. Here is a corrected/improved version:
SELECT Metal, UoM
, min(Result) as [Min]
, max(Result) as [Max]
, count(Result) as [Samples]
, avg(Result) as [Mean]
, stdev(Result) as [Std. Dev.]
FROM Metals
GROUP BY Metal, UoM
As befoe, stdev() is the statistical standard deviation, if you want the standard deviation for a population (unlikely, I think) , use stdevp() instead.
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
June 20, 2008 at 8:43 am
I so wanted to see 'execution by SQL' this morning ...
but alas ... no one suggested ## tables instead of # tables ...
so much for the morning entertainment !
RegardsRudy KomacsarSenior Database Administrator"Ave Caesar! - Morituri te salutamus."
June 20, 2008 at 8:49 am
You missed it... that was in the first response... :hehe:
--Jeff Moden
Change is inevitable... Change for the better is not.
June 20, 2008 at 8:55 am
I just knew that I should have had that second cup of coffee before posting ... oh well ...
RegardsRudy KomacsarSenior Database Administrator"Ave Caesar! - Morituri te salutamus."
June 20, 2008 at 5:41 pm
elmerbotha (6/20/2008)
I truly feel honored to be graced by your absolutely meaningful and relevant answers. Truly, I am astounded by your infinite wisdom. Without risking being ridiculed once again by a bunch of super famous forum participants, I am going to approach this differently this time. Hopefully someone would be willing to stoop down to my level and give me some insight into how I could possibly approach this problem. Again, I am not proficient in SQL, so please bear with me. If I am at the wrong place, just say so. There are other forums out there.I have a resultset with a bunch of lab results for certain precious metals which looks something like this:
Metal Result Unit of Measurement
Pt 1.26754 %
Pt 1 ppm
Au 4 ppm
Rd 8 %
etc, etc
What I have to do with the above is to per metal:
- Find the min value
- Find the mx value
-
Find the standard deviation
All sarcasm aside, did the solutions we post for you help at all?
--Jeff Moden
Change is inevitable... Change for the better is not.
June 23, 2008 at 4:02 am
Hi Chris, thanks for your reply.
A typical resultset will follow below. I need to per nCount, per metal get the average and standard deviation to ultimately get the min and max values. First, I need to change all the units to the predominant one. Again this is per metal in the resultset. So, if 51 are 'ppm' and 49 are % for Pt, everything have to be converted to 'ppm.
Then, if any of the values of a specific metal is > Avg + (2* SD) OR < Avg - (2*SD) then I need to throw those values out and add others. So, if three values were not within those parameters, I need to add three again. This needs to carry on recursively until all the Ncount values are within the +2SD or -2SD scope. Only then can I work out the Min and Max Values. Herewith a typical resultset. The 'cl_analyte' represents the actual metal. 'RD' is relative density and should be treated as a metal for purpouses of this stored procedure. The 'G/ML' unit is only ficticious and is only assigned to the 'RD' values. I hope this helps. Thanks gain!
cl_analyte cl_result cl_unit
---------- --------------------------------------- -------
B 1.84255 %
BI 4.31752 %
CA 0.00000 PPM
CD 0.00070 PPM
CO 0.27250 PPM
CR 0.15750 PPM
FE 10.34250 PPM
IR 2.10500 PPM
MO 0.11750 PPM
OS 3.32250 PPM
PB 0.73500 %
Pd 100.00000 %
Pt 0.00000 %
Ag 0.00010 %
As 112.94800 PPM
Au 0.00034 %
Bi 1.10100 PPM
Co 109.84800 PPM
Cu 2216.14000 PPM
Fe 4.41600 PPM
Ir 0.07003 %
Ni 6191.24000 PPM
Os 40.69400 PPM
Pb 39.81250 PPM
Pd 0.00021 %
Pt 0.00374 %
RD 1.05070 G/ML
Rh 0.19079 %
Ru 0.25957 %
Sb 2.32150 PPM
Se 4.16300 PPM
Sn 2.69100 PPM
Te 34.05500 PPM
Zn 21.15750 PPM
Ag 0.11880 %
As 1285.61000 PPM
Au 0.33586 %
Bi 177.69100 PPM
Co 383.37600 PPM
Cu 8070.18000 PPM
Fe 6975.48000 PPM
Ir 0.34688 %
Ni 21766.00000 PPM
Os 139.22800 PPM
Pb 1086.26000 PPM
Pd 3.67380 %
Pt 7.24010 %
RD 1.50110 G/ML
Rh 0.90411 %
Ru 1.31575 %
Sb 279.11000 PPM
Se 3119.55000 PPM
Sn 112.54600 PPM
Te 1348.46000 PPM
Zn 66.94100 PPM
Ag 0.00373 %
As 0.00000 %
Au 0.00027 %
Bi 2.27600 PPM
Co 2.74550 PPM
Cu 628.35700 PPM
Fe 4.08800 PPM
Ir 0.00170 %
Ni 33.72850 PPM
Pb 33.01100 PPM
Pd 10.93840 %
Pt 0.01984 %
RD 1.27770 G/ML
Rh 0.00124 %
Ru 0.00088 %
Sb 2.26500 PPM
Se 94.51250 PPM
Sn 44.67850 PPM
Te 9.37650 PPM
Zn 2.80250 PPM
Ag 0.06226 %
As 894.72900 PPM
Au 0.00158 %
Bi 674.76700 PPM
Co 623.94300 PPM
Cu 12613.00000 PPM
Fe 7.48450 PPM
Ir 0.37955 %
Ni 33608.00000 PPM
Pb 1491.72000 PPM
Pd 4.27650 %
Pt 9.22520 %
RD 1.62510 G/ML
Rh 0.94618 %
Ru 1.29805 %
Sb 26.95300 PPM
Se 922.50700 PPM
Sn 86.56950 PPM
Te 145.65200 PPM
Zn 227.60100 PPM
Ag 0.10929 %
As 813.66800 PPM
Au 0.00158 %
Bi 594.25500 PPM
Co 593.14400 PPM
(100 row(s) affected)
June 23, 2008 at 4:06 am
Hi Jeff,
Yes, they have. I am still waiting for an answer from Chris. Hopefully there is a better solution than my temp table approach.
Thanks
June 23, 2008 at 4:20 am
Hi Elmer,
I'm also not an SQL legend by any standards but you might try the following:
Select T.ID, max(T.Amount) as [MAX], min(T.Amount) as [Min],
(Select Sum(TP1.Amount)/count(TP1.ID) from ITR1 TP1 where TP1.ID = T.ID) as [Average]
from ITR1 T
group by T.ReconNum
It seems to simple to be the solution, but might give you some ideas.
June 23, 2008 at 4:52 am
elmerbotha (6/23/2008)
Hi Jeff,Yes, they have. I am still waiting for an answer from Chris. Hopefully there is a better solution than my temp table approach.
Thanks
Hello
Here's your data prepared as a table:
CREATE TABLE #ResultSet (cl_analyte VARCHAR(2), cl_result DECIMAL(12,5), cl_unit VARCHAR(4))
INSERT INTO #ResultSet (cl_analyte, cl_result, cl_unit)
SELECT 'B', 1.84255, '%' UNION ALL
SELECT 'BI', 4.31752, '%' UNION ALL
SELECT 'CA', 0.00000, 'PPM' UNION ALL
SELECT 'CD', 0.00070, 'PPM' UNION ALL
SELECT 'CO', 0.27250, 'PPM' UNION ALL
SELECT 'CR', 0.15750, 'PPM' UNION ALL
SELECT 'FE', 10.34250, 'PPM' UNION ALL
SELECT 'IR', 2.10500, 'PPM' UNION ALL
SELECT 'MO', 0.11750, 'PPM' UNION ALL
SELECT 'OS', 3.32250, 'PPM' UNION ALL
SELECT 'PB', 0.73500, '%' UNION ALL
SELECT 'Pd', 100.00000, '%' UNION ALL
SELECT 'Pt', 0.00000, '%' UNION ALL
SELECT 'Ag', 0.00010, '%' UNION ALL
SELECT 'As', 112.94800, 'PPM' UNION ALL
SELECT 'Au', 0.00034, '%' UNION ALL
SELECT 'Bi', 1.10100, 'PPM' UNION ALL
SELECT 'Co', 109.84800, 'PPM' UNION ALL
SELECT 'Cu', 2216.14000, 'PPM' UNION ALL
SELECT 'Fe', 4.41600, 'PPM' UNION ALL
SELECT 'Ir', 0.07003, '%' UNION ALL
SELECT 'Ni', 6191.24000, 'PPM' UNION ALL
SELECT 'Os', 40.69400, 'PPM' UNION ALL
SELECT 'Pb', 39.81250, 'PPM' UNION ALL
SELECT 'Pd', 0.00021, '%' UNION ALL
SELECT 'Pt', 0.00374, '%' UNION ALL
SELECT 'RD', 1.05070, 'G/ML' UNION ALL
SELECT 'Rh', 0.19079, '%' UNION ALL
SELECT 'Ru', 0.25957, '%' UNION ALL
SELECT 'Sb', 2.32150, 'PPM' UNION ALL
SELECT 'Se', 4.16300, 'PPM' UNION ALL
SELECT 'Sn', 2.69100, 'PPM' UNION ALL
SELECT 'Te', 34.05500, 'PPM' UNION ALL
SELECT 'Zn', 21.15750, 'PPM' UNION ALL
SELECT 'Ag', 0.11880, '%' UNION ALL
SELECT 'As', 1285.61000, 'PPM' UNION ALL
SELECT 'Au', 0.33586, '%' UNION ALL
SELECT 'Bi', 177.69100, 'PPM' UNION ALL
SELECT 'Co', 383.37600, 'PPM' UNION ALL
SELECT 'Cu', 8070.18000, 'PPM' UNION ALL
SELECT 'Fe', 6975.48000, 'PPM' UNION ALL
SELECT 'Ir', 0.34688, '%' UNION ALL
SELECT 'Ni', 21766.00000, 'PPM' UNION ALL
SELECT 'Os', 139.22800, 'PPM' UNION ALL
SELECT 'Pb', 1086.26000, 'PPM' UNION ALL
SELECT 'Pd', 3.67380, '%' UNION ALL
SELECT 'Pt', 7.24010, '%' UNION ALL
SELECT 'RD', 1.50110, 'G/ML' UNION ALL
SELECT 'Rh', 0.90411, '%' UNION ALL
SELECT 'Ru', 1.31575, '%' UNION ALL
SELECT 'Sb', 279.11000, 'PPM' UNION ALL
SELECT 'Se', 3119.55000, 'PPM' UNION ALL
SELECT 'Sn', 112.54600, 'PPM' UNION ALL
SELECT 'Te', 1348.46000, 'PPM' UNION ALL
SELECT 'Zn', 66.94100, 'PPM' UNION ALL
SELECT 'Ag', 0.00373, '%' UNION ALL
SELECT 'As', 0.00000, '%' UNION ALL
SELECT 'Au', 0.00027, '%' UNION ALL
SELECT 'Bi', 2.27600, 'PPM' UNION ALL
SELECT 'Co', 2.74550, 'PPM' UNION ALL
SELECT 'Cu', 628.35700, 'PPM' UNION ALL
SELECT 'Fe', 4.08800, 'PPM' UNION ALL
SELECT 'Ir', 0.00170, '%' UNION ALL
SELECT 'Ni', 33.72850, 'PPM' UNION ALL
SELECT 'Pb', 33.01100, 'PPM' UNION ALL
SELECT 'Pd', 10.93840, '%' UNION ALL
SELECT 'Pt', 0.01984, '%' UNION ALL
SELECT 'RD', 1.27770, 'G/ML' UNION ALL
SELECT 'Rh', 0.00124, '%' UNION ALL
SELECT 'Ru', 0.00088, '%' UNION ALL
SELECT 'Sb', 2.26500, 'PPM' UNION ALL
SELECT 'Se', 94.51250, 'PPM' UNION ALL
SELECT 'Sn', 44.67850, 'PPM' UNION ALL
SELECT 'Te', 9.37650, 'PPM' UNION ALL
SELECT 'Zn', 2.80250, 'PPM' UNION ALL
SELECT 'Ag', 0.06226, '%' UNION ALL
SELECT 'As', 894.72900, 'PPM' UNION ALL
SELECT 'Au', 0.00158, '%' UNION ALL
SELECT 'Bi', 674.76700, 'PPM' UNION ALL
SELECT 'Co', 623.94300, 'PPM' UNION ALL
SELECT 'Cu', 12613.00000, 'PPM' UNION ALL
SELECT 'Fe', 7.48450, 'PPM' UNION ALL
SELECT 'Ir', 0.37955, '%' UNION ALL
SELECT 'Ni', 33608.00000, 'PPM' UNION ALL
SELECT 'Pb', 1491.72000, 'PPM' UNION ALL
SELECT 'Pd', 4.27650, '%' UNION ALL
SELECT 'Pt', 9.22520, '%' UNION ALL
SELECT 'RD', 1.62510, 'G/ML' UNION ALL
SELECT 'Rh', 0.94618, '%' UNION ALL
SELECT 'Ru', 1.29805, '%' UNION ALL
SELECT 'Sb', 26.95300, 'PPM' UNION ALL
SELECT 'Se', 922.50700, 'PPM' UNION ALL
SELECT 'Sn', 86.56950, 'PPM' UNION ALL
SELECT 'Te', 145.65200, 'PPM' UNION ALL
SELECT 'Zn', 227.60100, 'PPM' UNION ALL
SELECT 'Ag', 0.10929, '%' UNION ALL
SELECT 'As', 813.66800, 'PPM' UNION ALL
SELECT 'Au', 0.00158, '%' UNION ALL
SELECT 'Bi', 594.25500, 'PPM' UNION ALL
SELECT 'Co', 593.14400, 'PPM'
Using R Barry Young's code, an intermediate stage in solving your problem is this:
SELECT UPPER(cl_analyte)
, min(PPM) as [Min]
, max(PPM) as [Max]
, count(PPM) as [Samples]
, avg(PPM) as [Mean]
, stdev(PPM) as [Std. Dev.]
FROM (SELECT *,
CASE cl_unit
WHEN 'PPM' THEN cl_result
WHEN '%' THEN cl_result *10000
ELSE 0 END AS 'PPM'
FROM #ResultSet) d
GROUP BY UPPER(cl_analyte)
The FROM (select...) d part is called a derived table and I've used it here because I reckon it makes the UoM conversion easier to visualise. Check that it works independantly of the whole query by running it:
SELECT *,
CASE cl_unit
WHEN 'PPM' THEN cl_result
WHEN '%' THEN cl_result *10000
ELSE 0 END AS 'PPM'
FROM #ResultSet
You need to put an expression in there for converting G/ML to ppm. This could be hardcoded into the CASE block, but better still it would come from a table of the properties of the metals.
If you're throwing unsuitable values away and replacing them with new values, where are the new values coming from, and why aren't they in this set of results?
Do you have an id column in your table of results?
Cheers
ChrisM
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
June 23, 2008 at 5:30 am
Thanks a lot! I will see how I can implement all that you've given me. Regarding throwing away useless values and replacing them: The store procedure returning the result gets two parameters, the 1) last n batches for a 2) specific stock code. Remember the initial sample size n cannot be reduced. If the user wanted to check the values against the last 5 batches, the number '5' cannot be reduced. That's why the 'thrown away' values needs to be replaced with new ones. I hope I am not confusing things. When I get the resultset a second, or nth time, I obviously need to take into account the 'thrown away values' so they are not returned again, but rather replaced (or that's how I would think) by the new ones, which might or might not be correct. That's why this needs to be done recursively in my mind until I have found a perfect resultset on which I can base my min and max values.
Basically the query returns the last n records ordered desc by the datetimestamp. The analytes are stored by a batch number and a stock code. The resultset actually looks more correctly like this:
BATCH_NO STOCK_CODE ANALYTE ASSAY_RESULT UNIT
-------------- ---------- ------- --------------------------------------- -----
0805070069 2B04 Ag 0.01000 %
0805070069 2B04 As 13.00000 PPM
0805070069 2B04 Au 0.01000 %
0805070069 2B04 Bi 10.00000 PPM
0805070069 2B04 Cu 2.00000 PPM
0805070069 2B04 Fe 2.00000 PPM
0805070069 2B04 Ir 0.01000 %
0805070069 2B04 Ni 2.00000 PPM
0805070069 2B04 Os 1530.00000 PPM
0805070069 2B04 Pb 5.00000 PPM
0805070069 2B04 Pd 0.01000 %
0805070069 2B04 Pt 0.01000 %
0805070069 2B04 RD 1.15000 G/ML
0805070069 2B04 Rh 0.00100 %
0805070069 2B04 Ru 0.01000 %
0805070069 2B04 Sb 17.00000 PPM
0805070069 2B04 Se 5.00000 PPM
0805070069 2B04 Sn 10.00000 PPM
0805070069 2B04 Te 32.00000 PPM
0805070069 2B04 Zn 1.00000 PPM
0805080070 2B04 Ag 0.01000 %
0805080070 2B04 As 9.00000 PPM
0805080070 2B04 Au 0.01000 %
0805080070 2B04 Bi 10.00000 PPM
0805080070 2B04 Cu 2.00000 PPM
0805080070 2B04 Fe 2.00000 PPM
0805080070 2B04 Ir 0.01000 %
0805080070 2B04 Ni 2.00000 PPM
0805080070 2B04 Os 1770.00000 PPM
0805080070 2B04 Pb 5.00000 PPM
0805080070 2B04 Pd 0.01000 %
0805080070 2B04 Pt 0.01000 %
0805080070 2B04 RD 1.05000 G/ML
0805080070 2B04 Rh 0.00100 %
0805080070 2B04 Ru 0.01000 %
0805080070 2B04 Sb 13.00000 PPM
0805080070 2B04 Se 5.00000 PPM
0805080070 2B04 Sn 7.00000 PPM
0805080070 2B04 Te 24.00000 PPM
0805080070 2B04 Zn 1.00000 PPM
0805090107 2D04 Ag 0.02800 %
0805090107 2D04 As 1810.00000 PPM
etc.
etc.
Thanks once again. Much appreciated.
June 23, 2008 at 8:38 am
elmerbotha (6/23/2008)
When I get the resultset a second, or nth time, I obviously need to take into account the 'thrown away values' so they are not returned again, but rather replaced (or that's how I would think) by the new ones, which might or might not be correct. That's why this needs to be done recursively in my mind until I have found a perfect resultset on which I can base my min and max values.
If you can figure it out and describe what you want to do, then please post it up here - as it stands it sounds like an interesting challenge.
Cheers
ChrisM
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
June 23, 2008 at 11:23 am
elmerbotha (6/23/2008)
Thanks a lot! I will see how I can implement all that you've given me. Regarding throwing away useless values and replacing them: The store procedure returning the result gets two parameters, the 1) last n batches for a 2) specific stock code. Remember the initial sample size n cannot be reduced. If the user wanted to check the values against the last 5 batches, the number '5' cannot be reduced. That's why the 'thrown away' values needs to be replaced with new ones. I hope I am not confusing things. When I get the resultset a second, or nth time, I obviously need to take into account the 'thrown away values' so they are not returned again, but rather replaced (or that's how I would think) by the new ones, which might or might not be correct. That's why this needs to be done recursively in my mind until I have found a perfect resultset on which I can base my min and max values.Basically the query returns the last n records ordered desc by the datetimestamp. The analytes are stored by a batch number and a stock code. The resultset actually looks more correctly like this:
BATCH_NO STOCK_CODE ANALYTE ASSAY_RESULT UNIT
-------------- ---------- ------- --------------------------------------- -----
0805070069 2B04 Ag 0.01000 %
0805070069 2B04 As 13.00000 PPM
etc.
etc.
Thanks once again. Much appreciated.
I can't say I totally understand what you're trying to do, but since you haven't replied to Chris yet regarding success, let me ask you if you couldn't just use a single table with a couple of work fields added for status, and iteration number. You might want to index them if the workset is going to get pretty big.
Viewing 14 posts - 16 through 28 (of 28 total)
You must be logged in to reply to this topic. Login to reply