June 15, 2023 at 12:27 am
Jeff Moden wrote:Peter "Peso" Larsson wrote a nasty fast Prime Numbers function years ago and I tested it to confirm it. I'll see if I can find it in my archives this week if someone else doesn't find it first.
I think this must be it, it is in the comments on this page https://www.red-gate.com/simple-talk/databases/sql-server/t-sql-programming-sql-server/celkos-summer-sql-stumpers-prime-numbers/
That's the one.
--Jeff Moden
Change is inevitable... Change for the better is not.
June 15, 2023 at 6:38 am
A simplified version is found here
https://www.sqltopia.com/mathematics/prime-number-generation/
N 56°04'39.16"
E 12°55'05.25"
June 15, 2023 at 6:38 am
A simplified version is found here
https://www.sqltopia.com/mathematics/prime-number-generation/
N 56°04'39.16"
E 12°55'05.25"
June 15, 2023 at 12:17 pm
A simplified version is found here https://www.sqltopia.com/mathematics/prime-number-generation/%5B/quote%5D
Very good and a clear explanation of how it works.
June 15, 2023 at 2:02 pm
Each number you test for primality needs O(sqrt(n)*n) calculations, so it will slow down for larger numbers. Also, if you increase @n to 10 times its previous value each run you would expect the elapsed time to go up by a factor of over 30.
Makes sense. The SQL environment I use for online stuff isn't really set up to test absolute performance. My laptop starts gasping on Youtube videos and Azure SQL is using 50 DTU's which is much less cpu than my laptop. At the higher search counts I wasn't getting any results at all with that earlier query. For sure I didn't delete anything. The takeaway was to give it a go in C# which I never did
The SSC topic says "one query" which some of the scripts aren't exactly. Even so performance-wise the allocations are likely to not help and could be counterproductive. Suppose you try to speed it up by getting rid of the 2nd tally function and allocating a temp table to use in its place in the correlated subquery. It makes it slower afaik
This is 2x slower
declare @n int=1000000;
declare @lkup table(n int primary key clustered);
insert @lkup(n)
select fn.N*2+1 from dbo.fnTally(1, sqrt(@n/2)) fn;
select count(*)
from dbo.fnTally(1, @n/2-1) fn
where not exists(select 1
from @lkup l
where (fn.N*2+1)%l.N=0
and (fn.N*2+1)<>l.N);
versus just this
declare @n int=1000000;
select count(*)
from dbo.fnTally(1, @n/2-1) fn
where not exists(select 1
from dbo.fnTally(1, sqrt(fn.N/2)) ffn
where (fn.N*2+1)%(ffn.N*2+1)=0
and (fn.N*2+1)<>(ffn.N*2+1));
Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können
June 15, 2023 at 3:36 pm
Here is a "one query" solution 🙂
Finished in about 10 seconds for prime numbers up to 1 million.
DECLARE @max_prime INT = 1000000;
-- swePeso
SELECT value
FROM (
VALUES (2),
(3),
(5),
(7)
) AS x(value)
WHERE value <= @max_prime
UNION ALL
SELECT x.value
FROM GENERATE_SERIES(30, @max_prime + 29, 30) AS pc
CROSS APPLY (
VALUES (pc.value - 23),
(pc.value - 19),
(pc.value - 17),
(pc.value - 13),
(pc.value - 11),
(pc.value - 7),
(pc.value - 1),
(pc.value + 1)
) AS x(value)
CROSS APPLY GENERATE_SERIES(3, CAST(SQRT(pc.value + 29) AS INT), 2) AS d
WHERE x.value <= @max_prime
GROUP BY x.value
HAVING MIN(x.value % d.value) >= 1;
N 56°04'39.16"
E 12°55'05.25"
June 15, 2023 at 6:00 pm
Here's another one query solution.
Finished in about 3 seconds for prime numbers up to 1 million (on my machine).
I can't test it against SwePeso's solution as I don't have SQL Server 2022 installed so can't use GENERATE_SERIES
DECLARE @Max int = 1000000;
WITH TO_SQRT AS
(
SELECT TOP(CAST(SQRT(@Max)/2 AS bigint))
ROW_NUMBER() OVER (ORDER BY (SELECT NULL))*2+1 N
FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) AS T(N)
CROSS JOIN (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) AS T1(N)
CROSS JOIN (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) AS T2(N)
),
S AS
(
SELECT N FROM (VALUES(2),(3),(5),(7),(11),(13),(17),(19),(23),(29),(31),(37),(41),(43),(47),(53),(59)) T(N) -- Add the primes we factored out
UNION ALL
SELECT n
FROM TO_SQRT
WHERE N <= CAST(SQRT(@Max) AS bigint)
AND N%6 IN(1,5)
AND 0 NOT IN (N%5,N%7,N%11,N%13,N%17,N%19,N%23,N%29,N%31,N%37,N%41,N%43,N%47,N%53,N%59)
),
TO_MAX AS
(
SELECT TOP(CAST(@Max/2 AS int))
ROW_NUMBER() OVER (ORDER BY (SELECT NULL))*2+1 N
FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) AS T(N)
CROSS JOIN (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) AS T1(N)
CROSS JOIN (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) AS T2(N)
CROSS JOIN (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) AS T3(N)
CROSS JOIN (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) AS T4(N)
CROSS JOIN (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) AS T5(N)
)
SELECT N
FROM (VALUES(2),(3),(5),(7),(11),(13),(17),(19),(23),(29),(31),(37),(41),(43),(47),(53),(59)) T(N) -- Add the primes we factored out
WHERE T.N <= @Max
UNION ALL
SELECT p.N
FROM TO_MAX p
WHERE p.N <= @Max
AND p.N%6 IN(1,5)
AND 0 NOT IN (p.N%5,p.N%7,p.N%11,p.N%13,p.N%17,p.N%19,p.N%23,p.N%29,p.N%31,p.N%37,p.N%41,p.N%43,p.N%47,p.N%53,p.N%59)
AND NOT EXISTS(SELECT *
FROM S
WHERE p.N%S.N=0
AND S.N <= SQRT(p.N))
;
June 15, 2023 at 7:56 pm
@jonathon-2 - I just tested your method and found that it is including 151 additional non-prime numbers in the results. I also tested both version on one of my systems, yours took 8 seconds and @SwePeso's took 6 seconds.
Inserting the data into temp tables instead of to the screen - yours took 3 seconds vs 2 seconds for Peter's.
Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”
― Charles R. Swindoll
How to post questions to get better answers faster
Managing Transaction Logs
June 15, 2023 at 10:04 pm
@Jonathon - I just tested your method and found that it is including 151 additional non-prime numbers in the results. I also tested both version on one of my systems, yours took 8 seconds and @SwePeso's took 6 seconds.
Inserting the data into temp tables instead of to the screen - yours took 3 seconds vs 2 seconds for Peter's.
I appreciate the comparison. I have made the necessary correction to my SQL. In the last line, I replaced the "<" operator with "<=" to ensure that no squares are included in the results.
June 16, 2023 at 12:51 pm
I haven't yet installed SQL Server 2022 and thought it would be possible to write a TVF with the same functionality as SQL 2022's GENERATE_SERIES so I could get SwePeso's SQL Statement to work. Then I thought why not get ChatGPT to write the TVF. So I gave ChatGPT a copy of Jeff Moden's fnTally code and asked it to create a TVF that uses similar way of generating a large quantity of numbers to create a TVF with exactly the same functionality as Microsoft's GENERATE_SERIES function. I pasted in the functional specifications, I had to tell it to make a few corrections but this is what it came up with:
CREATE FUNCTION [dbo].[GENERATE_SERIES]
(
@Start BIGINT,
@Stop BIGINT,
@Step BIGINT = 1
)
RETURNS TABLE WITH SCHEMABINDING AS
RETURN WITH
H2(N) AS ( SELECT 1
FROM (VALUES
(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
)V(N)) --16^2 or 256 rows
, H4(N) AS (SELECT 1 FROM H2 a, H2 b) --16^4 or 65,536 rows
, H8(N) AS (SELECT 1 FROM H4 a, H4 b) --16^8 or 4,294,967,296 rows
, Tally AS (
SELECT N = ROW_NUMBER() OVER (ORDER BY N)
FROM H8
)
, StepTally AS (
SELECT value = @Start + (@Step * (N - 1))
FROM Tally
)
SELECT TOP(ABS(@Stop - @Start) / ABS(@Step) + 1) value
FROM StepTally
WHERE @Step <> 0
GO
Then to get SwePeso's code to work I just had to change all the GENERATE_SEQUENCE
to dbo.GENERATE_SEQUENCE
June 16, 2023 at 2:40 pm
Why mess with fnTally when what we really need is for Jeffrey to rewrite the query? Ha, then we also get a baseline of the functions. Also how does the 2 tally method with SQRT compare to SwePeso's code?
Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können
June 16, 2023 at 3:23 pm
Why mess with fnTally when what we really need is for Jeffrey to rewrite the query? Ha, then we also get a baseline of the functions. Also how does the 2 tally method with SQRT compare to SwePeso's code?
I think SQL Server's GENERATE_SERIES is very efficient and fast. You can try a comparison yourself with the dbo function, ChatGPT and I with a bit of help from Jeff Moden's fnTally wrote, and see for yourself.
June 16, 2023 at 4:17 pm
At the moment I don't have access to an instance of SQL 2022 to test with. That could change in the near future tho. For now I was just fishing for information 🙂
Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können
June 16, 2023 at 7:37 pm
I installed SQL Server 2022 today and changed my version to use GENERATE_SERIES. I also amended Steve Collins algorithm to use it and remove all the multiplications by "*2+1" as GENERATE_SERIES can go up in steps so no need for them, it was also missing the number 2 from the results so I added that. Here is a script of all the single query algorithms submitted in this thread with their timings.
SET STATISTICS TIME ON
DROP TABLE IF EXISTS #swePeso;
DROP TABLE IF EXISTS #JRoberts;
DROP TABLE IF EXISTS #SCollinsfnTally;
DROP TABLE IF EXISTS #SCollinsGENERATE_SERIES;
GO
DECLARE @max_prime INT = 1000000;
PRINT 'Method swePeso **************************************************************************************************'
-- swePeso
SELECT value
into #swePeso
FROM (
SELECT value
FROM (
VALUES (2),
(3),
(5),
(7)
) AS x(value)
WHERE value <= @max_prime
UNION ALL
SELECT x.value
FROM GENERATE_SERIES(30, @max_prime + 29, 30) AS pc
CROSS APPLY (
VALUES (pc.value - 23),
(pc.value - 19),
(pc.value - 17),
(pc.value - 13),
(pc.value - 11),
(pc.value - 7),
(pc.value - 1),
(pc.value + 1)
) AS x(value)
CROSS APPLY GENERATE_SERIES(3, CAST(SQRT(pc.value + 29) AS INT), 2) AS d
WHERE x.value <= @max_prime
GROUP BY x.value
HAVING MIN(x.value % d.value) >= 1
) x
;
GO
PRINT 'JRoberts **************************************************************************************************';
DECLARE @Max int = 1000000;
WITH S AS
(
SELECT value FROM (VALUES(2),(3),(5),(7),(11),(13),(17),(19),(23),(29),(31),(37),(41),(43),(47),(53),(59)) T(value) -- Add the primes we factored out
UNION ALL
SELECT value N
FROM GENERATE_SERIES(3, CAST(SQRT(@Max) AS int) , 2)
WHERE value%6 IN(1,5)
AND 0 NOT IN (value%5,value%7,value%11,value%13,value%17,value%19,value%23,value%29,value%31,value%37,value%41,value%43,value%47,value%53,value%59)
),
A AS
(
SELECT N
FROM (VALUES(2),(3),(5),(7),(11),(13),(17),(19),(23),(29),(31),(37),(41),(43),(47),(53),(59)) T(N) -- Add the primes we factored out
WHERE N <= @Max
UNION ALL
SELECT value N
FROM GENERATE_SERIES(61, @Max, 2) p
WHERE value%6 IN(1,5)
AND 0 NOT IN (value%5,value%7,value%11,value%13,value%17,value%19,value%23,value%29,value%31,value%37,value%41,value%43,value%47,value%53,value%59)
AND NOT EXISTS(SELECT *
FROM S
WHERE p.value%S.value=0
AND S.value <= SQRT(p.value))
)
SELECT N value
INTO #JRoberts
FROM A
GO
PRINT 'Steve Collins fnTally **************************************************************************************************'
declare @n int=1000000;
select fn.N*2+1 value
into #SCollinsfnTally
from dbo.fnTally(1, @n/2-1) fn
where not exists(select 1
from dbo.fnTally(1, sqrt(fn.N/2)) ffn
where (fn.N*2+1)%(ffn.N*2+1)=0
and (fn.N*2+1)<>(ffn.N*2+1));
GO
PRINT 'Steve Collins amended to use GENERATE_SERIES and get rid of calculations **************************************************************************************************'
DECLARE @Max int = 1000000;
SELECT value
INTO #SCollinsGENERATE_SERIES
FROM (values(2)) p(value)
WHERE value <= @Max
UNION all
SELECT value
FROM GENERATE_SERIES(3, @Max, 2) p
WHERE NOT EXISTS(SELECT *
FROM GENERATE_SERIES(3, CAST(SQRT(p.value) AS int), 2) f
WHERE p.value%f.value=0
AND p.value <> f.value)
;
Method | CPU time (ms) | Elapsed time (ms) | Rows affected |
---|---|---|---|
Steve Collins amended to use GENERATE_SERIES and get rid of calculations | 1969 | 2044 | 78498 |
JRoberts | 2734 | 2773 | 78498 |
Steve Collins fnTally | 4906 | 4953 | 78497 |
swePeso | 7922 | 8051 | 78498 |
The victorious algorithm belongs to Steve, which I refined by incorporating GENERATE_SERIES and eliminating certain calculations. As a result, it achieved completion of 2 seconds for all primes less than 1 million. It's worth noting that my machine may exhibit unique behaviour, so I encourage others to give it a try and share their results.
Viewing 15 posts - 16 through 30 (of 77 total)
You must be logged in to reply to this topic. Login to reply