Calculating Prime Numbers With One Query

  • Steve Collins wrote:

    Ok the proof by induction is acceptable

    If 2 doesn't add or subtract from what's being measured that's a good reason to get rid of it imo

    Yes but the whole point of it is to generate a list of prime numbers, if you exclude 2 it's not meeting the requirement.

    Steve Collins wrote:

    For my projects the upgrade to SQL 2022 is not currently scheduled.  I'm going to look into again

    You can install the developer version free on your personal machine. https://www.microsoft.com/en-gb/sql-server/sql-server-downloads

    Steve Collins wrote:

    What evidence?  Seemingly = none but the lack of performance change

    If you have a WHERE clause with multiple ANDs there is no way to force the order of the evaluation. You have to rely on the SQL Server query optimizer to make an optimal decision.

  • After giving it a few tries lately I've been trying to get GPT-4 to do things which wouldn't otherwise be attempted.  So here is AI's implementation of the Sieve of Eratosthenes algorithm using bit shifting and the GENERATE_SERIES function.  I got it to work using SQL 2022 on the Fiddle site.  Is it faster?  At smallish values I doubt it.  The initial insert is the set of odd numbers and an UPDATE statement is used.  Maybe at some large max value

    In the Sieve of Eratosthenes algorithm implementation provided, bit shifting is used for two main purposes:

    1. Indexing the bit array: Since the bit array only stores odd numbers (excluding even numbers, which are not prime except for 2), bit shifting is used to map the odd numbers to their corresponding index in the bit array. The expression i >> 1 (right shift by 1) effectively divides the odd number i by 2, giving the correct index in the bit array.

    2. Converting bit array index back to the original number: When retrieving the prime numbers from the bit array, bit shifting is used to convert the index back to the original odd number. The expression (Id << 1) + 1 (left shift by 1) effectively multiplies the index by 2 and adds 1, resulting in the original odd number.

    Bit shifting is used in this implementation to efficiently manipulate and access the bit array, which stores information about the primality of odd numbers. It helps reduce memory usage and simplifies the mapping between odd numbers and their corresponding bit array indices.

    https://dbfiddle.uk/JhlPzsFN

    -- Create a table to store the bit array
    CREATE TABLE #BitArray (
    Id INT PRIMARY KEY,
    Bit INT
    );

    -- Initialize the bit array with all bits set to 1
    DECLARE @maxNumber INT = 100;
    INSERT INTO #BitArray (Id, Bit)
    SELECT [value], 1 FROM generate_series(1, (@maxNumber - 1) / 2, 1);

    -- Implement the Sieve of Eratosthenes algorithm using the generate_series function
    WITH PrimeCandidates AS (
    SELECT [value] AS Prime FROM generate_series(3, @maxNumber, 2)
    WHERE [value] * [value] <= @maxNumber
    )
    , Sieve AS (
    SELECT Prime, Prime * Prime AS Multiple FROM PrimeCandidates
    UNION ALL
    SELECT Prime, Multiple + (Prime * 2) FROM Sieve WHERE Multiple + (Prime * 2) <= @maxNumber
    )
    UPDATE #BitArray
    SET Bit = 0
    WHERE Id IN (SELECT Multiple >> 1 FROM Sieve);

    -- Retrieve the prime numbers
    SELECT 2 AS PrimeNumber
    UNION ALL
    SELECT (Id << 1) + 1 AS PrimeNumber FROM #BitArray WHERE Bit = 1;

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • More from ChatGPT-4

    Now, let me explain how the Sieve CTE works and how it fits in with the rest of the query.

    The Sieve of Eratosthenes algorithm works by iteratively marking the multiples of each prime number as non-prime (composite). The remaining unmarked numbers are prime numbers.

    In the provided code, the Sieve CTE generates multiples of prime candidates and marks them as non-prime in the BitArray table. Here's a step-by-step explanation of the code:

    1. The BitArray table is created and initialized with all bits set to 1, representing that all numbers are initially considered prime.

    2. The PrimeCandidates CTE generates a list of odd numbers starting from 3 up to the square root of the maximum number. These are the potential prime numbers that will be used to sieve out the composite numbers.

    3. The Sieve CTE generates multiples of the prime candidates from the PrimeCandidates CTE. It starts with the square of each prime candidate (since all smaller multiples have already been marked as non-prime by smaller prime candidates) and iteratively adds the prime candidate multiplied by 2 to generate the next multiple. This is done recursively using the UNION ALL clause.

    4. The BitArray table is updated by setting the Bit value to 0 for all indices corresponding to the multiples generated by the Sieve CTE. This marks the composite numbers as non-prime.

    5. Finally, the prime numbers are retrieved by selecting the indices from the BitArray table where the Bit value is still 1. The indices are then converted back to the actual prime numbers using the formula (Id << 1) + 1.

    The Sieve CTE plays a crucial role in the algorithm by generating the multiples of prime candidates and marking them as non-prime in the BitArray table. This helps in identifying the prime numbers up to the specified maximum number.

     

     

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • Aha the recursive CTE has been replaced with a 3rd usage of the GENERATE_SERIES function

    https://dbfiddle.uk/bhPaKD5f

    -- Create a table to store the bit array
    CREATE TABLE BitArray (
    Id INT PRIMARY KEY,
    Bit INT
    );

    -- Initialize the bit array with all bits set to 1
    DECLARE @maxNumber INT = 100;
    INSERT INTO BitArray (Id, Bit)
    SELECT [value], 1 FROM generate_series(1, (@maxNumber - 1) / 2, 1);

    -- Implement the Sieve of Eratosthenes algorithm using the generate_series function
    WITH PrimeCandidates AS (
    SELECT [value] AS Prime FROM generate_series(3, @maxNumber, 2)
    WHERE [value] * [value] <= @maxNumber
    )
    , Sieve AS (
    SELECT Prime, (Prime * Prime) + (Prime * 2 * Numbers.n) AS Multiple
    FROM PrimeCandidates
    CROSS JOIN (SELECT [value] AS n FROM generate_series(0, (@maxNumber - 1) / 2, 1)) AS Numbers
    WHERE (Prime * Prime) + (Prime * 2 * Numbers.n) <= @maxNumber
    )
    UPDATE BitArray
    SET Bit = 0
    WHERE Id IN (SELECT Multiple >> 1 FROM Sieve);

    -- Retrieve the prime numbers
    SELECT 2 AS PrimeNumber
    UNION ALL
    SELECT (Id << 1) + 1 AS PrimeNumber FROM BitArray WHERE Bit = 1;

    From ChatGPT

    In this updated code, the Sieve CTE uses a CROSS JOIN between the PrimeCandidates CTE and a series of numbers generated by the generate_series function with a proper select list. This generates the multiples of prime candidates without using recursion. The rest of the algorithm remains the same, and the final result is a list of prime numbers up to the specified maximum number.

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • As one query

    DECLARE @maxNumber INT = 1000;

    WITH BitArray AS (
    SELECT [value] AS Id, 1 AS Bit FROM generate_series(1, (@maxNumber - 1) / 2, 1)
    )
    , PrimeCandidates AS (
    SELECT [value] AS Prime FROM generate_series(3, @maxNumber, 2)
    WHERE [value] * [value] <= @maxNumber
    )
    , Sieve AS (
    SELECT Prime, (Prime * Prime) + (Prime * 2 * Numbers.n) AS Multiple
    FROM PrimeCandidates
    CROSS JOIN (SELECT [value] AS n FROM generate_series(0, (@maxNumber - 1) / 2, 1)) AS Numbers
    WHERE (Prime * Prime) + (Prime * 2 * Numbers.n) <= @maxNumber
    )
    , UpdatedBitArray AS (
    SELECT Id, Bit, CASE WHEN Id IN (SELECT Multiple >> 1 FROM Sieve) THEN 0 ELSE Bit END AS NewBit
    FROM BitArray
    )
    SELECT 2 AS PrimeNumber
    UNION ALL
    SELECT (Id << 1) + 1 AS PrimeNumber FROM UpdatedBitArray WHERE NewBit = 1;

     

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • Steve Collins wrote:

    As one query

    DECLARE @maxNumber INT = 1000;

    WITH BitArray AS (
    SELECT [value] AS Id, 1 AS Bit FROM generate_series(1, (@maxNumber - 1) / 2, 1)
    )
    , PrimeCandidates AS (
    SELECT [value] AS Prime FROM generate_series(3, @maxNumber, 2)
    WHERE [value] * [value] <= @maxNumber
    )
    , Sieve AS (
    SELECT Prime, (Prime * Prime) + (Prime * 2 * Numbers.n) AS Multiple
    FROM PrimeCandidates
    CROSS JOIN (SELECT [value] AS n FROM generate_series(0, (@maxNumber - 1) / 2, 1)) AS Numbers
    WHERE (Prime * Prime) + (Prime * 2 * Numbers.n) <= @maxNumber
    )
    , UpdatedBitArray AS (
    SELECT Id, Bit, CASE WHEN Id IN (SELECT Multiple >> 1 FROM Sieve) THEN 0 ELSE Bit END AS NewBit
    FROM BitArray
    )
    SELECT 2 AS PrimeNumber
    UNION ALL
    SELECT (Id << 1) + 1 AS PrimeNumber FROM UpdatedBitArray WHERE NewBit = 1;

    What is its performance like?

  • I don't know.  Setting STATISTICS ON doesn't work on Fiddle

    Simplified without the 4th CTE

    declare @max_number       int = 100;
    declare @half_max_number int = (@max_number - 1) >> 1;

    with bit_array_cte as (
    select [value] as id, 1 as bit
    from generate_series(1, @half_max_number, 1)),
    prime_candidates_cte as (
    select [value] as prime
    from generate_series(3, cast(sqrt(@max_number) as int), 2)),
    sieve_cte AS (
    select prime, power(prime, 2) + ((prime << 1) * numbers.n) as multiple
    from prime_candidates_cte
    cross join (select [value] as n
    from generate_series(0, @half_max_number, 1)) as numbers
    where power(prime, 2) + ((prime << 1) * numbers.n) <= @max_number)
    select 2 as prime_number
    union all
    select (id << 1) + 1 as prime_number
    from bit_array_cte as b
    where not exists (select 1
    from sieve_cte
    where multiple >> 1 = b.id)
    order by prime_number;

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • Steve Collins wrote:

    I don't know.  Setting STATISTICS ON doesn't work on Fiddle

    Simplified without the 4th CTE

    The doubled-up relational symbology isn't floating my boat on an SQL Server forum, either.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden wrote:

    The doubled-up relational symbology isn't floating my boat on an SQL Server forum, either.

    There are functions also as I'm sure you're already aware

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • Steve Collins wrote:

    Jeff Moden wrote:

    The doubled-up relational symbology isn't floating my boat on an SQL Server forum, either.

    There are functions also as I'm sure you're already aware

    So why are you using FIDDLE instead of the Developers Edition of SQL Server?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Ok... I'm a little gobsmacked... I ran your code with zero changes and it's riddled with syntax errors and,yet, it ran without error.

    What is the following supposed to do for example?

    where multiple >> 1 = b.id

    Never mind... bad case of me not being caught up on reading the previous posts on this thread and being behind in my reading about the additions in SQL Server 2023. 🙁

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I also have to admit that I have no clue what something like this is supposed to be doing in T-SQL.  In another language, maybe but not in t-SQL.

    (prime << 1)

    Is there some documentation somewhere on the use of "<<" and ">>" in T-SQL?

    EDIT:  I removed all occurrences of "<<  1" and ">> 1" from your code and it ran but no longer produced the correct output.  I'm surprised (shocked, is more like it) that, especially since SSMS marked them as syntax errors, that the original code actually ran.  What are they used for and why do they appear in your code?  I can't find reference to the operators in any of the SQL Server/T-SQL documentation.

    EDIT AGAIN!:  Never mind.  I've not yet worked with "bit arrays" in SQL Server.  "<< 1" effectively multiplies by 2 and ">> 1" effectively divides by 2 (both much faster that even integer math to do the same).  In hand assembled machine language and the Assembly Language, they're known respectively as "shift left, load zeros" and "shift right, load zeroes" (as opposed to shift with wrap).

    It's also a bit funny that I'm running SSMS version 19 and it marked those bit operations as incorrect syntax.  I'll have to do an update to the latest version and see if that fixes it.

    It would have been good if I'd read your previous posts, as well, 😀 not to mention that I need to catch up on my reading for SQL Server 2022. 🙁

     

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden wrote:

    Ok... I'm a little gobsmacked... I ran your code with zero changes and it's riddled with syntax errors and,yet, it ran without error.

    What is the following supposed to do for example?

    where multiple >> 1 = b.id

    In SQL 2022 they introduced bit shifting operators/functions.  Jonathan said the calculations and *2+1's were possibly causing performance issues so all gone now.  The '>>' symbol is equivalent to the RIGHT_SHIFT function.  That line of code compares the "right shift by 1" of the multiple column to the id column of the bit array cte.  Bits are binary so shifting them alters the value by factors of 2.  "Right shift by 1" of the multiple column is the same as dividing it by 2.  Since these are integers the right shift operator converts the 'multiple' column to the index location of the row in bit_array_cte which contains the primality bit.

    Yes you're right about getting SQL 2022 on my machine.  I'm going to do that.  In meantime though it would be nice to know how the latest code compares to earlier attempts

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • Steve Collins wrote:

    Jeff Moden wrote:

    Ok... I'm a little gobsmacked... I ran your code with zero changes and it's riddled with syntax errors and,yet, it ran without error.

    What is the following supposed to do for example?

    where multiple >> 1 = b.id

    In SQL 2022 they introduced bit shifting operators/functions.  Jonathan said the calculations and *2+1's were possibly causing performance issues so all gone now.  The '>>' symbol is equivalent to the RIGHT_SHIFT function.  That line of code compares the "right shift by 1" of the multiple column to the id column of the bit array cte.  Bits are binary so shifting them alters the value by factors of 2.  "Right shift by 1" of the multiple column is the same as dividing it by 2.  Since these are integers the right shift operator converts the 'multiple' column (which is an odd number) to the index location of the row in bit_array_cte which contains the primality bit.

    Yes you're right about getting SQL 2022 on my machine.  I'm going to do that.  In meantime though it would be nice to know how the latest code compares to earlier attempts

    Thanks, Steve.  Yeah... see my edits above, which I posted just before you did and updated a few times after that, possibly overlapping your good post above.  This old dude is seriously behind in his reading.  Thanks.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden wrote:

    Steve Collins wrote:

    Jeff Moden wrote:

    Ok... I'm a little gobsmacked... I ran your code with zero changes and it's riddled with syntax errors and,yet, it ran without error.

    What is the following supposed to do for example?

    where multiple >> 1 = b.id

    In SQL 2022 they introduced bit shifting operators/functions.  Jonathan said the calculations and *2+1's were possibly causing performance issues so all gone now.  The '>>' symbol is equivalent to the RIGHT_SHIFT function.  That line of code compares the "right shift by 1" of the multiple column to the id column of the bit array cte.  Bits are binary so shifting them alters the value by factors of 2.  "Right shift by 1" of the multiple column is the same as dividing it by 2.  Since these are integers the right shift operator converts the 'multiple' column (which is an odd number) to the index location of the row in bit_array_cte which contains the primality bit.

    Yes you're right about getting SQL 2022 on my machine.  I'm going to do that.  In meantime though it would be nice to know how the latest code compares to earlier attempts

    Thanks, Steve.  Yeah... see my edits above, which I posted just before you did and updated a few times after that, possibly overlapping your good post above.  This old dude is seriously behind in his reading.  Thanks.

    If they are implementing things like shift-left and shift-right maybe they should also implement unsigned integers.

Viewing 15 posts - 46 through 60 (of 77 total)

You must be logged in to reply to this topic. Login to reply