How to skip the records

  • Hello

    I am not able to write a SQL query which give me to skip certain records. ex. i want to select every 3rd records from the table or some time every 5th record from table and so on. This skip factor i.e. how many records to skip, will be defined by user. Please help me in writing such SQL statement.

    Thanks and Regards

    Rakesh


    rakesh

  • Why not create a table that first examines the table you are going to pull data from and populate a MAX VARIABLE with the # of records.

    Then create a TEMP TABLE and populate it with whatever FACTOR that the end-user wants i.e.  Record 1 would be 3, Record 2 would be Record 1 + FACTOR, etc...

    This could be done in a WHILE LOOP LIKE this SAMPLE:

    DECLARE @i AS INTEGER

    DECLARE @2 AS INTEGER

    DECLARE @m AS INTEGER

    SET     @i = 3

    SET     @2 = @i

    SET     @m = 92

    WHILE @2 < @m

      BEGIN

        PRINT @2

        SET @2 = @2 + @i

      END

    You could then JOIN this X-REF table to your table to retrieve data based on the ID column (ASSUMPTION)



    Good Hunting!

    AJ Ahrens


    webmaster@kritter.net

  • Loops AJ?? Is that how you want to cross the big K????

    Declare @Sample as int

    Declare @OffSet as int

    SET @Sample = 3

    SET @OffSet = 1

    SELECT

    O1.id, O1.Name, count(*) as Rank

    FROM dbo.SysObjects O1 inner join dbo.SysObjects O2 ON O2.id <= O1.id

    GROUP BY O1.id, O1.Name

    HAVING(count(*) - 1) % @Sample = @OffSet

    ORDER BY O1.id

    @Sample is for every record in @Sample, 3 in this case.

    @OffSet is to change the sample. If the offset is 0 you'll get the records 1,4,7... If the offset is 1 you'll get the records 2,5,8...

    Make sure that the id in the join is INDEXED, clustered if possible.

  • Nice Remi.

    Don't know what in the world a user would need with displaying every 3 row in a dataset?

    But nice solution to it.

  • Usually for sampling. Something like an election where you want to get an estimate every X minute without recompiling everything. But there are always other methods to do that which are faster than this one .

  • ahh, okay, I guess I'm just fortunate having not worked on such an application.

  • I never had to either... but I can see the need. Imagine that the server can't keep up with the demand and has to queue to inserts of the votes (imagine they come in has flat text files). You could load those records in a temp table, sample the data there and ship the sample results in a stats table that could show a pretty accurate picture live.

  • Many statistics in Control systems applications are heavily based on the sampling period and it make sense a lot for simulation purposes also!

     


    * Noel

  • Do you have other exemples?

  • I can give you plenty within the realm of Control systems but out side of it I am not sure

     


    * Noel

  • I still want more details... I never worked with those kind of systems.

  • In data adquisition systems (SCADAS) the sampling period can be used to "filter" the noise of slowly moving variables like temperature. You usually do not have the luxury of a 3GHz processor to perform digital filtering nor all shops count with FPGA to build a Hardware prototype of the filter, therefore a very cheap method of determining what will be a "good" sampling rate for a cheap processor is to generate the output for the various speeds you have at hand... Hope this doesn't sound too dense

     

    Another application is to determine how much variation you get from a reference value (stability study) and the samplig rate comes into place again depending on how much error do you need to tolerate or reject

     

    There are others for more heavily math involved but I had very rarely used them (not a math guy ) and they completly transforms equations but I don't even want to go there

    Cheers! 


    * Noel

  • Though I suspect that a selection of records using a random function might be better. Selecting every n seems vulnerable to unintentional biases.

     

    ...

    -- FORTRAN manual for Xerox Computers --

  • That's why I added an offset parameter.

    But this is always a valid option :

    Select top 33 percent from dbo.MyTable order by Newid(), this will be a true random sampling but that's not what was asked in this case .

  • Remi,

    Here I thought my solution was ingenious and then (as usual) I see your's and go WOW.....



    Good Hunting!

    AJ Ahrens


    webmaster@kritter.net

Viewing 15 posts - 1 through 15 (of 33 total)

You must be logged in to reply to this topic. Login to reply