How to select 200k data faster in sql

  • jaquino012 (12/14/2015)


    Hi,

    @jeff Moden

    Yes, i have been working on a simple dictionary with a limited words and it is working fine.

    but what if the words are so big the i cant query it on a SQL, that is why i have asking if there is a better way to load a 300k words from SQL to my from end.

    And my last solution is put it on a text file if that is a possible.

    Unless you actually need offline access to the word list, the best thing to do is to leave the word list in SQL Server. That'll also prevent the need from having to distribute the word list and make sure that distributed word lists are "in synch".

    A 300k word list isn't a very large table though. If push came to shove, you could add SQL Express to wherever you want to distribute the list to and use that bit of power instead of redeveloping the wheel to do looks up on text file.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • ScottPletcher (12/14/2015)


    Jeff Moden (12/12/2015)


    ScottPletcher (12/10/2015)


    Do you have a unique index on Word? You don't want SQL to have to sort the words every time -- that will be extremely slow. Also, if you allow ad-hoc inserts to the table, be sure to set the fillfactor below 100, and rebuild that index whenever it gets even 10+% fragmented, since you're looking for very good speed.

    If new words being added is a somewhat infrequent event, using a Fill Factor of other than 100 would be a source of slowness and unneeded resource usage. For example, if you use a Fill Factor of 90, then you must read 10 pages to read 9 pages of content. Of course, that only matters for scans and not singleton seeks... but then the Fill Factor and levels of fragmentation don't actually matter for singleton seeks anyway.

    Using a fillfactor of 100 is extraordinarily risky for such a short row if, for example, there might ever be a trigger on this table or snapshot isolation on the db. Combined with the chance of a few random insertions, 98 or 99 seems more reasonable to me. I noted above that 10% freespace is too high for large tables, but 0% has very real risks too, except on 99.9% read-only tables.

    Gosh. I don't know why anyone would put a trigger on a word-lookup table nor why a 100% Fill Factor would cause any problems with the trigger. I also don't understand how snapshot isolation would cause problems on a table with 100% Fill Factor on a table that's mostly read-only.

    I will kowtow and agree that a 98% Fill Factor would probably be OK.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (12/15/2015)


    ScottPletcher (12/14/2015)


    Jeff Moden (12/12/2015)


    ScottPletcher (12/10/2015)


    Do you have a unique index on Word? You don't want SQL to have to sort the words every time -- that will be extremely slow. Also, if you allow ad-hoc inserts to the table, be sure to set the fillfactor below 100, and rebuild that index whenever it gets even 10+% fragmented, since you're looking for very good speed.

    If new words being added is a somewhat infrequent event, using a Fill Factor of other than 100 would be a source of slowness and unneeded resource usage. For example, if you use a Fill Factor of 90, then you must read 10 pages to read 9 pages of content. Of course, that only matters for scans and not singleton seeks... but then the Fill Factor and levels of fragmentation don't actually matter for singleton seeks anyway.

    Using a fillfactor of 100 is extraordinarily risky for such a short row if, for example, there might ever be a trigger on this table or snapshot isolation on the db. Combined with the chance of a few random insertions, 98 or 99 seems more reasonable to me. I noted above that 10% freespace is too high for large tables, but 0% has very real risks too, except on 99.9% read-only tables.

    Gosh. I don't know why anyone would put a trigger on a word-lookup table nor why a 100% Fill Factor would cause any problems with the trigger. I also don't understand how snapshot isolation would cause problems on a table with 100% Fill Factor on a table that's mostly read-only.

    I will kowtow and agree that a 98% Fill Factor would probably be OK.

    My concern is the 14-byte pointer that gets added to each row when an update (or delete) is done under snapshot isolation, or gets added when an after trigger is created on the table. I, too, can't predict if/when a trigger might be needed, even briefly, but, I'd really want to avoid having every page split if it did.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • I'll say that I think Access v SQL is highly dependent on what you do. If you're doing word by word, the round trips, and load on the server might make less sense than allowing an offline file.

    Access can essentially use the JET engine, or you could use A SQL Express variant like LocalDB that gives you local storage, with indexing. Indexing is what makes this fast v flat files.

    If you update the dictionary often, then maybe it makes sense in SQL Server. If you are not updating often, and have a lot of spell check calls, it might make sense for Access/LocalDB for a copy of the data from the table.

    Workload pattern will matter a lot here. As will network reliability and speed.

Viewing 4 posts - 16 through 18 (of 18 total)

You must be logged in to reply to this topic. Login to reply