'N' number of usage - Tally Table

  • Jeff Moden (10/10/2008)


    ggraber (10/10/2008)


    Nope, I'm not 100% sure I know what you mean by a "Tally CTE." Sounds like you're gonna create a tally table on the fly. Which sounds pretty performance intensive to me.

    That's exactly correct... and, compared to a While loop, it's still blazingly fast. In fact, if you hold your mouth just right while building it, it's very comparable to and sometimes beats a Tally table. Take a look at the following article...

    http://www.sqlservercentral.com/articles/T-SQL/63003/

    ... and do a search within the article for "CTE".

    That's pretty amazing! Thanks Jeff!

  • Jeff Moden (10/10/2008)


    rbarryyoung (10/10/2008)


    And what bothers me most about this is that I know that it is not the extra formatting that is killing my speed, but the built-in formatting that BCP already does. It has always seemed to me that it would be a simple matter for BCP or BULK INSERT to be moderately enhanced to handle the real formatting requirements of a CSV but for some reason, Microsoft has never seen fit to do it.

    I actually said the very same thing at work whilst training some folks, today. If bloody Excel and DTS can do it so easily, why can't BCP? It's been years, but I think I'm gonna have to write an EXE.

    Hey... wait a minute... is there an available DLL in Excel or DTS that could be used in either a CLR or as an extended stored proc?

    (cough)SSIS (cough)

    Sure you could go "steal" or borrow a DLL from those utilities - but why reinvent the wheel? There are lots of ways to get those in. Of course - none of them deal with ragged files particularly well it at all.

    This is one where you don't get a free lunch. I think you end up having to scrub the file first, so that it's "properly formatted" (consistent number of columns being one of the requirements). Whether you then upload WHILE you're scrubbing, or you do those in two stages is an implementation decision (although I found the "two stage" process to perform quicker in my case, due to the ability to bulk load).

    And - given some of the "help" Excel tries to provide (often overriding the data types, and just using whatever it guesses it to be), I'd steer clear of that one. Access' was much better IMO.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • UPDATE: So I spent today optimizing my CSVImporter and I have made some headway: previously it was running about 30-40% as fast as BCP, noow it is running about 60-70% as fast as BCP. It addition to BCP-compatible CSV files at can also handle quoted & unquoted string columns, commas and line-breaks embedded in quotes and rows with ragged-right columns.

    I still haven't addressed the the single-string size limitation yet (right now set at about 1MB). I figure that I'll address that if I ever get it to run fast enough.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • rbarryyoung (10/9/2008)


    Jeff Moden (10/9/2008)


    True enough... but have you ever see it used for anything other than a delimiter? 🙂

    I've seen it used for emoticons-> :Þ

    Jeff Moden (10/9/2008)


    Heh... I agree... but in a data file?

    This is too funny. Check out the screen shot of the Excel SQL application: http://www.sqlservercentral.com/Forums/Topic584077-386-1.aspx?Update=1. (I note that Excel to/from SQL applications is a common source of CSV Import files).

    Is that some kind or weird synchronicity or what?!? :w00t:

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Absolutely no comment on that stupidity.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • In a lame attempt to get this thread back on track, I've used Jeff's Tally table to html encode and decode some text for html and xml outputs:

    SELECT dbo.HTMLEncode('é')

    SELECT dbo.HTMLEncode('happinéss ës a warm blankëtþ') --note the delimiter to end the string is a thorn!

    ALTER FUNCTION HTMLEncode(@OriginalText VARCHAR(8000))

    RETURNS VARCHAR(8000)

    BEGIN

    DECLARE @CleanedText VARCHAR(8000)

    SELECT @CleanedText = ISNULL(@CleanedText,'') +

    CASE

    WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) = 32 THEN ' '

    WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) = 34 THEN '"'

    WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) = 38 THEN '&'

    WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) = 60 THEN '<'

    WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) = 62 THEN '>'

    WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) BETWEEN 32 AND 127 THEN SUBSTRING(@OriginalText,Tally.N,1)

    WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) <= 32 THEN '&#' + CONVERT(VARCHAR,ASCII(SUBSTRING(@OriginalText,Tally.N,1)) ) + ';'

    WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) >= 128 THEN '&#' + CONVERT(VARCHAR,ASCII(SUBSTRING(@OriginalText,Tally.N,1)) ) + ';' END

    FROM dbo.Tally WHERE Tally.N <= LEN(@OriginalText)

    RETURN @CleanedText

    END

    ALTER FUNCTION StripNonNumeric(@OriginalText VARCHAR(8000))

    RETURNS VARCHAR(8000)

    BEGIN

    DECLARE @CleanedText VARCHAR(8000)

    SELECT @CleanedText = ISNULL(@CleanedText,'') +

    CASE

    --ascii numbers are 48(for '0') thru 57 (for '9')

    WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) BETWEEN 48 AND 57 THEN SUBSTRING(@OriginalText,Tally.N,1) END

    FROM dbo.Tally WHERE Tally.N <= LEN(@OriginalText)

    RETURN @CleanedText

    END

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • rbarryyoung (10/26/2008)


    rbarryyoung (10/9/2008)


    Jeff Moden (10/9/2008)


    True enough... but have you ever see it used for anything other than a delimiter? 🙂

    I've seen it used for emoticons-> :Þ

    Jeff Moden (10/9/2008)


    Heh... I agree... but in a data file?

    This is too funny. Check out the screen shot of the Excel SQL application: http://www.sqlservercentral.com/Forums/Topic584077-386-1.aspx?Update=1. (I note that Excel to/from SQL applications is a common source of CSV Import files).

    Is that some kind or weird synchronicity or what?!? :w00t:

    Here at SQLServerCentral, we ARE the critical mass of humanity's thought processes . . .

    ---------------------------------------------------------
    How best to post your question[/url]
    How to post performance problems[/url]
    Tally Table:What it is and how it replaces a loop[/url]

    "stewsterl 80804 (10/16/2009)I guess when you stop and try to understand the solution provided you not only learn, but save yourself some headaches when you need to make any slight changes."

Viewing 7 posts - 61 through 66 (of 66 total)

You must be logged in to reply to this topic. Login to reply