extract all data with week

  • and this is a lot easier to do in Powershell - which could be called from SSIS directly if "boss" insists in SSIS

    variables could be passed from ssis or from a command line/file

    $SQLServer = "servername"  
    $DBName = "dbname"

    $weeks = @(1,2,3,4,5,6,7,52,53) # put all weeks here -- this could also easily be built with another sql
    $yearfrom = "2020-01-01"
    $yearto = "2020-01-01"
    $exportfile = "c:\temp\datafile_--week--.csv"

    foreach ($week in $weeks)
    {
    $SqlQuery = "SELECT * from tablename
    where mydatefield >= convert(datetime, '$($yearfrom)')
    and mydatefield < convert(datetime, '$($yearto)')
    and datepart(week, mydatefield) = $($week)"
    Invoke-Sqlcmd -ServerInstance $SQLServer -Database $DBName -Query $SqlQuery | Export-CSV -Path ($ExportFile).Replace("--week--", $week) -NoTypeInformation
    }
  • Thom A wrote:

    Lots of us can help you, but you need to help us help you. What have you completed so far? Why isn't what you've done working, or where have you got stuck?

    P.s. who is"he"?

    I assumed that "he" was the one year old, who wanted the history of a "table over", whatever that is.  Smart kid.

     

  • frederico_fonseca wrote:

    and this is a lot easier to do in Powershell - which could be called from SSIS directly if "boss" insists in SSIS

    variables could be passed from ssis or from a command line/file

    Frederico, this is a very simple way of achieving the desired outcome, good work!

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • You can do this in Powershell:

    $weeks = @(); 1..53 | ForEach {$weeks += $_};

    Another method would be:

    $yearfrom = "2020-01-01"
    $yearto = "2020-01-01"
    $exportfile = "c:\temp\datafile_--week--.csv"

    1..53 | foreach
    {
    $SqlQuery = "SELECT * from tablename
    where mydatefield >= convert(datetime, '$($yearfrom)')
    and mydatefield < convert(datetime, '$($yearto)')
    and datepart(week, mydatefield) = $($week)"
    Invoke-Sqlcmd -ServerInstance $SQLServer -Database $DBName -Query $SqlQuery | Export-CSV -Path ($ExportFile).Replace("--week--", $week) -NoTypeInformation
    }

    And - you can also do something like this:

    $yearfrom = "2020-01-01"
    $yearto = "2020-01-01"

    1..53 | foreach {
    $exportFile = "C:\Temp\DataFile_$($_).csv";

    $SqlQuery = "SELECT * from tablename
    where mydatefield >= convert(datetime, '$($yearfrom)')
    and mydatefield < convert(datetime, '$($yearto)')
    and datepart(week, mydatefield) = $($week)"
    Invoke-Sqlcmd -ServerInstance $SQLServer -Database $DBName -Query $SqlQuery | Export-CSV -Path $exportFile -NoTypeInformation
    }

     

     

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • lots of ways to skin a cat 🙂

  • My question would be... has anyone run the PowerShell code to make sure it works (with replacements for the table name,  etc, of course)?  I ask because I'm no PowerShell Ninja but I'd love to test this code for performance compared to another method when I get the chance.  Thanks, folks.

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  •  

    I ran a test against your JBMTest table - here is the code that was executed and the Get-Date output.

    Get-Date;

    $yearFrom = "2019-01-01";
    $yearTo = "2020-01-01";

    1..53 | foreach {
    $exportFile = "C:\Temp\DataFile_$($_).csv";

    $sqlQuery = "Select *
    From dbo.JBMTest
    Where SomeDateTime >= convert(datetime, '$($yearFrom)')
    And SomeDateTime < convert(datetime, '$($yearTo)')
    And datepart(iso_week, SomeDateTime) = $($_)";

    Invoke-Sqlcmd -ServerInstance $SQLServer -Database $DBName -Query $sqlQuery | Export-CSV -Path $exportFile -NoTypeInformation
    }

    Get-Date;

    Monday, March 30, 2020 05:05:32 PM
    Monday, March 30, 2020 05:05:47 PM

    All 53 files were created in about 15 seconds with an median file size of 2,590KB.  The largest file size was 2,959KB and the smallest file size was 2,522 (the last file was 0KB - no data selected).

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • was about to post the test code - Jeffrey second option has a typo which he corrected on the post above ( = $($_)" - was $($week)") so it would fail)

    and iso_week probably more correct here so good you identified that change

    but my code also works as expected once we add all required week numbers

  • Jeffrey Williams wrote:

    I ran a test against your JBMTest table - here is the code that was executed and the Get-Date output.

    All 53 files were created in about 15 seconds with an median file size of 2,590KB.  The largest file size was 2,959KB and the smallest file size was 2,522 (the last file was 0KB - no data selected).

    would you mind posting the code to create the JBMtest table you used - curious to see how long it take on my pc.

     

  • Awesome.  Thank you both.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • The JBMTest table code is on the first page of this thread a little more than half way down but here it is again just to make things easy.

    --===== If the test table already exists, drop it to make reruns in SSMS easier.
    -- The GOs are to make it so you don''t get errors if you need to add or delete columns.
    IF OBJECT_ID('dbo.JBMTest','U') IS NOT NULL
    DROP TABLE dbo.JBMTest
    ;
    GO
    --===== Create and populate a large test table on-the-fly.
    -- "SomeInt" has a range of 1 to 50,000 numbers
    -- "SomeLetters2" has a range of "AA" to "ZZ"
    -- "SomeDecimal" has a range of 10.00 to 100.00 numbers
    -- "SomeDate" has a range of >=01/01/2010 & <01/01/2030 whole dates
    -- "SomeDateTime" has a range of >=01/01/2010 & <01/01/2030 Date/Times
    -- "SomeRand" contains the value of RAND just to show it can be done without a loop.
    -- "SomeHex9" contains 9 hex digits from NEWID()
    -- "SomeFluff" is a fixed width CHAR column just to give the table a little bulk.
    SELECT TOP 10000000
    SomeInt = ABS(CHECKSUM(NEWID())%50000) + 1
    ,SomeLetters2 = CHAR(ABS(CHECKSUM(NEWID())%26) + 65)
    + CHAR(ABS(CHECKSUM(NEWID())%26) + 65)
    ,SomeDecimal = CAST(RAND(CHECKSUM(NEWID())) * 90 + 10 AS DECIMAL(9,2))
    ,SomeDate = DATEADD(dd, ABS(CHECKSUM(NEWID())%DATEDIFF(dd,'2010','2030')), '2010')
    ,SomeDateTime = DATEADD(dd, DATEDIFF(dd,0,'2010'), RAND(CHECKSUM(NEWID())) * DATEDIFF(dd,'2010','2030'))
    ,SomeRand = RAND(CHECKSUM(NEWID())) --CHECKSUM produces an INT and is MUCH faster than conversion to VARBINARY.
    ,SomeHex9 = RIGHT(NEWID(),9)
    ,SomeFluff = CONVERT(CHAR(170),'170 CHARACTERS RESERVED') --Just to add a little bulk to the table.
    INTO dbo.JBMTest
    FROM sys.all_columns ac1 --Cross Join forms up to a 16 million rows
    CROSS JOIN sys.all_columns ac2 --Pseudo Cursor
    GO
    --===== Add a non-unique Clustered Index to SomeDateTime for this demo.
    CREATE CLUSTERED INDEX IXC_Test ON dbo.JBMTest (SomeDateTime ASC)
    ;
    --===== Display the generated limits of the data in the SomeDateTime column.
    SELECT LoDateTime = MIN(SomeDateTime), HiDateTime = MAX(SomeDateTime) FROM JBMTest
    ;

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • GaryV wrote:

    Phil Parkin wrote:

    Jeff Moden wrote:

    GaryV wrote:

    If you don't understand these steps, I'd suggest finding a tutorial on SSIS that includes setting variables, ForEach and Connection Managers.

    That's why I'm asking people to write a definitive article for this on this site... a lot of people summarize steps like these but they don't help beginners because steps like "Create a data flow with a query that ... " might as well be written as "<insert miracle here>".

    I know what you mean. To go into that level of detail for a requirement such as this would create a very long article, however. I wonder what percentage of people would make it all the way through. A similar number to those who made it to the end of A Brief History of Time, I suspect, in today's ADD world.

    It's always interesting how different developers approach the same problem. My (imagined) solution bears a fair resemblance to that described above, but it would not be the same. For performance reasons, I avoid querying SQL Server when I can, so I would not do this, for example: "Create a query that gives the list of dates for which files are desired. Put the result set into an object." Instead, I'd use functions and calls within SSIS to achieve a similar outcome.

    I would not have the confidence in my abilities to call any solution I create 'definitive'. Instead, I create stuff which works as quickly and elegantly as my knowledge allows. There are many smarter developers out there whose own solutions would be 'more definitive' than mine.

    Given the OP's lack of response, I don't know if this is still being sought.  But I agree with Phil, without an example to use, a generic article on how to do something like this would get quite intensive.

    And yes, there's many ways to skin this cat.

    Heh... I agree that without a defining example, it would be a tough nut to crack.  If you go back to my post right after I posted code that makes a shedload of test data, you'll find a defining example to match. 😀

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • thanks Jeff.

    so a bit slower on my pc - I7 4 core , 16GB ram, SSD drive - 29 seconds for same criteria.

    on this type of multiple files output, and where possible, I have experience with some teams using AWK to split the files - beating up using SQL for big number of files (had 1 process outputting 800 files from same table - diff criteria- AWK would take a few seconds compared to a loop similar to the one we did which would take 20 min+)

  • Thanks, Frederico.  I aware of AWK but, pun intended, I've never used it before so I'm a bit AWKward there. 😀

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Here's something funny (not really)... I don't know how to run the PowerShell examples from the command prompt.  I've done it plenty of times building a string in T-SQL and making a call to PowerShell to execute it, but I've never executed PoSh from the command prompt.  Help me learn something new here?

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 16 through 30 (of 34 total)

You must be logged in to reply to this topic. Login to reply