Sql server data export to csv with 100,000 row filesize limit

Question

Sql server data export to csv with 100,000 row filesize limit

FlyingSquid

SSC Enthusiast

Points: 173
More actions
February 16, 2023 at 6:21 pm

#4150680
I'm using sql statements to query data from multiple tables and exporting them into csv files, then importing that file somewhere else.
Is there a way to export the results into csv files that limit the size of each file to approx 100,000 rows each. Once reached, the export creates a new csv file and increments a number. So the first file is cusexp_{datetime}_1, then cusexp_{datetime}_2, cusexp_{datetime}_3, etc.
Say the results return 1,000,000 rows. Export to csv files of approx 100,000 rows each. csv filename format cusexp_{datetime}_{iterationNumber}. Don't break up same [Customer ID] c.id between multiple files. So in this case I'd have around 10 or 11 csv files.
Here is a rough sample of what I'm selecting.
```
select
   [Customer Name] = c.customername
  ,[Customer ID] = c.id
  ,[Item Purchased] = o.itemname
  ,[Item Number] = o.itemNum
  ,[Item Description] = o.itemDesc
from dbo.mystore.customermstr c
join dbo.mystore.items o 
    on c.id = o.customerid
where c.status = 'true';
```

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply

FlyingSquid SSC Enthusiast Points: 173 More actions · Answer 1

I could also insert them into a temp with a counter column, then export the data from that file, but I'm not sure if that will help.

Emperor100 SSCrazy Points: 2173 More actions · Answer 2

You can give these a try:

a) Import Export wizard.

b) Use Excel directly, Connect to the Instance by creating a connection and retrieve the data in itself.

c) Use SSIS.

=======================================================================

Jeffrey Williams SSC Guru Points: 90303 More actions · Answer 3

I would add a RANK or DENSE_RANK to your query based on the customer ID and create a table to hold the results. You can then use that ranking column as the start/end for each file with a simple query passing in the start and end rank.

drop table if exists dbo.tmpCustomerOrders;

select
   [Customer Name] = c.customername
  ,[Customer ID] = c.id
  ,[Item Purchased] = o.itemname
  ,[Item Number] = o.itemNum
  ,[Item Description] = o.itemDesc
  ,CustomerRank = dense_rank() over(Order By c.id)
into dbo.tmpCustomerOrders
from dbo.mystore.customermstr c
join dbo.mystore.items o 
    on c.id = o.customerid
where c.status = 'true';

Declare @startRank int = 1
      , @endRank int = 100000;

select
   co.[Customer Name]
  ,co.[Customer ID]
  ,co.[Item Purchased]
  ,co.[Item Number]
  ,co.[Item Description]
from dbo.tmpCustomerOrders
where CustomerRank >= @startRank
and CustomerRank < @endRank;

Using whichever tool you want to export the results, build a loop passing in the start and end rank for each loop. Get the max value of the customer rank to identify when you have reached the end.

Calculate the increment to be used for each loop - which can be based on the average number of items purchased or the total number of customers. For example, if the average number of items purchased is 2 - then increment by 50,000 for each loop or if the total number of customers is 120,000 and you want 12 files - increment by 10,000 for each loop.

If this needs to be a repeatable process - I would use SSIS. For an adhoc process - I would use version 7 (or higher) Powershell script with Export-Csv.

Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

― Charles R. Swindoll

How to post questions to get better answers faster
Managing Transaction Logs

Jeff Moden SSC Guru Points: 1004363 More actions · Answer 4

FlyingSquid wrote:

I'm using sql statements to query data from multiple tables and exporting them into csv files, then importing that file somewhere else.

I have to ask "Why" here. What is the ultimate use of those files going to be?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

frederico_fonseca SSCoach Points: 16156 More actions · Answer 5

based on the 100k row limit I would nearly bet this is to generate files to load to a MySQL Instance using the "Load data Local" statement (this is the wrong way to do it for very big loads!!!).

to the OP - can you reply both Jeff questions and my own with regards to this -

what are you doing with the files after they are generated
why the 100k row limit
Most importantly why the need to have all rows for a customer on the same file
and can the 100k row limit be slightly exceeded