February 16, 2023 at 6:21 pm
I'm using sql statements to query data from multiple tables and exporting them into csv files, then importing that file somewhere else.
Is there a way to export the results into csv files that limit the size of each file to approx 100,000 rows each. Once reached, the export creates a new csv file and increments a number. So the first file is cusexp_{datetime}_1, then cusexp_{datetime}_2, cusexp_{datetime}_3, etc.
Say the results return 1,000,000 rows. Export to csv files of approx 100,000 rows each. csv filename format cusexp_{datetime}_{iterationNumber}. Don't break up same [Customer ID] c.id between multiple files. So in this case I'd have around 10 or 11 csv files.
Here is a rough sample of what I'm selecting.
select
[Customer Name] = c.customername
,[Customer ID] = c.id
,[Item Purchased] = o.itemname
,[Item Number] = o.itemNum
,[Item Description] = o.itemDesc
from dbo.mystore.customermstr c
join dbo.mystore.items o
on c.id = o.customerid
where c.status = 'true';
February 16, 2023 at 6:29 pm
I could also insert them into a temp with a counter column, then export the data from that file, but I'm not sure if that will help.
March 10, 2023 at 11:40 pm
You can give these a try:
a) Import Export wizard.
b) Use Excel directly, Connect to the Instance by creating a connection and retrieve the data in itself.
c) Use SSIS.
=======================================================================
March 11, 2023 at 5:16 pm
I would add a RANK or DENSE_RANK to your query based on the customer ID and create a table to hold the results. You can then use that ranking column as the start/end for each file with a simple query passing in the start and end rank.
drop table if exists dbo.tmpCustomerOrders;
select
[Customer Name] = c.customername
,[Customer ID] = c.id
,[Item Purchased] = o.itemname
,[Item Number] = o.itemNum
,[Item Description] = o.itemDesc
,CustomerRank = dense_rank() over(Order By c.id)
into dbo.tmpCustomerOrders
from dbo.mystore.customermstr c
join dbo.mystore.items o
on c.id = o.customerid
where c.status = 'true';
Declare @startRank int = 1
, @endRank int = 100000;
select
co.[Customer Name]
,co.[Customer ID]
,co.[Item Purchased]
,co.[Item Number]
,co.[Item Description]
from dbo.tmpCustomerOrders
where CustomerRank >= @startRank
and CustomerRank < @endRank;
Using whichever tool you want to export the results, build a loop passing in the start and end rank for each loop. Get the max value of the customer rank to identify when you have reached the end.
Calculate the increment to be used for each loop - which can be based on the average number of items purchased or the total number of customers. For example, if the average number of items purchased is 2 - then increment by 50,000 for each loop or if the total number of customers is 120,000 and you want 12 files - increment by 10,000 for each loop.
If this needs to be a repeatable process - I would use SSIS. For an adhoc process - I would use version 7 (or higher) Powershell script with Export-Csv.
Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”
― Charles R. Swindoll
How to post questions to get better answers faster
Managing Transaction Logs
March 12, 2023 at 7:41 am
I'm using sql statements to query data from multiple tables and exporting them into csv files, then importing that file somewhere else.
I have to ask "Why" here. What is the ultimate use of those files going to be?
--Jeff Moden
Change is inevitable... Change for the better is not.
March 12, 2023 at 12:03 pm
based on the 100k row limit I would nearly bet this is to generate files to load to a MySQL Instance using the "Load data Local" statement (this is the wrong way to do it for very big loads!!!).
to the OP - can you reply both Jeff questions and my own with regards to this -
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply