How to Insert million of records into a table?

Question

Post reply

How to Insert million of records into a table?

ns550

SSC Enthusiast

Points: 181
More actions
November 6, 2019 at 6:35 am

#3695352

Hi
Seeking help on above question. Because of this question I have failed my 1st interview. Please also provide couple of examples on how to achieve this result, it will be big help for my research.
Thanks in advance to answers provider!

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply

NorthernSoul SSCertifiable Points: 6869 More actions · Answer 1

That's an easy one to search for yourself, you'll also learn more.

Thanks

Thomas Rushton SSC-Insane Points: 22649 More actions · Answer 2

What was your answer? Do you know why it was wrong?

What's the job? (I assume it's a job interview you failed...)

Thomas Rushton
blog: https://thelonedba.wordpress.com

Jeff Moden SSC Guru Points: 1004415 More actions · Answer 3

And... was that all there was to the question? Did they identify the source of the data? Did the identify any limits or extenuating circumstances?

Better than that, did you have anything on your resume that said you know how to work with lots of data or had tables that contained millions of rows or knew how to do ETL or import data or what?

I also have to agree with the others. "Research" means finding stuff out on your own... not having others provide answers to interview questions for you.

I will give you a starting point, though... unless there are some currently unknown limits or additional circumstances, the way to insert a million rows is the same way to insert just one. Limits and additional circumstances will cause variations on that theme.

Now that you know that, all you have to do know is be prepared to discuss the many variations. And, if that's all the information they gave you for the question, then they may have dinged you for not asking about limits and additional circumstances.

My answer to such a simply stated question with no additional information offered would have started with "It Depends" following by the litany of limits, circumstances, and the effects each would have on the code and what the code should contain.

While this type of question might seem a bit unfair, if you were interviewing for a senior position, there are no requirements on the part of the interviewers to be fair because they're looking for the best candidate they can get for the money.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jonathan AC Roberts SSCoach Points: 18377 More actions · Answer 4

Here are two methods:

IF OBJECT_ID('tempdb..#SlowInserts','U') IS NOT NULL BEGIN
    DROP TABLE #SlowInserts
END
GO
CREATE TABLE #SlowInserts(Id int IDENTITY(1,1) NOT NULL, Comment varchar(50) NOT NULL)
GO
SET NOCOUNT ON
DECLARE @i int = 0
DECLARE @InsertValue as varchar(36)
WHILE @i < 1000000 BEGIN
    INSERT INTO #SlowInserts(Comment) 
    SELECT NEWID()
    SET @i+=1
END
GO
SELECT count(*) FROM #SlowInserts

IF OBJECT_ID('tempdb..#FastInserts','U') IS NOT NULL BEGIN
    DROP TABLE #FastInserts
END
CREATE TABLE #FastInserts(Id int IDENTITY(1,1) NOT NULL, Comment varchar(36) NOT NULL)
GO
;WITH X1 AS (SELECT * FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) T(C))
INSERT INTO #FastInserts(Comment) 
SELECT TOP(1000000) NEWID()
  FROM X1 A,X1 B,X1 C,X1 D,X1 E,X1 F

SELECT count(*) FROM #FastInserts

Kev Wisbey-443237 Right there with Babe Points: 783 More actions · Answer 5

Re Jeffs comment Did the identify any limits or extenuating circumstances?

I was tasked with importing over 15000000 rows of data, first having to delete a massive amount of existing data. The problem was the insert had to be complete before staff started work that day. The insert was overrunning and causing problems, solution drop the indexes, insert the data then rebuild indexes.

Jeff Moden SSC Guru Points: 1004415 More actions · Answer 6

Kev Wisbey-443237 wrote:

Re Jeffs comment Did the identify any limits or extenuating circumstances?
I was tasked with importing over 15000000 rows of data, first having to delete a massive amount of existing data. The problem was the insert had to be complete before staff started work that day. The insert was overrunning and causing problems, solution drop the indexes, insert the data then rebuild indexes.

Cool... I love this kind of feedback.

Just curious... you say that you imported over 15 million rows of data but that you first had to delete a massive amount of existing data. That prompts me to ask some additional questions...

How many rows were typically deleted?
After the 15 Million Row import, how many total rows were left in the table?
What was the criteria for the deletions? Was it based on some temporal value or ???
Did you drop the Clustered Index or was the table a heap?
Was any replication or other use of the log (log shipping) required for this one table?
What was the Recovery Model of the database set to and, if set to FULL, was the temporary use of BULK LOGGED allowed? If not to the latter, could the table be moved to a different database if no code changes were required?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004415 More actions · Answer 7

p.s. The questions I asked above (and possibly more) would be the kind of questions (obviously not identical because there were no deletes) that I would have asked an interviewer if they asked the simple question of "How to Insert million of records into a table?" without any amplifying information except the first two words in my reply would have been "It Depends".

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Kev Wisbey-443237 Right there with Babe Points: 783 More actions · Answer 8

Jeff thanks for that, getting a cool from you wow. The problem was we had to get all till transactions from a large group of outlets, in case there was any breakdown in the outlets internet the idea was delete several days transactions and reload the data. As for how many rows were there after I honestly cannot remember (this was 2010).It was a clustered index no way would we have a heap and if I remember we had more than 1 index. Re your point 5 & 6 as I was only involved in writing the SSIS package for the import I cannot comment on those points.

Jeff Moden SSC Guru Points: 1004415 More actions · Answer 9

Thanks, Kev but... darn it all. I was hoping you remembered more details because it sounds like a wicked interesting problem and I was going to set something up to explore the given method and some of my own. One of the first things I cut my teeth on (circa '96) in SQL was loading shedloads of telephone data.

Anyway, thank you again for the kind feedback and the information that you did remember.

heh... p.s. I never used DTS or SSIS for any of it. Didn't even know such a thing existed back then and that might not be all bad. 😀 A large part of many jobs in the years after that were to replace SSIS jobs with T-SQL jobs.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)