INSERT Statement - How to Handle to Avoid Thousands

  • I have a table with the following four columns:

    [ProjectType]

    [ProjectSubType]

    [ProjectCategory]

    [ProjectSubCategory]

    I need to perform an INSERT statement of the following:

    ProjectType: "Infrastructure"

    ProjectSubType: "ALL"

    ProjectCategory: "Hardware"

    ProjectSubCategory: "RAM"

    The problem here is: when ProjectSubType says "ALL", this isn't an actual value in the database, but what is meant is we need to add a record for every ProjectSubType where ProjectType = "Infrastructure".

    As such, this query: SELECT COUNT(*) FROM [ProjectType] WHERE ProjectType = 'Infrastructure'

    Returns about 4400 records.

    I am sure there is a way to do this without the need for 4400 different INSERT Statements, but the problem is, I am not sure how to go about doing this. Would anyone be able to help?

  • Maybe something like this:

    DECLARE @ProjectType varchar(20) = 'Infrastructure'

    , @ProjectSubType varchar(20) = 'ALL'

    , @ProjectCategory varchar(20) = 'Hardware'

    , @ProjectSubCategory varchar(20) = 'RAM'

    INSERT INTO SomeTable

    SELECT *

    FROM [ProjectType]

    WHERE (ProjectType = @ProjectType OR @ProjectType = 'ALL')

    AND (ProjectSubType = @ProjectSubType OR @ProjectSubType = 'ALL')

    AND (ProjectCategory = @ProjectCategory OR @ProjectCategory = 'ALL')

    AND (ProjectSubCategory = @ProjectSubCategory OR @ProjectSubCategory = 'ALL')

    OPTION( RECOMPILE);

    If you send a percentage sign instead of the word ALL, it can be simplified.

    DECLARE @ProjectType varchar(20) = 'Infrastructure'

    , @ProjectSubType varchar(20) = '%'

    , @ProjectCategory varchar(20) = 'Hardware'

    , @ProjectSubCategory varchar(20) = 'RAM'

    INSERT INTO SomeTable

    SELECT *

    FROM [ProjectType]

    WHERE ProjectType LIKE @ProjectType

    AND ProjectSubType LIKE @ProjectSubType

    AND ProjectCategory LIKE @ProjectCategory

    AND ProjectSubCategory LIKE @ProjectSubCategory;

    On another subject, I would review this db design as you seem to be missing some normalization.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • I go to hit reply and see Luis has not only answered the question (better than I was going to), but has also raised my biggest concern, it doesn't sound like this database is structured correctly. You shouldn't need to do 4400 inserts for any one value addition. What happens if they change any other one of the columns? Another 4400 rows? That absolutely implies something is off. While Luis has addressed your immediate question, you have bigger issues that you should focus on.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • I realize this database is not normalized, but unfortunately I inherited it when the previous developer left and I was hired to replace him and now I must work around the non-normalization (if that is a word). To normalize the database would be an enormous undertaking at this point and would most likely require an almost complete rewrite of the application code.

    Aside from this, I do have a question. On your second example, where you use the "%", I assume this means a wildcard, as the first example unfortunately won't work, because we do not actually have a ProjectSubType of "All", but I used "All" as a way to avoid writing out all 4400+ examples. It is basically a short cut to mean "Every ProjectSubType" without the requester having to outline all 4400 records.

  • kocheese (3/25/2016)


    I realize this database is not normalized, but unfortunately I inherited it when the previous developer left and I was hired to replace him and now I must work around the non-normalization (if that is a word). To normalize the database would be an enormous undertaking at this point and would most likely require an almost complete rewrite of the application code.

    Be prepared, because once this project is over, you should start the normalized version of this. Basically, to avoid great problems, you'll have to rewrite the application completely at some point.

    Aside from this, I do have a question. On your second example, where you use the "%", I assume this means a wildcard, as the first example unfortunately won't work, because we do not actually have a ProjectSubType of "All", but I used "All" as a way to avoid writing out all 4400+ examples. It is basically a short cut to mean "Every ProjectSubType" without the requester having to outline all 4400 records.

    The percent sign (%) is a wildcard for any string.

    In the first example, I compare the value from the parameter to the column and to the literal 'ALL'. If the value is equal to a value in the column, then it will return the matching rows. If the parameter value is equal to 'ALL', it will return all rows.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Thank you for the help, but unfortunately this doesn't seem to work as I expected. Not sure I explained it correctly.

    The table has the following columns:

    [ProjectType]

    [ProjectSubType]

    [ProjectCategory]

    [ProjectSubCategory]

    I know the values of ProjectType, ProjectCategory as they will be static, so the query can be as following:

    As such, this query: SELECT COUNT(*) FROM [ProjectType] WHERE ProjectType = 'Infrastructure' AND ProjectCategory = 'Hardware'. I didn't think I would need to worry about ProjectCategory, but apparently I do.

    But when I try the solution with the wildcard, I don't get any inserts.

    For every row of the above query, I need to add a new value for every ProjectSubCategory to the existing ProjectSubType.

    Which would be something like

    INSERT INTO (ProjectType, ProjectSubType, ProjectCategory, ProjectSubCategory) VALUES ('Infrastructure', '[ValueFromQuery]', 'Hardware', 'Value1')

    INSERT INTO (ProjectType, ProjectSubType, ProjectCategory, ProjectSubCategory) VALUES ('Infrastructure', '[ValueFromQuery2]', 'Hardware', 'Value1')

    INSERT INTO (ProjectType, ProjectSubType, ProjectCategory, ProjectSubCategory) VALUES ('Infrastructure', '[ValueFromQuery3]', 'Hardware', 'Value1')

    So if there are three ProjectCategory rows, I would have three inserts for the same value.

    Then I need to add Value2, so I would need three more INSERT statements as follows:

    INSERT INTO (ProjectType, ProjectSubType, ProjectCategory, ProjectSubCategory) VALUES ('Infrastructure', '[ValueFromQuery]', 'Hardware', 'Value2')

    INSERT INTO (ProjectType, ProjectSubType, ProjectCategory, ProjectSubCategory) VALUES ('Infrastructure', '[ValueFromQuery2]', 'Hardware', 'Value2')

    INSERT INTO (ProjectType, ProjectSubType, ProjectCategory, ProjectSubCategory) VALUES ('Infrastructure', '[ValueFromQuery3]', 'Hardware', 'Value2')

    If I did this manually, there are upwards of 4400 insert statements.

    My most sincere apologies for my original misstatements. Hopefully this helps explain better what I was trying to say the first time.

  • kocheese (3/25/2016)


    So if there are three ProjectCategory rows, I would have three inserts for the same value.

    No, you don't need three inserts. You need one insert statement that inserts 3 rows and that's exactly what my code does.

    When using INSERT INTO, you have two options:

    - Use the VALUES clause to define one row at a time.

    - Use a rowset returned by a SELECT statement or stored procedure execution.

    Both options are largely explained in Books OnLine (BOL): https://msdn.microsoft.com/en-us/library/ms174335.aspx

    Think in sets and let SQL Server worry about the rows.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • kocheese (3/25/2016)


    Thank you for the help, but unfortunately this doesn't seem to work as I expected. Not sure I explained it correctly.

    The table has the following columns:

    [ProjectType]

    [ProjectSubType]

    [ProjectCategory]

    [ProjectSubCategory]

    I know the values of ProjectType, ProjectCategory as they will be static, so the query can be as following:

    As such, this query: SELECT COUNT(*) FROM [ProjectType] WHERE ProjectType = 'Infrastructure' AND ProjectCategory = 'Hardware'. I didn't think I would need to worry about ProjectCategory, but apparently I do.

    But when I try the solution with the wildcard, I don't get any inserts.

    For every row of the above query, I need to add a new value for every ProjectSubCategory to the existing ProjectSubType.

    Which would be something like

    INSERT INTO (ProjectType, ProjectSubType, ProjectCategory, ProjectSubCategory) VALUES ('Infrastructure', '[ValueFromQuery]', 'Hardware', 'Value1')

    INSERT INTO (ProjectType, ProjectSubType, ProjectCategory, ProjectSubCategory) VALUES ('Infrastructure', '[ValueFromQuery2]', 'Hardware', 'Value1')

    INSERT INTO (ProjectType, ProjectSubType, ProjectCategory, ProjectSubCategory) VALUES ('Infrastructure', '[ValueFromQuery3]', 'Hardware', 'Value1')

    So if there are three ProjectCategory rows, I would have three inserts for the same value.

    Then I need to add Value2, so I would need three more INSERT statements as follows:

    INSERT INTO (ProjectType, ProjectSubType, ProjectCategory, ProjectSubCategory) VALUES ('Infrastructure', '[ValueFromQuery]', 'Hardware', 'Value2')

    INSERT INTO (ProjectType, ProjectSubType, ProjectCategory, ProjectSubCategory) VALUES ('Infrastructure', '[ValueFromQuery2]', 'Hardware', 'Value2')

    INSERT INTO (ProjectType, ProjectSubType, ProjectCategory, ProjectSubCategory) VALUES ('Infrastructure', '[ValueFromQuery3]', 'Hardware', 'Value2')

    If I did this manually, there are upwards of 4400 insert statements.

    My most sincere apologies for my original misstatements. Hopefully this helps explain better what I was trying to say the first time.

    You would not need 4400 insert statements.A single select statement along with the APPLY clause should suffice.The following example might give you an idea.

    DECLARE @mytable AS TABLE

    (

    projecttype VARCHAR(100),

    projectcategory VARCHAR(100),

    projectsubcategory VARCHAR(100),

    projectsubtype VARCHAR(100)

    )

    INSERT INTO @mytable

    SELECT *

    FROM (VALUES('Infrastructure','Hardware'))AS T1(projecttype,projectcategory)

    CROSS APPLY(VALUES('Value1'),

    ('Value2'))T2(projectsubcategory)

    CROSS APPLY(VALUES('ValueFromQuery'),

    ('ValueFromQuery2'),

    ('ValueFromQuery3'))T3(projectsubtype)

    SELECT * FROM @mytable

    At the right input part of the APPLY clause you need to replace with the select statement that returns rows for 'Value1','Value2' same needs to be done for 'ValueFromQuery', 'ValeFromQuery2' etc...

    --------------------------------------------------------------------------------------------------
    I am just an another naive wannabe DBA trying to learn SQL Server

  • AH. I gotcha now and see what you are saying / doing. Makes sense now. Thank you for the help. Appreciate it

  • kocheese (3/28/2016)


    AH. I gotcha now and see what you are saying / doing. Makes sense now. Thank you for the help. Appreciate it

    And now you know at least one form of a thing called "Relational Multiplication". 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply