July 24, 2020 at 6:06 pm
snip
OLD Server: 16 vCPU, 24 GB RAM
NEW Server: 4 vCPU, 8 GB RAM
Argh! That had not been my understanding. I'm guessing that this is what TL meant that he said that he saw this kind of problem when a new server is "smaller". Since my new server (compared to the old one which is a physical PC) is on a blade center, I understand that my network administrators can just change the server specifications at will. The consultant recommended trying to make the new server like the old server to see if the problem would resolve. My network administrators are not cooperating. They are only agreeing to change the CPUs from 4 vCPU to 8 vCPU and no change in RAM.
I don't know how long this will take, but I feel like changing the server specification is probably going to be the answer, and I didn't want to leave people on this forum hanging for long. (Hence this update.) I'll report back again when I've thoroughly tested the idea of changing the server specs (if I can get enough cooperation from my own people to do that). If this works, I'll come back and mark TLs post as the answer.
I try really hard not to waste people's time. I feel bad that I didn't know this about our server specs sooner. I truly do appreciate *everyone's* help. Thank you so much!
those admins don't really know what makes SQL happy - it would most likely be of more benefit to increase the ram to 12 or 16 than adding the 4 extra cpus.
try and get them to setup a test vm - and try the different combinations to see which one is better.
and if they are struggling with memory they should have bought more - VM hosts should have a hell of a lot of ram - in my shop minimum now is 512 and we have a few big clusters with 3TB of ram - 50% allocation available for vmotion
If they still refuse and if your SQL License is for the physical box (e.g. it is Enterprise edition and all physical cores are licensed) then get them to create a new VM - and get SSIS running on that new server alone, pointing to the db on the other server.
July 24, 2020 at 8:42 pm
Phil Parkin: Thanks for the encouragement and making me not feel so bad. You are right that I have learned a lot. That's the silver lining.
TL and frederico_fonseca: That is soooooo helpful! I'm going back to the network admins with this information. I won't use your names, but I'm going to quote you. Maybe they will listen if it's not me saying it, and since it's from people who have experience (which frankly, my agency does not seem to have). Thank you. 🙂
July 24, 2020 at 9:23 pm
I had been told by people in my agency that my new server was at least good as the old one. However, when the consultant compared the old server to the new one yesterday morning, he shared this with me:
OLD Server: 16 vCPU, 24 GB RAM
NEW Server: 4 vCPU, 8 GB RAM
Argh! That had not been my understanding. I'm guessing that this is what TL meant that he said that he saw this kind of problem when a new server is "smaller".
Lordy... those folks are truly penny-wise and pound foolish. You just don't do this to a production system (or any system, for that matter) until you've proven that it will not cause problems.
I also agree with Phil... you've been very good about taking investigation steps recommended by folks. My hat is off to you for your tenacity.
--Jeff Moden
Change is inevitable... Change for the better is not.
July 24, 2020 at 11:36 pm
Thanks Jeff! I really appreciate the support. When I think about how many *weeks* I have spent trying to figure this problem out (long before I posted here), I feel depressed and a bit like I'm recovering from a trauma. However, it's been fun to have the support of the people of this community. I appreciate how patient people were with me, and I really have learned a lot.
I just realized that I also need to give a shout out to Mr. Brian Gale and Phil Parkin (and probably others) who early on had proposed a memory problem. I thought I knew how much memory was on the server, and I thought my tests had ruled out memory as a problem. Based on what we now know about the new server and based on what you, as SQL experts, are telling me about memory, I think that memory is the problem after all and you were right. Fingers crossed.
July 29, 2020 at 10:34 pm
*** FINAL ANSWER ***
Bottom Line: A change of vCPUs from 4 to 8 fixed the hanging problem.
WHAT ABOUT RAM?
I tried to argue for 16 GB RAM not only to possibly address the problem with the SSIS package, but also to head off any future problems when the users start pounding away at the system. (I used your excellent quotes.) The network admins felt strongly enough about the issue to both deny my request and to bring it to our mutual supervisor. I summarized the issue for the supervisor, explaining that we went from 24 to 8 GB RAM. THAT'S when my local network admins told me that the consultant was wrong and the new server really has 12 GB RAM. The supervisor sided with the network admins on denying any increase in RAM, talking about the whole right-sizing topic you warned me about.
My Thoughts: I think the RAM situation might turn out OK though. 12 GB RAM is a lot better than 8. Then again, the second SSIS package consistently takes 5 minutes to run on the old server and is taking 6 minutes to run on the new server. Hardly the end of the world for a nightly process. But could the increased time be related to insufficient RAM? (The network admins don't think so.) Or still not enough CPUs? I just worry that the increased time for the SSIS package might be indicative of a general performance/configuration problem that might impact users when we go live. And it just feels wrong to "go backward." All previous server upgrades resulted in improved performance. (worry over)
WHAT OTHER WEIRD THING HAPPENED (and resolved)?
I'm going to describe this both for interest sake and in the hopes it can help someone else in the future--as the error message would never in a million years have led me to believe I had a CPU problem.
Background: I thought it was important to stick to one problem in this thread. However, I have two SSIS packages and had two problems. The second problem presented itself when I ran the second package and also in the first package discussed on this thread when I was working on Steve Jone's suggestion of precisely identifying which tables were the problem.
The Problem: For some tables, I got the following error:
"Violation of PRIMARY KEY constraint 'MyTableName'. Cannot insert duplicate key in object 'MyTableName'. The duplicate key value is ( )."
Here's the weird part: I did *not* remove the value from the parentheses. Other than replacing the table name in the error message above, that's exactly what the error message said. The error message has 4 spaces for the value in the parentheses - ie, the value which is supposedly duplicated. The message appears to be saying that the source/Sybase table has duplicate rows with spaces, nulls or maybe empty strings.
I got the same error on several tables, claiming that multiple rows in the source tables had empty/spaces for primary keys. I would get this error for a query that is nothing but a request to download 5,000 rows. Ie, my query did nothing to add/create spaces for primary key values.
I spent some time researching this duplicate key error, and every page I found said that it was just what it sounds like, a duplicate key problem. However, I can say that those tables getting this error absolutely, positively do NOT have spaces/empty strings/nulls for their primary keys. I checked and double-checked. I don't know much about Sybase, but I doubt they allow duplicate values for primary keys even if they did allow spaces. Further, the source tables do not have even a single row with any number of spaces for the primary key field.
This error appeared only for certain tables and only when the package was run from agent. The error did not occur when the package was run through Visual Studio. When run from Visual Studio, I confirmed that all the data correctly downloaded from the source/Sybase table into my local table.
SOLUTION: The error disappeared when the vCPUs on the server were changed from 4 to 8. I have run both packages multiple times now to make sure and the error is still gone. Is this twilight zone material? Or if only I understood about CPUs better, I would understand why the error went away with the CPU change? (No one has to answer that. I'm just sharing my bewilderment and possibly ignorance.)
That's it! Final answer. Do I lose a million dollars? (Joke about American TV game show.)
July 29, 2020 at 11:04 pm
Speaking of tenacity, that's some Awesome feedback! Thank you for taking the time to share what you found all across the board... especially the 4 space duplicate key error. I've never seen such a thing before and would never have guess that adding more CPUs would have fixed the problem... unless they did something else, as well, and didn't tell you. At this point, that wouldn't surprise me at all.
As a bit of a side bar, I just "love" it when supervisors and managers dictate configuration, especially when they "go cheap" without even a test or letting the resident DBA know. Memory is pretty darned cheap these days and they've not come anywhere near the limit even on Standard Edition. So are CPUs except when it comes to software licensing but, I'll say it again, they stripped your box way, way back compared to what it used to be. Wait a bit until the data and connection count grows. More fun to come, I'm afraid.
You're always welcome here, JJ B.
--Jeff Moden
Change is inevitable... Change for the better is not.
July 30, 2020 at 11:15 pm
Thanks Jeff! So fun to get to work with the RBAR guy. 🙂
Viewing 7 posts - 46 through 51 (of 51 total)
You must be logged in to reply to this topic. Login to reply