Reports upgraded to SSRS 2019 on WIN 2019 server is slower compared to SSRS 2008

  • We are upgrading SSRS at QBE from SSSR 2008 on Windows 2008 R2, to SSRS 2019 on Windows 2019 for the Report Server and Catalog DB . We’ve done the upgrade and have hit a re response time issues, where we have noticed degradation.

    Question: why is running the same reports (and fetching the parameters) on SSRS 2019 on Windows 2019 slower than it runs on Windows 2008 R2 with SSRS 2008, as part of our investigation, we have also tried installing SSRS 2019 on Windows Server 2012 and Wins 2012 and Wins 2008 R2 have SSRS 2008 and SSRS 2019 running much faster than SSRS 2019 on Windows 2019 – we need help to understand what changed and how to speed it up please.

    We are looking at three parts:

    Fetching Parameters – Focused on this, as this showed the largest response time degradation

    Running the Report

    Emailing the Report

    The setup we are running:

    SSRS Report Server (Server: TestReportServer1.domain1 and TestReportServer2.domain1) – SSRS 2019, Windows 2019

    Data Source Server (DatabaseServer.domain1) – SQL Server 2016

    SSRS Catalog DB Server (TestServerCatalog.domain2) – SSRS 2019, Windows 2019

    We have not modified any of the .Net versions or the DB connection drivers, we have just run the SSR 2019 installer and used what came with that.

    Our analysis

    Fetching the Parameters:

    We’ve investigated Fetching the Parameters and found that the SQL executed against the Data Source is taking about 25% of the time (500ms of 2200ms total), we saw this in the SQL Profiler trace we ran on the Data Source. We also have Dynatrace setup and in use at the client and see a similar story.

    This is what we see from the DevTools for the requests and the waterfall – most time is from Reportviewer.aspx

    Devtools

     

    This how the server side times look in Dynatrace – The time matches and we can see only 25% of the time is in the SQL statement being executed (the attached profile trace also confirms this)Dynatrace

     

    Request details

     

    Other things we looked into:

    1. Confirmed SSRS config is the same between 2008 and 2019
    2. We tried two upgrade approaches:

      Restoring the Catalog DB – Lift and Shift approach

      Uploading the RDL Files – a manual approach

    3. Ran SSRS 2019 with http instead of https (as https is the default in newer versions) – both configurations run the same

      Kerberos authentication is used, so we verified the setup from Microsoft website, to ensure correct setup

      Disabled customer insights

    4. Verified the Data Centres the Servers sat in, to check for latency, but they all sit in same datacenter (the NonProd DC)

      Windows Server specifications compared 2008 vs 2019 - identical

    5. Windows Server Memory/CPU/Disk and Network consumption between 2008 and 2019 – looked similar, no leaks, no spikes, nothing obvious
    6. Reviewed SSRS Logs, the Execution Log/Log3 – didn’t highlight lower level detail to help

      Reviewed Windows Event Logs – saw nothing

  • If I am understanding your post correctly, switching to Windows Server 2019 is causing the slowness getting data from the SQL instance, correct?

    If so, I would guess the problem is related to something related to the OS not related to SQL.

    The way I read your question is that you have SQL 2019 on win 2019 and it is slow, but you also tested SQL 2019 on Win 2012 and it was faster.  If I am reading this right, I would say the problem is with your Windows 2019 setup and not with SQL specifically.

    You also indicated you ran profiler - are you seeing that the SQL query is taking longer than when requested from the 2008 R2 instance?  if so, I would be looking at the connection string to ensure you are setting that up 100% identical.  Changing the report side without changing the data side should NOT be able to make a query "slower".

    Now, if I read the question wrong and SQL 2019 is slower than 2008 R2 no matter what windows server version you use, my guess for the performance difference would be caching.  I find that post reboot of my SSRS instance (the SQL side and the UI), I find that performance of reports is more sluggish.  But if I let it run for a day or 2, performance improves.  This is expected as a reboot would flush the cache and as reports get run and re-run, the cache would repopulate itself.

    Alternately, if your SSRS UI is on 2019 BUT the SSRS database is older, I would imagine that could cause performance issues.  Note - I do NOT mean the data source, but the ReportServer and ReportServerTempDB databases should be at 2019 (both the instance AND compatibility levels).

    Now, all of the above being said, I don't see any analytics done at the database in your post.  Everything seems to be from the web interface load times.  I would be curious to see the time it is taking for your SQL query to complete for getting the parameter data and I'd like to see what else is using that database.  It could be something like blocking on the Database side that is causing the report to load slowly and changing your isolation level MAY help.  The execution log from SSRS would also be interesting to see if the problem is with data retrieval or rendering.  If it is rendering that is hurting performance, you MAY get a performance boost by using a fully updated Edge.  Using outdated browser (such as Internet Explorer) will likely render wrong or slowly.

    My last thought is that you may have a slow network connection.  Your first screenshot is showing the Edge Developer Tools with cache disabled.  What I would be curious about is how fast it is on the 2008 R2 box with the same settings and output as that first screenshot.  What I notice with that first screenshot is that the larger the file, the slower things went.

    Also, I think your math is a bit off.  The total run time says it is close to 22 seconds, which would be 22,000 ms as 1 second is 1000 ms.  So SQL taking up 500 ms is 1/2 second which is pretty quick in my opinion.  The slow part was downloading that 6.5 MB .hxr file.  By any chance, do you have an antivirus installed on the Win 2019 machine that isn't installed on the 2008 R2 machine?  If the antivirus is doing a scan on the files before transferring them, it could be the antivirus trying to figure out what to do with an .hxr file and causing the slowness that wouldn't exist if the AV was told to skip .hxr files or wasn't configured at all (note I do NOT recommend turning off the AV).

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • Hi Brian,

    You're correct, the switch in OS is causing the snowless. We never tested SSRS 2019 on Windows 2008, just windows 2012 (as we went to 2012 for a test, when we found the slowness in windows 2019 vs 2008). We also can't for now or don't want to test on SSSRS 2019 on Windows 2008, as we have to move away from that to a later version anyway, and we know it’s specific to windows 2019 this slowness issue.

    We see 21.3s duration for the ‘Waiting’, TTFB action in Chrome Dev Tools:

    TTFB action in Chrome Dev Tools

    We can also see in Dynatrace (an Application Performance Monitoring Tool), that 21.2s is used in the service (screenshots in the OP), so we know it's not related to report rendering or UI rendering the parameters etc and we enable caching and run the fetching of parameters and also the report multiple times and still get this slowness with the same pattern of seeing it processing on the server side, with HAR file and Dynatrace showing this and then as the below DB TRACE file shows we have multi Data Source DB hits for the fetching of the parameters, but we have the audit logout step too, coming at ~1.5-2s per hit - so there is lost time where the connection is being released (we saw comments on this here)

     

    Here is the trace from (SQL 2019, Windows 2012) - no issues

    SQL 2019, Windows 2012_PROFILER

    Here is the trace from (SQL 2019 and Windows 2019) - with lost time, going slower

    SQL 2019 and Windows 2019_PROFILER

    So taking the durations of the executions for this parameter fetching part and comparing (each have 8 of the same select statements running) [because we ran for the same report to compare apples to apples as close as possible]:

    (SQL 2019, Windows 2019)

    SQL Executions: 3268ms which is 16% of the server processing time (20300) [Taken from Chrome, TTFB]

    Taking the time to Audit Logouts: 16500ms is 81% of the server processing time (20300) [Taken from Chrome, TTFB]

     

    (SQL 2019, Windows 2012)

    If we count to the audit logout step, we get 3239ms so UNKNOWN% of the server process time (didn't capture) [Taken from Chrome, TTFB]

    Taking the time to Audit Logouts: 3742ms is UNKNOW% of the server processing time (didn't capture) [Taken from Chrome, TTFB]

     

    It's not perfect, but it does show us, something is happening on the windows 2019 server, which is slower, much slower.

    Nothing else is using the data source database, as we're the only team in that environment, we monitoring that to make sure.

    Rendering isn't happening, as we are only looking at the server side processing, that what Dev Tools, Dynatrace, and the DB Trace is showing too.

    The amount of data being sent and seen in Chrome Dev Tools was always 6.5mb, the queries are the same, so the same data is being fetched from Data source to Report View etc. so don’t think it’s network from client to report Server, but it could be related to the network from Report Server to the Data Source

    We’ve just done this level of analysis for now with captured logs etc. for the fetching of parameters, we will work on a report and update tomorrow, thanks again Brian – we did it all higher level timings though and know for sure that all 3 parts run slower: Parameter Fetching, Report generation and Emailing Report.

     

    We're also thinking it must be related to the Connection String as you suggested or the frameworks or the differences in the OS, at the OS level. – how to find this is challenging.

    Do you know what else we could do to find why we are losing 1-2s per fetch between the SQL being collected and logging out…

    A colleagues suggested Wireshark might be helpful? – I was thinking of process explorer too see try and capture the DDLs used etc – any other suggestions welcome.

    Thanks community.

  • I am thinking you mean DLL not DDL.  Wireshark would probably help determine if it is network related, and I am still leaning towards something on the network stack between the application and the database.

    The "Logout" operation time is the time from login until logout.  So seeing 2 seconds on the 2019 machine indicates that the transfer of data from the database to the application (SSRS) is taking time or it is getting a lot of bad data between the 2 machines and making a lot of retry requests.  Since it is a VM, I would find it strange that the network would go crazy on just 1 of the VM's, but it could be something silly like Windows is using the wrong NIC driver for the VM infrastructure.  I would probably check that your VM software is up to date AND that it supports Windows 2019.  It likely does, but it never hurts to double check that your VM software works well with Windows Server 2019.  Also, on the idea of updates, are the Windows 2019 AND SQL Server 2019 fully patched?  It could be a known bug that was corrected in an update.

    I would say that wireshark is a good way to go here as it will tell you what is happening on the network level.  Process explorer might help, but I think it would likely tell you it is using a bunch of SSRS specific DLL's and probably http.sys.

    Another thing to check, but is probably not going to lead anywhere, are you viewing the page from your local machine OR from the SSRS server?  AND does the Windows 2019 server have enough free memory to work with?  SSRS operates outside of the SQL Server memory space, so if you only have a small amount of memory free, you may be paging to disk which would be incredibly slow.  If I remember right, Windows Server 2019 would need more memory for the base OS work than Server 2012.

    I am still leaning towards network in which case wireshark will help though.

    Side note - SQL Server 2019 is not supported on Windows Server 2008 R2, so SSRS 2019 on 2008 R2 is probably not going to work either.  What I was curious about is if your 2008 R2 SSRS performance is similar to the 2019 on 2012 one (as you have metrics there).

    Another question - are you doing an upgrade of the OS OR is it a fresh 2019 install?  The reason I ask is that SOMETIMES there are weird things that happen when you do an OS upgrade and a fresh install then migration of the required software will get you better performance.  Bigger pain in the butt as you have to install all of the software on there, but you can get the new machine set up and ready to go without impacting production.  Then you schedule downtime to do the cutover where you move the data over and you are done!  I would try a fresh install of 2019 then toss SSRS onto the box and ONLY have those 2 tools on it to do some testing.  If performance is better, then migrate other things over until the 2019 box has everything you need.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • Did you check what the wait types are for the slow queries? Are you using Linked Servers? I think there's a known issue with connecting to databases lower than 2012, assuming you kept the compatibility level at 2008.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply