September 6, 2023 at 2:54 pm
I have a VBScript file fired from a Windows scheduled task calling a stored procedure. The procedure accepts a database name as a parameter, and performs a backup if proper conditions are met. Primarily, if the database has changed during the proeceeding specified interval, a backup is done - transaction log, differential or full, depending on the time of day.
The script uses:
SELECT Max(IsNull(last_user_update,0)) FROM sys.dm_db_index_usage_stats WHERE database_id = DB_ID(@dbname))
to determine when the last change to the database happened. I had this on a 2008 R2 machine, and it has been working there for years. Now, management has 'upgraded' me to a virtual 2019 instance, and this doesn't work. There are two principal databases. When I make a change in a table of one, this returns the correct time and date of that change. When I make a change in the other one, the date/time returnd for that second one is correct, but in the process, the date/time returned by this for the first database becomes Null.
Sometimes I get the correct value for both, but not often, and I can't figure out any pattern for when I get both and when I get only one.
Does this make any sense? Is there some reliable method for getting the date/time when the most recent change was made to ANY table in the database? I know how to get structure changes, from sys.objects, and that is incoporated into the SP as well, but that doesn't happen often, while data changes (obviously) do.
September 6, 2023 at 3:12 pm
Take a look at sys.dm_db_index_usage_stats, it should give you what you want.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
September 6, 2023 at 3:20 pm
Possibly a silly question, but is there a reason for this obscure backup strategy? I am just trying to figure out why you would want to have backups taken sporadically like that and how you would manage your backup retention policy.
My approach has always been to just "set and forget" with consistent schedules. If I know the system is busy, then it gets nightly full backups and hourly log backups. If the system is slow, then it gets 2 full backups per week and hourly log backups.
My concern with your approach is lets say that you have it set so that if the time is midnight and data has changed, then it does a full backup, if it is noon, then do a differential, any other hour do a log backup. Just hypothetically; I don't know how you have it set up. Now you have a system that is busy with data writes (Insert, Update, or Delete) from 1 PM until 10 PM and is idle any other time of any other day. That system will have a lot of log backups, but your last full backup could be months back. Therefore your RPO is impossible to define and your RTO is going to be hours, if not days, to recover depending on the size of the database.
MAYBE I misunderstand your backup strategy, but the way I read it is that there is a chance that you could have a system that has only 1 full backup since the database was created! And if your backup retention policy is set too short, that full backup could already be removed from disk. If that is the case, your RPO is never and your RTO is going to be infinite.
For me, I would rather have a reliable and consistent backup strategy so I can say that my RPO is max 1 hour data loss and my RTO for the largest database I have is around 30 minutes. If I need to recover ALL my systems, that is approximately 24 hours (unfortunately had to test that a few years back), but a single database RTO is approximately 30 minutes or less. With your backup strategy I have no idea how you would define the RPO or RTO.
Now with the above being said, I am not saying you should change backup strategies just because some random guy on the internet said you should.
Now, as for why the values would be null, I am not 100% certain. The official docs from MS say that it should only be cleared if you restart the SQL instance. Any chance that your IT team MAY be restarting the SQL instance on you without telling you? You could probably confirm this by looking at the other columns (last user scan and last user seek or last system seek or last system update). I am uncertain if MAYBE the last user update could get cleared out if there was a last system update such as an index rebuild/reorganize? Not certain if that clears out the user stuff or not.
The above is all just my opinion on what you should do.
As with all advice you find on a random internet forum - you shouldn't blindly follow it. Always test on a test server to see if there is negative side effects before making changes to live!
I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.
September 6, 2023 at 3:46 pm
I didn't want to go into to a bunch of unnecessary detail that has nothing to do with my specific problem, but since you ask...
The Windows scheduled task runs every 15 minutes, starting at 6:00 am and ending at midnight. The SP looks at the time to determine what to do:
If it is 15, 30 or 45 minutes past the hour, AND the last recorded change in the database was sometime during the preceeding 15-minute interval, a transaction log backup is performed. The backup file is named PaleoData2023-09-05_16-30_Trans.bak
If it is on the hour, AND the last recorded change in the database was sometime during the preceeding hour, a differential backup is performed. The backup file is named PaleoData2023-09-05_16_Diff.bak
If it is 23:00, AND the last recorded change in the database was sometime during 'today', a full backup is performed. The backup file is named PaleoData2023-09-05_Full.bak
Usage of the database is a bit sporadic, and this has allowed me to maintain a tight backup strategy without generating thousands of pointless backups for periods when nothing has changed. The VBScript also compresses the backup file, copies it to a special location and deletes the original non-compressed backup file. All backups are also copied by scheduled tasks to SAN units, and I have a special 'little' 6TB USB drive to which I do a weekly manual backup. That unit is permanently shut off, to keep it safe. I log on to the server, examine it manually for problems, and when a personal examination convinces me that the server has not been compromised, I turn on the USB unit, run a special backup procedure aimed at that unit, and turn it off again. (Yes, I know I'm being paranoid.)
As for this specific problem. No, the IT department is not restarting the server. I am looking at this manually, trying to figure out what is going wrong. I manually change some bit of data in a table and immediately afterwards execute that Select statement. It shows that change (or Null instead of the proper change date/time) right away. I have been screwing with this all day, continuously logged into the machine. Nobody is messing with it. No other activity is taking place on the machine - everything is still back on the 2008 rig, and is likely to stay there until I get this sorted out. IT has been bugging me a bit to just do the switch and figure out the backups later, but that is the road to ruin in my estimation. 'Later' usually translates to 'never', and I just can't see migrating to a new platform where I already KNOW that backups are not working properly.
September 6, 2023 at 5:59 pm
I use the following to determine if log backups are worth it on instances with sporadically used DBs. Full and differential backups always get done on the schedule. This works on SQL2016, SQL2017 and SQL2019. I have not got around to testing it on SQL2022.
SET ANSI_NULLS, QUOTED_IDENTIFIER ON;
GO
CREATE FUNCTION dbo.DoLogBackup
(
@LogSizeSinceLastLogBackup int
,@TimeSinceLastLogBackup int
)
RETURNS int
AS
BEGIN;
DECLARE @log_since_last_log_backup_mb int
,@log_backup_time int
,@RetVal int;
SELECT @log_since_last_log_backup_mb = MAX(L.log_since_last_log_backup_mb)
,@log_backup_time = MAX(DATEDIFF(second, L.log_backup_time, CURRENT_TIMESTAMP))
FROM sys.databases D
CROSS APPLY sys.dm_db_log_stats(D.database_id) L
WHERE D.recovery_model_desc <> 'SIMPLE'
AND L.log_backup_time > '19000101'
AND D.database_id > 4;
SELECT @RetVal =
CASE
WHEN @log_since_last_log_backup_mb >= @LogSizeSinceLastLogBackup
OR @log_backup_time >= @TimeSinceLastLogBackup
THEN 1
ELSE 0
END;
RETURN @Retval;
END;
September 6, 2023 at 6:01 pm
Makes more sense to me now, but I know I still prefer having a bunch of "useless" backups rather than an inconsistent backup strategy. But that's just my opinion. Doesn't solve your problem.
The ONLY thing I can think of is that index maintenance is causing the last_user_update column to be blanked out. This COULD be manual or it COULD be automatic once enough rows have changed or been added. My GUESS is that it is index maintenance. This would be easy to test too, update a table, then rebuild/reorg the index. Does the last_user_update column blank out too? I imagine that it does BUT it probably updates the last_system_update column when that happens. So you could check both of those columns instead of just the one.
This would explain why it SOMETIMES works how you expect and SOMETIMES blanks out. And kind of explains why it works in the current system but not the new - new SQL version MAY have different requirements for automated index maintenance OR the old version of SQL may not blank out that column when updating the index.
I mean, that is just a guess.
Also, you must be working with some super confidential data if you go to those lengths to make sure the backup and USB drives are secure, but you could probably remove some of that paranoia by encrypting your backups and compressing them within SQL server when you create them. Saves the overhead of having to compress it AFTER the fact with 3rd party tools (which will also hurt your RTO), plus with the backups being encrypted, you don't have to worry as much about them being "stolen" as nobody can decrypt them without the keys.
I do agree with you though - if the backup strategy is failing, don't migrate. Last thing you want is no backups on the system. I would also want to test the restore process before going live because, as you know, a backup is only "known good" if someone did a restore from it. Just because it was successfully created on disk and passes the validation checks done by SQL doesn't mean that the backup is valid. ONLY after you restore from the backup can you know for certain that the backup is valid.
The above is all just my opinion on what you should do.
As with all advice you find on a random internet forum - you shouldn't blindly follow it. Always test on a test server to see if there is negative side effects before making changes to live!
I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.
September 6, 2023 at 6:12 pm
Thank you, I'll give that a try and see how it works, although the problem I'm running into is that I'm not getting the datetime-last-modified reliably from the system. Other than that, the strategy has worked smoothly for me for around 14 years now.
You're probably right in that the log backups are not necessary, but when I was building this entire system, I was trying to learn as much as I could about the whole backup/restore process, so I built it with as much capability as I could invent with my limited skills of the time. I envisioned tweaking the intervals in the future to suit some more realistic set of requirements, as the system came online and we had gained a better sense of what was actually required, but this worked, nobody bitched and I never came up with any reason to change it.
September 6, 2023 at 6:12 pm
So sorry, obviously I didn't look closely enough.
I can't imagine the db has autoclose on or that the db is taken offline, so, yeah, that is really weird.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
September 6, 2023 at 6:29 pm
Well, it's consistent in that it makes a backup only if there is a reason to do so. I find it a more sensible strategy, but as you wrote, it's a matter of opinion.
I'll look at the index stuff, but this is happening instantly. Not even time for an index re-org, and there is nothing I can think of that would be triggering such an activity. I manually change a value in a table using SSMS, then click to another window containing this Select statement and run it. The new value (or Null) appears instantly.
As for the data, no, it's not super confidential. It's just a museum collection inventory system, with some scientific data attached - if the entire contents of the database appeared on a public website tomorrow, eyebrows would be raised, but that's about it. Nobody would freak over having the data public. My paranoia stems from fear of crypto-viruses. A sister organization here got infected with one about two years ago, and they're still dealing with the fallout. Nothing sinister, just the usual bit of stupidity - someone opened an email attachment, and it infected the institution's entire network. I want to be CERTAIN that I have clean and recoverable backups in a place that is absolutely not penetrable by a virus, and physically turned-off storage is the best way I know of to achieve that. And yes, I manually test-restore my backups regularly.
I don't ask myself if I'm paranoid. Obviously, I am. I ask myself, am I paranoid ENOUGH? I hope so.
September 6, 2023 at 7:23 pm
OK, I finally found notes on the sys view I was trying to remember:
sys.dm_db_file_space_usage
This time, that should (actually) help you determine what you need to know (you would likely need to store the previous values to compare to the current ones).
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
September 6, 2023 at 7:33 pm
Just thought of a small scenario that MAY not impact you, but thought I'd toss this in here too - do you have any heaps? The reason I ask is, if I remember right, heaps won't show up in the index stats DMV. So if someone is making data changes in a HEAP, your backup strategy may miss that data. I say "MAY" because I don't remember with 100% certainty if heaps show up in the index stats DMV, but I would expect they wouldn't unless you explicitly created an NC index on the heap.
The above is all just my opinion on what you should do.
As with all advice you find on a random internet forum - you shouldn't blindly follow it. Always test on a test server to see if there is negative side effects before making changes to live!
I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.
September 6, 2023 at 7:45 pm
I believe heaps do show up, just under index_id 0 rather than 1.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
September 6, 2023 at 7:49 pm
OK, I finally found notes on the sys view I was trying to remember:
sys.dm_db_file_space_usage
This time, that should (actually) help you determine what you need to know (you would likely need to store the previous values to compare to the current ones).
Great, thank you. I'll have a look in the morning.
September 6, 2023 at 7:53 pm
Just thought of a small scenario that MAY not impact you, but thought I'd toss this in here too - do you have any heaps? The reason I ask is, if I remember right, heaps won't show up in the index stats DMV. So if someone is making data changes in a HEAP, your backup strategy may miss that data. I say "MAY" because I don't remember with 100% certainty if heaps show up in the index stats DMV, but I would expect they wouldn't unless you explicitly created an NC index on the heap.
Thanks for the idea. I don't think I do, or if at all, it's something that is not impacted by routine traffic. And the tables I was manually changing to see the impact on the last update date/time definitely DO have indexes and primary keys. I think all my tables do, but I'll look tomorrow to be sure.
Viewing 15 posts - 1 through 15 (of 27 total)
You must be logged in to reply to this topic. Login to reply