May 8, 2014 at 5:59 am
Hi,
I had to convert huge volumes with datetime fields (given timezone info) to UTC.
Since Sql does not have a native way to do this, I chose to do it with the CLR-SQL way of hosting a .NET assembly in Sql environment.
using System;
using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;
using System.Linq;
namespace SqlClr
{
public partial class UserDefinedFunctions
{
[SqlFunction(Name = "GetUTCfromTimeZone")]
public static SqlString GetUTCfromTimeZone(SqlString _dt, SqlString _timeZoneInfo)
{
if (_dt.IsNull || _timeZoneInfo.IsNull)
return _dt;
DateTime dt;
if (!DateTime.TryParse(_dt.Value, out dt))
{
return _dt;
}
// default behaviour
TimeZoneInfo tz = TimeZoneInfo.GetSystemTimeZones().Single(z => z.Id.Equals(_timeZoneInfo.Value, StringComparison.CurrentCultureIgnoreCase));
var utc = new DateTimeOffset(dt, tz.GetUtcOffset(dt));
// Sql-Server DateTime type is implemented based on the Gregorian calendar. It can not handle dates earlier than 01/01/1753.
if (utc.UtcDateTime.Year < 1753)
{
return new SqlString("01/01/1753 00:00:00");
}
return new SqlString(utc.UtcDateTime.ToString("MM/dd/yyyy HH:mm:ss"));
}
}
}
I created a Sql assembly pointing to the CLR dll file, with permission set = unsafe (somehow permission set = safe does not work).
Then, created a user defined function, which points to a method in the CLR-SQL assembly, which takes a datetime string representation, timezone string representation and returns the UTC datetime string representation.
sp_Configure 'CLR Enabled', 1 Reconfigure With Override
go
Alter Database GP_TPS_350 SET TRUSTWORTHY ON
Exec dbo.sp_changedbowner @loginame = N'sa', @map = false
go
If OBJECT_ID('dbo.GetUTCfromTimeZone') Is Not Null
Drop Function GetUTCfromTimeZone
go
If exists (select * from sys.assemblies where name = 'ClrFunctions')
drop assembly ClrFunctions
/*Create assembly from the CLR built dll*/
Create Assembly ClrFunctions From 'C:\SqlClr.dll'
WITH PERMISSION_SET = UNSAFE;
go
/*Create the scalar valued function from the CLR assembly*/
Create Function dbo.GetUTCfromTimeZone
(
@_dt As nvarchar(256),
@_timeZoneInfo As nvarchar(100)
)
Returns nvarchar(256)
As External Name [ClrFunctions].[SqlClr.UserDefinedFunctions].[GetUTCfromTimeZone]
go
Now, using that user defined function in an update statements like:
UPDATE t1
SET
t1.SOME_DATE = dbo.GetUTCfromTimeZone(t1.SOME_DATE, t2.TIMEZONE),
t1.SOME_OTHER_DATE = dbo.GetUTCfromTimeZone(t1.SOME_OTHER_DATE , t2.TIMEZONE)
FROM dbo.SOME_TABLE AS t1
INNER JOIN dbo.USER_INFO AS t2 ON t1.LAST_UPBY_USER_KEY = t2.USER_KEY
on about 50+ tables some of which have over 5 million records, the memory usage goes up to more than 90%.. and the execution takes a more than an hour sometimes..
The execution time is acceptable, but the memory wouldn't clear up until I restart the Sql Server process.
I know, the reason be: because permission set unsafe on the sql assembly, the memory allocation and disposal is not managed by the sql environment..
But when I set the permission set = safe, it yells "Permission set = safe" "trust worthy database" etc when the user defined function is invoked..
Any suggestions would be really appreciated.
Thanks
RPS.
May 8, 2014 at 11:33 pm
renu.iitkgp (5/8/2014)
Hi,I had to convert huge volumes with datetime fields (given timezone info) to UTC.
Since Sql does not have a native way to do this, I chose to do it with the CLR-SQL way of hosting a .NET assembly in Sql environment.
using System;
using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;
using System.Linq;
namespace SqlClr
{
public partial class UserDefinedFunctions
{
[SqlFunction(Name = "GetUTCfromTimeZone")]
public static SqlString GetUTCfromTimeZone(SqlString _dt, SqlString _timeZoneInfo)
{
if (_dt.IsNull || _timeZoneInfo.IsNull)
return _dt;
DateTime dt;
if (!DateTime.TryParse(_dt.Value, out dt))
{
return _dt;
}
// default behaviour
TimeZoneInfo tz = TimeZoneInfo.GetSystemTimeZones().Single(z => z.Id.Equals(_timeZoneInfo.Value, StringComparison.CurrentCultureIgnoreCase));
var utc = new DateTimeOffset(dt, tz.GetUtcOffset(dt));
// Sql-Server DateTime type is implemented based on the Gregorian calendar. It can not handle dates earlier than 01/01/1753.
if (utc.UtcDateTime.Year < 1753)
{
return new SqlString("01/01/1753 00:00:00");
}
return new SqlString(utc.UtcDateTime.ToString("MM/dd/yyyy HH:mm:ss"));
}
}
}
I created a Sql assembly pointing to the CLR dll file, with permission set = unsafe (somehow permission set = safe does not work).
Then, created a user defined function, which points to a method in the CLR-SQL assembly, which takes a datetime string representation, timezone string representation and returns the UTC datetime string representation.
sp_Configure 'CLR Enabled', 1 Reconfigure With Override
go
Alter Database GP_TPS_350 SET TRUSTWORTHY ON
Exec dbo.sp_changedbowner @loginame = N'sa', @map = false
go
If OBJECT_ID('dbo.GetUTCfromTimeZone') Is Not Null
Drop Function GetUTCfromTimeZone
go
If exists (select * from sys.assemblies where name = 'ClrFunctions')
drop assembly ClrFunctions
/*Create assembly from the CLR built dll*/
Create Assembly ClrFunctions From 'C:\SqlClr.dll'
WITH PERMISSION_SET = UNSAFE;
go
/*Create the scalar valued function from the CLR assembly*/
Create Function dbo.GetUTCfromTimeZone
(
@_dt As nvarchar(256),
@_timeZoneInfo As nvarchar(100)
)
Returns nvarchar(256)
As External Name [ClrFunctions].[SqlClr.UserDefinedFunctions].[GetUTCfromTimeZone]
go
Now, using that user defined function in an update statements like:
UPDATE t1
SET
t1.SOME_DATE = dbo.GetUTCfromTimeZone(t1.SOME_DATE, t2.TIMEZONE),
t1.SOME_OTHER_DATE = dbo.GetUTCfromTimeZone(t1.SOME_OTHER_DATE , t2.TIMEZONE)
FROM dbo.SOME_TABLE AS t1
INNER JOIN dbo.USER_INFO AS t2 ON t1.LAST_UPBY_USER_KEY = t2.USER_KEY
on about 50+ tables some of which have over 5 million records, the memory usage goes up to more than 90%.. and the execution takes a more than an hour sometimes..
The execution time is acceptable, but the memory wouldn't clear up until I restart the Sql Server process.
I know, the reason be: because permission set unsafe on the sql assembly, the memory allocation and disposal is not managed by the sql environment..
But when I set the permission set = safe, it yells "Permission set = safe" "trust worthy database" etc when the user defined function is invoked..
Any suggestions would be really appreciated.
Thanks
RPS.
Regarding the memory issue and SAFE vs UNSAFE, there is not much you can do:
1) The TimeZoneInfo class is known to have memory leaks which is why it has a Host Protection Attribute on it. See http://msdn.microsoft.com/en-us/library/cc645949(v=sql.100).aspx and look for "TimeZoneInfo" and you will see it listed with a Host Protection Attribute of "MayLeakOnAbort". So I don't think garbage collection has anything to do with it.
2) Because it has a Host Protection Attribute, you cannot use System.TimeZoneInfo in either SAFE or EXTERNAL_ACCESS assemblies. Your only option is UNSAFE, and that is due to the memory leak (as opposed to the memory leak due to the Assembly being set to UNSAFE).
Regarding the memory not being released until the SQL Server process is restarted:
you can free up the memory by causing the App Domain to unload, and that can be achieved by doing an ALTER ASSEMBLY to set the PERMISSION_SET to something other than the current, such as SAFE. Then do another ALTER to set it back to UNSAFE so you can use the function again.
Regarding performance:
1) in the SqlFunction() attribute, add "IsDeterministic = true". That might help the optimizer.
2) Why are you dealing with strings instead of dates? is the field itself a string? The conversion from String to DateTime back to String is part of what is taking so long. If possible, use the SqlDateTime type in the C# code.
Take care,
Solomon..
SQL# — https://SQLsharp.com/ ( SQLCLR library ofover 340 Functions and Procedures)
Sql Quantum Lift — https://SqlQuantumLift.com/ ( company )
Sql Quantum Leap — https://SqlQuantumLeap.com/ ( blog )
Info sites — Collations • Module Signing • SQLCLR
May 9, 2014 at 4:49 am
Regarding the memory issue and SAFE vs UNSAFE, there is not much you can do:
1) The TimeZoneInfo class is known to have memory leaks which is why it has a Host Protection Attribute on it. See http://msdn.microsoft.com/en-us/library/cc645949(v=sql.100).aspx and look for "TimeZoneInfo" and you will see it listed with a Host Protection Attribute of "MayLeakOnAbort". So I don't think garbage collection has anything to do with it.
2) Because it has a Host Protection Attribute, you cannot use System.TimeZoneInfo in either SAFE or EXTERNAL_ACCESS assemblies. Your only option is UNSAFE, and that is due to the memory leak (as opposed to the memory leak due to the Assembly being set to UNSAFE).
Regarding the memory not being released until the SQL Server process is restarted:
you can free up the memory by causing the App Domain to unload, and that can be achieved by doing an ALTER ASSEMBLY to set the PERMISSION_SET to something other than the current, such as SAFE. Then do another ALTER to set it back to UNSAFE so you can use the function again.
Regarding performance:
1) in the SqlFunction() attribute, add "IsDeterministic = true". That might help the optimizer.
2) Why are you dealing with strings instead of dates? is the field itself a string? The conversion from String to DateTime back to String is part of what is taking so long. If possible, use the SqlDateTime type in the C# code.
Hi Solomon,
Thank you for the reply, that is very helpful.
When the memory footprint was so large and it wasn't been freed up, I tried dropping the user defined function and assembly and waited. Nothing happened and then I restarted the sql server service. I will try changing the PERMISSION_SET.
Can you please explain a little what exactly you mean by unloading the app domain? Is the app domain here, a host process which hosts CLR assemblies?
I used SqlString instead of SqlDateTime because, some of those many fields are strings.. Can you think of another alternative? Would it be better if I overload the function definition? (because in that case, return type also has to be different..)
(Also the tables and the fields on which this operation has to be done will be different across various client databases. So, the update statement is generated in a C# console app, using our own metadata layer..)
Thanks and Regards,
Venkat
May 9, 2014 at 6:04 am
i'm thinking you might be able to do this in SQL, can't you?
what am i overlooking here?
i can get the UTC date, and datediff it agaisnt getdate(0 to get my current timezone(or any timezone if i have a table of timezone offsets), and then use DateAdd tohandle the date? then store it as a string with the offset in it?
DECLARE @date DATETIME;
DECLARE @offsetminutes INT
SELECT @date = CONVERT(DATETIME,'1990-08-30 16:07:00.000'); --my kids birthday
SELECT @offsetminutes = DATEDIFF(MINUTE,GETUTCDATE(),GETDATE()) --240 minutes for EST
SELECT
dateadd(minute,@offsetminutes,@date),
CONVERT(datetimeoffset,(dateadd(minute,@offsetminutes,@date)))
Lowell
May 9, 2014 at 6:59 am
renu.iitkgp (5/9/2014)
Hi Solomon,
Thank you for the reply, that is very helpful.
When the memory footprint was so large and it wasn't been freed up, I tried dropping the user defined function and assembly and waited. Nothing happened and then I restarted the sql server service. I will try changing the PERMISSION_SET.
Can you please explain a little what exactly you mean by unloading the app domain? Is the app domain here, a host process which hosts CLR assemblies?
I used SqlString instead of SqlDateTime because, some of those many fields are strings.. Can you think of another alternative? Would it be better if I overload the function definition? (because in that case, return type also has to be different..)
(Also the tables and the fields on which this operation has to be done will be different across various client databases. So, the update statement is generated in a C# console app, using our own metadata layer..)
Thanks and Regards,
Venkat
Hi there.
Create one function using SqlDateTime for input and output parameter types. Then do CONVERT(DATETIME,...) and CONVERT(VARCHAR(30),...) on the way in and out of the UPDATE statements when needed.
App Domains are separate memory / processing spaces that are per Database and per Owner of the Assembly. Unloading it will clear out the memory. Look at the dmv: sys.app_domains (or something like that).
SQL# — https://SQLsharp.com/ ( SQLCLR library ofover 340 Functions and Procedures)
Sql Quantum Lift — https://SqlQuantumLift.com/ ( company )
Sql Quantum Leap — https://SqlQuantumLeap.com/ ( blog )
Info sites — Collations • Module Signing • SQLCLR
May 9, 2014 at 7:02 am
Lowell (5/9/2014)
i'm thinking you might be able to do this in SQL, can't you?what am i overlooking here?
Hey there. I am 99.9% positive that the issue is TIMEZONEOFFSET in SQL Server is not DST aware, hence making historical conversions suspect. That might need to be re-verified as it has been a year or two since I looked into this in detail, but I was wondering the same thing and seem to recall this being the blocker.
Take care,
Solomon...
EDIT:
Actually, yes, that is the issue. TIMEZONEOFFSET requires an actual offset, not a TimeZoneID that would point to info that includes the offset, DST info, etc.
SQL# — https://SQLsharp.com/ ( SQLCLR library ofover 340 Functions and Procedures)
Sql Quantum Lift — https://SqlQuantumLift.com/ ( company )
Sql Quantum Leap — https://SqlQuantumLeap.com/ ( blog )
Info sites — Collations • Module Signing • SQLCLR
May 26, 2014 at 7:52 am
Hi Solomon,
Sorry that I had to get to do a lot of other tasks, could get back to this only today. Thank you for the explanation and the suggestions.
I overloaded the SqlFunction to take SqlDateTime and return SqlDateTime, used IsDeterministic=true and am setting PERMISSION_SET to SAFE and back to UNSAFE again.. The memory still is not being cleared..
I do generate the update script for each table and execute from a C# console application. However, at some point during execution when the table is too huge, the memory usage shot up to 99% and never came down.. everything slowed down and after an hour I had to restart the machine.
Is there anything else that I could be missing? I'm considering stopping and starting SQL SERVER service from the C# application, which I know could be dirty.. Please let me know, if something better could be done.
Thank you
RPS
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply