April 2, 2019 at 3:41 pm
I have mentioned this in two others posts but need to make it very clearly visible given that fixing the issue will not retroactively fix the mangled data (unlike fixing other display issues, such as bulleted / numbered lists now appearing correctly):
Characters that are not found on Code Page 1252 are not stored correctly. They are at best re-mapped to something similar via "best fit" mappings, or at worst are converted to the default replacement character: "?".
Content that was ported from the old site seems to have come over correctly. BUT, any new content, either submitted via the website or even imported (e.g. blog posts). The following list of characters should be the only non-standard ASCII characters that save correctly based on my testing thus far (they are the ASCII Extended set -- values 128 - 255 -- via Code Page 1252):
€ ‚ ƒ „ … † ‡ ˆ ‰ Š ‹ Œ Ž ‘ ’ “ ” • – — ˜ ™ š › œ ž Ÿ ¡ ¢ £ ¤ ¥ ¦ § ¨ © ª « ¬ ® ¯ ° ± ² ³ ´ µ ¶ · ¸ ¹ º » ¼ ½ ¾ ¿ À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï Ð Ñ Ò Ó Ô Õ Ö × Ø Ù Ú Û Ü Ý Þ ß à á â ã ä å æ ç è é ê ë ì í î ï ð ñ ò ó ô õ ö ÷ ø ù ú û ü ý þ ÿ
There should only be 5 ""s in that set as there are 5 undefined values. That list was generated using the following T-SQL:
DECLARE @CP1252characters VARCHAR(300) = '';
SELECT TOP (128)
@CP1252characters += ' ' + CHAR(ROW_NUMBER() OVER (ORDER BY @@MICROSOFTVERSION) + 127)
FROM master.sys.columns;
PRINT @CP1252characters;
As I mentioned in the "New blog import process mangles blog content" issue, there is clearly a VARCHAR parameter and/or variable and/or column being used somewhere in the saving process.
Take care,
Solomon...
P.S. If you are unable to find the issue quickly, I am available for consulting. I'm not being flippant or sarcastic or anything that might be perceived as negative or presumptuous here. Character encoding is a highly complex and tricky topic that trips up most computer folk, even the very intelligent / talented ones. But, it's something I have been researching / specializing in for 6 years now, and am able to assist with if needed / wanted 😺 .
P.P.S. This is actually a re-post. Against my better judgement, I made the mistake of editing my original post and lost it: https://www.sqlservercentral.com/forums/topic/major-issue-to-fix-asap-data-loss-for-characters-not-included-in-code-page-1252
P.P.P.S I am tempting fate here by editing this post (2nd edit, actually), but I am feeling daring 😼
SQL# — https://SQLsharp.com/ ( SQLCLR library ofover 340 Functions and Procedures)
Sql Quantum Lift — https://SqlQuantumLift.com/ ( company )
Sql Quantum Leap — https://SqlQuantumLeap.com/ ( blog )
Info sites — Collations • Module Signing • SQLCLR
April 2, 2019 at 4:06 pm
Noted - we'll dig in as soon as we can and get back to you.
September 14, 2021 at 4:00 pm
This issue was ultimately the result of string literals not being prefixed with an upper-case "N" in the T-SQL code generated by the PHP code (of ProjectNami). I submitted a fix for that to ProjectNami and it was recently merged (hence, official):
https://github.com/ProjectNami/projectnami/pull/422
Fixing that issue on SSC.com, fortunately, does not require any extensive work or upgrade of ProjectName. It only requires adding that single-character to the same line of code (even if that line is at a different line number in the SSC.com version of ProjectNami). Please see the following SSC.com forum post for details:
Take care,
Solomon...
SQL# — https://SQLsharp.com/ ( SQLCLR library ofover 340 Functions and Procedures)
Sql Quantum Lift — https://SqlQuantumLift.com/ ( company )
Sql Quantum Leap — https://SqlQuantumLeap.com/ ( blog )
Info sites — Collations • Module Signing • SQLCLR
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply