I am sure many of you have worked, or are working, with a legacy system. Some system that was created long before you arrived. It may not have the best design, so you have to work around it since often the design or code can’t be changed. This is a story of a bug fix that “shouldn’t” have caused any issues with any of the other systems.
The basic system overview was a common one. Everything started with the ERP system. There was an ETL process that took the order data from the ERP database and brought it down to the legacy manufacturing database. I had written a new system that used the order data from the legacy database to create shipping rack assignments for the finished goods.
It was discovered that a product type field from the ERP was getting truncated down to 15 characters. This was an issue since the shipping racks were often separated by the product type, so only like product types could be shipped together. The characters that were truncated would have caused a different shipping rack to be used. Since my system was new it was assumed that my system was the one that was improperly truncating the product type data.
It was easy enough to show that the product type data was already truncated in the legacy tables, so my shipping system was off the hook for that issue. Looking at the legacy system, the tables had a datatype set to varchar(25), so data wasn’t getting truncated because of a data type issue in the legacy table. We double checked the ERP system table and found it also supported 25 characters in its table. So the only logical answer was the ETL process was somehow truncating the data.
As the ETL guys looked into the problem, they discovered that for some reason, the ETL table was set to varchar(15). Everyone assured us that there would be no issue fixing this since both the ERP and the legacy tables already supported the 25 character product type. So they went ahead and released the ETL change to production so the product type would come through without any truncation.
Then the problems started to happen. A little known SQL job started to fail. As we investigated the error, we discovered the SQL job was copying the product type into another table whose column was defined as only varchar(15). Thankfully, it was an easy thing to discover even though the SQL job error just said "binary or string truncation". Next, a legacy system began to fail because of some hard coding that had been done around a specific product type. It no longer matched because the full product type was now coming across.
Thankfully, these problems weren’t too hard to diagnose and the production impact was minimal. Still it would have been nice to find these issues in testing. It was kind of crazy that this truncation issue wasn’t found sooner as I believe the ETL system had been truncating the product type for over 10 years!
I guess the moral of the story is: even simple changes that don’t look like they will affect anything, still can and often do affect things that you don’t expect. Share a time when a simple change caused issues for a system you support.