Blog Post

The trade-offs associated with low-code solutions

,

Low-code solutions often accelerate development and make tasks accessible to people who can’t or don’t want to write their own code. But it’s important to remember that it’s a trade-off. You are often trading decreased development and maintenance time for limited configuration options and minimal monitoring capabilities. Low-code solutions are great…until they aren’t.

Two laptops are shown. One has several lines of code while the other shows boxes and lines in a low-code solution
Low-code tools speed up development efforts but obscure the actual code, including many configurations.

I’d like to share a recent example of when a low-code solution I built in Azure Data Factory caused a bit of a mess with no way to diagnose or fix the issue. Don’t get me wrong — I am a fan of low-code tools. I spend a large portion of my time working in low-code tools like Azure Data Factory (ADF) and Power BI. I’m sharing this because I think it’s a great example of that trade-off, which many people fail to consider up front. In this case, we did consider the trade-offs. We just encountered the identified risk.

The Situation

I needed to set up some ADF pipelines to incrementally copy files from a third-party SFTP server. New files would be added each day, and those files needed to be copied to an Azure Data Lake Storage account.

This was an easy task using the ADF copy activity and the SFTP connector. I used an Azure integration runtime because we wanted to minimize infrastructure, and because our SHIR had port 22 blocked due to organizational policy. Port 22 is the standard port used for secure file transfer, so we needed outbound access to that port. I built the pipeline, published it, associated a schedule trigger, and it ran happily for months.

Then one day, the owner of the SFTP server notified us that they would be upgrading their server. They said our connectivity should not be affected as long as we were using one of the supported key exchange algorithms. We double-checked the ADF documentation and confirmed we should be good.

On the day of the SFTP server upgrade, we started seeing the error below.

"Source=Microsoft.DataTransfer.ClientLibrary.SftpConnector,'Type=Renci.SshNet.Common.SshConnectionException,Message=The connection was closed by the server: No common host key algorithms."

We asked the SFTP server owner to check their logs to see if they could see a problem. We provided our username and an approximate execution time. But because we were using an Azure IR, we couldn’t provide the expected source IP address of our process, just a lot of IP ranges (which are provided in CIDR notation). This is a particularly busy server, so it took them a bit to find our connection attempts.

In the meantime, we logged a support ticket with Microsoft because it was looking likely that the problem was on the ADF side. The frontline Azure support was not helpful because they aren’t familiar with details of the SFTP connector. They tried their best to look through suggestions in their system, but they were ultimately a waste of time, other than to confirm that we (and apparently they) could not see any more information on the SFTP connection in the ADF logs. They actually told us the only way to troubleshoot was to spin up a server with port 22 unblocked, install the self-hosted integration runtime, and pull the logs from the server. This was against corporate policy, so we did not pursue that suggestion. But support blocked escalating to the product team because we refused to do that. We left our support ticket open so we could get more information.

I spent a bit of time getting up to speed on the details of how the SFTP protocols work. SFTP uses an SSH connection. There are 3 steps to establishing a connection:

  1. The client verifies the host server.
  2. The client and server generate a session key together.
  3. The server authenticates the client.

That first step, also called SFTP host key validation, ensures the server’s identity by verifying its public key against a trusted list, preventing man-in-the-middle attacks and ensuring secure connections.

It turns out, our issue was in step 1. There are different host key types, which are not the same as key exchange algorithms. As far as I know, ADF doesn’t have the full list of supported host key types documented anywhere. They have the newly supported algorithms that work with host key fingerprints, but not the full list. The SFTP server owner got back to us and informed us that the logs showed that ADF was trying to connect using rsa-SHA1, which was disabled on their new server. That makes sense because NIST formally deprecated SHA-1 in 2011 and disallowed its use for digital signatures in 2013. OpenSSH 8.2 was released on 2020-02-14 and declared SHA1 deprecated with that release.

The ADF SFTP connector code uses Renci SSH behind the scenes. Renci SSH is a high-performance SSH library for .Net. It is used in the code that runs in the integration runtime. We cannot see or affect it’s use in the SFTP connector. Its use is not noted anywhere in the ADF documentation. Since our support ticket escalation was blocked, I used my connections to the product team to seek further help. It turns out that they were on an old version of SSH.Net, and their implementation had a problem where it was falling back to rsa-sha1 when rsa-sha2 should have been used.

The SFTP server owner temporarily relaxed the security requirements to allow us to access our data again. And luckily, Microsoft was working on a new version of the SFTP connector that used an upgraded version of the SSH.Net library. The product team connected with the support team, and we were allowed to switch to the preview version of the SFTP Connector. After verifying that our pipeline ran successfully for a couple of days, we asked the SFTP server owner to check the logs again. The logs verified that we were now using rsa-sha2-512.

While we were ultimately able to resolve the issue, it took about 12 hours of my consulting time, back channeling to the product team (which not everyone can do), and a very accommodating data vendor who helpfully checked their logs upon request.

Alternatives

This pipeline was part of a larger solution, which probably took me about 6 hours to create, test, and deploy. Add that to the 12 hours of troubleshooting. In 18 hours, I could have easily written a Python script to do the same thing and had control over the way I connected and the libraries and specific versions used. I would have also have more detailed monitoring where I could have easily seen what host key algorithm was used. We likely would have shaved about 10 hours off of our troubleshooting time. And we would have spent a bit more time fixing our code if we had encountered the same issue. But at least we could likely fix it.

But the organization would then be responsible for maintaining that code. And we would have spent more maintenance time keeping the environment and libraries up to date. This organization decided to keep the ADF pipeline and hope we didn’t run into further issues because they only had a couple of people who were competent in Python.

The trade-off

Sometimes using a low-code tool means you cannot access logs to identify the root cause of an issue. Sometimes, as in my situation, using a low-code tool means you cannot fix an issue.

In other cases, people who do not write code can accomplish a goal they would not have otherwise been able to achieve. And I definitely would have spent more time initially developing the solution if I wrote a Python script compared to my development time in ADF (this is based upon personal skills and experience, so that may vary for you).

Again, I’m not here to tell you to avoid low-code tools. I’m just here to tell you to consider the pros and cons, especially if you are building a mission-critical process. Do you want more control but more development effort? Or do you want to rely on your tool’s vendor to diagnose and solve your issues?

Original post (opens in new tab)
View comments in original post (opens in new tab)

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating