How to write to a file from SQL Server running on Docker Container on Linux

  • So we have a SQL Server database running in a Docker Container on Linux

    There is a stored procedure which will take in JSON as a parameter and extracts data out into a table

    called dbo.BCPOutput

    Now I can run this from SQL Server Management Studio (in SQLCMD mode)

    :connect 217CAS -U sa -P xxxxxxx
    :SETVAR DatabaseName Metrology
    :OUT /data/readings/Data.csv
    use Metrology;
    select csvline from dbo.bcpOutput

    But I want to be able to execute this code in my stored procedure

    So how can I write to a text file on my local drive from a stored procedure running on docker container on Linux?

    This code was working on windows...but we were able to use xp_cmdshell which only runs on windows

    Also FYI i dont have access the docker container or the linux vm to run stuff from the command line

    So it all has to be done from a stored procedure

  • Thanks for posting your issue and hopefully someone will answer soon.

    This is an automated bump to increase visibility of your question.

  • My opinion - don't. Firstly, SQLCMD mode is a client side thing (ie part of SSMS) not a server side thing (ie not part of SQL Server Engine) so SQLCMD won't work in a stored procedure - there is literally no way to make that work. Secondly, I wouldn't want that in a stored procedure because anybody that has view definition permissions on that stored procedure now has the details required to connect to 217CAS AS sa!!! This is a huge security risk in my mind and I would NOT be allowing that stored procedure to exist on my systems. My sa passwords (where enabled) are locked down so only a handful of users have visibility to them - the DBA group and that's it.

    Now, as for writing the output to disk, why not use a scheduled job to handle this? One of the output options on a SQL job step is to write the output to disk. Offhand, I have not tried this in linux, but I can't see why it wouldn't work.

    Now, if the stored procedure needs to be called manually or from an application, I think your options are limited to having the application write to disk or having the end user store the results to disk. That being said, if it needs to be run manually, then running the job is just as easy to do as running the stored procedure, so that's still an option. If it needs to be run by an application, you could have the application call a stored procedure that executes as owner then runs the job IF the application doesn't need the output from the stored procedure call. If the application does need the output from the stored procedure, then I think your options become more limited, but in theory you could have the application call the stored procedure you need PLUS call the stored procedure to schedule the job which would write the results to disk as long as there is no data manipulation in the stored procedure. If there is, then you'd need to be more creative with the SP that is scheduling the job so that it waits for the job to finish then would pull out the required data for the application.

    As an alternate solution - you could migrate back to a Windows based OS for your server as it sounds like Linux hosting doesn't suit your needs. You can do Docker inside Windows, so you could argue that the feature set of SQL Server on Linux doesn't meet the needs of your application and that you need a Windows host to run it on.

    As far as I know, those are your only options at this time. That COULD change with a future release of SQL, but I also wouldn't hold my breath. xp_cmdshell is easy to abuse so I use it sparingly. That being said, there are a other commands that are "easy to abuse" when they are misconfigured - linked servers being a big one. I also find that any time I am looking to use xp_cmdshell, it is likely because I am trying to avoid having to do something in the application layer that really should be done at the application layer.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • Mr. Brian Gale wrote:

    xp_cmdshell is easy to abuse so I use it sparingly.

    I also find that any time I am looking to use xp_cmdshell, it is likely because I am trying to avoid having to do something in the application layer that really should be done at the application layer.

    On that first statement, not if you set it up correctly.

    On the second part, a lot of people avoid it because it means that SQL Server has to do a little file work.  So, instead, they write an app that beats the hell out of SQL Server and the network between the server and the app to get the data to write to a file and they call that an "improvement". 😀

    And then, on top of it all, they complain about how to schedule it and how to log the action and then make an entry in a log table in SQL Server.

    As you say... "In my opinion". 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Oh I agree with you here Jeff. When things are set up properly, xp_cmdshell can be secure, but I find that a lot of times, it is used and abused. Kind of like cursors. They both have their place, but once you start using it, it can easily get out of hand. And once it gets out of hand it can be hard to get it back under control without drastic changes to processes.

    Now, as for if you should use xp_cmdshell OR an application for a task, I feel that really falls onto the design. If the application NEEDS the results of the stored procedure call, then having the SP write it to disk and the app read it from disk feels like a lot of wasted effort. My approach is always to try to use the "right tool for the job". SQL is a container to hold data and provides me the data and do calculations on the data. That is my primary purpose for the database. There are always exceptions, but that is the primary purpose of the database. As soon as I start needing something more than that, I start looking at other options such as SSRS, SSIS, service broker, 3rd party apps, or even custom built apps. One risk with writing to disk from a stored procedure is how do I ensure that I don't accidentally overwrite any of the data on disk (for example)? or even the OP's original issue - how do you use xp_cmdshell on linux?

    I am not saying "never use xp_cmdshell! it's evil", I'm just saying that it is ALWAYS good to pick the right tool for the job. For example, I wouldn't write a custom app to report on data when I have SSRS installed and running already. I try to pick the right tool for the problem I am presented with, and in some cases xp_cmdshell IS the right tool (I do use it).

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • Mr. Brian Gale wrote:

    I'm just saying that it is ALWAYS good to pick the right tool for the job.

    Yep.  Understood.  All I'm saying is that too many people think that xp_CmdShell isn't that tool and so they turn to a thousand other things to replace something that does the job perfectly and with little fan-fare and, frequently, with less stress on the server than a whole lot of those other methods.

    Mr. Brian Gale wrote:

    I find that a lot of times, it is used and abused.

    Oddly enough, I've found that in people's fervor to avoid it, they come up with some really unwieldy stuff and not just on occasion but a whole lot!  And, TBH, a lot of times, it's a whole lot less secure because they end up including logins and passwords in their plain text configuration file or even in the code itself.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I suppose that is a good point Jeff. In this specific scenario where the OP is trying to write the file to the server itself, having the data flow from application (request run stored procedure) to server (start stored procedure) to the application (result of stored procedure) to the server (write to server disk) is a lot of overhead and overkill.

    And I see your point too about security - if you don't know how to secure your SQL instance, do you know how to secure your application code? It may just be moving the problem from team A to team B and if neither team knows how to secure their stuff, you aren't really solving any problems. In my case, team A and team B are the same team - my team. I am just a part of my team (not the boss, I'm more on the "get stuff done" side of things), but I need to take into account both the database side and the application side for what I support and develop. For the things I work on, xp_cmdshell is not really needed. The one exception where I do use it a bit I'm sure people would argue it isn't needed there either, but it is in my restore scripts. I have a few stored procedures that get called when I need to generate restore scripts to refresh a live system to the test environment. I use xp_cmdshell to get the list of backup files then parse through those to find the ones needed in order to do a restore based on the datetime parameter (the restore scripts are designed for point in time restores). I know I could use backupset and backupmediafamily, but then I need to push my restore script out to ALL the SQL instances and I want to keep it all in one place - my "admin" database. So as long as I can restore admin, I can restore any system. I also built into that stored procedure a "schedule" option so if someone requests a database refresh after hours, I can start the script up and it just waits until the scheduled time to fire off. BUT I digress - I am getting really off topic.

    Now, back on topic, a SQL Job could handle the write to disk just as easily as xp_cmdshell IF the job could be scheduled AND it has the added benefit of you being able to run the stored procedure without risking overwriting the existing file on disk if you just want to look at the output. Now, if it can't be scheduled, then you get a lot of extra overhead built in with more points of failure using the job option and a second stored procedure that calls the job, so xp_cmdshell is sounding like the right tool for the job. BUT in the OP's case, xp_cmdshell is not an option - SQL on Linux doesn't support it in 2019. It MAY support it in the future (pretty sure 2022 doesn't support it either), but at the moment it's not an option. BUT even if xp_cmdshell was an option, that first line in the stored procedure is a HUGE red flag for me - connecting to ANY system as sa is a pink flag for me (there are cases where it is required, but they are more rare than a cursor) and connecting from a stored procedure as sa is a HUGE red flag. I know "xp_cmdshell" is not the stuff that is written in the OP - that's SQLCMD, I just wanted to point out that IF xp_cmdshell command had any mention of an sa password, I'd shoot that down going to production faster than a DBA says "it depends". I doubt anyone in the world could convince me that connecting to a SQL instance from an end user process as sa is a good idea. I wouldn't let an application connect as sa; the ONLY users who should use sa are the DBA's, and even they shouldn't be using it unless there is no other option. Best practice is to not even have the account enabled if you don't need to have it enabled.

    Now, as for picking the right tool for the job, that can be quite a challenge to get right as well. I've had cases where the right tool for the job was to do the calculation inside SSRS, while other times it was better to do it inside the TSQL query for example. I've written code that does far more "sanity checks" than it should be simply because the application wasn't coded to do the sanity checks. So, my opinion, sanity checks should be at the application layer so that the SP calls are faster and easier to test since by the time it hits the SP, everything is in a "known possible" state and the SP can just do the work instead of checking if the input makes sense.

    With respect to the original question though, I don't know what the "best solution" is for what they are trying to do. Pushing it out to the application layer feels like it won't work since the OP stated that they do not have access to the server filesystem, so I doubt end users do either. xp_cmdshell is out since it is linux, so my best option is to build a job and a stored procedure to call that job if the job can't be scheduled. Alternately, migrate the server back to Windows and use xp_cmdshell.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • I agree... for the OP, I've not had to work with Linux nor containers.

    Hopefully, this will be a nice "bump" to get someone that has to take a look at this.

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hey guys

    Thanks a lot for your  posts on this.

    One option im looking at is to use bash scripting to call bcp.

    I think i can connect my docker container to bash. then issue a bash comand to run bcp

    and output from the table dbo.BCPOutput.

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply