Shawn Melton MVP and dbatools contributor last week had an issue running SQL Server on Linux inside of Windows Subsystem for Linux.
Error trying to configure #sqlLinux on the openSUSE app for Windows 10 pic.twitter.com/0Eg5TtV0o5
— Shawn Melton (@wsmelton)
I didn’t want to leave a brother hanging so I spent this morning digging into this a little bit.
Reproducing the Issue
The first thing I had to do was reproduce the issue. So on my Windows 10 test VM I installed the Windows Subsystem for Linux, steps to do so are here and I installed the Ubuntu app.
Then, I fired up a bash shell using WSL and then I installed SQL Server on Linux for Ubuntu as documented here.
Now, I completed the installation of SQL Server on Linux using mssql-conf when that program completes it attempts to start SQL Server on Linux. BOOM! I’m able to reproduce the same error.
Looking at the error, I decided to see if I could run SQL Server on Linux from the shell as the user mssql. This would remove systemd and mssql-conf from the picture. Basically I wanted to see if I could get another, more descriptive, error to pop out. To do that we’ll need to change over to the mssql user with su.
<span style="font-family: Courier;">sudo su mssql -</span>
And then change into the working directory for SQL Server on Linux and try to launch SQL Server.
<span style="font-family: Courier;">cd /var/opt/mssql/<br />/opt/mssql/bin/sqlservr</span>
Now, doing that…generates same same error! Here’s the error in a search engine friendly form 🙂
mssql@DESKTOP:~$ /opt/mssql/bin/sqlservr<br />This program has encountered a fatal error and cannot continue running.<br />The following diagnostic information is available:<br /> Reason: 0x00000003<br /> Message: fd != -1<br /> Stacktrace: 00007f818942d4d3 00007f8188de76ba 00007f81863e73dd<br /> Process: 79 - sqlservr<br /> Thread: 80<br /> Instance Id: 50bd6e1b-8f6c-45b3-939d-2338725d8b4a<br /> Crash Id: d38007c0-48c6-4374-9205-5539333138ff<br /> Build stamp: 5fb3474a5f63ad2f4b7eddadad44a086839721f18a66c5fb5d7cfcce25c0f539<br />This program has encountered a fatal error and cannot continue running.<br />The following diagnostic information is available:<br /> Reason: 0x00000003<br /> Message: fd != -1<br /> Stacktrace: 00007f818942d6ac 00007f8188de76ba 00007f81863e73dd<br /> Process: 81 - sqlservr<br /> Thread: 83<br /> Instance Id: 50bd6e1b-8f6c-45b3-939d-2338725d8b4a<br /> Crash Id: d38007c0-48c6-4374-9205-5539333138ff<br /> Build stamp: 5fb3474a5f63ad2f4b7eddadad44a086839721f18a66c5fb5d7cfcce25c0f539<br />*********** PANIC CORE DUMP GENERATION FAILED **********<br />Attempt to launch handle-crash.sh failed with error 0x0000000C<br />Aborted (core dumped)
Digging a Little Deeper
So now with the same error output, I decided to give it a cursory pass with strace to see if I could find anything that would put us closer to why SQL Server on Linux won’t start when using Windows Subsystem for Linux.
What you see in the strace output is the parent process creating the child sqlservr process and failing. In the first line of output you can see process 137 clone and return process ID 139. Which is how a parent process creates a child in Linux. Then process 139 tries to perform some setup operations like registering signal actions (rt_sigaction) and their corresponding routines to call when that signal is received by that process.
Now the only error I found in the output is the prctl call which returns invalid argument.This system call is to perform operations on a process. On my WSL system the option being set PR_SET_PTRACER is for the Yama LSM subsystem which lives in /proc/sys/kernel/yama normally. This doesn’t exist on my Ubuntu WSL installation. I checked my CentOS full VMs and this exists. I checked a full Ubuntu installation and it’s there too.
After the error SQL Server calls tgkill and kills itself with the SIGABRT signal. A dump occurs and the program exits.
<strong>137 clone(child_stack=0x7fb1a0feff30, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x7fb1a0ff09d0, tls=0x7fb1a0ff0700, child_tidptr=0x<br /></strong>7fb1a0ff09d0) = 139<br />139 set_robust_list(0x7fb1a0ff09e0, 24 <unfinished ...><br />137 fcntl(1, F_SETFL, O_RDONLY|O_APPEND <unfinished ...><br />139 <... set_robust_list resumed> ) = 0<br />137 <... fcntl resumed> ) = 0<br />139 gettid( <unfinished ...><br />137 fcntl(2, F_SETFL, O_RDONLY|O_APPEND <unfinished ...><br />139 <... gettid resumed> ) = 139<br />137 <... fcntl resumed> ) = 0<br />139 rt_sigaction(SIGABRT, {0x7fb1a6a2e470, [ABRT], SA_RESTORER|SA_RESTART, 0x7fb1a39154b0}, <unfinished ...><br />137 getrlimit(RLIMIT_NOFILE, <unfinished ...><br />139 <... rt_sigaction resumed> {0x7fb1a6a2d290, [ABRT], SA_RESTORER|SA_RESTART, 0x7fb1a39154b0}, 8) = 0<br />137 <... getrlimit resumed> {rlim_cur=1024, rlim_max=4*1024}) = 0<br />139 rt_sigaction(SIGILL, {0x7fb1a6a2e470, [ILL], SA_RESTORER|SA_RESTART, 0x7fb1a39154b0}, <unfinished ...><br />137 setrlimit(RLIMIT_NOFILE, {rlim_cur=4*1024, rlim_max=4*1024} <unfinished ...><br />139 <... rt_sigaction resumed> {0x7fb1a6a52790, [], SA_RESTORER|SA_STACK|SA_NODEFER|SA_SIGINFO, 0x7fb1a63f1390}, 8) = 0<br />137 <... setrlimit resumed> ) = 0<br />139 rt_sigaction(SIGFPE, {0x7fb1a6a2e470, [FPE], SA_RESTORER|SA_RESTART, 0x7fb1a39154b0}, <unfinished ...><br />137 gettid( <unfinished ...><br />139 <... rt_sigaction resumed> {0x7fb1a6a52790, [], SA_RESTORER|SA_STACK|SA_NODEFER|SA_SIGINFO, 0x7fb1a63f1390}, 8) = 0<br />137 <... gettid resumed> ) = 137<br />139 rt_sigaction(SIGSEGV, {0x7fb1a6a2e470, [SEGV], SA_RESTORER|SA_RESTART, 0x7fb1a39154b0}, <unfinished ...><br />137 rt_sigprocmask(SIG_BLOCK, [TERM], <unfinished ...><br />139 <... rt_sigaction resumed> {0x7fb1a6a52790, [], SA_RESTORER|SA_STACK|SA_NODEFER|SA_SIGINFO, 0x7fb1a63f1390}, 8) = 0<br />137 <... rt_sigprocmask resumed> NULL, 8) = 0<br />139 rt_sigaction(SIGBUS, {0x7fb1a6a2e470, [BUS], SA_RESTORER|SA_RESTART, 0x7fb1a39154b0}, <unfinished ...><br />137 rt_sigtimedwait([TERM], NULL, NULL, 8 <unfinished ...><br />139 <... rt_sigaction resumed> {0x7fb1a6a52790, [], SA_RESTORER|SA_STACK|SA_NODEFER|SA_SIGINFO, 0x7fb1a63f1390}, 8) = 0<br />139 rt_sigaction(SIGTRAP, {0x7fb1a6a2e470, [TRAP], SA_RESTORER|SA_RESTART, 0x7fb1a39154b0}, {0x7fb1a6a52790, [], SA_RESTORER|SA_STACK|SA_NODEFER|SA_SIGINFO, 0x7fb1a63f1390}, 8) = 0<br />139 rt_sigaction(SIGSYS, {0x7fb1a6a2e470, [SYS], SA_RESTORER|SA_RESTART, 0x7fb1a39154b0}, {0x7fb1a6a2d290, [SYS], SA_RESTORER|SA_RESTART, 0x7fb1a39154b0}, 8) = 0<br />139 rt_sigaction(SIGXCPU, {0x7fb1a6a2e470, [XCPU], SA_RESTORER|SA_RESTART, 0x7fb1a39154b0}, {0x7fb1a6a2d290, [XCPU], SA_RESTORER|SA_RESTART, 0x7fb1a39154b0}, 8) = 0<br />139 rt_sigaction(SIGXFSZ, {0x7fb1a6a2e470, [XFSZ], SA_RESTORER|SA_RESTART, 0x7fb1a39154b0}, {0x7fb1a6a2d290, [XFSZ], SA_RESTORER|SA_RESTART, 0x7fb1a39154b0}, 8) = 0<br />139 rt_sigaction(SIGSTKFLT, {0x7fb1a6a2e470, [STKFLT], SA_RESTORER|SA_RESTART, 0x7fb1a39154b0}, {0x7fb1a6a2d290, [STKFLT], SA_RESTORER|SA_RESTART, 0x7fb1a39154b0}, 8) = 0<br />139 mmap(NULL, 4194304, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fb1a03f0000<br />139 munmap(0x7fb1a03f0000, 4194304) = 0<br />139 mmap(NULL, 8384512, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fb19fff0000<br />139 munmap(0x7fb19fff0000, 65536) = 0<br />139 munmap(0x7fb1a0400000, 4124672) = 0<br />139 open("/proc/self/status", O_RDONLY) = 41<br />139 fstat(41, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0<br />139 read(41, "Name:\tsqlservr\nState:\tS (sleepin"..., 4096) = 663<br />139 close(41) = 0<br />139 rt_sigaction(SIGABRT, {0x7fb1a6a2d290, [ABRT], SA_RESTORER|SA_RESTART, 0x7fb1a39154b0}, {0x7fb1a6a2e470, [ABRT], SA_RESTORER|SA_RESTART, 0x7fb1a39154b0}, 8) = 0<br />139 rt_sigaction(SIGILL, {0x7fb1a6a2d290, [ILL], SA_RESTORER|SA_RESTART, 0x7fb1a39154b0}, {0x7fb1a6a2e470, [ILL], SA_RESTORER|SA_RESTART, 0x7fb1a39154b0}, 8) = 0<br />139 rt_sigaction(SIGFPE, {0x7fb1a6a2d290, [FPE], SA_RESTORER|SA_RESTART, 0x7fb1a39154b0}, {0x7fb1a6a2e470, [FPE], SA_RESTORER|SA_RESTART, 0x7fb1a39154b0}, 8) = 0<br />139 rt_sigaction(SIGSEGV, {0x7fb1a6a2d290, [SEGV], SA_RESTORER|SA_RESTART, 0x7fb1a39154b0}, {0x7fb1a6a2e470, [SEGV], SA_RESTORER|SA_RESTART, 0x7fb1a39154b0}, 8) = 0<br />139 rt_sigaction(SIGBUS, {0x7fb1a6a2d290, [BUS], SA_RESTORER|SA_RESTART, 0x7fb1a39154b0}, {0x7fb1a6a2e470, [BUS], SA_RESTORER|SA_RESTART, 0x7fb1a39154b0}, 8) = 0<br />139 rt_sigaction(SIGTRAP, {0x7fb1a6a2d290, [TRAP], SA_RESTORER|SA_RESTART, 0x7fb1a39154b0}, {0x7fb1a6a2e470, [TRAP], SA_RESTORER|SA_RESTART, 0x7fb1a39154b0}, 8) = 0<br />139 rt_sigaction(SIGSYS, {0x7fb1a6a2d290, [SYS], SA_RESTORER|SA_RESTART, 0x7fb1a39154b0}, {0x7fb1a6a2e470, [SYS], SA_RESTORER|SA_RESTART, 0x7fb1a39154b0}, 8) = 0<br />139 rt_sigaction(SIGXCPU, {0x7fb1a6a2d290, [XCPU], SA_RESTORER|SA_RESTART, 0x7fb1a39154b0}, {0x7fb1a6a2e470, [XCPU], SA_RESTORER|SA_RESTART, 0x7fb1a39154b0}, 8) = 0<br />139 rt_sigaction(SIGXFSZ, {0x7fb1a6a2d290, [XFSZ], SA_RESTORER|SA_RESTART, 0x7fb1a39154b0}, {0x7fb1a6a2e470, [XFSZ], SA_RESTORER|SA_RESTART, 0x7fb1a39154b0}, 8) = 0<br />139 rt_sigaction(SIGSTKFLT, {0x7fb1a6a2d290, [STKFLT], SA_RESTORER|SA_RESTART, 0x7fb1a39154b0}, {0x7fb1a6a2e470, [STKFLT], SA_RESTORER|SA_RESTART, 0x7fb1a39154b0}, 8) = 0<br /><strong>139 prctl(PR_SET_PTRACER, PR_SET_PTRACER_ANY) = -1 EINVAL (Invalid argument)<br /></strong>139 prctl(PR_SET_PDEATHSIG, SIG_0) = 0<br />139 open("/proc/self/status", O_RDONLY) = 41<br />139 fstat(41, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0<br />139 read(41, "Name:\tsqlservr\nState:\tS (sleepin"..., 4096) = 663<br />139 close(41) = 0<br />139 rt_sigprocmask(SIG_UNBLOCK, [ABRT], NULL, 8) = 0<br /><strong>139 tgkill(137, 139, SIGABRT) = 0<br /></strong>139 --- SIGABRT {si_signo=SIGABRT, si_code=SI_TKILL, si_pid=137, si_uid=999} ---<br />139 gettid() = 139<br />139 write(2, "Dump collecting thread [139] hit"..., 57) = 57<br />139 exit_group(-1) = ?<br />137 +++ exited with 255 +++<br />138 +++ exited with 255 +++<br />139 +++ exited with 255 +++
What’s Really Happening?
Well I think something is missing from Windows Subsystem for Linux. Is it the Yama stuff…perhaps. But clearly SQL Server isn’t happy with the environment and kills itself. I haven’t dove into WSL yet and I don’t know how it’s implemented, but there could also be something up at that level too. Generally I don’t write blog posts where I don’t know exactly what’s going on, but I did want to let folks know that SQL on Linux doesn’t work on Windows Subsystem for Linux.
The post Attempting to Run SQL on Linux Inside Windows Subsystem for Linux appeared first on Centino Systems Blog.