Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash on node:10-alpine docker image on some systems #1226

Closed
orgads opened this issue Dec 22, 2019 · 6 comments
Closed

Crash on node:10-alpine docker image on some systems #1226

orgads opened this issue Dec 22, 2019 · 6 comments

Comments

@orgads
Copy link
Contributor

orgads commented Dec 22, 2019

Problem description

The grpc node module crashes on a particular system. I tried to reproduce on another machine and failed, but on this machine it reliably reproduces.

Reproduction steps

docker run -it node:10-alpine sh
mkdir a
cd a
npm init -y
export NODE_TLS_REJECT_UNAUTHORIZED=0
npm i grpc
cd node_modules/grpc/src/node/extension_binary/node-v64-linux-x64-musl
node grpc_node.node
Segmentation fault (core dumped)

The same steps work with node:12-alpine.

Environment

  • OS name, version and architecture: Host: Linux Ubuntu 18.04.2 amd64, Docker: 18.09.7
  • Kernel version: 5.0.0-37-generic
  • Node version: 10.18.0
  • Node installation method: Docker
  • If applicable, compiler version: Downloaded the pre-built binary
  • Package name and version: [email protected]

Additional context

(gdb) run grpc_node.node
Starting program: /usr/local/bin/node grpc_node.node
warning: Error disabling address space randomization: Operation not permitted
[New LWP 39]
[New LWP 40]
[New LWP 41]
[New LWP 42]
[New LWP 43]
[New LWP 44]

Thread 7 "node" received signal SIG34, Real-time event 34.
[Switching to LWP 44]
__cp_end () at src/thread/x86_64/syscall_cp.s:29
29      src/thread/x86_64/syscall_cp.s: No such file or directory.
(gdb) bt
#0  __cp_end () at src/thread/x86_64/syscall_cp.s:29
#1  0x00007f64cf437895 in __syscall_cp_c (nr=202, u=<optimized out>, v=<optimized out>, w=<optimized out>, x=<optimized out>, y=<optimized out>, z=0)
    at src/thread/pthread_cancel.c:33
#2  0x00007f64cf436e9f in __futex4_cp (to=0x0, val=-1, op=<optimized out>,
    addr=0x555f5e7a4720 <node::inspector::(anonymous namespace)::start_io_thread_semaphore>) at src/thread/__timedwait.c:52
#3  __timedwait_cp (addr=addr@entry=0x555f5e7a4720 <node::inspector::(anonymous namespace)::start_io_thread_semaphore>, val=val@entry=-1, clk=clk@entry=0,
    at=at@entry=0x0, priv=<optimized out>) at src/thread/__timedwait.c:52
#4  0x00007f64cf439fe1 in sem_timedwait (sem=sem@entry=0x555f5e7a4720 <node::inspector::(anonymous namespace)::start_io_thread_semaphore>, at=at@entry=0x0)
    at src/thread/sem_timedwait.c:23
#5  0x00007f64cf43a055 in sem_wait (sem=sem@entry=0x555f5e7a4720 <node::inspector::(anonymous namespace)::start_io_thread_semaphore>) at src/thread/sem_wait.c:5
#6  0x0000555f5ce88cd2 in uv__sem_wait (sem=0x555f5e7a4720 <node::inspector::(anonymous namespace)::start_io_thread_semaphore>)
    at ../deps/uv/src/unix/thread.c:604
#7  uv_sem_wait (sem=0x555f5e7a4720 <node::inspector::(anonymous namespace)::start_io_thread_semaphore>) at ../deps/uv/src/unix/thread.c:660
#8  0x0000555f5cdf94b0 in node::inspector::(anonymous namespace)::StartIoThreadMain(void*) ()
#9  0x00007f64cf43833e in start (p=0x7f64cca2aab0) at src/thread/pthread_create.c:192
#10 0x00007f64cf43a447 in __clone () at src/thread/x86_64/clone.s:22
@lxzlovelyidiot
Copy link

The same on centos 3.10.0-957.21.3.el7.x86_64 #1 SMP Tue Jun 18 16:35:19 UTC 2019 x86_64 Linux with alpine:3.10 node-10.16.3

@murgatroid99
Copy link
Member

I can't reproduce this failure on either of the listed Alpine docker images.

@vdeturckheim
Copy link

I have a similar issue with another native addon on alpine linux with node 10. But so far I can only reproduce it in AWS codebuild which runs Ubuntu 14 with am Amazon build of Linux Kernel.

Out of curiosity, @orgads is the machine where it crashes on AWS?

@orgads
Copy link
Contributor Author

orgads commented Mar 16, 2020

IIRC it was on Azure.

@vdeturckheim
Copy link

Weird, I might have a tempoary fix (worked for me duting my tests) but I am not clear regarding the implications of it: adding --cap-add=SYS_PTRACE --security-opt seccomp=unconfined flags to docker run

@vdeturckheim
Copy link

I am coming back to this, we identified an issue with a syscall made during the dlopen sequence on Alpine - it is not allowed on docker. My collegue made a PR on Moby to fix it and it has been released in docker engine 20.10. I expect this should fix the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants