Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v1.5.10 memory leak regression when used with libcurl #5044

Closed
sbiscigl opened this issue Jan 17, 2025 · 11 comments
Closed

v1.5.10 memory leak regression when used with libcurl #5044

sbiscigl opened this issue Jan 17, 2025 · 11 comments

Comments

@sbiscigl
Copy link
Contributor

Hey from the aws cpp sdk, and we're trying to update our submodule dependencies and we noticed a regression/memory leak when using s2n in conjuction with libcurl and its blocking us from updating our dependencies.

Created a small reproduction

#include <curl/curl.h>
#include <s2n.h>

auto main() -> int {
    s2n_init();
    curl_global_init(CURL_GLOBAL_ALL);
    curl_global_cleanup();
    s2n_cleanup();
    s2n_cleanup_final();
    return 0;
}

Dockerfile that can reproduce it:

# Using offical Amazon Linux 2 image from public ECR
FROM public.ecr.aws/amazonlinux/amazonlinux:2

#Install g++
RUN yum groupinstall "Development Tools" -y

#Install required dependencies
RUN yum install -y curl-devel openssl-devel ninja-build cmake3 rsync gdb libasan

#Link cmake3 correctly
RUN ln -s /usr/bin/cmake3 /usr/bin/cmake

# v1.5.9 works
#RUN git clone -b v1.5.9 https://github.com/aws/s2n-tls.git &&\
#    cd s2n-tls &&\
#    mkdir build &&\
#    cd build &&\
#    cmake3 -DCMAKE_BUILD_TYPE=Debug -DCMAKE_CXX_FLAGS="-ggdb -fsanitize=address" -DCMAKE_INSTALL_PREFIX="/s2n-install" .. && \
#    cmake3 --build . &&\
#    cmake3 --install .

# v1.5.10/main has a memory leak
RUN git clone https://github.com/aws/s2n-tls.git &&\
    cd s2n-tls &&\
    mkdir build &&\
    cd build &&\
    cmake3 -DCMAKE_BUILD_TYPE=Debug -DCMAKE_CXX_FLAGS="-ggdb -fsanitize=address" -DCMAKE_INSTALL_PREFIX="/s2n-install" .. && \
    cmake3 --build . &&\
    cmake3 --install .

# Create test application
RUN mkdir test

# Create CMakeLists.txt
RUN touch /test/CMakeLists.txt &&\
    echo "cmake_minimum_required(VERSION 3.13)" >> /test/CMakeLists.txt &&\
    echo "project(test_s2n_init)" >> /test/CMakeLists.txt &&\
    echo "set(CMAKE_CXX_STANDARD 20)" >> /test/CMakeLists.txt &&\
    echo "find_package(CURL REQUIRED)" >> /test/CMakeLists.txt &&\
    echo "find_package(s2n REQUIRED)" >> /test/CMakeLists.txt &&\
    echo "add_executable(\${PROJECT_NAME} "main.cpp")" >> /test/CMakeLists.txt &&\
    echo "target_include_directories(\${PROJECT_NAME} PRIVATE \${CURL_INCLUDE_DIRS})" >> /test/CMakeLists.txt &&\
    echo "target_link_libraries(\${PROJECT_NAME} PRIVATE \${CURL_LIBRARIES} AWS::s2n)" >> /test/CMakeLists.txt

# Create simple test file
RUN touch /test/main.cpp &&\
    echo "#include <curl/curl.h>" >> /test/main.cpp &&\
    echo "#include <s2n.h>" >> /test/main.cpp &&\
    echo "" >> /test/main.cpp &&\
    echo "auto main() -> int {" >> /test/main.cpp &&\
    echo "    s2n_init();" >> /test/main.cpp &&\
    echo "    curl_global_init(CURL_GLOBAL_ALL);" >> /test/main.cpp &&\
    echo "    curl_global_cleanup();" >> /test/main.cpp &&\
    echo "    s2n_cleanup();" >> /test/main.cpp &&\
    echo "    s2n_cleanup_final();" >> /test/main.cpp &&\
    echo "    return 0;" >> /test/main.cpp &&\
    echo "}" >> /test/main.cpp

# Build test
RUN cd test &&\
    mkdir build &&\
    cd build &&\
    cmake3 -DCMAKE_BUILD_TYPE=Debug -DCMAKE_CXX_FLAGS="-ggdb -fsanitize=address" -DCMAKE_PREFIX_PATH="/s2n-install" .. &&\
    cmake3 --build .

can build and run this to replicate with

docker build -t test-image .
docker run --name test-image test-image /test/build/test_s2n_init

but you should see the stack trace

=================================================================
==1==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 216 byte(s) in 1 object(s) allocated from:
    #0 0xffff966dbdb3 in __interceptor_malloc (/lib64/libasan.so.4+0xdbdb3)
    #1 0xffff9626f35f in CRYPTO_malloc (/lib64/libcrypto.so.10+0x6f35f)
    #2 0xffff962d6d8b in ENGINE_new (/lib64/libcrypto.so.10+0xd6d8b)
    #3 0x40c757 in s2n_rand_init /s2n-tls/utils/s2n_random.c:576
    #4 0x4063c3 in s2n_init /s2n-tls/utils/s2n_init.c:78
    #5 0x40606f in main /test/main.cpp:5
    #6 0xffff96033da3 in __libc_start_main (/lib64/libc.so.6+0x1fda3)
    #7 0x405f87  (/test/build/sdk_usage_workspace+0x405f87)

which is indicative of this change being the cultprit of it.

would guess this is a issue with static state being shared between s2n and libcurl in some way.

@boquan-fang
Copy link
Contributor

Hello Sam,

Thanks for providing all those context. I have pull downed your Dockerfile and can confirm that the leak is caused by PR#4878. I did two git checkouts for the commit hash of PR#4878 and its parent commit. ASAN didn't complain for the parent commit, but did detect memory leaks for PR#4878.

@maddeleine
Copy link
Contributor

maddeleine commented Jan 21, 2025

So far I don't know why the leak started to occur. However it seems that the try-compile feature S2N_LIBCRYPTO_SUPPORTS_ENGINE evaluates to true in the repro. Which means that our custom randomness code is trying to be instantiated/cleaned up. Maybe it was turned off for this platform previously. I'm going to hazard a guess that the libcrypto being used here is Openssl 1.0.2.

@maddeleine
Copy link
Contributor

Leak goes away if you re-order the init/cleanup code:

auto main() -> int {
    curl_global_init(CURL_GLOBAL_ALL);
    s2n_init();
    s2n_cleanup_final();
    curl_global_cleanup();
    return 0;
}

I suspect curl is wiping our randomness engine, maybe replacing it with their own. Therefore our engine pointer memory gets leaked since its being wiped.

@sbiscigl
Copy link
Contributor Author

Leak goes away if you re-order the init/cleanup code

yeah I noticed that too, and I would re-order if I could, but the order of initialization happens at several layers of indirection for us. we initialize the CRT, which initializes s2n, then initialize the http client. we cannot move the crt initialization after the http client as the http client is dependent on CRT stuff. I know that doesnt have to do with s2n, but re-ordering the call order isn't as trivial for us as it would be in this example.

I'm going to hazard a guess that the libcrypto being used here is Openssl 1.0.2.

looking at the libcrypto it finds in the image might not be

bash-4.2# strings /usr/lib64/libcrypto.so | grep "^OpenSSL \S\+ [0-9]\+ \S\+ [0-9]\+"
OpenSSL 1.0.0-fips 29 Mar 2010
OpenSSL 1.0.1e-fips 11 Feb 2013

@maddeleine
Copy link
Contributor

Hmmm wow that is an old version of Openssl. Okay here's my guess about why this started to cause issues. Before the PR, we were gating our custom randomness engine with !defined(OPENSSL_FIPS). We don't want to turn on the custom randomness code if the libcrypto is doing anything fips-related. Now, we are using a slightly more complicated check, here where we both check OPENSSL_FIPS and FIPS_mode(), which ensures that not only is the libcrypto capable of fips, but the libcrypto actually has fips mode turned on. And FIPS_mode() would definitely return false for OpenSSL 1.0.0/1e-fips, because these libraries are so old they aren't actually certified for fips anymore.

Anyways it seems like our code is working as intended, replacing the randomness engine when s2n-tls is built with older libcryptos. There's obviously some sort of incompatibility with our engine code and how curl works. Not sure what we can do about that yet.

@sbiscigl
Copy link
Contributor Author

Hmmm wow that is an old version of Openssl.

It is! but its also what comes with amazon linux 2 by default, so it is something that has to be in compatibility matrix.

@sbiscigl
Copy link
Contributor Author

ok so updating

yum install -y openssl-devel

to

yum install -y openssl11-devel

Will result in the memory leak going away. however this seems like a breaking change to require customers to use a newer version of openssl-devel on AL2. If we do want to say "thats not supported" that will impact a non-zero amount of our customers using the RPM on AL2, so we shouldn't do this lightly.

@sbiscigl
Copy link
Contributor Author

ok was able to reproduce with 1.0.2, attaching the dockerfile here

# Using offical Amazon Linux 2 image from public ECR
FROM public.ecr.aws/amazonlinux/amazonlinux:2

RUN yum groupinstall "Development Tools" -y
RUN yum install -y curl-devel ninja-build cmake3 libasan wget

# lets get openssl 1.0.2
RUN wget https://github.com/openssl/openssl/releases/download/OpenSSL_1_0_2/openssl-1.0.2.tar.gz &&\
    tar -xvzf openssl-1.0.2.tar.gz &&\
    cd openssl-1.0.2 &&\
    ./config &&\
    make &&\
    make install


# v1.5.9 works
#RUN git clone -b v1.5.9 https://github.com/aws/s2n-tls.git &&\
#    cd s2n-tls &&\
#    mkdir build &&\
#    cd build &&\
#    cmake3 -DCMAKE_BUILD_TYPE=Debug -DCMAKE_PREFIX_PATH="/usr/local/ssl/" -DCMAKE_CXX_FLAGS="-ggdb -fsanitize=address" -DCMAKE_INSTALL_PREFIX="/s2n-install" .. && \
#    cmake3 --build . &&\
#    cmake3 --install .

# v1.5.10/main has a memory leak
RUN git clone -b v1.5.10 https://github.com/aws/s2n-tls.git &&\
    cd s2n-tls &&\
    mkdir build &&\
    cd build &&\
    cmake3 -DCMAKE_BUILD_TYPE=Debug -DCMAKE_PREFIX_PATH="/usr/local/ssl/" -DCMAKE_CXX_FLAGS="-ggdb -fsanitize=address" -DCMAKE_INSTALL_PREFIX="/s2n-install" .. && \
    cmake3 --build . &&\
    cmake3 --install .

# # Create test application
RUN mkdir test

# # Create CMakeLists.txt
RUN touch /test/CMakeLists.txt &&\
    echo "cmake_minimum_required(VERSION 3.13)" >> /test/CMakeLists.txt &&\
    echo "project(test_s2n_init)" >> /test/CMakeLists.txt &&\
    echo "set(CMAKE_CXX_STANDARD 20)" >> /test/CMakeLists.txt &&\
    echo "find_package(CURL REQUIRED)" >> /test/CMakeLists.txt &&\
    echo "find_package(s2n REQUIRED)" >> /test/CMakeLists.txt &&\
    echo "add_executable(\${PROJECT_NAME} "main.cpp")" >> /test/CMakeLists.txt &&\
    echo "target_include_directories(\${PROJECT_NAME} PRIVATE \${CURL_INCLUDE_DIRS})" >> /test/CMakeLists.txt &&\
    echo "target_link_libraries(\${PROJECT_NAME} PRIVATE \${CURL_LIBRARIES} AWS::s2n)" >> /test/CMakeLists.txt

# # Create simple test file
RUN touch /test/main.cpp &&\
    echo "#include <curl/curl.h>" >> /test/main.cpp &&\
    echo "#include <s2n.h>" >> /test/main.cpp &&\
    echo "" >> /test/main.cpp &&\
    echo "auto main() -> int {" >> /test/main.cpp &&\
    echo "    s2n_init();" >> /test/main.cpp &&\
    echo "    curl_global_init(CURL_GLOBAL_ALL);" >> /test/main.cpp &&\
    echo "    curl_global_cleanup();" >> /test/main.cpp &&\
    echo "    s2n_cleanup();" >> /test/main.cpp &&\
    echo "    s2n_cleanup_final();" >> /test/main.cpp &&\
    echo "    return 0;" >> /test/main.cpp &&\
    echo "}" >> /test/main.cpp

# # Build test
RUN cd test &&\
    mkdir build &&\
    cd build &&\
    cmake3 -DCMAKE_BUILD_TYPE=Debug -DCMAKE_CXX_FLAGS="-ggdb -fsanitize=address" -DCMAKE_PREFIX_PATH="/s2n-install;/usr/local/ssl/" .. &&\
    cmake3 --build .

We should probably chase down why such a old RPM is in AL2, but looks like even if they were to update to 1.0.2 the issue would be present.

@jouho
Copy link
Contributor

jouho commented Feb 14, 2025

Hello @sbiscigl we have merged changes to address this issue. Please confirm the fix, and feel free to reopen if it persists. Thank you!

@jouho jouho closed this as completed Feb 14, 2025
@sbiscigl
Copy link
Contributor Author

feel free to reopen if it persists

I dont have permissions to re-open, can you please re-open this? the docker file i provided above still replicates the issue on mainline

FROM public.ecr.aws/amazonlinux/amazonlinux:2

RUN yum groupinstall "Development Tools" -y
RUN yum install -y curl-devel ninja-build cmake3 libasan wget

# lets get openssl 1.0.2
RUN wget https://github.com/openssl/openssl/releases/download/OpenSSL_1_0_2/openssl-1.0.2.tar.gz &&\
    tar -xvzf openssl-1.0.2.tar.gz &&\
    cd openssl-1.0.2 &&\
    ./config &&\
    make &&\
    make install

# main has a memory leak
RUN git clone https://github.com/aws/s2n-tls.git &&\
    cd s2n-tls &&\
    mkdir build &&\
    cd build &&\
    cmake3 -DCMAKE_BUILD_TYPE=Debug -DCMAKE_PREFIX_PATH="/usr/local/ssl/" -DCMAKE_CXX_FLAGS="-ggdb -fsanitize=address" -DCMAKE_INSTALL_PREFIX="/s2n-install" .. && \
    cmake3 --build . &&\
    cmake3 --install .

# # Create test application
RUN mkdir test

# # Create CMakeLists.txt
RUN touch /test/CMakeLists.txt &&\
    echo "cmake_minimum_required(VERSION 3.13)" >> /test/CMakeLists.txt &&\
    echo "project(test_s2n_init)" >> /test/CMakeLists.txt &&\
    echo "set(CMAKE_CXX_STANDARD 20)" >> /test/CMakeLists.txt &&\
    echo "find_package(CURL REQUIRED)" >> /test/CMakeLists.txt &&\
    echo "find_package(s2n REQUIRED)" >> /test/CMakeLists.txt &&\
    echo "add_executable(\${PROJECT_NAME} "main.cpp")" >> /test/CMakeLists.txt &&\
    echo "target_include_directories(\${PROJECT_NAME} PRIVATE \${CURL_INCLUDE_DIRS})" >> /test/CMakeLists.txt &&\
    echo "target_link_libraries(\${PROJECT_NAME} PRIVATE \${CURL_LIBRARIES} AWS::s2n)" >> /test/CMakeLists.txt

# # Create simple test file
RUN touch /test/main.cpp &&\
    echo "#include <curl/curl.h>" >> /test/main.cpp &&\
    echo "#include <s2n.h>" >> /test/main.cpp &&\
    echo "" >> /test/main.cpp &&\
    echo "auto main() -> int {" >> /test/main.cpp &&\
    echo "    s2n_init();" >> /test/main.cpp &&\
    echo "    curl_global_init(CURL_GLOBAL_ALL);" >> /test/main.cpp &&\
    echo "    curl_global_cleanup();" >> /test/main.cpp &&\
    echo "    s2n_cleanup();" >> /test/main.cpp &&\
    echo "    s2n_cleanup_final();" >> /test/main.cpp &&\
    echo "    return 0;" >> /test/main.cpp &&\
    echo "}" >> /test/main.cpp

# # Build test
RUN cd test &&\
    mkdir build &&\
    cd build &&\
    cmake3 -DCMAKE_BUILD_TYPE=Debug -DCMAKE_CXX_FLAGS="-ggdb -fsanitize=address" -DCMAKE_PREFIX_PATH="/s2n-install;/usr/local/ssl/" .. &&\
    cmake3 --build .

and as before you can replicate with

docker build -t test-image .
docker run --name test-image test-image /test/build/test_s2n_init

can we use fixing this example as a baseline of what needs to be fixed.

@goatgoose goatgoose reopened this Feb 17, 2025
@boquan-fang
Copy link
Contributor

We have discussed this offline. This fix works for openssl1-0-2-fips. Please let us know if you have other concerns.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants