NDK r17 was the last version to include GCC. If you're upgrading from an old NDK and need to migrate to Clang, this doc can help.
If you maintain a custom build system, see the Build System Maintainers documentation.
Clang Optimization Flags
has the full details, but if you used -Os
to optimize your
code for size with GCC, you probably want -Oz
when using
Clang. Although -Os
attempts to make code small, it still
enables some optimizations that will increase code size (based on
https://stackoverflow.com/a/15548189/632035). For the smallest possible
code with Clang, prefer -Oz
. With -Oz
, Chromium actually saw both
size and performance improvements when moving to Clang compared to
-Os
with GCC.
Normally the __aligned__
attribute is given an explicit alignment,
but with no value means “maximum alignment”. The interpretation of
“maximum” differs between GCC and Clang: Clang includes vector types
too so for ARM GCC thinks the maximum alignment is 8 (for uint64_t
), but
Clang thinks it’s 16 (because there are NEON instructions that require
16-byte alignment). Normally this shouldn’t matter because malloc is
always at least 16-byte aligned, and mmap regions are page (4096-byte)
aligned. Most code should either specify an explicit alignment or use
alignas instead.
When targeting Android (but no other platform), GCC passed
-Bsymbolic
to the linker by default. This is not a good default, so Clang does not
do that. -Bsymbolic
causes the following behavior change:
// foo.cpp
#include <iostream>
void foo() {
std::cout << "Goodbye, world" << std::endl;
}
void bar() {
foo();
}
// main.cpp
#include <iostream>
extern void bar();
void foo() {
std::cout << "Hello, world\n";
}
int main(int, char**) {
foo(); // Prints “Hello, world!”
bar(); // Without -Bsymbolic, prints “Hello, world!” With -Bsymbolic, prints “Goodbye, world!”
}
In addition to not being the "expected" default behavior on all other platforms, this prevents symbol interposition (used by tools such as asan).
You might however wish to add manually -Bsymbolic
back because it can
result in smaller ELF files because fewer relocations are needed. If you
do want the non--Bsymbolic
behavior but would like fewer relocations,
that can be achieved via -fvisibility=hidden
(and manually exporting
the symbols you want to be public, using the JNI_EXPORT
macro in JNI
code or __attribute__ ((visibility("default")))
otherwise. Linker
version scripts are an even more powerful mechanism for controlling
exported symbols, but harder to use.
For many years the problem of adjusting inline assembler to work with
LLVM could be punted down the road by using -fno-integrated-as
to fall
back to the GNU Assembler (GAS). With the removal of GNU binutils from
the NDK, such issues will now need to be addressed. We’ve collected
some of the most common issues and their solutions/workarounds here.
GAS doesn’t scope .arch
or .arch_extension
, so you can have a global
__asm__(".arch foo")
that applies to the whole C/C++ source file,
just like a bare .arch
or .arch_extension
directive would in a .S
file. LLVM scopes these to the specific __asm__
in which it occurs,
so you’ll need to adapt your inline assembler, or build the whole file
for the relevant arch variant.
GAS lets you use the ADRL
pseudoinstruction to get the address of
something too far away for a regular ADR
to reference. This means
that it expands to two instructions, which LLVM doesn’t support,
so you’ll need to use a macro something like this instead:
.macro ADRL reg:req, label:req
add \reg, pc, #((\label - .L_adrl_\@) & 0xff00)
add \reg, \reg, #((\label - .L_adrl_\@) - ((\label - .L_adrl_\@) & 0xff00))
.L_adrl_\@:
.endm
While GAS supports the older divided and newer unified syntax (selectable
via .syntax unified
and .syntax divided
), LLVM only supports the
newer unified syntax.
As an example of where this matters, LDR
has an optional type and the
optional condition code allowed on all instructions. GAS allows these
to come in either order when using divided syntax, but LLVM only allows
them in the canonical order given in the ARM instruction reference (which
is what “unified” syntax means). So continuing this example, GAS
accepts both LDRBEQ
and LDREQB
, but LLVM only accepts LDRBEQ
(with
the condition code at the end, as the instruction appears in the manual).
Most humans usually use this order anyway, but you’ll have to rearrange any instructions that don’t use the canonical order.
Some ARM instructions have restrictions that make some operands
implicit. For example, the two target registers supplied to LDREXD
must be consecutive. GAS would allow you to write LDREXD R1, [R4]
because the other register must be R2
, but LLVM requires both
registers to be explicitly stated, in this case LDREXD R1, R2, [R4]
.
Switching from Thumb to ARM mode implicitly forces 4-byte alignment
with GAS but doesn’t with LLVM. You may need to use an explicit
.align
/.balign
/.p2align
directive in such cases.
GAS and LLVM implement their own conditional assembly mechanism with
.if
....endif
rather than the C preprocessor’s #if
...#endif
. The
equivalent of -DA=B
for .if
is -Wa,-defsym,A=B
, but GAS allowed
--defsym
instead of -defsym
. LLVM requires -defsym
.
You might also prefer to just use the C preprocessor. If your assembly
is in a .S file it is already being preprocessed. If your assembly
is in a file with any other extension (including .s
--- this is the
difference between .s
and .S
), you’ll need to either rename it to
.S
or use the -x assembler-with-cpp
flag to the compiler to override
the file extension-based guess.
GAS ignores a request for obsolete STABS debugging information to be
emitted using .func
and .endfunc
. Neither GAS nor LLVM actually
support STABS, but LLVM rejects these meaningless directives. The fix
is simply to remove them.