-
Notifications
You must be signed in to change notification settings - Fork 119
Tutorial #4: DCA against Karroumi 2010 challenge
Here is the fourth tutorial of our series.
Specificities of this tutorial:
- Learning how to focus on the interesting address range by observing visually one trace
- Learning how to reduce the scope of the DPA phase by observing leakage positions and reducing number of traces, for a faster attack
- Learning how to try other targets than the values after the S-boxes
- Learning how to go back from the leakages to the corresponding source code lines in the white-box
The corresponding materials are available at https://github.com/SideChannelMarvels/Deadpool/tree/master/wbs_aes_karroumi2010
As there was no real challenge with a Karroumi implementation, we did ours based on the one of Dušan Klinec available at https://github.com/ph4r05/Whitebox-crypto-AES
You can either compile your own, following https://github.com/SideChannelMarvels/Deadpool/blob/master/wbs_aes_karroumi2010/target/README.md or use the compiled binary https://github.com/SideChannelMarvels/Deadpool/tree/master/wbs_aes_karroumi2010/target/main
The binary is both a generator of white-box tables and an engine to encrypt/decrypt based on those tables (and a BGE attack tool too ^^, which was the whole point of Dušan to implement Karroumi!).
First step is to generate new white-box tables:
./main --create-table tables_karroumi_noextenc.tbl
Note that currently serializing external encodings in those tables does not work. We used an awful hack when we wrote the paper and I hope the upstream code will be fixed soon so that we can try properly external encodings, but for now we won't use external encodings (this doesn't influence the results anyway).
Usage of the newly created tables is as following:
./main --load-tables tables_karroumi_noextenc.tbl --input-files <(echo 6bc1bee22e409f96e93d7e117393172a|xxd -r -p) --out-file >(xxd -p)
3ad77bb40d7a3660a89ecaf32466ef97
So, acquiring a first full trace is done with:
Tracer -t sqlite -- ./main --load-tables tables_karroumi_noextenc.tbl --input-files <(echo 6bc1bee22e409f96e93d7e117393172a|xxd -r -p) --out-file >(xxd -p)
The trace is about 1Gb large.
Fire tracegraph, load the sqlite trace and rescale with the Overview zoom.
The trace is very large but what takes most of the time is the initialization of the Boost libraries and the loading of the serialized tables. But the crypto part we're interested in is the little structure visible on the bottom left. Zoomed in, it gives:
Roughly the AES is around 0x451150-0x4555a0.
Alternatively, to find the range of addresses to trace, one can look at the binary... WBAES::encdec() is the function using the white-box tables.
nm main -S |grep encdec
0000000000451150 0000000000004cdd T _ZN5WBAES6encdecER6_W128Bb
which gives 0x451150-0x455e2d
But seeing the loops, we can probably restrict ourselves to about 0x4512c0-0x452477:
"Real" stack (so after the few tables dynamically loaded on the stack) is after 0x7ffffff68000. You may find this address if you zoom adequately.
With this information, we can mount the attack with the following parameters:
from deadpool_dca import *
def processinput(iblock, blocksize):
return ["--load-tables tables_karroumi_noextenc.tbl --input-files <(echo %0*x|xxd -r -p) --out-file >(xxd -p)" % (2*blocksize, iblock)]
def processoutput(output, blocksize):
return int(''.join([x for x in output.split('\n') if len(x)==32][0]), 16)
filters=[Filter('mem_addr1_rw1', ['R', 'W'], lambda stack_range, addr, size, data: addr < 0x7ffffff68000 and size == 1, lambda addr, size, data: addr & 0xFF, '<B'),
Filter('mem_data_rw1', ['R', 'W'], lambda stack_range, addr, size, data: addr < 0x7ffffff68000 and size == 1, lambda addr, size, data: data, '<B')]
T=TracerPIN('./main', processinput, processoutput, ARCH.amd64, 16, addr_range='0x4512c0-0x452477', filters=filters, shell=True)
T.run(2000)
bin2daredevil(keywords=filters,
configs={'attack_sbox': {'algorithm':'AES', 'position':'LUT/AES_AFTER_SBOX', 'correct_key':'0x2b7e151628aed2a6abf7158809cf4f3c'},
'attack_multinv':{'algorithm':'AES', 'position':'LUT/AES_AFTER_MULTINV', 'correct_key':'0x2b7e151628aed2a6abf7158809cf4f3c'}})
Results will vary with your own randomly generated white-box. It might even be that you can recover the full key at the first step if you're lucky but here is a typical example where the attack may fail on some byte(s) and how to break them. Best results are when using last byte of addresses and considering the highest absolute correlation amongst the bit correlations, so I removed the information about sum(correlations) for clarity:
daredevil -c mem_addr1_rw1_2000_17096.attack_sbox.config |egrep --line-buffered "(Best|0:)"|grep -v "sum"
Best 10 candidates for key byte #0 according to highest abs(bit_correlations):
0: 0x2b peak: 0.384138 <==
Best bit: 4 rank: 0. -0.384138 0x2b 2071
Best 10 candidates for key byte #1 according to highest abs(bit_correlations):
0: 0x41 peak: 0.260993
Best bit: 2 rank: 0. 0.257112 0x7e 3211
Best 10 candidates for key byte #2 according to highest abs(bit_correlations):
0: 0x15 peak: 0.25414 <==
Best bit: 0 rank: 0. -0.229832 0x15 2942
Best 10 candidates for key byte #3 according to highest abs(bit_correlations):
0: 0x16 peak: 0.520946 <==
Best bit: 0 rank: 0. -0.250154 0x16 2496
Best 10 candidates for key byte #4 according to highest abs(bit_correlations):
0: 0x28 peak: 0.619194 <==
Best bit: 0 rank: 0. -0.406104 0x28 2463
Best 10 candidates for key byte #5 according to highest abs(bit_correlations):
0: 0xae peak: 0.757938 <==
Best bit: 0 rank: 0. -0.268071 0xae 2050
Best 10 candidates for key byte #6 according to highest abs(bit_correlations):
0: 0xd2 peak: 0.379755 <==
Best bit: 2 rank: 0. 0.260265 0xd2 3327
Best 10 candidates for key byte #7 according to highest abs(bit_correlations):
0: 0xa6 peak: 0.496979 <==
Best bit: 1 rank: 0. 0.27145 0xa6 2914
Best 10 candidates for key byte #8 according to highest abs(bit_correlations):
0: 0xab peak: 0.34484 <==
Best bit: 0 rank: 0. -0.34484 0xab 2837
Best 10 candidates for key byte #9 according to highest abs(bit_correlations):
0: 0xf7 peak: 0.542857 <==
Best bit: 2 rank: 0. 0.542857 0xf7 2442
Best 10 candidates for key byte #10 according to highest abs(bit_correlations):
0: 0x15 peak: 0.275168 <==
Best bit: 6 rank: 0. 0.275168 0x15 2119
Best 10 candidates for key byte #11 according to highest abs(bit_correlations):
0: 0x88 peak: 0.54094 <==
Best bit: 2 rank: 0. -0.527773 0x88 3264
Best 10 candidates for key byte #12 according to highest abs(bit_correlations):
0: 0x09 peak: 0.500975 <==
Best bit: 0 rank: 0. -0.500975 0x09 3245
Best 10 candidates for key byte #13 according to highest abs(bit_correlations):
0: 0xcf peak: 0.495989 <==
Best bit: 4 rank: 0. -0.495989 0xcf 2819
Best 10 candidates for key byte #14 according to highest abs(bit_correlations):
0: 0x4f peak: 0.393279 <==
Best bit: 1 rank: 0. -0.241375 0x4f 2542
Best 10 candidates for key byte #15 according to highest abs(bit_correlations):
0: 0x72 peak: 0.288065
We see almost all bytes can be properly recovered, only byte #1 cannot be recovered.
We see also that the useful leakages occur roughly between sample 2000 and sample 3350 so we know how to replay the attack on potentially other instances, but faster:
Edit mem_addr1_rw1_2000_17096.attack_sbox.config
index=2000
nsamples=1350
Attack takes now 2s per key byte, so 32s in total.
We may even reduce the number of traces to about 500 (by trial & error), which brings the attack down to 8s.
trace=mem_addr1_rw1_2000_17096.trace 500 17096
guess=mem_addr1_rw1_2000_17096.input 500 16
About the missing key byte #1, it's trivial to brute-force but we can try something else: Karroumi dual ciphers are about alternative affine transformations so we might e.g. try as target the application of the multiplicative inverse to the data, a step normally internal to the S-box. But first, apply the same changes in mem_addr1_rw1_2000_17096.attack_multinv.config to execute this second attack faster.
daredevil -c mem_addr1_rw1_2000_17096.attack_multinv.config |egrep --line-buffered "(Best|0:)"|grep -v sum
Best 10 candidates for key byte #0 according to highest abs(bit_correlations):
0: 0xfc peak: 0.260916
Best bit: 3 rank: 1. 0.250943 0x2b 2085
Best 10 candidates for key byte #1 according to highest abs(bit_correlations):
0: 0x7e peak: 0.502287 <==
Best bit: 0 rank: 0. -0.257893 0x7e 3203
Best 10 candidates for key byte #2 according to highest abs(bit_correlations):
0: 0x15 peak: 0.367983 <==
Best bit: 0 rank: 0. 0.242405 0x15 2927
Best 10 candidates for key byte #3 according to highest abs(bit_correlations):
0: 0x16 peak: 0.746864 <==
Best bit: 0 rank: 0. 0.746864 0x16 2554
Best 10 candidates for key byte #4 according to highest abs(bit_correlations):
0: 0x28 peak: 0.410953 <==
Best bit: 2 rank: 0. 0.255894 0x28 2477
Best 10 candidates for key byte #5 according to highest abs(bit_correlations):
0: 0xae peak: 0.49706 <==
Best bit: 3 rank: 0. -0.49706 0xae 2104
Best 10 candidates for key byte #6 according to highest abs(bit_correlations):
0: 0xd2 peak: 0.409918 <==
Best bit: 0 rank: 0. 0.409918 0xd2 3278
Best 10 candidates for key byte #7 according to highest abs(bit_correlations):
0: 0xa6 peak: 0.54717 <==
Best bit: 1 rank: 0. -0.448853 0xa6 2907
Best 10 candidates for key byte #8 according to highest abs(bit_correlations):
0: 0x32 peak: 0.264
Best 10 candidates for key byte #9 according to highest abs(bit_correlations):
0: 0xf7 peak: 0.511092 <==
Best bit: 1 rank: 0. -0.511092 0xf7 2450
Best 10 candidates for key byte #10 according to highest abs(bit_correlations):
0: 0x6c peak: 0.273054
Best bit: 7 rank: 0. -0.2561 0x15 2165
Best 10 candidates for key byte #11 according to highest abs(bit_correlations):
0: 0x88 peak: 0.531063 <==
Best bit: 3 rank: 0. 0.531063 0x88 3299
Best 10 candidates for key byte #12 according to highest abs(bit_correlations):
0: 0x09 peak: 0.393156 <==
Best bit: 4 rank: 0. -0.2522 0x09 3246
Best 10 candidates for key byte #13 according to highest abs(bit_correlations):
0: 0xcf peak: 0.324364 <==
Best bit: 0 rank: 0. 0.324364 0xcf 2827
Best 10 candidates for key byte #14 according to highest abs(bit_correlations):
0: 0x10 peak: 0.256703
Best bit: 4 rank: 1. 0.242052 0x4f 2541
Best 10 candidates for key byte #15 according to highest abs(bit_correlations):
0: 0x39 peak: 0.257039
Best bit: 0 rank: 1. -0.242115 0x3c 2121
This time a few other bytes were not recovered but we got byte #1! So by combining both attacks we managed to recover the full key. In the rare event one byte is not revealed by an attack or the other, try to brute-force it (it's easy by comparing the relative correlation scores if a candidate seems legit or not) or to target other affine transforms than in the original S-box...
One step further:
In this exercise, for once, we've the source code.
So let's see how to we can go back to the source and pinpoint the lines where the leaks occur.
From daredevil's reports let's find the best leaks (grep "rank: 0"
):
Best bit: 0 rank: 0. -0.292396 0x7e 3203
Best bit: 0 rank: 0. 0.280328 0x15 2927
Best bit: 0 rank: 0. 0.739881 0x16 2554
Best bit: 3 rank: 0. -0.422051 0x28 2471
Best bit: 3 rank: 0. -0.517008 0xae 2104
Best bit: 0 rank: 0. 0.415916 0xd2 3278
Best bit: 1 rank: 0. -0.403762 0xa6 2907
Best bit: 1 rank: 0. -0.52469 0xf7 2450
Best bit: 3 rank: 0. 0.508487 0x88 3297
Best bit: 4 rank: 0. -0.271449 0x09 3246
Best bit: 0 rank: 0. 0.320264 0xcf 2827
Best bit: 4 rank: 0. -0.407483 0x2b 2071
Best bit: 2 rank: 0. 0.317442 0x7e 3219
Best bit: 5 rank: 0. -0.288184 0x15 2909
Best bit: 4 rank: 0. -0.531342 0x16 2498
Best bit: 0 rank: 0. -0.386349 0x28 2463
Best bit: 0 rank: 0. -0.295254 0xae 2050
Best bit: 2 rank: 0. 0.267346 0xd2 3327
Best bit: 4 rank: 0. -0.306637 0xa6 2929
Best bit: 3 rank: 0. -0.30787 0xab 2854
Best bit: 2 rank: 0. 0.514181 0xf7 2442
Best bit: 6 rank: 0. 0.279898 0x15 2119
Best bit: 2 rank: 0. -0.583486 0x88 3264
Best bit: 0 rank: 0. -0.432023 0x09 3245
Best bit: 4 rank: 0. -0.448291 0xcf 2819
Best bit: 1 rank: 0. -0.245292 0x4f 2542
Last column is the sample number. We know how we filtered instructions and how we constructed the traces, so we know to which instructions those leaks correspond. But that would be more interesting to know the source line.
So, first, recompile main
with debug info activated in CMakeLists.txt
:
-set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -O2 -std=c++0x")
+set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -g -O2 -std=c++0x")
Let's rename that version make_debug
to avoid confusion.
Check if the targeted function is still at the same addresses or not:
nm main -S |grep encdec
0000000000451150 0000000000004cdd T _ZN5WBAES6encdecER6_W128Bb
nm main_debug -S |grep encdec
0000000000451110 0000000000004cdd T _ZN5WBAES6encdecER6_W128Bb
Hmm the function is now 0x40 bytes earlier, so let's adapt the filtering range to 0x451280-0x452437.
There is a function sample2event
to convert a sample number into an instruction and, if debug info is available, a source line. It's based on the special trace .info
that is created when acquiring traces.
So let's acquire one trace again with main_debug
to get the right addresses.
But first delete the existing .info
:
rm *.info
from deadpool_dca import *
def processinput(iblock, blocksize):
return ["--load-tables tables_karroumi_noextenc.tbl --input-files <(echo %0*x|xxd -r -p) --out-file >(xxd -p)" % (2*blocksize, iblock)]
def processoutput(output, blocksize):
return int(''.join([x for x in output.split('\n') if len(x)==32][0]), 16)
filters=[Filter('mem_addr1_rw1', ['R', 'W'], lambda stack_range, addr, size, data: addr < 0x7ffffff68000 and size == 1, lambda addr, size, data: addr & 0xFF, '<B')]
T=TracerPIN('./main_debug', processinput, processoutput, ARCH.amd64, 16, addr_range='0x451280-0x452437', filters=filters, shell=True)
T.run(1)
for sample in [3203, 2927, 2554, 2471, 2104, 3278, 2907, 2450, 3297, 3246, 2827, 2071, 3219, 2909, 2498, 2463, 2050, 3327, 2929, 2854, 2442, 2119, 3264, 3245, 2819, 2542]:
print sample2event(sample, filters[0], '../target/main_debug')
Result is:
(401, [('[R]', 817, 4528849, 140736938848444, 1, 1, 'WBAES.h:202 (discriminator 2)')])
(366, [('[R]', 746, 4529440, 140736938839257, 1, 3, 'WBAES.h:202 (discriminator 2)')])
(320, [('[R]', 663, 4529532, 140736938827493, 1, 15, 'WBAES.h:202 (discriminator 2)')])
(309, [('[R]', 651, 4529061, 140736938824847, 1, 15, 'WBAES.h:202 (discriminator 2)')])
(263, [('[R]', 568, 4529150, 140736938813083, 1, 10, 'WBAES.h:202 (discriminator 2)')])
(410, [('[R]', 826, 4529260, 140736938850783, 1, 4, 'WBAES.h:202 (discriminator 2)')])
(364, [('[R]', 743, 4529346, 140736938838897, 1, 7, 'WBAES.h:202 (discriminator 2)')])
(307, [('[R]', 649, 4528943, 140736938824282, 1, 2, 'WBAES.h:202 (discriminator 2)')])
(413, [('[R]', 829, 4529422, 140736938851634, 1, 3, 'WBAES.h:202 (discriminator 2)')])
(406, [('[R]', 822, 4529073, 140736938849295, 1, 5, 'WBAES.h:202 (discriminator 2)')])
(354, [('[R]', 733, 4528885, 140736938836477, 1, 9, 'WBAES.h:202 (discriminator 2)')])
(259, [('[R]', 564, 4528943, 140736938812046, 1, 7, 'WBAES.h:202 (discriminator 2)')])
(403, [('[R]', 819, 4528943, 140736938848926, 1, 2, 'WBAES.h:202 (discriminator 2)')])
(364, [('[R]', 743, 4529346, 140736938838897, 1, 7, 'WBAES.h:202 (discriminator 2)')])
(313, [('[R]', 655, 4529190, 140736938825927, 1, 10, 'WBAES.h:202 (discriminator 2)')])
(308, [('[R]', 650, 4529036, 140736938825080, 1, 0, 'WBAES.h:202 (discriminator 2)')])
(257, [('[R]', 562, 4528849, 140736938811471, 1, 12, 'WBAES.h:202 (discriminator 2)')])
(416, [('[R]', 833, 4529532, 140736938851875, 1, 1, 'WBAES.h:202 (discriminator 2)')])
(367, [('[R]', 747, 4529521, 140736938839954, 1, 8, 'WBAES.h:202 (discriminator 2)')])
(357, [('[R]', 736, 4529061, 140736938837138, 1, 10, 'WBAES.h:202 (discriminator 2)')])
(306, [('[R]', 648, 4528885, 140736938824171, 1, 4, 'WBAES.h:202 (discriminator 2)')])
(265, [('[R]', 570, 4529190, 140736938813690, 1, 2, 'WBAES.h:202 (discriminator 2)')])
(408, [('[R]', 824, 4529169, 140736938850187, 1, 14, 'WBAES.h:202 (discriminator 2)')])
(406, [('[R]', 822, 4529073, 140736938849295, 1, 5, 'WBAES.h:202 (discriminator 2)')])
(353, [('[R]', 732, 4528849, 140736938836169, 1, 10, 'WBAES.h:202 (discriminator 2)')])
(318, [('[R]', 661, 4529440, 140736938826895, 1, 15, 'WBAES.h:202 (discriminator 2)')])
Where we get for each sample the corresponding event number, followed by informationextracted from the .info
file: memory mode, item, instruction address, memory address, memory size, memory data, and finally the corresponding source line.
Well here all leaks seem to come from the same line, which is actually:
OP8XOR(o1, o2, xtb, res);
It's an inlined macro heavily used in the core of the white-box code, cf https://github.com/ph4r05/Whitebox-crypto-AES/blob/38f1127db2df22c62490f72b846263644c9c3ce8/WBAES.cpp#L171
This toturial ends here, but if you want to go further, you may e.g. inline yourself the macro so you'll know which lines of WBAES.cpp leak and from there, two options:
- Try to fix Karroumi :)
- Make a faster attack out of the source code by dumping immediately the address bytes at those lines to a file, which would be far more efficient than using instrumentation on the binary.
So the binary would embed itself a tool to attack other instances of the tables, which is already what happens with the BGE attack embedded by Dušan in main
:
See option --bench-bge [=arg(=0)] (=0) Benchmarking rounds for AES BGE attack