Skip to content

Commit

Permalink
Use inline asm to generate vctzlsbb
Browse files Browse the repository at this point in the history
There was an issue in the way the vec_cnttz_lsbb builtin was being
implemented that got fixed on GCC 12, but now we have to use different
builtins depending on the compiler version. To avoid that, let's use
inline asm to generate the exact instruction we need.
  • Loading branch information
mscastanho committed Apr 6, 2022
1 parent 605301e commit 5490ed4
Showing 1 changed file with 8 additions and 1 deletion.
9 changes: 8 additions & 1 deletion contrib/power/longest_match_power9.c
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ local inline int vec_match OF((Bytef* scan, Bytef* match))
local inline int vec_match(Bytef* scan, Bytef* match)
{
vector unsigned char vscan, vmatch, vc;
int len;

vscan = *((vector unsigned char *) scan);
vmatch = *((vector unsigned char *) match);
Expand All @@ -25,7 +26,13 @@ local inline int vec_match(Bytef* scan, Bytef* match)
* on vc (since we used cmpne), counting the number of consecutive
* bytes where LSB == 0 is the same as counting the length of the match.
*/
return vec_cnttz_lsbb(vc);
#ifdef __LITTLE_ENDIAN__
asm volatile("vctzlsbb %0, %1\n\t" : "=r" (len) : "v" (vc));
#else
asm volatile("vclzlsbb %0, %1\n\t" : "=r" (len) : "v" (vc));
#endif

return len;
}

uInt ZLIB_INTERNAL _longest_match_power9(deflate_state *s, IPos cur_match)
Expand Down

0 comments on commit 5490ed4

Please sign in to comment.