Skip to content

Commit eebf986

Browse files
swolchokfacebook-github-bot
authored andcommitted
Don't leave a math puzzle for the compiler in BMI decoder
Summary: We computed `8 * intBytes - 1`, converted that to `intBytes`, and then did a shift by `8 * intBytes - 1`. Saving the shift value directly causes clang to generate shorter code. Differential Revision: D54440459 fbshipit-source-id: 5f1380f8b38fd706ed91b903d6b212e9f791f626
1 parent a994a83 commit eebf986

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

thrift/lib/cpp/util/VarintUtils-inl.h

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -234,10 +234,10 @@ inline size_t readContiguousVarintMediumSlowU64BMI2(
234234
}
235235
// By reset data bits and toggle the continuation bits, the tailing zeros
236236
// should be intBytes*8-1
237-
size_t intBytes =
238-
(__builtin_ctzll(continuationBits ^ kContinuationBitMask) >> 3) + 1;
237+
size_t maskShift = __builtin_ctzll(continuationBits ^ kContinuationBitMask);
238+
size_t intBytes = (maskShift >> 3) + 1;
239239

240-
uint64_t mask = (1ULL << (8 * intBytes - 1)) - 1;
240+
uint64_t mask = (1ULL << maskShift) - 1;
241241
// You might think it would make more sense to to the pext first and mask
242242
// afterwards (avoiding having two pexts in a single dependency chain at 3
243243
// cycles / pop); this seems not to be borne out in microbenchmarks. The

0 commit comments

Comments
 (0)