Manually cache block attributes in PackBMatrix::pack_unpack_ #702
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary: I noticed from inspecting assembly that we were re-loading the 4 block attributes on each loop iteration. It initially surprised me that the compiler was not able to prove that the loop body doesn't mutate the block (I don't see any function calls and TBAA should mean that it can prove nothing aliases the block). On reflection, it is notable that
T
issigned char
--char*
is the one exception to the strict aliasing rule (see https://en.cppreference.com/w/cpp/language/reinterpret_cast), so the compiler cannot prove thatunpack_buf
andpack_buf
do not aliasblock
and thus it has to re-load fromblock
each time.Reviewed By: jspark1105
Differential Revision: D30976702