-
Notifications
You must be signed in to change notification settings - Fork 24.8k
[PyTorch] RFC: Add tuple inline storage #64066
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
5f06e2f
2e3fb4b
8496eb9
a2726bc
2451f9f
9dcf2ae
a7d23fd
ca08c5f
8fce745
031bc4a
8a7ef90
e03440d
757281a
9a1cc85
27eca8b
b4c2fd4
9e66764
a5e5bfe
1475ad1
436ccc5
305c8a1
edad395
a010999
0719493
5df6b5b
8c71367
23264d8
51f9a65
9232e2e
5f4f9f8
dcf699e
121a192
d897b89
d82cd75
09ee5d8
4fddc8d
0ce2d6d
59e3375
7e17dd6
cf1d072
a66f9a9
adfdb14
161a240
236cf76
be384f7
5b55485
88f8ce9
eadc443
4d1ef84
b58e12f
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
I noticed a bunch of time being spent heap-allocating Tuples in the unpickler. 1-, 2-, and 3-element Tuples are apparently common enough that they get their own bytecode instructions, so I decided to try also giving them their own representation. We store up to 3 IValues inline in `Tuple` rather than doing a second heap allocation for a `std::vector<IValue>`. Differential Revision: [D30592622](https://our.internmc.facebook.com/intern/diff/D30592622/) **NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D30592622/)! [ghstack-poisoned]
- Loading branch information
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -55,28 +55,25 @@ std::unique_ptr<PropagateGradientsReq> PropagateGradientsReq::fromMessage( | |
payload_size, | ||
*rpc::RpcAgent::getCurrentRpcAgent()->getTypeResolver(), | ||
message.tensors()); | ||
std::vector<at::IValue> tupleElements = std::move(*std::move(tuple).toTuple()).elements(); | ||
auto tupleElements = std::move(*std::move(tuple).toTuple()).elements(); | ||
|
||
// Build PropagateGradientsReq. | ||
TORCH_INTERNAL_ASSERT(tupleElements.size() >= 3); | ||
|
||
// Retrieve retainGraph. | ||
bool retainGraph = tupleElements.back().toBool(); | ||
tupleElements.pop_back(); | ||
|
||
// Build AutogradMetadata. | ||
// NOLINTNEXTLINE(cppcoreguidelines-init-variables) | ||
int64_t autogradContextId, autogradMessageId; | ||
autogradMessageId = tupleElements.back().toInt(); | ||
tupleElements.pop_back(); | ||
autogradContextId = tupleElements.back().toInt(); | ||
tupleElements.pop_back(); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why are you removing these There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, this doesn't seem right. |
||
|
||
AutogradMetadata autogradMetadata(autogradContextId, autogradMessageId); | ||
|
||
// Retrieve the gradient tensors. | ||
std::vector<Variable> grads(tupleElements.size()); | ||
for(const auto i : c10::irange(tupleElements.size())) { | ||
std::vector<Variable> grads(tupleElements.size() - 3); | ||
for(const auto i : c10::irange(tupleElements.size() - 3)) { | ||
grads[i] = std::move(tupleElements[i]).toTensor(); | ||
} | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need that new
TupleElements
class instead of just using ac10::SmallVector<IValue, 3>
?