Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary:
X-link: https://github.com/facebookresearch/FBGEMM/pull/2180
ssd l2 cache uses raw pointers to async parallel threads for set() and get(), raw pointers are not tracked by tensors ref counts, so if tensors are deallocated or their memory allocation changed before parallel threads access the raw pointers, it will crash.
Even though in most cases tensors will not be deallocated when async parallel threads access the tensors, as futures.wait() is called before function is returned, however PyTorch memory allocation may be changed depending on its internal memory management, so raw pointers without ref count tracking could still result in accessing deallocated objects in some rare cases.
passing tensor ref counts to async parallel threads to avoid this case.
Differential Revision: D87893640