The Wayback Machine - https://web.archive.org/web/20201201074703/https://github.com/NVIDIA/cub/issues/229
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow segmented problems to have different types for offset iterator #229

Open
devavret opened this issue Oct 29, 2020 · 0 comments
Open

Allow segmented problems to have different types for offset iterator #229

devavret opened this issue Oct 29, 2020 · 0 comments

Comments

@devavret
Copy link

@devavret devavret commented Oct 29, 2020

Segmented reduce uses the same template type OffsetIteratorT for begin and end offsets

static CUB_RUNTIME_FUNCTION cudaError_t cub::DeviceSegmentedReduce::Sum
    (   void *              d_temp_storage,
        size_t &            temp_storage_bytes,
        InputIteratorT      d_in,
        OutputIteratorT     d_out,
        int                 num_segments,
        OffsetIteratorT     d_begin_offsets,  ///<<<
        OffsetIteratorT     d_end_offsets,    ///<<<
        cudaStream_t        stream = 0,
        bool                debug_synchronous = false 
    )

This precludes use of thrust transform iterators to generate the offsets. The implementation suggests that the iterators are only dereferenced to get the begin and end iterators independently per-block and as long as the dereferenced value is the same type, it should be ok for the iterators to be a different type.

Check the attached src for and example use case.
cub_segment_repro.cu.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.