The Wayback Machine - https://web.archive.org/web/20201125135322/https://github.com/opencv/opencv/issues/13413
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce top memory consumption of dnn module for FP16 precision #13413

Open
dkurt opened this issue Dec 11, 2018 · 4 comments
Open

Reduce top memory consumption of dnn module for FP16 precision #13413

dkurt opened this issue Dec 11, 2018 · 4 comments

Comments

@dkurt
Copy link
Member

@dkurt dkurt commented Dec 11, 2018

To reduce top memory consumption of dnn module usage we can achieve approximately x2 less memory usage for networks which will be executed with DNN_TARGET_OPENCL_FP16 or DNN_TARGET_MYRIAD.

For now it loads all the weights in FP32 precision and performs FP32->FP16 conversion for mentioned targets. For models whom weights size more than internal allocations for intermediate blobs, we can reduce top memory consumption converting the weights to FP16 during import.

There are several questions that must be resolved:

  1. Some layers which are failed to execute with DNN_BACKEND_OPENCV and DNN_TARGET_OPENCL_FP16 fallback to DNN_TARGET_OPENCL and then to DNN_TARGET_CPU. We need to manage it and keep FP32 weights.

  2. For now it works in this way:

    net = readNet(<model path>);  <-- Importers work here
    net.setPreferableBackend(<backend id>);
    net.setPreferableTarget(DNN_TARGET_OPENCL_FP16);  <-- We specify precision only here
    

    And we need to say importers about the desired precision earlier.

    2.1. Solution 1: An extra flag to readNet with target. (Or backend and target?)
    2.2. Solution 2: Create methods such Net::readFromCaffe and specify target before import:

    Net net;
    net.setPreferableTarget(DNN_TARGET_OPENCL_FP16);
    net.readFromCaffe(<model path>)
    

    However it's not so obvious for user as the first solution.

  3. Precision of this approach won't be the same as FP32->FP16 path because now all the weights fusions are made in FP32.

  4. It'd be great research to study dynamics of top memory consumption of dnn module for a single forward pass use case over different versions. There are several big PRs which made some changes in this topic: opencv/opencv_contrib#1205, #11461 (just ones I found in my logs), #9389 (closed for a while).

@kunakl07
Copy link

@kunakl07 kunakl07 commented Jan 10, 2020

Is this issue still open? I would like to work on this issue

@souradeepmajumdar05
Copy link

@souradeepmajumdar05 souradeepmajumdar05 commented Sep 13, 2020

anybody working on this issue?

@carrycooldude
Copy link

@carrycooldude carrycooldude commented Sep 15, 2020

I want to work on this issue @dkurt

@dkurt
Copy link
Member Author

@dkurt dkurt commented Sep 15, 2020

Feel free to propose solution by opening a pull request

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants
You can’t perform that action at this time.