Skip to content

gh-52769: Return list of errors in shutil.rmtree when ignore_errors=True#108255

Closed
richardhob wants to merge 14 commits intopython:mainfrom
richardhob:issue52769-rmtree-onerror
Closed

gh-52769: Return list of errors in shutil.rmtree when ignore_errors=True#108255
richardhob wants to merge 14 commits intopython:mainfrom
richardhob:issue52769-rmtree-onerror

Conversation

@richardhob
Copy link
Contributor

@richardhob richardhob commented Aug 22, 2023

Resolve #52769

This merge request adds an errors return value to the shutil.rmtree, based on the patch written in the #pycon 2013 sprint by andrewg (with r.david.murray's assistance). I have attempted to add documentation to explain the purpose of the errors return value as well.

Example 1: Path that does not exist

>>> errors = shutil.rmtree('My Pretend Path', ignore_errors=True)
>>> print(errors)
[(<built-in function lstat>, 'My Pretend Path', FileNotFoundError(2, 'No such file or directory'))]

Example 2: Read Only folder with a Read Only File

>>> errors = shutil.rmtree('ReadOnlyFolder', ignore_errors=True)
>>> print(errors)
[(<built-in function unlink>, 'ReadOnlyFolder/ReadOnlyFile.txt', PermissionError(13, 'Permission denied')), 
 (<built-in function rmdir>, 'ReadOnlyFolder', OSError(39, 'Directory not empty'))]

The discussion in this issue revolves around the complexity of the onerror function, and what the purpose of the onerror (now onexc) function is.

The conclusion from what I read:

  • shutil.rmtree should not try to delegate the hard work [of removing files and folders, dealing with permissions, etc.] to third party code
  • shutil.rmtree should return failures / errors, similar to how smtp returns a list of mails that could not be sent

Addittional features mentioned that are not implemented but are mentioned:

  • shutil.rmtree could check to make sure the permissions are the same throught the tree
  • shutil.rmtree could check and see if there are any links what will make the function fail

📚 Documentation preview 📚: https://cpython-previews--108255.org.readthedocs.build/

Update `test_rmtree_errors_onexc` to use the provided `errors` return
value from `shutil.rmtree` when ignoring errors (`ignore_errors=True`).
This errors list will contain the platform / implementation dependant
details in which the error occurred ((func, path, exc) from `onexc`).
When shutil.rmtree is provided with *ignore_errors=True* then the
function will return a list of errors as [(function, path, excinfo),
...].
Copy link
Contributor

@hauntsaninja hauntsaninja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand the benefit of this. If a user wants to do this, they should just pass in an onexc closure that does this.

@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@richardhob
Copy link
Contributor Author

How I see it is that the onerror / onexc function is used for one of two things:

  • Fix an error that occurs during shutil.rmtree (for example: change the permission of a file)
  • monitor which errors occur

The argument is that the shutil.rmtree should not delegate fixing errors to 3rd party code. And with that usecase removed, the only thing to use onerror / onexc for is to check out which errors occur (which could be returned as a list).

I would argue that:

>>> errors = shutil.rmtree(path, ignore_errors=True)

Makes more sense than:

>>> errors = []
>>> def my_onexc(*args):
...    errors.append(args)
>>> shutil.rmtree(path, onexc=my_onexc)

But this has been the way that shutil.rmtree has worked for a long time so I'm ok with closing the issue too.

I have made the requested changes; please review again

@bedevere-bot
Copy link

Thanks for making the requested changes!

@hauntsaninja: please review the changes made to this pull request.

@serhiy-storchaka
Copy link
Member

I concur with @hauntsaninja. It is easy to do with the existing error handler argument.

When you try to remove a large read-only three with ignore_errors=True, there will be a lot of errors, and collecting them in a list will take additional time and memory, even when most users do not need this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

shutil.rmtree and os.listdir cannot recover on error conditions

4 participants