-
-
Notifications
You must be signed in to change notification settings - Fork 34.1k
bpo-32002: Refactor C locale coercion tests #4369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
ncoghlan
merged 12 commits into
python:master
from
ncoghlan:bpo-32002-refactor-locale-coercion-test-cases
Dec 16, 2017
Merged
Changes from all commits
Commits
Show all changes
12 commits
Select commit
Hold shift + click to select a range
63c7676
bpo-32002: Refactor C locale coercion tests
ncoghlan 8d4f704
Add NEWS entry
ncoghlan 9f8d9f8
Make the default locale its own subtest
ncoghlan e282db7
Fix method name in comments
ncoghlan 8ff68a8
Fix merge error
ncoghlan 75e585b
Reword NEWS entry
ncoghlan af504ff
Fix doubled word in test method name
ncoghlan 05abe72
Avoid unneeded dict copy in tests
ncoghlan 17364e8
Fix directive name in comment
ncoghlan e61457b
Add issue reference in current Linux special case
ncoghlan 13f06bb
Fix test expectations for Cygwin & Android
ncoghlan 45c1822
Fix the tests for Android
xdegaye File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -14,30 +14,51 @@ | |
| interpreter_requires_environment, | ||
| ) | ||
|
|
||
| # Set the list of ways we expect to be able to ask for the "C" locale | ||
| EXPECTED_C_LOCALE_EQUIVALENTS = ["C", "invalid.ascii"] | ||
|
|
||
| # Set our expectation for the default encoding used in the C locale | ||
| # for the filesystem encoding and the standard streams | ||
|
|
||
| # While most *nix platforms default to ASCII in the C locale, some use a | ||
| # different encoding. | ||
| if sys.platform.startswith("aix"): | ||
| C_LOCALE_STREAM_ENCODING = "iso8859-1" | ||
| elif test.support.is_android: | ||
| C_LOCALE_STREAM_ENCODING = "utf-8" | ||
| else: | ||
| C_LOCALE_STREAM_ENCODING = "ascii" | ||
|
|
||
| # FS encoding is UTF-8 on macOS, other *nix platforms use the locale encoding | ||
| if sys.platform == "darwin": | ||
| C_LOCALE_FS_ENCODING = "utf-8" | ||
| else: | ||
| C_LOCALE_FS_ENCODING = C_LOCALE_STREAM_ENCODING | ||
|
|
||
| # Note that the above is probably still wrong in some cases, such as: | ||
| EXPECTED_C_LOCALE_STREAM_ENCODING = "ascii" | ||
| EXPECTED_C_LOCALE_FS_ENCODING = "ascii" | ||
|
|
||
| # Set our expectation for the default locale used when none is specified | ||
| EXPECT_COERCION_IN_DEFAULT_LOCALE = True | ||
|
|
||
| # Apply some platform dependent overrides | ||
| if sys.platform.startswith("linux"): | ||
| if test.support.is_android: | ||
| # Android defaults to using UTF-8 for all system interfaces | ||
| EXPECTED_C_LOCALE_STREAM_ENCODING = "utf-8" | ||
| EXPECTED_C_LOCALE_FS_ENCODING = "utf-8" | ||
| else: | ||
| # Linux distros typically alias the POSIX locale directly to the C | ||
| # locale. | ||
| # TODO: Once https://bugs.python.org/issue30672 is addressed, we'll be | ||
| # able to check this case unconditionally | ||
| EXPECTED_C_LOCALE_EQUIVALENTS.append("POSIX") | ||
| elif sys.platform.startswith("aix"): | ||
| # AIX uses iso8859-1 in the C locale, other *nix platforms use ASCII | ||
| EXPECTED_C_LOCALE_STREAM_ENCODING = "iso8859-1" | ||
| EXPECTED_C_LOCALE_FS_ENCODING = "iso8859-1" | ||
| elif sys.platform == "darwin": | ||
| # FS encoding is UTF-8 on macOS | ||
| EXPECTED_C_LOCALE_FS_ENCODING = "utf-8" | ||
| elif sys.platform == "cygwin": | ||
| # Cygwin defaults to using C.UTF-8 | ||
| # TODO: Work out a robust dynamic test for this that doesn't rely on | ||
| # CPython's own locale handling machinery | ||
| EXPECT_COERCION_IN_DEFAULT_LOCALE = False | ||
|
|
||
| # Note that the above expectations are still wrong in some cases, such as: | ||
| # * Windows when PYTHONLEGACYWINDOWSFSENCODING is set | ||
| # * AIX and any other platforms that use latin-1 in the C locale | ||
| # * Any platform other than AIX that uses latin-1 in the C locale | ||
| # * Any Linux distro where POSIX isn't a simple alias for the C locale | ||
| # * Any Linux distro where the default locale is something other than "C" | ||
| # | ||
| # Options for dealing with this: | ||
| # * Don't set PYTHON_COERCE_C_LOCALE on such platforms (e.g. Windows doesn't) | ||
| # * Don't set the PY_COERCE_C_LOCALE preprocessor definition on | ||
| # such platforms (e.g. it isn't set on Windows) | ||
| # * Fix the test expectations to match the actual platform behaviour | ||
|
|
||
| # In order to get the warning messages to match up as expected, the candidate | ||
|
|
@@ -47,7 +68,7 @@ | |
| # There's no reliable cross-platform way of checking locale alias | ||
| # lists, so the only way of knowing which of these locales will work | ||
| # is to try them with locale.setlocale(). We do that in a subprocess | ||
| # to avoid altering the locale of the test runner. | ||
| # in setUpModule() below to avoid altering the locale of the test runner. | ||
| # | ||
| # If the relevant locale module attributes exist, and we're not on a platform | ||
| # where we expect it to always succeed, we also check that | ||
|
|
@@ -217,8 +238,9 @@ def _check_child_encoding_details(self, | |
| class LocaleConfigurationTests(_LocaleHandlingTestCase): | ||
| # Test explicit external configuration via the process environment | ||
|
|
||
| def setUpClass(): | ||
| # This relies on setupModule() having been run, so it can't be | ||
| @classmethod | ||
| def setUpClass(cls): | ||
| # This relies on setUpModule() having been run, so it can't be | ||
| # handled via the @unittest.skipUnless decorator | ||
| if not AVAILABLE_TARGETS: | ||
| raise unittest.SkipTest("No C-with-UTF-8 locale available") | ||
|
|
@@ -284,8 +306,8 @@ def _check_c_locale_coercion(self, | |
|
|
||
| if not AVAILABLE_TARGETS: | ||
| # Locale coercion is disabled when there aren't any target locales | ||
| fs_encoding = C_LOCALE_FS_ENCODING | ||
| stream_encoding = C_LOCALE_STREAM_ENCODING | ||
| fs_encoding = EXPECTED_C_LOCALE_FS_ENCODING | ||
| stream_encoding = EXPECTED_C_LOCALE_STREAM_ENCODING | ||
| coercion_expected = False | ||
| if expected_warnings: | ||
| expected_warnings = [LEGACY_LOCALE_WARNING] | ||
|
|
@@ -296,41 +318,47 @@ def _check_c_locale_coercion(self, | |
| "LC_ALL": "", | ||
| } | ||
| base_var_dict.update(extra_vars) | ||
| for env_var in ("LANG", "LC_CTYPE"): | ||
| for locale_to_set in ("", "C", "POSIX", "invalid.ascii"): | ||
| # XXX (ncoghlan): *BSD platforms don't behave as expected in the | ||
| # POSIX locale, so we skip that for now | ||
| # See https://bugs.python.org/issue30672 for discussion | ||
| if locale_to_set == "POSIX": | ||
| continue | ||
| if coerce_c_locale is not None: | ||
| base_var_dict["PYTHONCOERCECLOCALE"] = coerce_c_locale | ||
|
|
||
| # Platforms using UTF-8 in the C locale do not print | ||
| # CLI_COERCION_WARNING when all the locale envt variables are | ||
| # not set or set to the empty string. | ||
| # Check behaviour for the default locale | ||
| with self.subTest(default_locale=True, | ||
| PYTHONCOERCECLOCALE=coerce_c_locale): | ||
| if EXPECT_COERCION_IN_DEFAULT_LOCALE: | ||
| _expected_warnings = expected_warnings | ||
| for _env_var in base_var_dict: | ||
| if base_var_dict[_env_var]: | ||
| break | ||
| else: | ||
| if (C_LOCALE_STREAM_ENCODING == "utf-8" and | ||
| locale_to_set == "" and coerce_c_locale == "warn"): | ||
| _expected_warnings = None | ||
|
|
||
| _coercion_expected = coercion_expected | ||
| else: | ||
| _expected_warnings = None | ||
| _coercion_expected = False | ||
| # On Android CLI_COERCION_WARNING is not printed when all the | ||
| # locale environment variables are undefined or empty. When | ||
| # this code path is run with environ['LC_ALL'] == 'C', then | ||
| # LEGACY_LOCALE_WARNING is printed. | ||
| if (test.support.is_android and | ||
| _expected_warnings == [CLI_COERCION_WARNING]): | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I need to study this a bit more carefully, but I believe this condition may apply for Cygwin as well (or anywhere that the default locale has UTF-8 encoding). |
||
| _expected_warnings = None | ||
| self._check_child_encoding_details(base_var_dict, | ||
| fs_encoding, | ||
| stream_encoding, | ||
| _expected_warnings, | ||
| _coercion_expected) | ||
|
|
||
| # Check behaviour for explicitly configured locales | ||
| for locale_to_set in EXPECTED_C_LOCALE_EQUIVALENTS: | ||
| for env_var in ("LANG", "LC_CTYPE"): | ||
| with self.subTest(env_var=env_var, | ||
| nominal_locale=locale_to_set, | ||
| PYTHONCOERCECLOCALE=coerce_c_locale): | ||
| var_dict = base_var_dict.copy() | ||
| var_dict[env_var] = locale_to_set | ||
| if coerce_c_locale is not None: | ||
| var_dict["PYTHONCOERCECLOCALE"] = coerce_c_locale | ||
| # Check behaviour on successful coercion | ||
| self._check_child_encoding_details(var_dict, | ||
| fs_encoding, | ||
| stream_encoding, | ||
| _expected_warnings, | ||
| expected_warnings, | ||
| coercion_expected) | ||
|
|
||
| def test_test_PYTHONCOERCECLOCALE_not_set(self): | ||
| def test_PYTHONCOERCECLOCALE_not_set(self): | ||
| # This should coerce to the first available target locale by default | ||
| self._check_c_locale_coercion("utf-8", "utf-8", coerce_c_locale=None) | ||
|
|
||
|
|
@@ -349,27 +377,27 @@ def test_PYTHONCOERCECLOCALE_set_to_warn(self): | |
|
|
||
| def test_PYTHONCOERCECLOCALE_set_to_zero(self): | ||
| # The setting "0" should result in the locale coercion being disabled | ||
| self._check_c_locale_coercion(C_LOCALE_FS_ENCODING, | ||
| C_LOCALE_STREAM_ENCODING, | ||
| self._check_c_locale_coercion(EXPECTED_C_LOCALE_FS_ENCODING, | ||
| EXPECTED_C_LOCALE_STREAM_ENCODING, | ||
| coerce_c_locale="0", | ||
| coercion_expected=False) | ||
| # Setting LC_ALL=C shouldn't make any difference to the behaviour | ||
| self._check_c_locale_coercion(C_LOCALE_FS_ENCODING, | ||
| C_LOCALE_STREAM_ENCODING, | ||
| self._check_c_locale_coercion(EXPECTED_C_LOCALE_FS_ENCODING, | ||
| EXPECTED_C_LOCALE_STREAM_ENCODING, | ||
| coerce_c_locale="0", | ||
| LC_ALL="C", | ||
| coercion_expected=False) | ||
|
|
||
| def test_LC_ALL_set_to_C(self): | ||
| # Setting LC_ALL should render the locale coercion ineffective | ||
| self._check_c_locale_coercion(C_LOCALE_FS_ENCODING, | ||
| C_LOCALE_STREAM_ENCODING, | ||
| self._check_c_locale_coercion(EXPECTED_C_LOCALE_FS_ENCODING, | ||
| EXPECTED_C_LOCALE_STREAM_ENCODING, | ||
| coerce_c_locale=None, | ||
| LC_ALL="C", | ||
| coercion_expected=False) | ||
| # And result in a warning about a lack of locale compatibility | ||
| self._check_c_locale_coercion(C_LOCALE_FS_ENCODING, | ||
| C_LOCALE_STREAM_ENCODING, | ||
| self._check_c_locale_coercion(EXPECTED_C_LOCALE_FS_ENCODING, | ||
| EXPECTED_C_LOCALE_STREAM_ENCODING, | ||
| coerce_c_locale="warn", | ||
| LC_ALL="C", | ||
| expected_warnings=[LEGACY_LOCALE_WARNING], | ||
|
|
||
2 changes: 2 additions & 0 deletions
2
Misc/NEWS.d/next/Tests/2017-11-11-16-35-18.bpo-32002.itDxIo.rst
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,2 @@ | ||
| Adjust C locale coercion testing for the empty locale and POSIX locale | ||
| cases to more readily adjust to platform dependent behaviour. |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think
""should be included in this list otherwise the case where all the locale envt variables are set to""may not be tested. On Android it is only the tests that havecoerce_c_locale="warn"that must be skipped when all locale envt variables are set to"".There is currently no test run for when none of these variables is defined and AFAIK it seems correct since it is equivalent to having all locale envt variables set to
"". But since it is such a common case, testing with all locale envt variables set to""should be mandatory for all platforms.Maybe
EXPECTED_C_LOCALE_EQUIVALENTScould be replaced withLOCALES_TO_TEST.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense. If we go that way, then what I'd actually suggest we do is give the empty locale a dedicated test case, separate from the C locale equivalents.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And a dedicated test case for the empty locale would be more explicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to a dedicated test case for empty locale.