Skip to content

Fix dict unpacking order and improve error messages#7085

Merged
youknowone merged 6 commits intoRustPython:mainfrom
youknowone:extcall
Feb 12, 2026
Merged

Fix dict unpacking order and improve error messages#7085
youknowone merged 6 commits intoRustPython:mainfrom
youknowone:extcall

Conversation

@youknowone
Copy link
Member

@youknowone youknowone commented Feb 11, 2026

  • Rewrite compile_dict to preserve insertion order with ** unpacking
  • Fix function call error messages to include function name and keyword-only argument counts
  • Fix missing keyword-only argument error to list all missing args
  • Fix starred expression error messages to match CPython
  • Fix "cannot unpack non-iterable" error message
  • Remove resolved expected failure markers in test_extcall and test_unpack_ex

Summary by CodeRabbit

  • Bug Fixes
    • Unified and clearer error messages for missing, positional and keyword-only arguments.
    • Improved error reporting for dictionary unpacking and starred-argument operations, including contextual function names.
    • Clearer TypeError messages when unpacking encounters non-iterable or non-mapping types; single-star call cases now handle starred values more directly.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 11, 2026

📝 Walkthrough

Walkthrough

Refactors dict construction and star-unpack compilation paths in the codegen, tightens runtime error messages for starred/mapping/unpack operations, and adds a function-descriptor helper used across frame and builtin error reporting.

Changes

Cohort / File(s) Summary
Dict & star-unpack compilation
crates/codegen/src/compile.rs
Rewrote dict construction: emit a single BuildMap when no ** unpacking; stream segments and emit DictUpdate for ** entries when present. Simplified single-starred call arg path to directly use CallFunctionEx without temporary tuple/list wrapping.
Runtime framing & unpack errors
crates/vm/src/frame.rs
Added object_function_str helper and threaded func_str into mapping/iteration/unpack error paths to produce function-contextual messages (e.g., "X argument after ** must be a mapping, not Y"). Updated iterate_mapping_keys signature.
Builtin function argument errors
crates/vm/src/builtins/function.rs
Introduced format_missing_args helper and unified missing-argument message construction for positional and keyword-only parameters; now reports combined positional and kw-only counts and names.
Error string updates
crates/codegen/src/error.rs
Refined two error message strings: MultipleStarArgs and InvalidStarExpr texts updated for clarity.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    Client->>Compiler: compile source with dict/** and *args usage
    note right of Compiler: emits bytecode sequences\n(BuildMap, DictUpdate, CallFunctionEx)
    Compiler->>VM: load bytecode
    Client->>VM: invoke function (runtime)
    VM/Frame->>VM/Frame: object_function_str(func) -> func_str
    VM/Frame->>Runtime: handle DictUpdate / iterate_mapping_keys using func_str
    alt mapping is invalid
        Runtime->>Client: raise TypeError("X argument after ** must be a mapping, not Y")
    else star-unpack single tuple-like
        VM->>Runtime: CallFunctionEx with compiled starred value
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • ShaharNaveh

Poem

🐰 I hopped through maps and starred arrays,

Rewrote the paths in tidy ways.
Now BuildMap sings when no ** plays,
And errors name the function’s days.
A little hop — the code behaves!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Fix dict unpacking order and improve error messages' directly summarizes the main changes: rewriting dict construction logic to preserve insertion order with ** unpacking, and enhancing error messages throughout multiple modules.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

No actionable comments were generated in the recent review. 🎉

🧹 Recent nitpick comments
crates/vm/src/builtins/function.rs (1)

28-57: Mutating the caller's Vec via pop() is a hidden side-effect; consider a non-destructive approach.

format_missing_args takes &mut Vec solely to pop() the last element for formatting. This works because no caller reuses the vector afterwards, but the mutation is surprising and fragile. You can achieve the same with split_last() on a slice, avoiding the &mut requirement entirely.

♻️ Non-mutating alternative
-fn format_missing_args(
-    qualname: impl core::fmt::Display,
-    kind: &str,
-    missing: &mut Vec<impl core::fmt::Display>,
-) -> String {
-    let count = missing.len();
-    let last = if missing.len() > 1 {
-        missing.pop()
-    } else {
-        None
-    };
-    let (and, right): (&str, String) = if let Some(last) = last {
-        (
-            if missing.len() == 1 {
-                "' and '"
-            } else {
-                "', and '"
-            },
-            format!("{last}"),
-        )
-    } else {
-        ("", String::new())
-    };
-    format!(
-        "{qualname}() missing {count} required {kind} argument{}: '{}{}{right}'",
-        if count == 1 { "" } else { "s" },
-        missing.iter().join("', '"),
-        and,
-    )
-}
+fn format_missing_args(
+    qualname: impl core::fmt::Display,
+    kind: &str,
+    missing: &[impl core::fmt::Display],
+) -> String {
+    let count = missing.len();
+    let (init, and, right) = if count > 1 {
+        let (init, last) = missing.split_last().unwrap();
+        let and = if count == 2 { "' and '" } else { "', and '" };
+        (
+            core::slice::from_ref(last)
+                .iter()
+                .chain(init.iter().take(init.len().saturating_sub(0)))
+                .collect::<Vec<_>>(),
+            and,
+            format!("{last}"),
+        )
+    } else {
+        (missing.iter().collect(), "", String::new())
+    };
+    // Simpler: just use the original slice directly
+    let (init, and, right): (&[_], &str, String) = if let Some((last, rest)) = missing.split_last()
+        && rest.len() >= 1
+    {
+        (rest, if rest.len() == 1 { "' and '" } else { "', and '" }, format!("{last}"))
+    } else {
+        (missing, "", String::new())
+    };
+    format!(
+        "{qualname}() missing {count} required {kind} argument{}: '{}{}{right}'",
+        if count == 1 { "" } else { "s" },
+        init.iter().join("', '"),
+        and,
+    )
+}

Actually, let me simplify — the cleanest version using split_last:

♻️ Cleaner non-mutating version
 fn format_missing_args(
     qualname: impl core::fmt::Display,
     kind: &str,
-    missing: &mut Vec<impl core::fmt::Display>,
+    missing: &[impl core::fmt::Display],
 ) -> String {
     let count = missing.len();
-    let last = if missing.len() > 1 {
-        missing.pop()
-    } else {
-        None
-    };
-    let (and, right): (&str, String) = if let Some(last) = last {
-        (
-            if missing.len() == 1 {
-                "' and '"
-            } else {
-                "', and '"
-            },
-            format!("{last}"),
-        )
+    let (init, and, right): (&[_], &str, String) = if let Some((last, rest)) = missing.split_last()
+        && !rest.is_empty()
+    {
+        (rest, if rest.len() == 1 { "' and '" } else { "', and '" }, format!("{last}"))
     } else {
-        ("", String::new())
+        (missing, "", String::new())
     };
     format!(
         "{qualname}() missing {count} required {kind} argument{}: '{}{}{right}'",
         if count == 1 { "" } else { "s" },
-        missing.iter().join("', '"),
+        init.iter().join("', '"),
         and,
     )
 }

Then update call sites from &mut missing to &missing.

crates/vm/src/frame.rs (3)

1073-1076: func_str is computed eagerly on every DictMerge, even on the happy path.

object_function_str performs get_attr("__qualname__") and get_attr("__module__") (potentially triggering Python-level __getattr__). This work is wasted when no error occurs, which is the common case. The same issue applies in collect_ex_args (lines 2612 and 2633).

Consider deferring the computation with a lazy closure or computing it only inside the error branches.

♻️ Sketch: defer func_str to error paths
-                let callable = self.nth_value(idx + 2);
-                let func_str = Self::object_function_str(callable, vm);
-
                 // Check if source is a mapping
                 if vm
                     .get_method(source.clone(), vm.ctx.intern_str("keys"))
                     .is_none()
                 {
+                    let callable = self.nth_value(idx + 2);
+                    let func_str = Self::object_function_str(callable, vm);
                     return Err(vm.new_type_error(format!(
                         "{} argument after ** must be a mapping, not {}",
                         func_str,

(Apply the same pattern for the duplicate-key branch and the two sites in collect_ex_args.)


2653-2676: Duplicated fallback logic — consider extracting the repr-or-"?" pattern.

Lines 2658–2661 and 2664–2667 are identical. A small helper or local closure would reduce duplication.

♻️ Suggested simplification
     fn object_function_str(obj: &PyObject, vm: &VirtualMachine) -> String {
+        let repr_fallback = || {
+            obj.repr(vm)
+                .map(|s| s.as_str().to_owned())
+                .unwrap_or_else(|_| "?".to_owned())
+        };
         let Ok(qualname) = obj.get_attr(vm.ctx.intern_str("__qualname__"), vm) else {
-            return obj
-                .repr(vm)
-                .map(|s| s.as_str().to_owned())
-                .unwrap_or_else(|_| "?".to_owned());
+            return repr_fallback();
         };
         let Some(qualname_str) = qualname.downcast_ref::<PyStr>() else {
-            return obj
-                .repr(vm)
-                .map(|s| s.as_str().to_owned())
-                .unwrap_or_else(|_| "?".to_owned());
+            return repr_fallback();
         };

As per coding guidelines, "When branches differ only in a value but share common logic, extract the differing value first, then call the common logic once to avoid duplicate code."


1334-1337: Repeated non-iterability check pattern across 4 call sites.

The identical not_iterable guard appears at lines 1334–1337, 2634–2637, 2806–2809, and 3020–3023. Consider extracting a small helper method on PyObject or a free function:

fn is_not_iterable(obj: &PyObject, vm: &VirtualMachine) -> bool {
    obj.class().slots.iter.load().is_none()
        && obj
            .get_class_attr(vm.ctx.intern_str("__getitem__"))
            .is_none()
}

This makes the intent self-documenting and ensures future sites stay consistent.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@youknowone youknowone force-pushed the extcall branch 3 times, most recently from b0eaa37 to 6aa01a4 Compare February 11, 2026 16:01
youknowone and others added 4 commits February 12, 2026 01:01
- Rewrite compile_dict to preserve insertion order with ** unpacking
- Fix function call error messages to include function name and
keyword-only argument counts
- Fix missing keyword-only argument error to list all missing args
- Fix starred expression error messages to match CPython
- Fix "cannot unpack non-iterable" error message
- Remove resolved expected failure markers in test_extcall and
test_unpack_ex
@youknowone youknowone marked this pull request as ready for review February 12, 2026 03:18
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@crates/vm/src/frame.rs`:
- Around line 2806-2815: The current error-rewrite around value.try_to_value(vm)
replaces any TypeError with "cannot unpack non-iterable ..." and therefore can
hide real TypeErrors raised by a custom __iter__; change the condition so the
rewrite only happens when the object truly lacks an __iter__ implementation
(mirror the guard used in ListExtend and collect_ex_args). Concretely, inside
the map_err closure around value.try_to_value(vm), keep the existing check of
e.class().is(vm.ctx.exceptions.type_error) but add a second check that the
value's type does not provide __iter__ (e.g., via the same API used elsewhere
such as checking value.class().has_attr("__iter__") or using
vm.get_attribute_opt(value, "__iter__") == None); only then return
vm.new_type_error(...), otherwise return the original error e. Reference
symbols: elements (Vec<_>), value.try_to_value, vm.new_type_error, and the
analogous guards in ListExtend and collect_ex_args.
🧹 Nitpick comments (2)
crates/codegen/src/compile.rs (1)

6649-6707: Deduplicate pending-pair flush logic.

The BuildMap/DictUpdate sequence is repeated in multiple branches. Consider extracting the common logic so the differing part (whether a dict already exists) is handled once.

♻️ Suggested refactor
 fn compile_dict(&mut self, items: &[ast::DictItem]) -> CompileResult<()> {
     let has_unpacking = items.iter().any(|item| item.key.is_none());

     if !has_unpacking {
         // Simple case: no ** unpacking, build all pairs directly
         for item in items {
             self.compile_expression(item.key.as_ref().unwrap())?;
             self.compile_expression(&item.value)?;
         }
         emit!(
             self,
             Instruction::BuildMap {
                 size: u32::try_from(items.len()).expect("too many dict items"),
             }
         );
         return Ok(());
     }

+    fn flush_pending(
+        compiler: &mut Compiler,
+        elements: u32,
+        have_dict: &mut bool,
+    ) {
+        if elements == 0 {
+            return;
+        }
+        emit!(compiler, Instruction::BuildMap { size: elements });
+        if *have_dict {
+            emit!(compiler, Instruction::DictUpdate { index: 1 });
+        } else {
+            *have_dict = true;
+        }
+    }
+
     // Complex case with ** unpacking: preserve insertion order.
     // Collect runs of regular k:v pairs and emit BUILD_MAP + DICT_UPDATE
     // for each run, and DICT_UPDATE for each ** entry.
     let mut have_dict = false;
     let mut elements: u32 = 0;

     for item in items {
         if let Some(key) = &item.key {
             // Regular key: value pair
             self.compile_expression(key)?;
             self.compile_expression(&item.value)?;
             elements += 1;
         } else {
             // ** unpacking entry
-            if elements > 0 {
-                // Flush pending regular pairs
-                if !have_dict {
-                    emit!(self, Instruction::BuildMap { size: elements });
-                    have_dict = true;
-                } else {
-                    emit!(self, Instruction::BuildMap { size: elements });
-                    emit!(self, Instruction::DictUpdate { index: 1 });
-                }
-                elements = 0;
-            }
+            flush_pending(self, elements, &mut have_dict);
+            elements = 0;
             if !have_dict {
                 emit!(self, Instruction::BuildMap { size: 0 });
                 have_dict = true;
             }
             self.compile_expression(&item.value)?;
             emit!(self, Instruction::DictUpdate { index: 1 });
         }
     }

-    if elements > 0 || !have_dict {
-        if !have_dict {
-            emit!(self, Instruction::BuildMap { size: elements });
-        } else {
-            emit!(self, Instruction::BuildMap { size: elements });
-            emit!(self, Instruction::DictUpdate { index: 1 });
-        }
-    }
+    flush_pending(self, elements, &mut have_dict);
+    if !have_dict {
+        emit!(self, Instruction::BuildMap { size: 0 });
+    }

     Ok(())
 }

As per coding guidelines, "When branches differ only in a value but share common logic, extract the differing value first, then call the common logic once to avoid duplicate code."

crates/vm/src/builtins/function.rs (1)

395-444: Duplicate formatting logic for missing-argument error messages.

Lines 417–444 replicate the conjunction-formatting pattern from lines 349–376 almost verbatim (pop last, choose "' and '" vs "', and '", join remaining, format). Consider extracting a small helper (e.g., fn format_missing_args(kind: &str, qualname: &str, names: &mut Vec<…>) -> String) and calling it from both sites.

♻️ Sketch
// Somewhere accessible to both call sites:
fn format_missing_args_error(
    qualname: &str,
    kind: &str, // "positional" or "keyword-only"
    missing: &mut Vec<&impl AsRef<str>>,
) -> String {
    let missing_len = missing.len();
    let last = if missing.len() > 1 { missing.pop() } else { None };
    let (and, right) = if let Some(last) = last {
        (
            if missing.len() == 1 { "' and '" } else { "', and '" },
            last.as_ref(),
        )
    } else {
        ("", "")
    };
    format!(
        "{}() missing {} required {} argument{}: '{}{}{}'",
        qualname,
        missing_len,
        kind,
        if missing_len == 1 { "" } else { "s" },
        missing.iter().map(|m| m.as_ref()).join("', '"),
        and,
        right,
    )
}

As per coding guidelines, "When branches differ only in a value but share common logic, extract the differing value first, then call the common logic once to avoid duplicate code".

@github-actions
Copy link
Contributor

📦 Library Dependencies

The following Lib/ modules were modified. Here are their dependencies:

[ ] test: cpython/Lib/test/test_extcall.py (TODO: 8)

dependencies:

dependent tests: (no tests depend on extcall)

[x] lib: cpython/Lib/inspect.py
[ ] test: cpython/Lib/test/test_inspect (TODO: 49)

dependencies:

  • inspect

dependent tests: (42 tests)

  • inspect: test_abc test_argparse test_asyncgen test_builtin test_code test_collections test_coroutines test_decimal test_enum test_functools test_generators test_grammar test_inspect test_ntpath test_operator test_patma test_posixpath test_signal test_traceback test_type_annotations test_types test_typing test_unittest test_yield_from test_zipimport
    • asyncio: test_asyncio test_contextlib_async test_logging test_os test_sys_settrace test_unittest
    • bdb: test_bdb
    • dataclasses: test__colorize test_genericalias test_pprint test_regrtest test_zoneinfo
    • importlib.metadata: test_importlib
    • pydoc:
      • xmlrpc.server: test_docxmlrpc test_xmlrpc
    • rlcompleter: test_rlcompleter
    • trace: test_trace

[x] lib: cpython/Lib/pickle.py
[ ] lib: cpython/Lib/_compat_pickle.py
[ ] test: cpython/Lib/test/test_pickle.py (TODO: 21)
[ ] test: cpython/Lib/test/test_picklebuffer.py (TODO: 12)
[ ] test: cpython/Lib/test/test_pickletools.py (TODO: 8)

dependencies:

  • pickle (native: itertools, sys)
    • _compat_pickle
    • _compat_pickle
    • io (native: _io, _thread, errno, sys)
    • types
    • codecs, copyreg, functools, struct

dependent tests: (72 tests)

  • pickle: test_annotationlib test_array test_asyncio test_builtin test_bytes test_bz2 test_codecs test_collections test_concurrent_futures test_coroutines test_csv test_ctypes test_decimal test_defaultdict test_deque test_descr test_dict test_dictviews test_email test_enum test_enumerate test_exceptions test_fractions test_functools test_generators test_genericalias test_http_cookies test_importlib test_inspect test_io test_ipaddress test_iter test_itertools test_list test_logging test_lzma test_memoryio test_memoryview test_opcache test_operator test_ordered_dict test_os test_pathlib test_pickle test_pickletools test_platform test_plistlib test_positional_only_arg test_posix test_random test_range test_set test_shelve test_slice test_socket test_statistics test_str test_string test_trace test_tuple test_type_aliases test_type_params test_types test_typing test_unittest test_uuid test_xml_dom_minicompat test_xml_etree test_zipfile test_zlib test_zoneinfo
    • logging.handlers: test_concurrent_futures

[ ] test: cpython/Lib/test/test_positional_only_arg.py (TODO: 5)

dependencies:

dependent tests: (no tests depend on positional_only_arg)

[ ] test: cpython/Lib/test/test_unpack.py
[ ] test: cpython/Lib/test/test_unpack_ex.py (TODO: 17)

dependencies:

dependent tests: (no tests depend on unpack)

Legend:

  • [+] path exists in CPython
  • [x] up-to-date, [ ] outdated

@youknowone youknowone merged commit 7f64acd into RustPython:main Feb 12, 2026
43 of 47 checks passed
@youknowone youknowone deleted the extcall branch February 12, 2026 15:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant