Implement more ast features by youknowone · Pull Request #6986 · RustPython/RustPython

youknowone · 2026-02-03T15:43:49Z

Summary by CodeRabbit

New Features
- Enhanced AST module with support for function type nodes and improved validation
- Added feature version support to the compile() function for better Python compatibility
- Improved documentation coverage for Python's internal modules and objects
Bug Fixes
- Fixed float integer detection to use exact checks instead of epsilon-based comparisons
- Enhanced string handling for robustness against invalid character codes
Documentation
- Expanded built-in module documentation with comprehensive entries for internal Python classes and functions

coderabbitai · 2026-02-03T15:44:02Z

📝 Walkthrough

Walkthrough

This PR refactors the AST system to add comprehensive validation and representation helpers, changes how builtins are stored from PyDictRef to PyObjectRef throughout the frame and function system, expands documentation entries, and enhances string handling for f-strings and template strings with normalization and escape-sequence warnings.

Changes

Cohort / File(s)	Summary
Builtins Type System `crates/vm/src/builtins/frame.rs`, `crates/vm/src/frame.rs`, `crates/vm/src/builtins/function.rs`, `crates/vm/src/vm/mod.rs`, `crates/vm/src/suggestion.rs`	Changed builtins storage from PyDictRef to PyObjectRef in Frame and ExecutingFrame; updated PyFunction construction to use frame's builtins directly; modified run_code_obj to dynamically resolve builtins from module.builtins; adjusted load_global_or_builtin to handle both dict and non-dict builtins with fallback paths.
AST Core Implementation `crates/vm/src/stdlib/ast/pyast.rs`, `crates/vm/src/stdlib/ast/python.rs`, `crates/vm/src/stdlib/ast/validate.rs`, `crates/vm/src/stdlib/ast/repr.rs`	Introduced impl_base_node! macro for AST node declaration; added comprehensive AST validation module; implemented AST representation helpers with depth-aware truncation; refactored extend_module_nodes to register AST class and populate field types, singletons, and match args; enhanced NodeAst with reduce, replace, and custom initialization logic.
AST Node Type Handling `crates/vm/src/stdlib/ast/basic.rs`, `crates/vm/src/stdlib/ast/constant.rs`, `crates/vm/src/stdlib/ast/elif_else_clause.rs`, `crates/vm/src/stdlib/ast/expression.rs`, `crates/vm/src/stdlib/ast/module.rs`, `crates/vm/src/stdlib/ast/operator.rs`, `crates/vm/src/stdlib/ast/other.rs`	Updated string handling to use interned strings in basic.rs and other.rs; reworked frozenset constant representation; adjusted elif/else range computation and clause construction; enhanced expression handling with NodeExprTemplateStr support and dict key/value validation; removed node location additions in module nodes; added _instance early-return paths in operator nodes.
AST Parameters & Patterns `crates/vm/src/stdlib/ast/parameter.rs`, `crates/vm/src/stdlib/ast/pattern.rs`	Updated merge_positional/keyword_parameter_defaults to accept VM, validate argument counts, and return PyResult; enhanced pattern matching to explicitly type kwd attributes/patterns and validate length consistency between kwd_attrs and kwd_patterns.
AST String & Statement Handling `crates/vm/src/stdlib/ast/string.rs`, `crates/vm/src/stdlib/ast/statement.rs`	Introduced f-string/template-string normalization with adjacent constant merging; added escape-sequence validation and warnings; changed TStringInterpolation.format_spec from TemplateStr to JoinedStr; enhanced function/class definition range computation; improved ImportFrom level validation and Pass statement location handling.
Compile & Parser Integration `crates/vm/src/stdlib/ast.rs`, `crates/vm/src/stdlib/builtins.rs`	Expanded parse entry points with optimize, target_version, and type_comments parameters; added ModFunctionType exposure; introduced parse_func_type and wrap_interactive under parser feature; added AST validation in compile(); enhanced error propagation for parse errors; exposed new compile flags (PY_CF_SOURCE_IS_UTF8, PY_CF_IGNORE_COOKIE, PY_CF_ALLOW_TOP_LEVEL_AWAIT).
Documentation & Spelling `.cspell.dict/cpython.txt`, `crates/doc/generate.py`, `crates/doc/src/data.inc.rs`	Added binop and hexdigit to spelling dictionary; updated doc generation to use strip() instead of inspect.cleandoc(); improved is_child_of module resolution for C modules; massively expanded DB documentation map with Python internal modules, builtin types, and class descriptions.
Numeric & Type Handling `crates/literal/src/float.rs`, `crates/vm/src/builtins/type.rs`	Changed is_integer check in float.rs from epsilon-based tolerance to exact finite check (v.is_finite() && v.fract() == 0.0); refactored PyType.annotations resolution with stricter HEAPTYPE guards and non-dict attribute skipping.
Utility & Platform-Specific `crates/vm/src/stdlib/ctypes/function.rs`, `crates/vm/src/stdlib/ctypes/simple.rs`, `crates/vm/src/stdlib/os.rs`, `extra_tests/snippets/stdlib_types.py`	Replaced direct wchar-to-char casts with guarded u32::try_from conversions in ctypes; fixed OS encoding detection to use byte constants; updated test snippet to add Pass statement and remove explicit location metadata.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant VM
    participant Frame
    participant Module
    participant Builtins

    Client->>VM: run_code_obj(code_obj, scope)
    VM->>Module: resolve builtins from __builtins__
    alt Module has PyModule __builtins__
        Module->>Builtins: dict()
    else Module has direct builtins
        Module->>Builtins: use as-is
    else No __builtins__
        VM->>Builtins: use vm.builtins.dict()
    end
    VM->>Frame: new(builtins: PyObjectRef)
    Frame->>Client: Frame ready (builtins: PyObjectRef)
    Client->>Frame: load_global_or_builtin(name)
    alt builtins is dict
        Frame->>Builtins: fast path dict lookup
    else builtins is non-dict
        Frame->>Builtins: generic path getattr
    end
    Builtins->>Client: value or error

sequenceDiagram
    participant Compiler
    participant Parser
    participant Validator
    participant AST_Module

    Compiler->>Parser: parse(source, mode, feature_version)
    Parser->>Parser: apply target_version defaults
    Parser->>AST_Module: emit AST nodes
    alt mode == "single"
        Validator->>AST_Module: wrap_interactive(Module → Interactive)
    end
    Validator->>Validator: validate_mod(module)
    Validator->>Validator: check context consistency, body non-empty, patterns
    alt validation fails
        Validator->>Compiler: PyResult error
    else validation succeeds
        Validator->>Compiler: validated AST
    end
    Compiler->>Compiler: compile to code object

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~70 minutes

Possibly related PRs

correct builtins type under function #6745: Modifies how function/frame builtins are typed and resolved, changing builtins from PyDictRef to PyObjectRef and adjusting PyFunction construction to handle module-or-dict builtins.
Pyfunction builtins and constructor #5823: Modifies PyFunction builtins handling and constructor, updating how the builtins context is captured and propagated through function creation.
Update _ast_unparse and fix unparse, type_comments #6961: Directly overlaps with AST implementation changes including ast parameter handling, pyast macro generation, statement range computation, and compile() mode/type_comment behavior.

Suggested reviewers

ShaharNaveh
coolreader18

Poem

🐰 Builtins dance from dict to object, so flexible and free,
AST nodes validate their forms, a symphony of three,
F-strings merge their secrets, escape sequences show their way,
A rabbit hops through Frame's new paths, bringing order to the fray! ✨

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 50.72% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check	❓ Inconclusive	The PR title 'Implement more ast features' is generic and vague, failing to convey specific details about the substantial changes made across the AST module and related components.	Consider using a more specific title that highlights the primary changes, such as 'Add Python AST validation, interning, and builtins refactoring' or similar to better reflect the scope of modifications.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-02-03T15:44:17Z

Code has been automatically formatted

The code in this PR has been formatted using:

cargo fmt --all
Please pull the latest changes before pushing again:

git pull origin ast

github-actions · 2026-02-05T02:07:44Z

📦 Library Dependencies

The following Lib/ modules were modified. Here are their dependencies:

[x] lib: cpython/Lib/ast.py
[x] lib: cpython/Lib/_ast_unparse.py
[x] test: cpython/Lib/test/test_unparse.py
[x] test: cpython/Lib/test/test_type_comments.py (TODO: 15)

dependencies:

ast

dependent tests: (48 tests)

[x] test: cpython/Lib/test/test_builtin.py (TODO: 25)

dependencies:

dependent tests: (no tests depend on builtin)

[ ] test: cpython/Lib/test/test_funcattrs.py (TODO: 5)

dependencies:

dependent tests: (no tests depend on funcattrs)

[ ] lib: cpython/Lib/traceback.py
[ ] test: cpython/Lib/test/test_traceback.py (TODO: 48)

dependencies:

traceback (native: collections.abc, itertools, sys)
- codeop
- io (native: _io, _thread, errno, sys)
- tokenize (native: _tokenize, builtins, itertools, sys)
- _colorize, contextlib, keyword, linecache, textwrap, warnings

dependent tests: (32 tests)

traceback: test_asyncio test_builtin test_code_module test_contextlib test_contextlib_async test_coroutines test_importlib test_listcomps test_pyexpat test_setcomps test_socket test_ssl test_subprocess test_threadedtempfile test_threading test_traceback test_unittest test_with test_zipimport
- code:
  - sqlite3.main: test_sqlite3
- concurrent.futures.process: test_concurrent_futures
- logging.config: test_logging
- py_compile: test_argparse test_cmd_line_script test_compileall test_importlib test_multiprocessing_main_handling test_py_compile test_runpy
- pydoc: test_enum
  - xmlrpc.server: test_docxmlrpc test_xmlrpc

Legend:

[+] path exists in CPython
[x] up-to-date, [ ] outdated

coderabbitai

Actionable comments posted: 3

🤖 Fix all issues with AI agents

In `@crates/vm/src/frame.rs`:
- Around line 1209-1234: The LOAD_BUILD_CLASS handler currently assumes
__builtins__ is a mapping and calls get_item, which causes an uncaught TypeError
when __builtins__ is a module; change the else branch of the
Instruction::LoadBuildClass handling (the branch after
self.builtins.downcast_ref::<PyDict>()) to detect module-style builtins
(downcast_ref::<PyModule>() or equivalent) and call get_attr(identifier!(vm,
__build_class__), vm) for modules, translating AttributeError into a NameError
the same way you handle KeyError for mappings; keep the existing get_item path
for dict/mapping objects. Apply the same fix to the slow-path global/builtin
name lookup routine (the name-lookup code referenced in the review) so it uses
get_attr for module builtins and preserves existing error translation logic for
mapping builtins.

In `@crates/vm/src/stdlib/ast/string.rs`:
- Around line 50-134: The merge logic currently keeps the first constant's
TextRange when collapsing adjacent string literals, so location info becomes the
first-span-only; update normalize_joined_str_parts and
normalize_template_str_parts so that when you append a new constant (in the
branches handling JoinedStrPart::Constant and TemplateStrPart::Constant) you
also extend the pending range to cover both spans (e.g., set pending.2 =
TextRange::new(pending.start(), constant.range.end()) or use the project's
helper like TextRange::cover/join to set the pending range to span from the
original pending start to constant.range.end), and ensure
push_joined_str_literal and push_template_str_literal use that extended range
when building Constant::new_str; apply the identical change for both joined and
template paths so the merged literal range extends to the last fragment.

In `@crates/vm/src/stdlib/ast/validate.rs`:
- Around line 349-354: The NamedExpr handler only checks that named.target is an
ast::Expr::Name but doesn't validate it in Store context; after the existing
type check for ast::Expr::Named (and the matches!(&*named.target,
ast::Expr::Name(_)) check), call validate_expr(vm, &named.target,
ast::ExprContext::Store) to validate the target as an assignment target and
propagate any error, then validate the value with validate_expr(vm,
&named.value, ast::ExprContext::Load) as already done; keep the existing type
error message if the target isn't a Name.

🧹 Nitpick comments (3)

crates/vm/src/stdlib/ast.rs (2)
398-470: Consider handling edge cases in func_type parsing.

The parse_func_type function has good structure but a few observations:

Lines 405-406: optimize and target_version are unused (explicitly ignored with let _ =). This is intentional but worth noting.

Line 437: When extracting right, the offset split_at + 2 assumes -> is always 2 bytes, which is correct for ASCII but the comment/documentation should clarify this.

Lines 457-461: The match arm for other expression types (other => vec![other]) is a catch-all that may accept invalid func_type syntax silently.
💡 Optional: Add validation for unexpected expression types
     let argtypes: Vec<ast::Expr> = match arg_expr {
         ast::Expr::Tuple(tup) => tup.elts,
         ast::Expr::Name(_) | ast::Expr::Subscript(_) | ast::Expr::Attribute(_) => vec![arg_expr],
-        other => vec![other],
+        // CPython accepts any expression as a type annotation
+        other => vec![other],
     };
472-503: Minor: redundant assignment on Line 487.

The line let tag = if tag.is_empty() { "" } else { tag }; is a no-op since tag already equals itself in both branches.
♻️ Simplify redundant conditional
         let tag = rest.trim_start();
-        let tag = if tag.is_empty() { "" } else { tag };
         let node = NodeAst
crates/vm/src/stdlib/ast/python.rs (1)
433-513: Redundant setup in module_exec.

The module_exec function (lines 440-477) repeats the same setup that extend_class already performs in NodeAst. This appears to be setting up the AST class that's registered in the module, which may be a different type object than the one defined via #[pyattr].

If this duplication is intentional (to handle both the pyattr-registered type and the module-registered type), consider adding a comment explaining why.
💡 Add clarifying comment
     pub(crate) fn module_exec(
         vm: &VirtualMachine,
         module: &Py<crate::builtins::PyModule>,
     ) -> PyResult<()> {
         __module_exec(vm, module);
         super::super::pyast::extend_module_nodes(vm, module);

+        // The AST class registered in extend_module_nodes needs its attributes
+        // and methods configured separately from the #[pyattr] NodeAst type.
         let ast_type = module
             .get_attr("AST", vm)?

coderabbitai · 2026-02-05T04:49:49Z

crates/vm/src/frame.rs

            Instruction::LoadBuildClass => {
-                self.push_value(vm.builtins.get_attr(identifier!(vm, __build_class__), vm)?);
+                let build_class =
+                    if let Some(builtins_dict) = self.builtins.downcast_ref::<PyDict>() {
+                        builtins_dict
+                            .get_item_opt(identifier!(vm, __build_class__), vm)?
+                            .ok_or_else(|| {
+                                vm.new_name_error(
+                                    "__build_class__ not found".to_owned(),
+                                    identifier!(vm, __build_class__).to_owned(),
+                                )
+                            })?
+                    } else {
+                        self.builtins
+                            .get_item(identifier!(vm, __build_class__), vm)
+                            .map_err(|e| {
+                                if e.fast_isinstance(vm.ctx.exceptions.key_error) {
+                                    vm.new_name_error(
+                                        "__build_class__ not found".to_owned(),
+                                        identifier!(vm, __build_class__).to_owned(),
+                                    )
+                                } else {
+                                    e
+                                }
+                            })?
+                    };
+                self.push_value(build_class);


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

cd crates/vm && wc -l src/frame.rs

Repository: RustPython/RustPython

Length of output: 82

🏁 Script executed:

cd crates/vm && sed -n '1209,1234p' src/frame.rs

Repository: RustPython/RustPython

Length of output: 1413

🏁 Script executed:

cd crates/vm && sed -n '2151,2169p' src/frame.rs

Repository: RustPython/RustPython

Length of output: 927

🏁 Script executed:

cd crates/vm && rg "get_attr" --type rust -B2 -A2 | head -50

Repository: RustPython/RustPython

Length of output: 2773

🏁 Script executed:

cd crates/vm && rg "module_type" --type rust | head -20

Repository: RustPython/RustPython

Length of output: 680

🏁 Script executed:

cd crates/vm && rg "\.class\(\)" --type rust -B1 -A1 | head -30

Repository: RustPython/RustPython

Length of output: 1642

🏁 Script executed:

cd crates/vm && rg "builtins.*module" --type rust -B2 -A2 | head -40

Repository: RustPython/RustPython

Length of output: 2296

🏁 Script executed:

cd crates/vm && rg "get_item" src/frame.rs -B3 -A3 | head -60

Repository: RustPython/RustPython

Length of output: 2874

🏁 Script executed:

cd crates/vm && grep -n "attribute_error\|key_error" src/frame.rs | head -20

Repository: RustPython/RustPython

Length of output: 1161

🏁 Script executed:

cd crates && find . -name "*.rs" -type f -exec grep -l "PyModule" {} \; | grep -E "builtins|module" | head -5

Repository: RustPython/RustPython

Length of output: 205

🏁 Script executed:

cd crates/vm && sed -n '1,100p' src/builtins/module.rs

Repository: RustPython/RustPython

Length of output: 3283

🏁 Script executed:

cd crates/vm && rg "__getitem__" src/builtins/module.rs -B3 -A5

Repository: RustPython/RustPython

Length of output: 47

🏁 Script executed:

cd crates/vm && rg "get_item\|get_attr" src/builtins/module.rs -B2 -A2 | head -40

Repository: RustPython/RustPython

Length of output: 47

🏁 Script executed:

cd crates/vm && sed -n '100,300p' src/builtins/module.rs

Repository: RustPython/RustPython

Length of output: 7033

🏁 Script executed:

cd crates/vm && rg "GetAttr\|Initializer" src/builtins/module.rs -B2 -A8 | head -50

Repository: RustPython/RustPython

Length of output: 47

🏁 Script executed:

cd crates/vm && python3 << 'EOF' import sys # In Python, modules don't support __getitem__ # They support __getattr__ via attribute access # Check the behavior: mod = sys try: mod["path"] # This will fail except TypeError as e: print(f"__getitem__ on module raises: {type(e).__name__}: {e}") try: getattr(mod, "path") # This works print("__getattr__ on module works") except AttributeError as e: print(f"__getattr__ raises: {e}") try: getattr(mod, "nonexistent_attr") except AttributeError as e: print(f"__getattr__ for missing attr raises AttributeError: {type(e).__name__}") EOF

Repository: RustPython/RustPython

Length of output: 237

🏁 Script executed:

cd crates/vm && rg "fn get_item" src/builtins/*.rs -A5 | head -60

Repository: RustPython/RustPython

Length of output: 1137

🏁 Script executed:

cd crates/vm && rg "object.get_item\|obj.get_item" src/ --type rust -B2 -A3 | head -50

Repository: RustPython/RustPython

Length of output: 47

🏁 Script executed:

cd crates && find . -name "*.rs" -type f -exec grep -l "fn get_item" {} \; | head -10

Repository: RustPython/RustPython

Length of output: 205

🏁 Script executed:

cd crates/vm && sed -n '1,100p' src/protocol/object.rs

Repository: RustPython/RustPython

Length of output: 3552

🏁 Script executed:

cd crates/vm && rg "pub fn get_item" src/protocol/object.rs -A10 | head -40

Repository: RustPython/RustPython

Length of output: 530

🏁 Script executed:

cd crates/vm && grep -n "get_item" src/protocol/object.rs | head -20

Repository: RustPython/RustPython

Length of output: 294

🏁 Script executed:

cd crates/vm && sed -n '686,720p' src/protocol/object.rs

Repository: RustPython/RustPython

Length of output: 1504

🏁 Script executed:

cd crates/vm && rg "try_mapping\|try_sequence" src/protocol/object.rs -B2 -A8 | head -50

Repository: RustPython/RustPython

Length of output: 47

🏁 Script executed:

cd crates/vm && rg "\.class\(\)\.is\(vm\.ctx\.types\." src/ --type rust -B1 -A1 | head -40

Repository: RustPython/RustPython

Length of output: 2395

🏁 Script executed:

cd crates/vm && rg "get_attr.*vm\)" src/frame.rs -B2 -A2 | head -30

Repository: RustPython/RustPython

Length of output: 1579

🏁 Script executed:

cd crates/vm && cargo clippy --lib 2>&1 | grep -A5 "frame.rs" | head -30

Repository: RustPython/RustPython

Length of output: 151

🏁 Script executed:

cd crates/vm && cargo fmt --check -- src/frame.rs 2>&1 | head -20

Repository: RustPython/RustPython

Length of output: 604

Handle module-style __builtins__ lookup to avoid TypeError.

When __builtins__ is a module (the common case), calling get_item raises TypeError, which is not caught by the code expecting KeyError. This breaks LOAD_BUILD_CLASS and global/builtin name resolution.

For modules, use get_attr instead (which raises AttributeError on missing attributes, translatable to NameError). Keep get_item for dict/mapping objects.

This applies to both the LOAD_BUILD_CLASS instruction (lines 1209-1234) and the slow-path name lookup (lines 2151-2169).

Suggested fix

@@ - } else { - self.builtins - .get_item(identifier!(vm, __build_class__), vm) - .map_err(|e| { - if e.fast_isinstance(vm.ctx.exceptions.key_error) { - vm.new_name_error( - "__build_class__ not found".to_owned(), - identifier!(vm, __build_class__).to_owned(), - ) - } else { - e - } - })? - }; + } else if self.builtins.class().is(vm.ctx.types.module_type) { + self.builtins + .get_attr(identifier!(vm, __build_class__), vm) + .map_err(|e| { + if e.fast_isinstance(vm.ctx.exceptions.attribute_error) { + vm.new_name_error( + "__build_class__ not found".to_owned(), + identifier!(vm, __build_class__).to_owned(), + ) + } else { + e + } + })? + } else { + self.builtins + .get_item(identifier!(vm, __build_class__), vm) + .map_err(|e| { + if e.fast_isinstance(vm.ctx.exceptions.key_error) { + vm.new_name_error( + "__build_class__ not found".to_owned(), + identifier!(vm, __build_class__).to_owned(), + ) + } else { + e + } + })? + }; @@ - } else { - // Slow path: builtins is not a dict, use generic __getitem__ - if let Some(value) = self.globals.get_item_opt(name, vm)? { - return Ok(value); - } - self.builtins.get_item(name, vm).map_err(|e| { - if e.fast_isinstance(vm.ctx.exceptions.key_error) { - vm.new_name_error(format!("name '{name}' is not defined"), name.to_owned()) - } else { - e - } - }) - } + } else { + // Slow path: builtins is not a dict + if let Some(value) = self.globals.get_item_opt(name, vm)? { + return Ok(value); + } + if self.builtins.class().is(vm.ctx.types.module_type) { + self.builtins.get_attr(name, vm).map_err(|e| { + if e.fast_isinstance(vm.ctx.exceptions.attribute_error) { + vm.new_name_error(format!("name '{name}' is not defined"), name.to_owned()) + } else { + e + } + }) + } else { + self.builtins.get_item(name, vm).map_err(|e| { + if e.fast_isinstance(vm.ctx.exceptions.key_error) { + vm.new_name_error(format!("name '{name}' is not defined"), name.to_owned()) + } else { + e + } + }) + } + }

🤖 Prompt for AI Agents

In `@crates/vm/src/frame.rs` around lines 1209 - 1234, The LOAD_BUILD_CLASS handler currently assumes __builtins__ is a mapping and calls get_item, which causes an uncaught TypeError when __builtins__ is a module; change the else branch of the Instruction::LoadBuildClass handling (the branch after self.builtins.downcast_ref::<PyDict>()) to detect module-style builtins (downcast_ref::<PyModule>() or equivalent) and call get_attr(identifier!(vm, __build_class__), vm) for modules, translating AttributeError into a NameError the same way you handle KeyError for mappings; keep the existing get_item path for dict/mapping objects. Apply the same fix to the slow-path global/builtin name lookup routine (the name-lookup code referenced in the review) so it uses get_attr for module builtins and preserves existing error translation logic for mapping builtins.

coderabbitai · 2026-02-05T04:49:49Z

crates/vm/src/stdlib/ast/string.rs

+fn push_joined_str_literal(
+    output: &mut Vec<JoinedStrPart>,
+    pending: &mut Option<(String, StringLiteralPrefix, TextRange)>,
+) {
+    if let Some((value, prefix, range)) = pending.take()
+        && !value.is_empty()
+    {
+        output.push(JoinedStrPart::Constant(Constant::new_str(
+            value, prefix, range,
+        )));
+    }
+}
+
+fn normalize_joined_str_parts(values: Vec<JoinedStrPart>) -> Vec<JoinedStrPart> {
+    let mut output = Vec::with_capacity(values.len());
+    let mut pending: Option<(String, StringLiteralPrefix, TextRange)> = None;
+
+    for part in values {
+        match part {
+            JoinedStrPart::Constant(constant) => {
+                let ConstantLiteral::Str { value, prefix } = constant.value else {
+                    push_joined_str_literal(&mut output, &mut pending);
+                    output.push(JoinedStrPart::Constant(constant));
+                    continue;
+                };
+                let value: String = value.into();
+                if let Some((pending_value, _, _)) = pending.as_mut() {
+                    pending_value.push_str(&value);
+                } else {
+                    pending = Some((value, prefix, constant.range));
+                }
+            }
+            JoinedStrPart::FormattedValue(value) => {
+                push_joined_str_literal(&mut output, &mut pending);
+                output.push(JoinedStrPart::FormattedValue(value));
+            }
+        }
+    }
+
+    push_joined_str_literal(&mut output, &mut pending);
+    output
+}
+
+fn push_template_str_literal(
+    output: &mut Vec<TemplateStrPart>,
+    pending: &mut Option<(String, StringLiteralPrefix, TextRange)>,
+) {
+    if let Some((value, prefix, range)) = pending.take()
+        && !value.is_empty()
+    {
+        output.push(TemplateStrPart::Constant(Constant::new_str(
+            value, prefix, range,
+        )));
+    }
+}
+
+fn normalize_template_str_parts(values: Vec<TemplateStrPart>) -> Vec<TemplateStrPart> {
+    let mut output = Vec::with_capacity(values.len());
+    let mut pending: Option<(String, StringLiteralPrefix, TextRange)> = None;
+
+    for part in values {
+        match part {
+            TemplateStrPart::Constant(constant) => {
+                let ConstantLiteral::Str { value, prefix } = constant.value else {
+                    push_template_str_literal(&mut output, &mut pending);
+                    output.push(TemplateStrPart::Constant(constant));
+                    continue;
+                };
+                let value: String = value.into();
+                if let Some((pending_value, _, _)) = pending.as_mut() {
+                    pending_value.push_str(&value);
+                } else {
+                    pending = Some((value, prefix, constant.range));
+                }
+            }
+            TemplateStrPart::Interpolation(value) => {
+                push_template_str_literal(&mut output, &mut pending);
+                output.push(TemplateStrPart::Interpolation(value));
+            }
+        }
+    }
+
+    push_template_str_literal(&mut output, &mut pending);
+    output
+}


⚠️ Potential issue | 🟡 Minor

Merged literal ranges should extend to the last fragment.
When adjacent constants are collapsed, the resulting range keeps only the first span; this can skew location info for AST consumers. Consider extending the range while merging.

🛠️ Suggested fix

- if let Some((pending_value, _, _)) = pending.as_mut() { - pending_value.push_str(&value); - } else { - pending = Some((value, prefix, constant.range)); - } + if let Some((pending_value, _, pending_range)) = pending.as_mut() { + pending_value.push_str(&value); + *pending_range = TextRange::new(pending_range.start(), constant.range.end()); + } else { + pending = Some((value, prefix, constant.range)); + } @@ - if let Some((pending_value, _, _)) = pending.as_mut() { - pending_value.push_str(&value); - } else { - pending = Some((value, prefix, constant.range)); - } + if let Some((pending_value, _, pending_range)) = pending.as_mut() { + pending_value.push_str(&value); + *pending_range = TextRange::new(pending_range.start(), constant.range.end()); + } else { + pending = Some((value, prefix, constant.range)); + }

🤖 Prompt for AI Agents

In `@crates/vm/src/stdlib/ast/string.rs` around lines 50 - 134, The merge logic currently keeps the first constant's TextRange when collapsing adjacent string literals, so location info becomes the first-span-only; update normalize_joined_str_parts and normalize_template_str_parts so that when you append a new constant (in the branches handling JoinedStrPart::Constant and TemplateStrPart::Constant) you also extend the pending range to cover both spans (e.g., set pending.2 = TextRange::new(pending.start(), constant.range.end()) or use the project's helper like TextRange::cover/join to set the pending range to span from the original pending start to constant.range.end), and ensure push_joined_str_literal and push_template_str_literal use that extended range when building Constant::new_str; apply the identical change for both joined and template paths so the merged literal range extends to the last fragment.

coderabbitai · 2026-02-05T04:49:50Z

crates/vm/src/stdlib/ast/validate.rs

+        ast::Expr::Named(named) => {
+            if !matches!(&*named.target, ast::Expr::Name(_)) {
+                return Err(vm.new_type_error("NamedExpr target must be a Name".to_owned()));
+            }
+            validate_expr(vm, &named.value, ast::ExprContext::Load)
+        }


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🌐 Web query:

Python AST NamedExpr target Store context validation requirement

💡 Result:

In CPython’s ast, a NamedExpr (the walrus operator) is treated like an assignment target:

NamedExpr.target must be an ast.Name node (not Attribute, Subscript, etc.). The compiler’s AST validator explicitly errors otherwise: “NamedExpr target must be a Name”. [3]

That ast.Name must be in Store context (because it’s being assigned to). This is what the parser produces, and what you must build if you construct the AST yourself: Name(id='x', ctx=Store()). [1]

More generally, assignment targets use Store context (vs Load for reading a value, Del for del). [2]

So, when creating a walrus AST manually, do:

ast.NamedExpr( target=ast.Name(id="x", ctx=ast.Store()), value=... )

Validate NamedExpr targets in Store context.

Python's AST validator requires that NamedExpr.target (the walrus operator) be a Name node in Store context, since it represents an assignment target. The current check only enforces the target is a Name, but does not validate the context. Add context validation to reject invalid ASTs:

Proposed fix

ast::Expr::Named(named) => { if !matches!(&*named.target, ast::Expr::Name(_)) { return Err(vm.new_type_error("NamedExpr target must be a Name".to_owned())); } + validate_expr(vm, &named.target, ast::ExprContext::Store)?; validate_expr(vm, &named.value, ast::ExprContext::Load) }

🤖 Prompt for AI Agents

In `@crates/vm/src/stdlib/ast/validate.rs` around lines 349 - 354, The NamedExpr handler only checks that named.target is an ast::Expr::Name but doesn't validate it in Store context; after the existing type check for ast::Expr::Named (and the matches!(&*named.target, ast::Expr::Name(_)) check), call validate_expr(vm, &named.target, ast::ExprContext::Store) to validate the target as an assignment target and propagate any error, then validate the value with validate_expr(vm, &named.value, ast::ExprContext::Load) as already done; keep the existing type error message if the target isn't a Name.

youknowone force-pushed the ast branch 6 times, most recently from 395cffb to 99862cd Compare February 4, 2026 14:45

youknowone added 10 commits February 5, 2026 10:21

ast __annotations__

30848ec

Implement more ast

ec181d5

optimize

53d5d6c

more macro

95421a9

tstring unparse

32b9eb3

fix win

6486a3c

dict

1d1bed7

fix more

bb9b56a

align validation

29c6672

ast

ebe063b

youknowone force-pushed the ast branch from 99862cd to ebe063b Compare February 5, 2026 02:07

youknowone marked this pull request as ready for review February 5, 2026 04:38

coderabbitai bot reviewed Feb 5, 2026

View reviewed changes

youknowone changed the title ~~ast~~ Implement more ast features Feb 5, 2026

youknowone merged commit 684e880 into RustPython:main Feb 5, 2026
14 checks passed

youknowone deleted the ast branch February 5, 2026 07:20

coderabbitai bot mentioned this pull request Feb 6, 2026

Refactor _ast module for clarity #7031

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement more ast features#6986

Implement more ast features#6986
youknowone merged 10 commits intoRustPython:mainfrom
youknowone:ast

youknowone commented Feb 3, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Feb 3, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

github-actions bot commented Feb 3, 2026

Uh oh!

github-actions bot commented Feb 5, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Feb 5, 2026

Uh oh!

coderabbitai bot Feb 5, 2026

Uh oh!

coderabbitai bot Feb 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

youknowone commented Feb 3, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

github-actions bot commented Feb 3, 2026

Uh oh!

github-actions bot commented Feb 5, 2026

📦 Library Dependencies

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

youknowone commented Feb 3, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 3, 2026 •

edited

Loading