Align psuedo ops to CPython 3.14.2#6846
Conversation
|
Caution Review failedThe pull request is closed. 📝 WalkthroughWalkthroughThis PR refactors attribute and super attribute load instructions across the compiler stack. It introduces encoded opargs for LoadAttr/LoadSuperAttr operations, consolidates multiple pseudo-instruction variants into single encoded forms, removes public re-exports of encoding helpers, and updates bytecode instruction definitions accordingly. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Code has been automatically formatted The code in this PR has been formatted using:
git pull origin pseudo-ops |
136d0e1 to
c2ad5eb
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@crates/compiler-core/src/bytecode/instruction.rs`:
- Around line 474-483: The Arg<NameIdx> used for the LoadAttr operand currently
holds an encoded value which is confusing and error-prone; introduce a new
OpArgType called EncodedLoadAttrArg and replace uses of Arg<NameIdx> for
LoadAttr so the type signals that decoding is required. Update the emitter sites
(emit_load_attr, emit_load_attr_method) to produce EncodedLoadAttrArg (using
encode_load_attr_arg), update consumers (stack_effect implementation for
LoadAttr, disassembly, and the load_attr runtime) to accept EncodedLoadAttrArg
and call decode_load_attr_arg before extracting the name index, and adjust any
trait impls or type aliases so Arg<EncodedLoadAttrArg> integrates with the
existing OpArgType system.
In `@crates/stdlib/src/_opcode.rs`:
- Around line 154-156: Remove the public exposure of the internal macro by
deleting or disabling the #[pyattr] export for ENABLE_SPECIALIZATION_FT (and any
ENABLE_SPECIALIZATION) so they are not added to the opcode module; keep the
internal constants locally if needed but do not annotate them with #[pyattr] (or
change to private/internal symbol) and ensure only CPython-public attributes
(e.g., cmp_op, stack_effect, hascompare, opname, opmap, HAVE_ARGUMENT,
EXTENDED_ARG, hasarg, hasconst, hasname, hasjump, hasjrel, hasjabs, hasfree,
haslocal, hasexc) remain exported from the module.
🧹 Nitpick comments (1)
crates/compiler-core/src/bytecode/instruction.rs (1)
932-967: Pseudo-op range alignment looks consistent; do a clean build to avoid stale bytecode artifacts.
Based on learnings, please run a clean build and clear any stale.pyc/target/debug/build/rustpython-*artifacts after bytecode instruction changes.
| Self::LoadFromDictOrDeref(_) => 1, | ||
| Self::StoreSubscr => -3, | ||
| Self::DeleteSubscr => -2, | ||
| Self::LoadAttr { .. } => 0, | ||
| Self::LoadAttr { idx } => { | ||
| // Stack effect depends on method flag in encoded oparg | ||
| // method=false: pop obj, push attr → effect = 0 | ||
| // method=true: pop obj, push (method, self_or_null) → effect = +1 | ||
| let (_, is_method) = decode_load_attr_arg(idx.get(arg)); | ||
| if is_method { 1 } else { 0 } | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Locate all LoadAttr construction sites and encode/decode usage
echo "=== LoadAttr construction/usage sites ==="
rg -n -C2 --type=rs 'LoadAttr\s*\{|Instruction::LoadAttr\b' crates/
echo -e "\n=== encode_load_attr_arg and decode_load_attr_arg usage ==="
rg -n -C2 --type=rs 'encode_load_attr_arg|decode_load_attr_arg' crates/Repository: RustPython/RustPython
Length of output: 238
🏁 Script executed:
#!/bin/bash
# Locate all LoadAttr construction sites and encode/decode usage
echo "=== LoadAttr construction/usage sites ==="
rg -n -C2 'LoadAttr\s*\{|Instruction::LoadAttr' crates/
echo -e "\n=== encode_load_attr_arg and decode_load_attr_arg usage ==="
rg -n -C2 'encode_load_attr_arg|decode_load_attr_arg' crates/
echo -e "\n=== Files containing these terms ==="
rg -l 'LoadAttr|encode_load_attr_arg|decode_load_attr_arg' crates/Repository: RustPython/RustPython
Length of output: 7418
🏁 Script executed:
#!/bin/bash
# Search for any direct LoadAttr construction without going through emit functions
echo "=== Search for direct LoadAttr { idx construction (potential bypass) ==="
rg -n 'LoadAttr\s*\{\s*idx\s*:' crates/
echo -e "\n=== Check emit_load_attr calls to confirm all usages go through encoding ==="
rg -n 'emit_load_attr' crates/
echo -e "\n=== Check Arg type definition ==="
rg -n 'pub struct Arg' crates/ -A 5
echo -e "\n=== Check if there are any other places calling Instruction::LoadAttr ==="
rg -n 'Instruction::LoadAttr' crates/ -C 3Repository: RustPython/RustPython
Length of output: 8611
🏁 Script executed:
#!/bin/bash
# Get more context on Arg<T> and OpArgType definition
echo "=== Arg<T> definition and usage ==="
rg -n 'impl.*Arg<' crates/compiler-core/src/bytecode/instruction.rs -A 10 | head -50
echo -e "\n=== OpArgType trait definition ==="
rg -n 'pub trait OpArgType|pub enum OpArgType' crates/ -A 5
echo -e "\n=== NameIdx type definition ==="
rg -n 'type NameIdx|pub.*NameIdx' crates/compiler-core/src/bytecode/instruction.rs | head -10
echo -e "\n=== LoadAttr field type vs actual usage ==="
sed -n '167,169p' crates/compiler-core/src/bytecode/instruction.rs
echo "---"
sed -n '842,846p' crates/compiler-core/src/bytecode/instruction.rsRepository: RustPython/RustPython
Length of output: 1779
🏁 Script executed:
#!/bin/bash
# Search for NameIdx definition
echo "=== NameIdx definition ==="
rg -n 'type NameIdx\s*=|pub.*NameIdx\s*=' crates/ -A 2
echo -e "\n=== Arg.get method implementation ==="
rg -n 'pub fn get\|pub const fn get' crates/compiler-core/src/bytecode/instruction.rs -A 8
echo -e "\n=== How idx.get is used in all LoadAttr contexts ==="
rg -n 'idx\.get\(' crates/ -B 2 -A 1
echo -e "\n=== Check if NameIdx implements OpArgType ==="
rg -n 'impl.*OpArgType.*NameIdx\|NameIdx.*OpArgType' crates/Repository: RustPython/RustPython
Length of output: 13278
🏁 Script executed:
#!/bin/bash
# Find the .get() method on Arg<T>
echo "=== Search for .get method on Arg<T> ==="
rg -n '\.get\s*\(' crates/compiler-core/src/bytecode/instruction.rs -B 5 | head -40
echo -e "\n=== Full Arg implementation ==="
sed -n '1177,1230p' crates/compiler-core/src/bytecode/instruction.rs
echo -e "\n=== Check for OpArgType impl for NameIdx/u32 ==="
rg -n 'impl.*OpArgType' crates/compiler-core/src/bytecode/oparg.rs -A 10 | head -80Repository: RustPython/RustPython
Length of output: 5623
Type safety for encoded LOAD_ATTR args is currently correct but could be improved.
All 2 emission sites (emit_load_attr, emit_load_attr_method) properly use encode_load_attr_arg, and all 3 consumption sites (stack_effect, disassembly, load_attr runtime) properly decode before using the name index. However, the field type Arg<NameIdx> is misleading since it contains an encoded value, not a raw name index. Since Arg<u32>.get() just returns the raw u32 without decoding, a future maintainer could naively use it as a direct index. Consider introducing a dedicated EncodedLoadAttrArg type implementing OpArgType to make the encoding requirement explicit in the type system.
🤖 Prompt for AI Agents
In `@crates/compiler-core/src/bytecode/instruction.rs` around lines 474 - 483, The
Arg<NameIdx> used for the LoadAttr operand currently holds an encoded value
which is confusing and error-prone; introduce a new OpArgType called
EncodedLoadAttrArg and replace uses of Arg<NameIdx> for LoadAttr so the type
signals that decoding is required. Update the emitter sites (emit_load_attr,
emit_load_attr_method) to produce EncodedLoadAttrArg (using
encode_load_attr_arg), update consumers (stack_effect implementation for
LoadAttr, disassembly, and the load_attr runtime) to accept EncodedLoadAttrArg
and call decode_load_attr_arg before extracting the name index, and adjust any
trait impls or type aliases so Arg<EncodedLoadAttrArg> integrates with the
existing OpArgType system.
crates/codegen/src/ir.rs
Outdated
| } | ||
|
|
||
| // Real instructions are already encoded by compile.rs | ||
| if info.instr.real().is_some() { |
There was a problem hiding this comment.
I'd replace the line 214, no need to check twice for if it's real/pseudo
16cfbe1 to
1c95128
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@crates/stdlib/src/_opcode.rs`:
- Around line 320-326: The current get_special_method_names function only
returns the four context-manager names; replace or extend that array to match
CPython's full _Py_SpecialMethods list so callers receive the complete set of
special method names. Locate or create a static Rust slice/const (e.g.,
SPECIAL_METHODS or _PY_SPECIAL_METHODS) containing all CPython special method
strings, then change get_special_method_names to iterate over that slice (using
the existing .into_iter().map(|x| vm.ctx.new_str(x).into()).collect()) to
produce the Vec<PyObjectRef>; ensure the symbol name get_special_method_names
remains unchanged and the returned type stays Vec<PyObjectRef>.
| #[pyfunction] | ||
| fn get_executor(_code: PyObjectRef, vm: &VirtualMachine) -> PyResult<PyObjectRef> { | ||
| // TODO | ||
| fn get_special_method_names(vm: &VirtualMachine) -> Vec<PyObjectRef> { | ||
| ["__enter__", "__exit__", "__aenter__", "__aexit__"] | ||
| .into_iter() | ||
| .map(|x| vm.ctx.new_str(x).into()) | ||
| .collect() | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n crates/stdlib/src/_opcode.rs | sed -n '310,330p'Repository: RustPython/RustPython
Length of output: 754
🏁 Script executed:
cd crates/stdlib && cargo clippy --all-targets 2>&1 | head -100Repository: RustPython/RustPython
Length of output: 174
🏁 Script executed:
rg "get_special_method_names\|__enter__\|__exit__" crates/stdlib/src/_opcode.rs -A 2 -B 2Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
rg "get_intrinsic1_descs|get_intrinsic2_descs" crates/stdlib/src/_opcode.rs -A 6Repository: RustPython/RustPython
Length of output: 639
🏁 Script executed:
rg "class.*Manager|__enter__|__exit__|__aenter__|__aexit__" --type rs | head -30Repository: RustPython/RustPython
Length of output: 91
🏁 Script executed:
cat -n crates/stdlib/src/_opcode.rs | sed -n '1,50p'Repository: RustPython/RustPython
Length of output: 1893
🏁 Script executed:
rg "get_special_method_names" crates/stdlib/Repository: RustPython/RustPython
Length of output: 168
🏁 Script executed:
rg "__enter__|__exit__|__aenter__|__aexit__" crates/stdlib/src/_opcode.rsRepository: RustPython/RustPython
Length of output: 125
🏁 Script executed:
git log --oneline -10 crates/stdlib/src/_opcode.rsRepository: RustPython/RustPython
Length of output: 99
🏁 Script executed:
find crates -name "*test*" -type f | xargs grep -l "special_method\|__enter__\|__exit__" 2>/dev/null | head -5Repository: RustPython/RustPython
Length of output: 159
🏁 Script executed:
rg "def get_special_method_names|special_method" crates/stdlib/src/_opcode.rs -B 5 -A 10Repository: RustPython/RustPython
Length of output: 446
🏁 Script executed:
cat -n crates/stdlib/src/_opcode.rs | sed -n '320,350p'Repository: RustPython/RustPython
Length of output: 776
🌐 Web query:
CPython _opcode module get_special_method_names function
💡 Result:
In CPython, _opcode.get_special_method_names() is a CPython-internal helper that returns a Python list of the interpreter’s “special method” (dunder) names.
Implementation (CPython 3.14): it allocates a list of length SPECIAL_MAX + 1 and fills it with _Py_SpecialMethods[i].name for i = 0..SPECIAL_MAX. The returned list index corresponds to CPython’s internal “special method” id. [1]
API docstring in the generated clinic header: “Return a list of special method names.” [2]
Use:
import _opcode
names = _opcode.get_special_method_names()Sources: [1] [2]
🏁 Script executed:
rg "ASYNC_WITH|WITH_CLEANUP|context.*manager" crates/ --type-list | head -10Repository: RustPython/RustPython
Length of output: 454
🏁 Script executed:
cat -n crates/stdlib/src/_opcode.rs | sed -n '280,330p'Repository: RustPython/RustPython
Length of output: 2069
Incomplete implementation: Returns only context manager special methods, not all special methods.
The function syntactically follows the established pattern (array → .into_iter().map().collect()), but does not match CPython's get_special_method_names(). CPython's implementation returns a comprehensive list of all special method names (indexed from an internal _Py_SpecialMethods array), while this implementation returns only the four context manager methods: __enter__, __exit__, __aenter__, __aexit__. This implementation is incomplete relative to CPython and may cause compatibility issues if callers expect the full set of special method names.
🤖 Prompt for AI Agents
In `@crates/stdlib/src/_opcode.rs` around lines 320 - 326, The current
get_special_method_names function only returns the four context-manager names;
replace or extend that array to match CPython's full _Py_SpecialMethods list so
callers receive the complete set of special method names. Locate or create a
static Rust slice/const (e.g., SPECIAL_METHODS or _PY_SPECIAL_METHODS)
containing all CPython special method strings, then change
get_special_method_names to iterate over that slice (using the existing
.into_iter().map(|x| vm.ctx.new_str(x).into()).collect()) to produce the
Vec<PyObjectRef>; ensure the symbol name get_special_method_names remains
unchanged and the returned type stays Vec<PyObjectRef>.
576421e to
c0bdb9a
Compare
📦 Library DependenciesThe following Lib/ modules were modified. Here are their dependencies: [+] lib: cpython/Lib/_opcode_metadata.py dependencies:
dependent tests: (13 tests) Legend:
|
Summary by CodeRabbit
Release Notes
New Features
Refactor
✏️ Tip: You can customize this high-level summary in your review settings.