Fix socket.getaddrinfo and upgrade socket to 3.13.11#6564
Fix socket.getaddrinfo and upgrade socket to 3.13.11#6564youknowone merged 7 commits intoRustPython:mainfrom
Conversation
📝 WalkthroughWalkthroughThis pull request extends socket functionality by adding Windows WinSock API support, introducing Linux AF_CAN and AF_ALG address family handling, and implementing new socket methods including Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
1b3b15b to
02ae8cc
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
crates/stdlib/src/socket.rs (2)
146-220: Consider consolidating the cfg attributes.The repeated
#[cfg(target_os = "linux")]on each constant is verbose. Consider grouping them under a single cfg block for maintainability:#[cfg(target_os = "linux")] mod can_bcm_constants { pub const CAN_BCM_TX_SETUP: i32 = 1; // ... } #[cfg(target_os = "linux")] #[pyattr] use can_bcm_constants::*;However, if the current style matches the rest of the codebase or is required by the
pyattrmacro, this is fine as-is.
1891-1906: Potential UB: casting const pointer to mut formsg_iov.Line 1901 casts
iovecs.as_ptr()(const pointer) to*mut _formsg_iov. While this pattern is common and the kernel won't actually mutate the buffers forsendmsg, it's technically undefined behavior in Rust to cast away constness if the memory is accessed.Consider using
as_mut_ptr()with a mutable binding:🔎 Suggested fix
- let iovecs: Vec<libc::iovec> = buffers + let mut iovecs: Vec<libc::iovec> = buffers .iter() .map(|buf| libc::iovec { iov_base: buf.as_ptr() as *mut _, iov_len: buf.len(), }) .collect(); // Set up msghdr let mut msghdr: libc::msghdr = unsafe { std::mem::zeroed() }; - msghdr.msg_iov = iovecs.as_ptr() as *mut _; + msghdr.msg_iov = iovecs.as_mut_ptr();
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
Lib/socket.pyis excluded by!Lib/**Lib/test/test_socket.pyis excluded by!Lib/**
📒 Files selected for processing (1)
crates/stdlib/src/socket.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rs: Follow the default rustfmt code style by runningcargo fmtto format Rust code
Always run clippy to lint code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass,pymodule,pyfunction, etc.) when implementing Python functionality in Rust
Files:
crates/stdlib/src/socket.rs
🧠 Learnings (2)
📚 Learning: 2025-06-27T14:47:28.810Z
Learnt from: moreal
Repo: RustPython/RustPython PR: 5847
File: vm/src/stdlib/stat.rs:547-567
Timestamp: 2025-06-27T14:47:28.810Z
Learning: In RustPython's stat module implementation, platform-specific constants like SF_SUPPORTED and SF_SYNTHETIC should be conditionally declared only for the platforms where they're available (e.g., macOS), following CPython's approach of optional declaration using #ifdef checks rather than providing fallback values for other platforms.
Applied to files:
crates/stdlib/src/socket.rs
📚 Learning: 2025-06-27T14:47:28.810Z
Learnt from: moreal
Repo: RustPython/RustPython PR: 5847
File: vm/src/stdlib/stat.rs:547-567
Timestamp: 2025-06-27T14:47:28.810Z
Learning: In RustPython's stat module implementation, platform-specific constants like SF_SUPPORTED and SF_SYNTHETIC should be conditionally declared only for the platforms where they're available (e.g., macOS), following CPython's approach of optional declaration rather than providing fallback values for other platforms.
Applied to files:
crates/stdlib/src/socket.rs
🔇 Additional comments (12)
crates/stdlib/src/socket.rs (12)
66-89: LGTM!The Windows WinSock API additions are correctly imported, and
FROM_PROTOCOL_INFO = -1matches the documented WinSock constant value.
991-1054: LGTM!The logic correctly masks out
SOCK_NONBLOCKandSOCK_CLOEXECflags from the stored socket type, ensuring consistent cross-platform behavior. The timeout initialization properly handles non-blocking sockets by setting timeout to 0.0.
1252-1314: LGTM!The AF_ALG address handling correctly validates string lengths against the
sockaddr_algfield sizes and properly initializes the zeroed structure before copying the type and name strings.
1402-1478: LGTM!The Windows socket reconstruction from
share()bytes is correctly implemented with proper size validation,WSASocketWusage, and error handling. The handle inheritance is properly disabled after creation.
1921-2009: LGTM!The
recvmsgimplementation correctly handles buffer allocation, validates input parameters, and properly parses the received data and ancillary data. The return format matches Python'ssocket.recvmsg()specification.
2011-2045: LGTM!The
parse_ancillary_datafunction correctly handles control message iteration with proper truncation handling whenMSG_CTRUNCoccurs. The data length calculation respects both thecmsg_lenfield and the actual available buffer space.
2088-2093: LGTM!Correctly using
CMSG_LENfor thecmsg_lenfield (actual message length) whileCMSG_SPACEis used for buffer allocation (includes padding). This matches the proper CMSG construction pattern.
2111-2141: LGTM with a note on close() error handling.The
__del__implementation correctly emitsResourceWarningfor unclosed sockets, matching Python's behavior. Theclose()error is appropriately ignored withlet _ =since destructors shouldn't propagate errors.
2305-2408: LGTM!The Windows
ioctlimplementation correctly handles the supported IOCTL commands with proper input validation and error handling. TheTcpKeepalivestruct is correctly marked#[repr(C)]for FFI compatibility.
2410-2433: LGTM!The Windows
sharemethod correctly usesWSADuplicateSocketWto create a shareable socket descriptor and returns the protocol info as bytes, which can be used by another process to recreate the socket.
2801-2825: LGTM!The
getaddrinfofunction correctly applies IDNA encoding to hostnames for internationalized domain name support, and properly validates port strings as UTF-8. This matches Python's behavior for hostname encoding.
234-275: These values are correctly defined per RFC 3542 standard and are the appropriate solution for macOS, where they are not available in the libc crate.
| #[cfg(target_os = "linux")] | ||
| c::AF_CAN => { | ||
| let tuple: PyTupleRef = addr.downcast().map_err(|obj| { | ||
| vm.new_type_error(format!( | ||
| "{}(): AF_CAN address must be tuple, not {}", | ||
| caller, | ||
| obj.class().name() | ||
| )) | ||
| })?; | ||
| if tuple.is_empty() || tuple.len() > 2 { | ||
| return Err(vm | ||
| .new_type_error( | ||
| "AF_CAN address must be a tuple (interface,) or (interface, addr)", | ||
| ) | ||
| .into()); | ||
| } | ||
| let interface: PyStrRef = tuple[0].clone().downcast().map_err(|obj| { | ||
| vm.new_type_error(format!( | ||
| "{}(): AF_CAN interface must be str, not {}", | ||
| caller, | ||
| obj.class().name() | ||
| )) | ||
| })?; | ||
| let ifname = interface.as_str(); | ||
|
|
||
| // Get interface index | ||
| let ifindex = if ifname.is_empty() { | ||
| 0 // Bind to all CAN interfaces | ||
| } else { | ||
| // Check interface name length (IFNAMSIZ is typically 16) | ||
| if ifname.len() >= 16 { | ||
| return Err(vm | ||
| .new_os_error("interface name too long".to_owned()) | ||
| .into()); | ||
| } | ||
| let cstr = std::ffi::CString::new(ifname) | ||
| .map_err(|_| vm.new_os_error("invalid interface name".to_owned()))?; | ||
| let idx = unsafe { libc::if_nametoindex(cstr.as_ptr()) }; | ||
| if idx == 0 { | ||
| return Err(io::Error::last_os_error().into()); | ||
| } | ||
| idx as i32 | ||
| }; | ||
|
|
||
| // Create sockaddr_can | ||
| let mut storage: libc::sockaddr_storage = unsafe { std::mem::zeroed() }; | ||
| let can_addr = | ||
| &mut storage as *mut libc::sockaddr_storage as *mut libc::sockaddr_can; | ||
| unsafe { | ||
| (*can_addr).can_family = libc::AF_CAN as libc::sa_family_t; | ||
| (*can_addr).can_ifindex = ifindex; | ||
| } | ||
| let storage: socket2::SockAddrStorage = unsafe { std::mem::transmute(storage) }; | ||
| Ok(unsafe { | ||
| socket2::SockAddr::new( | ||
| storage, | ||
| std::mem::size_of::<libc::sockaddr_can>() as libc::socklen_t, | ||
| ) | ||
| }) | ||
| } |
There was a problem hiding this comment.
AF_CAN: Use libc::IF_NAMESIZE and handle can_id from tuple.
- Line 1222: Use
libc::IF_NAMESIZEinstead of hardcoded16for consistency. - The error message says tuple can be
(interface, addr)butcan_idfromtuple[1]is never parsed or used.
🔎 Suggested fix for IF_NAMESIZE
- if ifname.len() >= 16 {
+ if ifname.len() >= libc::IF_NAMESIZE {Additionally, if the second tuple element (can_id) is intentionally ignored, consider updating the error message to reflect the actual accepted format, or implement proper can_id handling if it's needed for CAN_RAW sockets.
🤖 Prompt for AI Agents
In crates/stdlib/src/socket.rs around lines 1192-1251, replace the hardcoded
interface-name limit 16 with libc::IF_NAMESIZE and handle the optional second
tuple element (can_id) described in the error text: update the length check to
if ifname.len() >= libc::IF_NAMESIZE, and if the tuple has a second element
parse tuple[1] to an integer (map Python int errors to
vm.new_type_error/vm.new_value_error as appropriate), then assign that value
into the sockaddr_can (convert to the correct unsigned integer type and store it
in the can id field) before creating the SockAddr; alternatively, if you intend
to ignore can_id, change the earlier error message to only advertise
(interface,) as the accepted form instead of (interface, addr).
| #[cfg(target_os = "linux")] | ||
| { | ||
| let family = addr.family(); | ||
| if family == libc::AF_CAN as libc::sa_family_t { | ||
| // AF_CAN address: (interface_name,) or (interface_name, can_id) | ||
| let can_addr = unsafe { &*(addr.as_ptr() as *const libc::sockaddr_can) }; | ||
| let ifindex = can_addr.can_ifindex; | ||
| let ifname = if ifindex == 0 { | ||
| String::new() | ||
| } else { | ||
| let mut buf = [0u8; libc::IF_NAMESIZE]; | ||
| let ret = unsafe { | ||
| libc::if_indextoname( | ||
| ifindex as libc::c_uint, | ||
| buf.as_mut_ptr() as *mut libc::c_char, | ||
| ) | ||
| }; | ||
| if ret.is_null() { | ||
| String::new() | ||
| } else { | ||
| let nul_pos = memchr::memchr(b'\0', &buf).unwrap_or(buf.len()); | ||
| String::from_utf8_lossy(&buf[..nul_pos]).into_owned() | ||
| } | ||
| }; | ||
| return vm.ctx.new_tuple(vec![vm.ctx.new_str(ifname).into()]).into(); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# First, let's find the extract_address function and check its AF_CAN handling
rg -n "fn extract_address" crates/stdlib/src/socket.rs -A 50 | head -80Repository: RustPython/RustPython
Length of output: 2683
🏁 Script executed:
#!/bin/bash
# Also search for AF_CAN handling in extract_address
rg -n "AF_CAN" crates/stdlib/src/socket.rs -B 3 -A 10Repository: RustPython/RustPython
Length of output: 4019
🏁 Script executed:
#!/bin/bash
# Check the complete get_addr_tuple function around the mentioned lines
sed -n '2535,2580p' crates/stdlib/src/socket.rsRepository: RustPython/RustPython
Length of output: 2317
🏁 Script executed:
#!/bin/bash
# Search for any usage of can_id or sockaddr_can fields in the code
rg -n "can_id|sockaddr_can" crates/stdlib/src/socket.rs -B 2 -A 2Repository: RustPython/RustPython
Length of output: 1164
🏁 Script executed:
#!/bin/bash
# Check if there's any code after line 1215 that handles the second element
sed -n '1215,1240p' crates/stdlib/src/socket.rsRepository: RustPython/RustPython
Length of output: 1366
🏁 Script executed:
#!/bin/bash
# Search the entire codebase for any handling of the second AF_CAN parameter
rg -n "AF_CAN.*\[1\]|\[1\].*interface" crates/Repository: RustPython/RustPython
Length of output: 47
🌐 Web query:
CPython socket module AF_CAN address tuple format
💡 Result:
Short answer: use Python socket address tuples.
- AF_CAN (generic): (interface,) — a 1-tuple with the interface name string (e.g. ("can0",)). Use "" to bind/receive from all CAN interfaces. [1][2]
- CAN_ISOTP: (interface, rx_addr, tx_addr) — interface string plus two unsigned long CAN IDs. Example: ("can0", 0x123, 0x456). [1][2]
- CAN_J1939: (interface, name, pgn, addr) — interface string, 64-bit ECU name, 32-bit PGN, 8-bit address. [1][2]
References: Python socket docs (AF_CAN / protocol-specific tuple formats) and Linux sockaddr_can layout. [1][2]
AF_CAN: Tuple validation accepts but silently ignores second element.
extract_address at line 1201 accepts 1 or 2 element tuples with the error message "AF_CAN address must be a tuple (interface,) or (interface, addr)", but only extracts and uses tuple[0] (interface), completely ignoring any second element. This causes silent data loss: users passing a 2-element tuple will have the second value discarded without error.
CPython's AF_CAN (generic/CAN_RAW) accepts only 1-tuple (interface,). Protocol-specific variants like CAN_ISOTP use different tuple formats entirely (3-tuple) rather than a simple (interface, addr) format. Either reject 2-element tuples explicitly or implement full protocol-specific tuple support; do not silently ignore input.
🤖 Prompt for AI Agents
In crates/stdlib/src/socket.rs around lines 2535-2560 (and update
extract_address at ~line 1201), the AF_CAN handling currently accepts 1- or
2-element tuples but only uses tuple[0], silently discarding the second element;
change the validation to accept only 1-element tuples for AF_CAN and return a
clear error when a 2-element tuple is provided (or implement full
protocol-specific tuple parsing if you intend to support other CAN families).
Concretely: in extract_address ensure that for AF_CAN you reject tuples whose
length != 1 with the message "AF_CAN address must be a tuple (interface,)" and
adjust callers/tests accordingly so any provided 2-element tuple raises an error
instead of being ignored.
Summary by CodeRabbit
New Features
Improvements
✏️ Tip: You can customize this high-level summary in your review settings.