-
Notifications
You must be signed in to change notification settings - Fork 38
feat(c/sedona-geos): Implement ST_ExteriorRing #409
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@petern48 can you please review my PR,thank you |
petern48
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. While it seems to work, we need to add some other test cases and do a bit of revising to clean up some unnecessary complexity. It's close, not far off from where it needs to be for us to merge it.
| "POLYGON Z ((0 0 1, 1 0 1, 1 1 1, 0 1 1, 0 0 1))", | ||
| "LINESTRING Z (0 0 1, 1 0 1, 1 1 1, 0 1 1, 0 0 1)", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also add POLYGON M and POLYGON ZM cases? I think your current code might fail these, at the moment, since I see logic for has_z instead of using our usual pattern.
You can Ctrl + f in this test_functions.py file and search for POLYGON M and POLYGON ZM to find examples of test cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can I ask how you figured out we have to use the polygon M and polygon ZM cases for this? I cannot see it in the documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When it comes to geospatial knowledge. You just learn overtime that there's four possible dimensions. X and Y (required), and Z and M (both optional). These cases always existed, but they were less interesting cases for your previous functions which only returned counts of rings. Whether an M or Z dimension existed or not is unlikely to fail for those "counting" functions. We could totally add them, but they were less interesting or unlikely to break things.
For this ST_ExteriorRing, it doesn't just count rings, it returns them. Specifically, the returned geometry should also have Z and M dimensions if their inputs did. It's certainly possible for underlying implementations to forget to implement Z and M support. In that case, the function might right LINESTRING except missing the Z and M dimensions. That's why Z and M tests are more interesting for this function, hence we want to test them and make sure it doesn't test the wrong input.
In contrast, if ST_NumRings completely disregarded Z and M dimensions, it probably would still return the right count, since it doesn't really care about the exact contents of the rings, just the number of them. This intuition isn't quite something contributors would be expected to reason about, so don't worry too much if you can't follow the logic.
| ("MULTIPOINT ((0 0), (1 1))", 0), | ||
| ("MULTILINESTRING ((0 0, 1 1), (2 2, 3 3))", 0), | ||
| ("POINT EMPTY", 0), | ||
| ("MULTIPOINT EMPTY", 0), | ||
| ("LINESTRING EMPTY", 0), | ||
| ("MULTILINESTRING EMPTY", 0), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing this. For other readers, this is just clean up from a previous PR, as explained here
| @pytest.mark.parametrize( | ||
| ("geom", "expected"), | ||
| [ | ||
| (None, None), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| (None, None), | |
| (None, None), | |
| ("POINT EMPTY", None), | |
| ("LINESTRING EMPTY", None), | |
| ("POLYGON EMPTY", 0), | |
| ("MULTIPOINT EMPTY", None), | |
| ("MULTILINESTRING EMPTY", None), | |
| ("MULTIPOLYGON EMPTY", None), | |
| ("GEOMETRYCOLLECTION EMPTY", None), |
As always, I want to test empty versions of each of the geometry types
| ("geom", "expected"), | ||
| [ | ||
| (None, None), | ||
| ("POINT (1 2)", None), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ("POINT (1 2)", None), | |
| ("POINT (1 2)", None), | |
| ("MULTIPOINT ((0 0), (1 1))", None), |
I prefer to also test a non-empty version of each geometry type as well. You have all except multipoint already.
| let executor = GeosExecutor::new(arg_types, args); | ||
| let mut builder = BinaryBuilder::with_capacity( | ||
| executor.num_iterations(), | ||
| 100 * executor.num_iterations(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| 100 * executor.num_iterations(), | |
| WKB_MIN_PROBABLE_BYTES * executor.num_iterations(), |
To figure out how to set values like this, I recommend looking at other functions that return a geometry like st_centroid.rs. e.g. here
You'll need to add the import at the top too:
use sedona_geometry::wkb_factory::WKB_MIN_PROBABLE_BYTES;| let mut writer = WKBWriter::new().map_err(|e| { | ||
| DataFusionError::Execution(format!("Failed to create WKB writer: {e}")) | ||
| })?; | ||
|
|
||
| if geos_geom.has_z().unwrap_or(false) { | ||
| writer.set_output_dimension(OutputDimension::ThreeD); | ||
| } | ||
|
|
||
| let wkb = writer.write_wkb(&ring).map_err(|e| { | ||
| DataFusionError::Execution(format!("Failed to convert to wkb: {e}")) | ||
| })?; | ||
|
|
||
| builder.append_value(wkb); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While it seems like this works (though I think it would fail for M and ZM cases), we should follow existing code patterns instead of introducing this less concise and somewhat more complex logic.
I encourage you to look at st_centroid.rs to see how you can implement the wkb writing more cleanly. Specifically look here and how that's called here
| // Test 2: 3D Polygon (Z-coordinates) | ||
| let z_wkt_in = "POLYGON Z ((0 0 1, 1 0 1, 1 1 1, 0 1 1, 0 0 1))"; | ||
| let z_wkt_out = "LINESTRING Z (0 0 1, 1 0 1, 1 1 1, 0 1 1, 0 0 1)"; | ||
|
|
||
| let result_z = tester.invoke_scalar(z_wkt_in).unwrap(); | ||
| match result_z { | ||
| ScalarValue::Binary(Some(bytes)) | ScalarValue::LargeBinary(Some(bytes)) => { | ||
| let expected_bytes = geos_wkt_to_wkb(z_wkt_out); | ||
| assert_eq!( | ||
| bytes, expected_bytes, | ||
| "Z-coordinates were not preserved correctly" | ||
| ); | ||
| } | ||
| _ => panic!("Expected Binary result for 3D test"), | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Genuine question) Do we really need this extra logic for testing POLYGON Z here? I would've thought we could use it in the input_wkt and expected arrays below.
Let's remove this complexity by moving this test case into the arrays below. If that doesn't work. Let me know.
|
Could you also fix the PR title and mention the issue in the PR description, according to my previous comment before: #386 (comment). You should generally try to do it whenever you first open the PR. If you don't do so, it increases the odds of someone else taking up that work too, which would eventually lead to someone (either you or them) going through the work just for their PR to not be merged. If you're afraid of linking the issue earlier because you're not sure you can 100% commit, it's totally okay to link it anyways and later decide to close your PR if you decide you don't want to follow through with it (e.g. if it's too hard or you lose interest). Nothing ever locks you in to being required to finish something. But please do link the issue as soon as you open the PR. For this PR, we're already close to being done anyway. |
Introduced the st_exteriorfile function, which returns the outer boundary line of a given geometry.
This issue is a part of #224