feat: Added batching to feature server /push to offline store (#5683)#5729
Conversation
|
|
||
| from feast.repo_config import FeastConfigBaseModel | ||
|
|
||
| class OfflinePushBatchingConfig(FeastConfigBaseModel): |
There was a problem hiding this comment.
i think having a config is fine but do we actually need it? we probably could have just passed these as optional args, no?
There was a problem hiding this comment.
You're right, I just noticed the FeatureLoggingConfig with 5 fields and decided that for 3 fields it would be justified to also add a config. Do you want me to refactor it to use optional args?
There was a problem hiding this comment.
i think having a config is fine but do we actually need it? we probably could have just passed these as optional args, no?
I refactored it as you wanted, so that there is no config. @franciscojavierarceo
494912d to
107bf09
Compare
c980db9 to
04dc34f
Compare
…dev#5683) Signed-off-by: Jacob Weinhold <[email protected]> fix: formatting,l int errors (feast-dev#5683) Signed-off-by: Jacob Weinhold <[email protected]>
20079eb to
1a3ccbd
Compare
…push Signed-off-by: Jacob Weinhold <[email protected]>
a60c605 to
bb299d9
Compare
sdk/python/feast/feature_server.py
Outdated
| return | ||
|
|
||
| batch_df = pd.concat(dfs, ignore_index=True) | ||
| self._buffers[key].clear() |
There was a problem hiding this comment.
Does it make sense to move clear inside try: self._store.push so that buffer gets cleared only after the write succeeds ?
There was a problem hiding this comment.
Totally makes sense, it's done. Thanks for seeing that.
sdk/python/feast/feature_server.py
Outdated
|
|
||
| # NOTE: offline writes are currently synchronous only, so we call directly | ||
| try: | ||
| self._store.push( |
There was a problem hiding this comment.
What about splitting _flush_locked into two methods: one that extracts data (with lock) and one that does I/O (without lock) ?
Something like:
- Extracting the batch data while holding the lock
- Releasing the lock before doing I/O
- Re-enqueueing data if the write fails
There was a problem hiding this comment.
Thanks for the suggestion!
I split _flush_locked into _drain_locked (extract under lock) and _flush (I/O without lock). I also added _inflight to prevent concurrent flushes per key. On failure the drained batch is re‑enqueued so we don’t drop data.
|
|
||
| # use a multi-row payload to ensure we test non-trivial dfs | ||
| resp = client.post("/push", json=push_body_many(push_mode, count=2, id_start=100)) | ||
| assert resp.status_code == 200 |
There was a problem hiding this comment.
optional but I think it's good to return 202 when batching is enabled and offline writes are involved
There was a problem hiding this comment.
Nice catch! It's done.
There was a problem hiding this comment.
Pull request overview
This PR adds configurable batching support for offline writes to the feature server's /push endpoint. The batching mechanism buffers offline writes and flushes them based on either a size threshold or time interval, improving throughput for high-volume offline push operations.
Key Changes:
- Introduced
OfflineWriteBatcherclass that manages buffered writes in a background thread - Added configuration options for batch size and interval in
BaseFeatureServerConfig - Modified
/pushendpoint logic to separate online and offline writes when batching is enabled
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| sdk/python/feast/feature_server.py | Implemented OfflineWriteBatcher class and integrated batching logic into the /push endpoint |
| sdk/python/feast/infra/feature_servers/base_config.py | Added three new configuration fields for offline push batching |
| sdk/python/tests/unit/test_feature_server.py | Added comprehensive test coverage for batching behavior across different push modes and configurations |
| docs/reference/feature-store-yaml.md | Documented the new feature_server configuration block with batching options |
| docs/reference/feature-servers/python-feature-server.md | Added user-facing documentation explaining offline write batching functionality |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| allow_registry_cache = request.allow_registry_cache | ||
| transform_on_write = request.transform_on_write | ||
|
|
||
| # Async currently only applies to online store writes (ONLINE / ONLINE_AND_OFFLINE paths) as theres no async for offline store |
There was a problem hiding this comment.
Corrected spelling: 'theres' should be 'there's'.
| # Async currently only applies to online store writes (ONLINE / ONLINE_AND_OFFLINE paths) as theres no async for offline store | |
| # Async currently only applies to online store writes (ONLINE / ONLINE_AND_OFFLINE paths) as there's no async for offline store |
| fs, enabled: bool = True, batch_size: int = 1, batch_interval_seconds: int = 60 | ||
| ): | ||
| """ | ||
| Attach a minimal feature_server.offline_push_batching config |
There was a problem hiding this comment.
The docstring has inconsistent indentation. The closing triple quotes and the description line should be aligned with the opening triple quotes for standard formatting.
| Attach a minimal feature_server.offline_push_batching config | |
| Attach a minimal feature_server.offline_push_batching config |
…-dev#5683](feast-dev#5683)) Signed-off-by: Jacob Weinhold <[email protected]>
a99a9b0 to
2747405
Compare
# [0.59.0](v0.58.0...v0.59.0) (2026-01-16) ### Bug Fixes * Add get_table_query_string_with_alias() for PostgreSQL subquery aliasing ([#5811](#5811)) ([11122ce](11122ce)) * Add hybrid online store to ONLINE_STORE_CLASS_FOR_TYPE mapping ([#5810](#5810)) ([678589b](678589b)) * Add possibility to overwrite send_receive_timeout for clickhouse offline store ([#5792](#5792)) ([59dbb33](59dbb33)) * Denial by default to all resources when no permissions set ([#5663](#5663)) ([1524f1c](1524f1c)) * Make operator include full OIDC secret in repo config ([#5676](#5676)) ([#5809](#5809)) ([a536bc2](a536bc2)) * Populate Postgres `registry.path` during `feast init` ([#5785](#5785)) ([f293ae8](f293ae8)) * **redis:** Preserve millisecond timestamp precision for Redis online store ([#5807](#5807)) ([9e3f213](9e3f213)) * Search API to return all matching tags in matched_tags field ([#5843](#5843)) ([de37f66](de37f66)) * Spark Materialization Engine Cannot Infer Schema ([#5806](#5806)) ([58d0325](58d0325)), closes [#5594](#5594) [#5594](#5594) * Support arro3 table schema with newer deltalake packages ([#5799](#5799)) ([103c5e9](103c5e9)) * Timestamp formatting and lakehouse-type connector for trino_offline_store. ([#5846](#5846)) ([c2ea7e9](c2ea7e9)) * Update model_validator to use instance method signature (Pydantic v2.12 deprecation) ([#5825](#5825)) ([3c10b6e](3c10b6e)) ### Features * Add dbt integration for importing models as FeatureViews ([#5827](#5827)) ([b997361](b997361)), closes [#3335](#3335) [#3335](#3335) [#3335](#3335) * Add GCS registry store in Go feature server ([#5818](#5818)) ([1dc2be5](1dc2be5)) * Add progress bar to CLI from feast apply ([#5867](#5867)) ([ab3562b](ab3562b)) * Add RBAC blog post to website ([#5861](#5861)) ([b1844a3](b1844a3)) * Add skip_feature_view_validation parameter to FeatureStore.apply() and plan() ([#5859](#5859)) ([5482a0e](5482a0e)) * Added batching to feature server /push to offline store ([#5683](#5683)) ([#5729](#5729)) ([ce35ce6](ce35ce6)) * Enable static artifacts for feature server that can be used in Feature Transformations ([#5787](#5787)) ([edefc3f](edefc3f)) * Improve lambda materialization engine ([#5829](#5829)) ([f6116f9](f6116f9)) * Offline Store historical features retrieval based on datetime range in Ray ([#5738](#5738)) ([e484c12](e484c12)) * Read, Save docs and chat fixes ([#5865](#5865)) ([2081b55](2081b55)) * Resolve pyarrow >21 installation with ibis-framework ([#5847](#5847)) ([8b9bb50](8b9bb50)) * Support staging for spark materialization ([#5671](#5671)) ([#5797](#5797)) ([5b787af](5b787af))
# [0.59.0](feast-dev/feast@v0.58.0...v0.59.0) (2026-01-16) ### Bug Fixes * Add get_table_query_string_with_alias() for PostgreSQL subquery aliasing ([feast-dev#5811](feast-dev#5811)) ([11122ce](feast-dev@11122ce)) * Add hybrid online store to ONLINE_STORE_CLASS_FOR_TYPE mapping ([feast-dev#5810](feast-dev#5810)) ([678589b](feast-dev@678589b)) * Add possibility to overwrite send_receive_timeout for clickhouse offline store ([feast-dev#5792](feast-dev#5792)) ([59dbb33](feast-dev@59dbb33)) * Denial by default to all resources when no permissions set ([feast-dev#5663](feast-dev#5663)) ([1524f1c](feast-dev@1524f1c)) * Make operator include full OIDC secret in repo config ([feast-dev#5676](feast-dev#5676)) ([feast-dev#5809](feast-dev#5809)) ([a536bc2](feast-dev@a536bc2)) * Populate Postgres `registry.path` during `feast init` ([feast-dev#5785](feast-dev#5785)) ([f293ae8](feast-dev@f293ae8)) * **redis:** Preserve millisecond timestamp precision for Redis online store ([feast-dev#5807](feast-dev#5807)) ([9e3f213](feast-dev@9e3f213)) * Search API to return all matching tags in matched_tags field ([feast-dev#5843](feast-dev#5843)) ([de37f66](feast-dev@de37f66)) * Spark Materialization Engine Cannot Infer Schema ([feast-dev#5806](feast-dev#5806)) ([58d0325](feast-dev@58d0325)), closes [feast-dev#5594](feast-dev#5594) [feast-dev#5594](feast-dev#5594) * Support arro3 table schema with newer deltalake packages ([feast-dev#5799](feast-dev#5799)) ([103c5e9](feast-dev@103c5e9)) * Timestamp formatting and lakehouse-type connector for trino_offline_store. ([feast-dev#5846](feast-dev#5846)) ([c2ea7e9](feast-dev@c2ea7e9)) * Update model_validator to use instance method signature (Pydantic v2.12 deprecation) ([feast-dev#5825](feast-dev#5825)) ([3c10b6e](feast-dev@3c10b6e)) ### Features * Add dbt integration for importing models as FeatureViews ([feast-dev#5827](feast-dev#5827)) ([b997361](feast-dev@b997361)), closes [feast-dev#3335](feast-dev#3335) [feast-dev#3335](feast-dev#3335) [feast-dev#3335](feast-dev#3335) * Add GCS registry store in Go feature server ([feast-dev#5818](feast-dev#5818)) ([1dc2be5](feast-dev@1dc2be5)) * Add progress bar to CLI from feast apply ([feast-dev#5867](feast-dev#5867)) ([ab3562b](feast-dev@ab3562b)) * Add RBAC blog post to website ([feast-dev#5861](feast-dev#5861)) ([b1844a3](feast-dev@b1844a3)) * Add skip_feature_view_validation parameter to FeatureStore.apply() and plan() ([feast-dev#5859](feast-dev#5859)) ([5482a0e](feast-dev@5482a0e)) * Added batching to feature server /push to offline store ([feast-dev#5683](feast-dev#5683)) ([feast-dev#5729](feast-dev#5729)) ([ce35ce6](feast-dev@ce35ce6)) * Enable static artifacts for feature server that can be used in Feature Transformations ([feast-dev#5787](feast-dev#5787)) ([edefc3f](feast-dev@edefc3f)) * Improve lambda materialization engine ([feast-dev#5829](feast-dev#5829)) ([f6116f9](feast-dev@f6116f9)) * Offline Store historical features retrieval based on datetime range in Ray ([feast-dev#5738](feast-dev#5738)) ([e484c12](feast-dev@e484c12)) * Read, Save docs and chat fixes ([feast-dev#5865](feast-dev#5865)) ([2081b55](feast-dev@2081b55)) * Resolve pyarrow >21 installation with ibis-framework ([feast-dev#5847](feast-dev#5847)) ([8b9bb50](feast-dev@8b9bb50)) * Support staging for spark materialization ([feast-dev#5671](feast-dev#5671)) ([feast-dev#5797](feast-dev#5797)) ([5b787af](feast-dev@5b787af)) Signed-off-by: Kyle DePasquale <[email protected]>
# [0.59.0](feast-dev/feast@v0.58.0...v0.59.0) (2026-01-16) ### Bug Fixes * Add get_table_query_string_with_alias() for PostgreSQL subquery aliasing ([feast-dev#5811](feast-dev#5811)) ([11122ce](feast-dev@11122ce)) * Add hybrid online store to ONLINE_STORE_CLASS_FOR_TYPE mapping ([feast-dev#5810](feast-dev#5810)) ([678589b](feast-dev@678589b)) * Add possibility to overwrite send_receive_timeout for clickhouse offline store ([feast-dev#5792](feast-dev#5792)) ([59dbb33](feast-dev@59dbb33)) * Denial by default to all resources when no permissions set ([feast-dev#5663](feast-dev#5663)) ([1524f1c](feast-dev@1524f1c)) * Make operator include full OIDC secret in repo config ([feast-dev#5676](feast-dev#5676)) ([feast-dev#5809](feast-dev#5809)) ([a536bc2](feast-dev@a536bc2)) * Populate Postgres `registry.path` during `feast init` ([feast-dev#5785](feast-dev#5785)) ([f293ae8](feast-dev@f293ae8)) * **redis:** Preserve millisecond timestamp precision for Redis online store ([feast-dev#5807](feast-dev#5807)) ([9e3f213](feast-dev@9e3f213)) * Search API to return all matching tags in matched_tags field ([feast-dev#5843](feast-dev#5843)) ([de37f66](feast-dev@de37f66)) * Spark Materialization Engine Cannot Infer Schema ([feast-dev#5806](feast-dev#5806)) ([58d0325](feast-dev@58d0325)), closes [feast-dev#5594](feast-dev#5594) [feast-dev#5594](feast-dev#5594) * Support arro3 table schema with newer deltalake packages ([feast-dev#5799](feast-dev#5799)) ([103c5e9](feast-dev@103c5e9)) * Timestamp formatting and lakehouse-type connector for trino_offline_store. ([feast-dev#5846](feast-dev#5846)) ([c2ea7e9](feast-dev@c2ea7e9)) * Update model_validator to use instance method signature (Pydantic v2.12 deprecation) ([feast-dev#5825](feast-dev#5825)) ([3c10b6e](feast-dev@3c10b6e)) ### Features * Add dbt integration for importing models as FeatureViews ([feast-dev#5827](feast-dev#5827)) ([b997361](feast-dev@b997361)), closes [feast-dev#3335](feast-dev#3335) [feast-dev#3335](feast-dev#3335) [feast-dev#3335](feast-dev#3335) * Add GCS registry store in Go feature server ([feast-dev#5818](feast-dev#5818)) ([1dc2be5](feast-dev@1dc2be5)) * Add progress bar to CLI from feast apply ([feast-dev#5867](feast-dev#5867)) ([ab3562b](feast-dev@ab3562b)) * Add RBAC blog post to website ([feast-dev#5861](feast-dev#5861)) ([b1844a3](feast-dev@b1844a3)) * Add skip_feature_view_validation parameter to FeatureStore.apply() and plan() ([feast-dev#5859](feast-dev#5859)) ([5482a0e](feast-dev@5482a0e)) * Added batching to feature server /push to offline store ([feast-dev#5683](feast-dev#5683)) ([feast-dev#5729](feast-dev#5729)) ([ce35ce6](feast-dev@ce35ce6)) * Enable static artifacts for feature server that can be used in Feature Transformations ([feast-dev#5787](feast-dev#5787)) ([edefc3f](feast-dev@edefc3f)) * Improve lambda materialization engine ([feast-dev#5829](feast-dev#5829)) ([f6116f9](feast-dev@f6116f9)) * Offline Store historical features retrieval based on datetime range in Ray ([feast-dev#5738](feast-dev#5738)) ([e484c12](feast-dev@e484c12)) * Read, Save docs and chat fixes ([feast-dev#5865](feast-dev#5865)) ([2081b55](feast-dev@2081b55)) * Resolve pyarrow >21 installation with ibis-framework ([feast-dev#5847](feast-dev#5847)) ([8b9bb50](feast-dev@8b9bb50)) * Support staging for spark materialization ([feast-dev#5671](feast-dev#5671)) ([feast-dev#5797](feast-dev#5797)) ([5b787af](feast-dev@5b787af)) Signed-off-by: yassinnouh21 <[email protected]>
# [0.59.0](feast-dev/feast@v0.58.0...v0.59.0) (2026-01-16) ### Bug Fixes * Add get_table_query_string_with_alias() for PostgreSQL subquery aliasing ([feast-dev#5811](feast-dev#5811)) ([11122ce](feast-dev@11122ce)) * Add hybrid online store to ONLINE_STORE_CLASS_FOR_TYPE mapping ([feast-dev#5810](feast-dev#5810)) ([678589b](feast-dev@678589b)) * Add possibility to overwrite send_receive_timeout for clickhouse offline store ([feast-dev#5792](feast-dev#5792)) ([59dbb33](feast-dev@59dbb33)) * Denial by default to all resources when no permissions set ([feast-dev#5663](feast-dev#5663)) ([1524f1c](feast-dev@1524f1c)) * Make operator include full OIDC secret in repo config ([feast-dev#5676](feast-dev#5676)) ([feast-dev#5809](feast-dev#5809)) ([a536bc2](feast-dev@a536bc2)) * Populate Postgres `registry.path` during `feast init` ([feast-dev#5785](feast-dev#5785)) ([f293ae8](feast-dev@f293ae8)) * **redis:** Preserve millisecond timestamp precision for Redis online store ([feast-dev#5807](feast-dev#5807)) ([9e3f213](feast-dev@9e3f213)) * Search API to return all matching tags in matched_tags field ([feast-dev#5843](feast-dev#5843)) ([de37f66](feast-dev@de37f66)) * Spark Materialization Engine Cannot Infer Schema ([feast-dev#5806](feast-dev#5806)) ([58d0325](feast-dev@58d0325)), closes [feast-dev#5594](feast-dev#5594) [feast-dev#5594](feast-dev#5594) * Support arro3 table schema with newer deltalake packages ([feast-dev#5799](feast-dev#5799)) ([103c5e9](feast-dev@103c5e9)) * Timestamp formatting and lakehouse-type connector for trino_offline_store. ([feast-dev#5846](feast-dev#5846)) ([c2ea7e9](feast-dev@c2ea7e9)) * Update model_validator to use instance method signature (Pydantic v2.12 deprecation) ([feast-dev#5825](feast-dev#5825)) ([3c10b6e](feast-dev@3c10b6e)) ### Features * Add dbt integration for importing models as FeatureViews ([feast-dev#5827](feast-dev#5827)) ([b997361](feast-dev@b997361)), closes [feast-dev#3335](feast-dev#3335) [feast-dev#3335](feast-dev#3335) [feast-dev#3335](feast-dev#3335) * Add GCS registry store in Go feature server ([feast-dev#5818](feast-dev#5818)) ([1dc2be5](feast-dev@1dc2be5)) * Add progress bar to CLI from feast apply ([feast-dev#5867](feast-dev#5867)) ([ab3562b](feast-dev@ab3562b)) * Add RBAC blog post to website ([feast-dev#5861](feast-dev#5861)) ([b1844a3](feast-dev@b1844a3)) * Add skip_feature_view_validation parameter to FeatureStore.apply() and plan() ([feast-dev#5859](feast-dev#5859)) ([5482a0e](feast-dev@5482a0e)) * Added batching to feature server /push to offline store ([feast-dev#5683](feast-dev#5683)) ([feast-dev#5729](feast-dev#5729)) ([ce35ce6](feast-dev@ce35ce6)) * Enable static artifacts for feature server that can be used in Feature Transformations ([feast-dev#5787](feast-dev#5787)) ([edefc3f](feast-dev@edefc3f)) * Improve lambda materialization engine ([feast-dev#5829](feast-dev#5829)) ([f6116f9](feast-dev@f6116f9)) * Offline Store historical features retrieval based on datetime range in Ray ([feast-dev#5738](feast-dev#5738)) ([e484c12](feast-dev@e484c12)) * Read, Save docs and chat fixes ([feast-dev#5865](feast-dev#5865)) ([2081b55](feast-dev@2081b55)) * Resolve pyarrow >21 installation with ibis-framework ([feast-dev#5847](feast-dev#5847)) ([8b9bb50](feast-dev@8b9bb50)) * Support staging for spark materialization ([feast-dev#5671](feast-dev#5671)) ([feast-dev#5797](feast-dev#5797)) ([5b787af](feast-dev@5b787af)) Signed-off-by: yassinnouh21 <[email protected]>
What this PR does / why we need it:
Added batching configuration for feature_servers /push endpoint for offline store writes
Which issue(s) this PR fixes:
Fixes #5683