feat: Contrib azure provider with synapse/mssql offline store and Azure registry store#3072
Conversation
Codecov Report
@@ Coverage Diff @@
## master #3072 +/- ##
==========================================
+ Coverage 67.86% 76.03% +8.16%
==========================================
Files 176 214 +38
Lines 15507 17897 +2390
==========================================
+ Hits 10524 13608 +3084
+ Misses 4983 4289 -694
Flags with carried forward coverage won't be shown. Click here to find out more.
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
dba6b1f to
7fdc2ae
Compare
adchia
left a comment
There was a problem hiding this comment.
not needed, but we could theoretically setup CI by using https://docs.microsoft.com/en-us/azure/azure-sql-edge/disconnected-deployment and using test-containers.
| ## Functionality Matrix | ||
|
|
||
| The set of functionality supported by offline stores is described in detail [here](overview.md#functionality). | ||
| Below is a matrix indicating which functionality is supported by the Spark offline store. |
docs/roadmap.md
Outdated
| * [x] [BigQuery source](https://docs.feast.dev/reference/data-sources/bigquery) | ||
| * [x] [Parquet file source](https://docs.feast.dev/reference/data-sources/file) | ||
| * [x] [Synapse source (community plugin)](https://github.com/Azure/feast-azure) | ||
| * [x] [Synapse source (contrib plugin)](https://docs.feast.dev/reference/data-sources/mssql) |
There was a problem hiding this comment.
should this be Azure Synapse + Azure SQL?
| make_tzaware, | ||
| ) | ||
|
|
||
| # Make sure spark warning doesn't raise more than once. |
There was a problem hiding this comment.
as discussed, we can kill this provider probably or use passthroughprovider
|
|
||
| if data_source_type == DataSourceProto.SourceType.CUSTOM_SOURCE: | ||
| data_source.data_source_class_type = "feast.infra.offline_stores.contrib.mssql_offline_store.mssqlserver_source.MsSqlServerSource" | ||
| cls = get_data_source_class_from_type(data_source.data_source_class_type) |
There was a problem hiding this comment.
Instead of hard coding the class type here, I believe it should be declared in the to_proto() function of the MsSqlServerSource class. This is done in case of the other offline contrib stores and also ensures that there are no conflicts with any other future custom sources.
2eaebdd to
8d92603
Compare
| super().__init__(project_name) | ||
| self.tables_created: List[str] = [] | ||
| self.container = SqlServerContainer(user=MSSQL_USER, password=MSSQL_PASSWORD) | ||
| # self.container = fixture_request.getfixturevalue("mssql_container") |
There was a problem hiding this comment.
Remove all the commented line here?
sdk/python/feast/infra/offline_stores/contrib/mssql_offline_store/tests/data_source.py
Show resolved
Hide resolved
sdk/python/feast/type_map.py
Outdated
| pa_type_as_str = str(pa_type).lower() | ||
| if pa_type_as_str.startswith("timestamp"): | ||
| if "tz=" in pa_type_as_str: | ||
| return "DATETIME2" |
There was a problem hiding this comment.
any reason why this is upper case and not lower case like everything else?
docs/reference/data-sources/mssql.md
Outdated
There was a problem hiding this comment.
doesn't look fixed to me? (Microsoft sql table sources)
There was a problem hiding this comment.
oh sorry, I realized I hadn't pushed my changes last night.
sdk/python/feast/infra/offline_stores/contrib/mssql_offline_store/tests/data_source.py
Outdated
Show resolved
Hide resolved
sdk/python/feast/infra/offline_stores/contrib/mssql_offline_store/mssqlserver_source.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
call out synapse or sql server?
docs/tutorials/azure/README.md
Outdated
sdk/python/feast/infra/offline_stores/contrib/mssql_offline_store/mssql.py
Outdated
Show resolved
Hide resolved
sdk/python/feast/infra/offline_stores/contrib/mssql_offline_store/tests/data_source.py
Outdated
Show resolved
Hide resolved
sdk/python/feast/infra/offline_stores/contrib/mssql_offline_store/tests/data_source.py
Show resolved
Hide resolved
sdk/python/feast/infra/offline_stores/contrib/mssql_offline_store/tests/data_source.py
Show resolved
Hide resolved
There was a problem hiding this comment.
if you always use the fixture as I suggest above, then this shouldn't be needed
sdk/python/feast/infra/offline_stores/contrib/mssql_offline_store/tests/data_source.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
can you throw an exception here? e.g. some not implemented error so users know they can't pass a SQL string for MsSQL to generate the entity dataframe?
docs/reference/data-sources/mssql.md
Outdated
There was a problem hiding this comment.
doesn't look fixed to me? (Microsoft sql table sources)
docs/tutorials/azure/README.md
Outdated
| ## 6. Running Feast Azure Tutorials locally without Azure workspace | ||
|
|
||
| * If you are on a free tier instance, you will not be able to deploy the azure deployment because the azure workspace requires VCPUs and the free trial subscription does not have a quota. | ||
| * The workaround is to remove the `Microsoft.MachineLearningServices/workspaces/computes` resource from `fs_snapse_azure_deploy.json` and setting up the environment locally. |
There was a problem hiding this comment.
typo? assuming should be synapse
There was a problem hiding this comment.
thanks for the catch! fixed
35b123d to
452379f
Compare
There was a problem hiding this comment.
pass in the azure-sql-edge image as mentioned above?
There was a problem hiding this comment.
Oh I didn't realize you actually wanted to use this image. Updated.
sdk/python/tests/integration/registration/test_universal_cli.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
you shouldn't need to pass a fixture_request actually. the test environment already has a datasourcecreator
de40738 to
f998d8b
Compare
Signed-off-by: Kevin Zhang <[email protected]>
Signed-off-by: Kevin Zhang <[email protected]>
Signed-off-by: Kevin Zhang <[email protected]>
Signed-off-by: Kevin Zhang <[email protected]>
Signed-off-by: Kevin Zhang <[email protected]>
Signed-off-by: Kevin Zhang <[email protected]>
Signed-off-by: Kevin Zhang <[email protected]>
Signed-off-by: Kevin Zhang <[email protected]>
Signed-off-by: Kevin Zhang <[email protected]>
Signed-off-by: Kevin Zhang <[email protected]>
Signed-off-by: Kevin Zhang <[email protected]>
Signed-off-by: Kevin Zhang <[email protected]>
Signed-off-by: Kevin Zhang <[email protected]>
Signed-off-by: Kevin Zhang <[email protected]>
Signed-off-by: Kevin Zhang <[email protected]>
Signed-off-by: Kevin Zhang <[email protected]>
Signed-off-by: Kevin Zhang <[email protected]>
Signed-off-by: Kevin Zhang <[email protected]>
Signed-off-by: Kevin Zhang <[email protected]>
f998d8b to
3d42093
Compare
Signed-off-by: Danny Chiao <[email protected]>
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: adchia, kevjumba The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Signed-off-by: Danny Chiao <[email protected]>
Signed-off-by: Danny Chiao <[email protected]>
Signed-off-by: Danny Chiao <[email protected]>
Signed-off-by: Danny Chiao <[email protected]>
Signed-off-by: Danny Chiao <[email protected]>
|
/lgtm |
# [0.24.0](v0.23.0...v0.24.0) (2022-08-25) ### Bug Fixes * Check if on_demand_feature_views is an empty list rather than None for snowflake provider ([#3046](#3046)) ([9b05e65](9b05e65)) * FeatureStore.apply applies BatchFeatureView correctly ([#3098](#3098)) ([41be511](41be511)) * Fix Feast Java inconsistency with int64 serialization vs python ([#3031](#3031)) ([4bba787](4bba787)) * Fix feature service inference logic ([#3089](#3089)) ([4310ed7](4310ed7)) * Fix field mapping logic during feature inference ([#3067](#3067)) ([cdfa761](cdfa761)) * Fix incorrect on demand feature view diffing and improve Java tests ([#3074](#3074)) ([0702310](0702310)) * Fix Java helm charts to work with refactored logic. Fix FTS image ([#3105](#3105)) ([2b493e0](2b493e0)) * Fix on demand feature view output in feast plan + Web UI crash ([#3057](#3057)) ([bfae6ac](bfae6ac)) * Fix release workflow to release 0.24.0 ([#3138](#3138)) ([a69aaae](a69aaae)) * Fix Spark offline store type conversion to arrow ([#3071](#3071)) ([b26566d](b26566d)) * Fixing Web UI, which fails for the SQL registry ([#3028](#3028)) ([64603b6](64603b6)) * Force Snowflake Session to Timezone UTC ([#3083](#3083)) ([9f221e6](9f221e6)) * Make infer dummy entity join key idempotent ([#3115](#3115)) ([1f5b1e0](1f5b1e0)) * More explicit error messages ([#2708](#2708)) ([e4d7afd](e4d7afd)) * Parse inline data sources ([#3036](#3036)) ([c7ba370](c7ba370)) * Prevent overwriting existing file during `persist` ([#3088](#3088)) ([69af21f](69af21f)) * Register BatchFeatureView in feature repos correctly ([#3092](#3092)) ([b8e39ea](b8e39ea)) * Return an empty infra object from sql registry when it doesn't exist ([#3022](#3022)) ([8ba87d1](8ba87d1)) * Teardown tables for Snowflake Materialization testing ([#3106](#3106)) ([0a0c974](0a0c974)) * UI error when saved dataset is present in registry. ([#3124](#3124)) ([83cf753](83cf753)) * Update sql.py ([#3096](#3096)) ([2646a86](2646a86)) * Updated snowflake template ([#3130](#3130)) ([f0594e1](f0594e1)) ### Features * Add authentication option for snowflake connector ([#3039](#3039)) ([74c75f1](74c75f1)) * Add Cassandra/AstraDB online store contribution ([#2873](#2873)) ([feb6cb8](feb6cb8)) * Add Snowflake materialization engine ([#2948](#2948)) ([f3b522b](f3b522b)) * Adding saved dataset capabilities for Postgres ([#3070](#3070)) ([d3253c3](d3253c3)) * Allow passing repo config path via flag ([#3077](#3077)) ([0d2d951](0d2d951)) * Contrib azure provider with synapse/mssql offline store and Azure registry store ([#3072](#3072)) ([9f7e557](9f7e557)) * Custom Docker image for Bytewax batch materialization ([#3099](#3099)) ([cdd1b07](cdd1b07)) * Feast AWS Athena offline store (again) ([#3044](#3044)) ([989ce08](989ce08)) * Implement spark offline store `offline_write_batch` method ([#3076](#3076)) ([5b0cc87](5b0cc87)) * Initial Bytewax materialization engine ([#2974](#2974)) ([55c61f9](55c61f9)) * Refactor feature server helm charts to allow passing feature_store.yaml in environment variables ([#3113](#3113)) ([85ee789](85ee789))
|
Hi, I am wondering if it is possible to use this for a self-hosted MSSQL Server database as an Offline Store? |
What this PR does / why we need it:
Moving https://github.com/Azure/feast-azure feast_azure_provider package into feast as contrib. Does not move the kubernetes cluster configuration.
Which issue(s) this PR fixes:
Fixes #