-
Notifications
You must be signed in to change notification settings - Fork 96
Description
We have seen one issue going into Silver. It looks like our only option right now is streaming tables. This has caused some challenges because our users would like to be able to use Time Travel on tables in Silver. We also have some technical use cases that would benefit from Time Travel on Silver. While this appears to work on the SQL Warehouse it isn't support on the cluster/spark.
In reviewing the code and documentation, it appears the choice between a materialized view (which should support Time Travel) vs. streaming table is based on how you read the underlying table. I think this code in dataflow_pipeline might be our culprit, due to this section of code that always using the "readStream.table" function. Is there anyway we might be able to create a non-streaming option for silver?
def get_silver_schema(self):
"""Get Silver table Schema."""
silver_dataflow_spec: SilverDataflowSpec = self.dataflowSpec
source_database = silver_dataflow_spec.sourceDetails["database"]
source_table = silver_dataflow_spec.sourceDetails["table"]
select_exp = silver_dataflow_spec.selectExp
where_clause = silver_dataflow_spec.whereClause
raw_delta_table_stream = self.spark.readStream.table(
f"{source_database}.{source_table}"
).selectExpr(*select_exp) if self.uc_enabled else self.spark.readStream.load(
path=silver_dataflow_spec.sourceDetails["path"],
format="delta"
).selectExpr(*select_exp)
raw_delta_table_stream = self.__apply_where_clause(where_clause, raw_delta_table_stream)
return raw_delta_table_stream.schema