Consider a scenario where you accidentally dropped the actual table or instead of deleting a set of records, you updated all the records present in the table.
What will you do? How will you restore your data that has already been deleted/altered?
You must be hoping of going back in time and correcting incorrectly executed statements. Snowflake provides this feature wherein you can get back the data that is present at a particular time. This feature of Snowflake is called Time Travel.
Snowflake Time Travel is a very important tool that allows users to access Historical Data (i.e. data that has been updated or removed) at any point in time in the past. It is a powerful Continuous Data Protection (CDP) feature that ensures the maintenance and availability of historical data.
Query Optimization: As a user, we should not be concerned about optimizing queries because Snowflake on its own optimizes queries by using Clustering and Partitioning.
Secure Data Sharing: Using Snowflake Database, Tables, Views, and UDFs, data can be shared securely from one account to another.
Support for File Formats: Supports almost all file formats: JSON, Avro, ORC, Parquet, and XML are all Semi-Structured data formats that Snowflake can import. Column type — Variant lets the user store Semi-Structured data.
Caching: Caching strategy of Snowflake returns results quickly for repeated queries as it stores query results in a cache within a given session.
Fault Resistant: In case of event failure, Snowflake provides exceptional fault-tolerant capabilities to recover tables, views, databases, schema, and so on.
To query past data.
To make clones of complete Tables, Schemas, and Databases at or before certain dates.
To restore deleted Tables, Schemas, and Databases.
To restore original data that was updated accidentally.
To check consumption over a period of time.
Cloning and Backing up data from previous times.
No additional configurations are required to enable Time Travel, it is enabled by default, with a one-day retention period. Although to configure longer data retention periods, we need to upgrade to Snowflake Enterprise Edition. The retention period can be set to a maximum of 90 days. Based on the retention period, charges will increase.
The below query builds a table with a retention period of 90 days:
create table my_table(col1 number, col2 date) data_retention_time_in_days=90;
The retention period can also be changed using the ‘alter’ query as below:
alter table my_table set data_retention_time_in_days=30;
Time Travel cannot be turned off for accounts, but it can be turned off for individual databases, schemas, and tables by setting data_retention_time_in_days field to 0 using the below query:
alter table my_table set data_retention_time_in_days=0;
Whenever any Data Manipulation Language (DML) query is executed on a table, Snowflake saves prior versions of the Table data for a given period of time depending on the retention period. The previous version of data can be queried using the AT | BEFORE Clause.
Using AT, the user can get data at a given period of time whereas using BEFORE all the data from that point till the end of the retention period can be fetched.
The following SQL extensions have been added to facilitate Snowflake Time Travel:
CLONE: To create a logical duplicate of the object at a specific point in its history.
TIMESTAMP: From a given time (Data & Time) provided.
OFFSET: Time difference from current time till offset provided in seconds.
STATEMENT: Using a Statement ID from the point where the last DML query was fired.
UNDROP: If a table is dropped accidentally, it can be restored using the UNDROP command.
The below query generates a Clone of a Table from the given Date and Time as indicated by the Timestamp:
create table restored_table clone my_table at(timestamp => 'Sat, 09 May 2015 01:01:00 +0300’::timestamp_tz);
The below query creates a Clone of a Schema and all its Objects as they were an hour ago:
create schema restored_schema clone my_schema at(offset => -3600);
The below query pulls Historical Data from a Table from a given Timestamp:
select * from my_table at(timestamp => 'Fri, 26 May 2023 1:30:00 -0700’::timestamp_tz);
The below query pulls Historical Data from a Table that was updated 5 minutes ago:
select * from my_table at(offset => -60*5);
The below query collects Historical Data from a Table up to the given statement’s Modifications (Statement ID):
select * from my_table before(statement => '8e5d0ca9-005a-44f6-a858-b8f5b37c5726');
The below query is used to restore Database EMP:
UNDROP DATABASE EMP
The below query is used to restore Database EMP:
The following graphic from the Snowflake documentation summarizes all the above points visually:
Snowflake preserves the previous state of the data when DML operations are performed. By default, all Snowflake accounts have a standard retention duration of one day which is automatically enabled.
For Snowflake Standard Edition, the Retention Period can be adjusted to 0 from default 1 day for all objects (Temporary & Permanent).
For Snowflake Enterprise Edition (or higher) it gives more flexibility for setting retention period, that is The Retention Time for permanent Databases, Schemas, and Tables can be configured to any number between 0 and 90 days whereas for temporary objects it can be set to 0 from the default 1 day.
The below query sets a retention period of 90 days while creating the table:
create table my_table(col1 number, col2 date) data_retention_time_in_days=90;
Snowflake provides another exciting feature called Fail-safe where historical data can be protected in case of any failure. Fail-safe allows a maximum period of 7 days which begins after the Time Travel retention period ends wherein Historical data can be recovered. Recovering data through Fail-safe can take hours to days and it involves cost.
The number of days historical data is maintained is based on the table type and the Fail-safe period for the table. Transient and temporary tables have no Fail-safe period.
Storage fees are incurred for maintaining historical data during both the Time Travel and Fail-safe periods. The fees are calculated for each 24 hours (i.e. 1 day) from the time the data changed. The number of days historical data is maintained is based on the table type and retention period set for the table.
Snowflake minimizes the amount of storage required for historical data by maintaining only the information required to restore the individual table rows that were updated or deleted. As a result, storage usage is calculated as a percentage of the table that changed. In most cases, Snowflake does not keep a full copy of data. Only when tables are dropped or truncated, full copies of tables are maintained.
To manage the storage costs Snowflake provides two table types: TEMPORARY & TRANSIENT, which do not incur the same fees as standard (i.e. permanent) tables:
Transient tables can have a Time Travel retention period of either 0 or 1 day.
Temporary tables can also have a Time Travel retention period of 0 or 1 day; however, this retention period ends as soon as the table is dropped or the session in which the table was created ends.
Transient and temporary tables have no Fail-safe period.
The maximum additional fees incurred for Time Travel and Fail-safe by these types of tables are limited to 1 day.
The above table illustrates the different scenarios, based on table type.
{{cta('1308a939-5241-47c3-bf0f-864090d8516d')}}
Snowflake Time Travel is a powerful feature that enables users to examine data usage and manipulations over a specific time. Syntax to query with time travel is fairly the same as in SQL Server which is easy to understand and execute. Users can restore deleted objects, make duplicates, make a Snowflake backup, and recover historical data.
Fast-growing tech companies partner with Encora to outsource product development and drive growth. Contact us to learn more about our software engineering capabilities.