1 iceberg getting started
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName("Jupyter").getOrCreate()
spark
25/06/20 09:57:25 WARN SparkSession: Using an existing Spark session; only runtime SQL configurations will take effect.
SparkSession - in-memory
Load One Month of NYC Taxi/Limousine Trip Data¶
For this notebook, we will use the New York City Taxi and Limousine Commision Trip Record Data that's available on the AWS Open Data Registry. This contains data of trips taken by taxis and for-hire vehicles in New York City. We'll save this into an iceberg table called taxis
.
To be able to rerun the notebook several times, let's drop the table if it exists to start fresh.
%%sql
SHOW CATALOGS
catalog |
---|
demo |
spark_catalog |
%%sql
SHOW DATABASES
namespace |
---|
nyc |
%%sql
CREATE DATABASE IF NOT EXISTS nyc
%%sql
DROP TABLE IF EXISTS nyc.taxis
df = spark.read.parquet("/home/iceberg/data/yellow_tripdata_2021-04.parquet")
df.write.saveAsTable("nyc.taxis")
%%sql
DESCRIBE EXTENDED nyc.taxis
col_name | data_type | comment |
---|---|---|
VendorID | bigint | None |
tpep_pickup_datetime | timestamp_ntz | None |
tpep_dropoff_datetime | timestamp_ntz | None |
passenger_count | double | None |
trip_distance | double | None |
RatecodeID | double | None |
store_and_fwd_flag | string | None |
PULocationID | bigint | None |
DOLocationID | bigint | None |
payment_type | bigint | None |
fare_amount | double | None |
extra | double | None |
mta_tax | double | None |
tip_amount | double | None |
tolls_amount | double | None |
improvement_surcharge | double | None |
total_amount | double | None |
congestion_surcharge | double | None |
airport_fee | double | None |
# Metadata Columns | ||
_spec_id | int | |
_partition | struct<> | |
_file | string | |
_pos | bigint | |
_deleted | boolean | |
# Detailed Table Information | ||
Name | demo.nyc.taxis | |
Type | MANAGED | |
Location | s3://warehouse/nyc/taxis | |
Provider | iceberg | |
Owner | root | |
Table Properties | [created-at=2025-06-20T09:58:21.220162508Z,current-snapshot-id=4656165136069422702,format=iceberg/parquet,format-version=2,write.format.default=parquet,write.parquet.compression-codec=zstd] |
%%sql
SELECT COUNT(*) as cnt
FROM nyc.taxis
cnt |
---|
2171187 |
Schema Evolution¶
Adding, dropping, renaming, or altering columns is easy and safe in Iceberg. In this example, we'll rename fare_amount
to fare
and trip_distance
to distance
. We'll also add a float column fare_per_distance_unit
immediately after distance
.
%%sql
ALTER TABLE nyc.taxis RENAME COLUMN fare_amount TO fare
%%sql
ALTER TABLE nyc.taxis RENAME COLUMN trip_distance TO distance
%%sql
ALTER TABLE nyc.taxis ALTER COLUMN distance COMMENT 'The elapsed trip distance in miles reported by the taximeter.'
%%sql
ALTER TABLE nyc.taxis ALTER COLUMN distance TYPE double;
%%sql
ALTER TABLE nyc.taxis ALTER COLUMN distance AFTER fare;
%%sql
ALTER TABLE nyc.taxis
ADD COLUMN fare_per_distance_unit float AFTER distance
Let's update the new fare_per_distance_unit
to equal fare
divided by distance
.
%%sql
UPDATE nyc.taxis
SET fare_per_distance_unit = fare/distance
%%sql
SELECT
VendorID
,tpep_pickup_datetime
,tpep_dropoff_datetime
,fare
,distance
,fare_per_distance_unit
FROM nyc.taxis
VendorID | tpep_pickup_datetime | tpep_dropoff_datetime | fare | distance | fare_per_distance_unit |
---|---|---|---|---|---|
1 | 2021-04-01 00:00:18 | 2021-04-01 00:21:54 | 25.5 | 8.4 | 3.0357143878936768 |
1 | 2021-04-01 00:42:37 | 2021-04-01 00:46:23 | 5.0 | 0.9 | 5.55555534362793 |
1 | 2021-04-01 00:57:56 | 2021-04-01 01:08:22 | 11.5 | 3.4 | 3.382352828979492 |
1 | 2021-04-01 00:01:58 | 2021-04-01 00:54:27 | 44.2 | 0.0 | None |
2 | 2021-04-01 00:24:55 | 2021-04-01 00:34:33 | 9.0 | 1.96 | 4.591836929321289 |
2 | 2021-04-01 00:19:16 | 2021-04-01 00:21:46 | 4.5 | 0.77 | 5.844155788421631 |
2 | 2021-04-01 00:25:11 | 2021-04-01 00:31:53 | 11.5 | 3.65 | 3.1506848335266113 |
1 | 2021-04-01 00:27:53 | 2021-04-01 00:47:03 | 26.5 | 8.9 | 2.9775280952453613 |
2 | 2021-04-01 00:24:24 | 2021-04-01 00:37:50 | 12.0 | 2.98 | 4.026845455169678 |
1 | 2021-04-01 00:19:18 | 2021-04-01 00:41:25 | 28.0 | 8.9 | 3.1460673809051514 |
2 | 2021-04-01 00:04:25 | 2021-04-01 00:29:22 | 23.5 | 7.48 | 3.1417112350463867 |
2 | 2021-04-01 00:03:07 | 2021-04-01 00:18:02 | 13.5 | 3.39 | 3.9823009967803955 |
2 | 2021-04-01 00:35:44 | 2021-04-01 00:51:06 | 14.0 | 3.51 | 3.9886040687561035 |
2 | 2021-04-01 00:52:32 | 2021-04-01 01:04:41 | 12.5 | 3.42 | 3.654970645904541 |
2 | 2021-04-01 00:28:05 | 2021-04-01 00:47:59 | 33.5 | 12.14 | 2.7594728469848633 |
1 | 2021-04-01 00:39:01 | 2021-04-01 00:57:39 | 32.0 | 11.8 | 2.711864471435547 |
2 | 2021-04-01 00:15:10 | 2021-04-01 00:22:46 | 7.5 | 1.44 | 5.208333492279053 |
2 | 2021-04-01 00:30:46 | 2021-04-01 00:39:52 | 8.5 | 1.65 | 5.151515007019043 |
2 | 2021-04-01 00:48:18 | 2021-04-01 01:06:50 | 25.0 | 8.16 | 3.063725471496582 |
1 | 2021-04-01 00:19:42 | 2021-04-01 00:33:25 | 21.5 | 7.4 | 2.9054055213928223 |
2 | 2021-04-01 00:14:42 | 2021-04-01 00:42:59 | 38.5 | 13.65 | 2.8205127716064453 |
2 | 2021-04-01 00:48:53 | 2021-04-01 01:02:10 | 12.5 | 3.14 | 3.980891704559326 |
2 | 2021-04-01 00:54:51 | 2021-04-01 01:01:47 | 8.0 | 1.89 | 4.232804298400879 |
1 | 2021-04-01 00:17:17 | 2021-04-01 00:43:38 | 42.5 | 15.5 | 2.7419354915618896 |
1 | 2021-04-01 00:24:04 | 2021-04-01 00:56:20 | 52.0 | 20.1 | 2.587064743041992 |
2 | 2021-04-01 00:31:12 | 2021-04-01 00:36:07 | 6.0 | 1.21 | 4.958677768707275 |
2 | 2021-04-01 00:36:46 | 2021-04-01 00:40:07 | 4.5 | 0.5 | 9.0 |
2 | 2021-04-01 00:27:29 | 2021-04-01 00:42:54 | 28.0 | 10.12 | 2.766798496246338 |
1 | 2021-04-01 00:30:59 | 2021-04-01 00:36:40 | 6.0 | 1.0 | 6.0 |
2 | 2021-04-01 00:10:17 | 2021-04-01 00:25:00 | 27.0 | 9.4 | 2.872340440750122 |
2 | 2021-04-01 00:12:31 | 2021-04-01 00:14:32 | 4.5 | 0.89 | 5.056180000305176 |
2 | 2021-04-01 00:22:25 | 2021-04-01 00:39:31 | 16.5 | 4.67 | 3.5331904888153076 |
1 | 2021-04-01 00:39:29 | 2021-04-01 01:02:44 | 52.0 | 17.2 | 3.0232558250427246 |
2 | 2021-04-01 00:36:24 | 2021-04-01 23:39:59 | 45.5 | 16.09 | 2.827843427658081 |
2 | 2021-04-01 00:01:40 | 2021-04-01 00:06:54 | 8.0 | 2.06 | 3.8834950923919678 |
2 | 2021-04-01 00:32:22 | 2021-04-01 00:42:47 | 11.0 | 3.24 | 3.395061731338501 |
2 | 2021-04-01 00:53:54 | 2021-04-01 00:58:37 | 7.5 | 1.93 | 3.8860104084014893 |
2 | 2021-04-01 00:14:08 | 2021-04-01 00:23:22 | 15.0 | 4.77 | 3.1446540355682373 |
1 | 2021-04-01 00:13:34 | 2021-04-01 00:29:53 | 24.0 | 8.3 | 2.891566276550293 |
1 | 2021-04-01 00:20:43 | 2021-04-01 00:52:07 | 52.0 | 18.9 | 2.7513227462768555 |
2 | 2021-04-01 00:46:15 | 2021-04-01 00:52:13 | 6.0 | 1.04 | 5.769230842590332 |
2 | 2021-04-01 00:29:43 | 2021-04-01 00:37:43 | 10.0 | 3.08 | 3.246753215789795 |
2 | 2021-04-01 00:33:29 | 2021-04-01 01:04:31 | 52.0 | 19.46 | 2.6721479892730713 |
1 | 2021-04-01 00:25:43 | 2021-04-01 00:58:50 | 85.0 | 26.3 | 3.2319390773773193 |
1 | 2021-04-01 00:03:42 | 2021-04-01 00:16:54 | 20.0 | 7.0 | 2.857142925262451 |
1 | 2021-04-01 00:32:51 | 2021-04-01 00:44:32 | 12.5 | 3.6 | 3.472222328186035 |
1 | 2021-04-01 00:59:07 | 2021-04-01 01:04:18 | 6.0 | 1.2 | 5.0 |
2 | 2021-03-31 23:57:46 | 2021-04-01 00:01:31 | 5.5 | 1.18 | 4.661016941070557 |
2 | 2021-04-01 00:34:30 | 2021-04-01 00:40:33 | 6.0 | 0.95 | 6.315789699554443 |
2 | 2021-04-01 00:48:51 | 2021-04-01 00:55:10 | 7.0 | 1.44 | 4.861111164093018 |
2 | 2021-04-01 00:19:34 | 2021-04-01 00:24:29 | 6.0 | 1.05 | 5.714285850524902 |
2 | 2021-04-01 00:43:38 | 2021-04-01 01:06:40 | 30.0 | 7.54 | 3.9787797927856445 |
2 | 2021-04-01 00:43:47 | 2021-04-01 01:19:20 | 51.5 | 18.99 | 2.711953639984131 |
1 | 2021-04-01 00:03:09 | 2021-04-01 00:09:11 | 8.5 | 2.4 | 3.5416667461395264 |
2 | 2021-04-01 00:12:20 | 2021-04-01 00:57:11 | 43.0 | 13.27 | 3.240391969680786 |
2 | 2021-03-31 23:56:44 | 2021-04-01 00:01:14 | 5.0 | 0.69 | 7.246376991271973 |
2 | 2021-04-01 00:04:04 | 2021-04-01 00:09:28 | 6.0 | 1.12 | 5.357142925262451 |
2 | 2021-04-01 00:17:47 | 2021-04-01 00:33:39 | 15.0 | 4.08 | 3.6764705181121826 |
2 | 2021-04-01 00:51:32 | 2021-04-01 01:09:05 | 21.0 | 6.34 | 3.312302827835083 |
1 | 2021-04-01 00:15:13 | 2021-04-01 00:36:09 | 28.2 | 0.0 | None |
2 | 2021-04-01 00:01:02 | 2021-04-01 00:07:46 | 7.0 | 1.69 | 4.142011642456055 |
2 | 2021-04-01 00:16:31 | 2021-04-01 00:20:07 | 5.0 | 0.86 | 5.813953399658203 |
2 | 2021-04-01 00:17:14 | 2021-04-01 00:43:30 | 23.0 | 6.72 | 3.422619104385376 |
2 | 2021-04-01 00:07:48 | 2021-04-01 00:33:07 | 34.5 | 12.37 | 2.789005756378174 |
2 | 2021-04-01 00:32:23 | 2021-04-01 00:51:58 | 26.0 | 9.11 | 2.85400652885437 |
2 | 2021-04-01 00:01:58 | 2021-04-01 00:08:13 | 6.5 | 1.33 | 4.887217998504639 |
2 | 2021-04-01 00:16:48 | 2021-04-01 00:23:45 | 9.0 | 2.66 | 3.3834586143493652 |
2 | 2021-04-01 00:08:52 | 2021-04-01 00:21:46 | 17.5 | 5.4 | 3.2407407760620117 |
2 | 2021-04-01 00:53:47 | 2021-04-01 01:14:13 | 19.0 | 5.47 | 3.473491668701172 |
2 | 2021-04-01 00:13:06 | 2021-04-01 00:29:40 | 16.5 | 4.59 | 3.594771146774292 |
2 | 2021-04-01 00:35:02 | 2021-04-01 01:01:27 | 33.0 | 11.19 | 2.949061632156372 |
2 | 2021-04-01 00:05:00 | 2021-04-01 00:17:40 | 12.0 | 3.09 | 3.8834950923919678 |
2 | 2021-04-01 00:54:19 | 2021-04-01 23:25:04 | 12.5 | 3.3 | 3.7878787517547607 |
2 | 2021-04-01 00:11:19 | 2021-04-01 00:12:55 | 3.5 | 0.44 | 7.954545497894287 |
2 | 2021-04-01 00:26:23 | 2021-04-01 00:31:14 | 5.5 | 1.03 | 5.339805603027344 |
2 | 2021-04-01 00:12:04 | 2021-04-01 00:17:26 | 7.5 | 1.78 | 4.2134833335876465 |
2 | 2021-04-01 00:29:03 | 2021-04-01 00:49:47 | 32.0 | 11.41 | 2.8045573234558105 |
1 | 2021-04-01 00:06:19 | 2021-04-01 00:19:18 | 11.5 | 2.8 | 4.107142925262451 |
1 | 2021-04-01 00:31:32 | 2021-04-01 00:36:12 | 6.5 | 1.6 | 4.0625 |
2 | 2021-04-01 00:10:29 | 2021-04-01 00:14:55 | 5.5 | 0.86 | 6.395349025726318 |
2 | 2021-04-01 00:36:45 | 2021-04-01 00:42:05 | 6.0 | 0.91 | 6.593406677246094 |
2 | 2021-04-01 00:29:42 | 2021-04-01 00:58:06 | 52.0 | 17.9 | 2.9050278663635254 |
1 | 2021-04-01 00:01:15 | 2021-04-01 00:15:05 | 17.0 | 5.3 | 3.207547187805176 |
2 | 2021-04-01 00:13:04 | 2021-04-01 00:21:48 | 12.0 | 3.5 | 3.4285714626312256 |
1 | 2021-04-01 00:06:13 | 2021-04-01 00:29:21 | 36.5 | 13.3 | 2.74436092376709 |
2 | 2021-04-01 00:04:50 | 2021-04-01 00:20:32 | 27.0 | 9.69 | 2.7863776683807373 |
1 | 2021-04-01 00:02:44 | 2021-04-01 00:21:21 | 18.0 | 5.4 | 3.3333332538604736 |
1 | 2021-04-01 00:44:08 | 2021-04-01 00:57:16 | 13.0 | 3.5 | 3.7142856121063232 |
2 | 2021-04-01 00:05:48 | 2021-04-01 00:25:39 | 21.5 | 6.59 | 3.262518882751465 |
2 | 2021-04-01 00:00:49 | 2021-04-01 00:12:50 | 12.0 | 3.0 | 4.0 |
1 | 2021-04-01 00:09:05 | 2021-04-01 00:33:26 | 34.0 | 12.0 | 2.8333332538604736 |
1 | 2021-04-01 00:02:07 | 2021-04-01 00:11:47 | 10.5 | 2.6 | 4.038461685180664 |
1 | 2021-04-01 00:41:49 | 2021-04-01 00:42:48 | 3.0 | 0.2 | 15.0 |
2 | 2021-04-01 00:03:37 | 2021-04-01 00:19:46 | 28.5 | 10.41 | 2.7377521991729736 |
1 | 2021-04-01 00:06:18 | 2021-04-01 00:13:35 | 6.5 | 0.9 | 7.222222328186035 |
1 | 2021-04-01 00:19:44 | 2021-04-01 00:24:55 | 5.5 | 1.1 | 5.0 |
2 | 2021-04-01 00:05:45 | 2021-04-01 00:15:21 | -9.0 | 1.96 | -4.591836929321289 |
2 | 2021-04-01 00:05:45 | 2021-04-01 00:15:21 | 9.0 | 1.96 | 4.591836929321289 |
2 | 2021-04-01 00:26:14 | 2021-04-01 00:30:22 | 5.0 | 0.85 | 5.882352828979492 |
2 | 2021-04-01 00:46:33 | 2021-04-01 00:58:41 | 15.5 | 4.47 | 3.4675614833831787 |
Expressive SQL for Row Level Changes¶
With Iceberg tables, DELETE
queries can be used to perform row-level deletes. This is as simple as providing the table name and a WHERE
predicate. If the filter matches an entire partition of the table, Iceberg will intelligently perform a metadata-only operation where it simply deletes the metadata for that partition.
Let's perform a row-level delete for all rows that have a fare_per_distance_unit
greater than 4 or a distance
greater than 2. This should leave us with relatively short trips that have a relatively high fare per distance traveled.
%%sql
DELETE FROM nyc.taxis
WHERE fare_per_distance_unit > 4.0 OR distance > 2.0
There are some fares that have a null
for fare_per_distance_unit
due to the distance being 0
. Let's remove those as well.
%%sql
DELETE FROM nyc.taxis
WHERE fare_per_distance_unit is null
%%sql
SELECT
VendorID
,tpep_pickup_datetime
,tpep_dropoff_datetime
,fare
,distance
,fare_per_distance_unit
FROM nyc.taxis
VendorID | tpep_pickup_datetime | tpep_dropoff_datetime | fare | distance | fare_per_distance_unit |
---|---|---|---|---|---|
2 | 2021-04-01 00:53:54 | 2021-04-01 00:58:37 | 7.5 | 1.93 | 3.8860104084014893 |
2 | 2021-04-01 00:05:45 | 2021-04-01 00:15:21 | -9.0 | 1.96 | -4.591836929321289 |
2 | 2021-04-01 00:16:50 | 2021-04-01 00:22:58 | 7.5 | 1.95 | 3.846153736114502 |
2 | 2021-04-01 00:22:33 | 2021-04-01 00:23:00 | -2.5 | 0.03 | -83.33333587646484 |
1 | 2021-04-01 00:48:37 | 2021-04-01 00:54:30 | 7.5 | 1.9 | 3.9473683834075928 |
2 | 2021-04-01 00:26:08 | 2021-04-01 00:29:36 | 6.5 | 1.79 | 3.6312849521636963 |
2 | 2021-04-01 00:10:58 | 2021-04-01 00:16:00 | 6.5 | 1.66 | 3.9156627655029297 |
2 | 2021-04-01 00:13:14 | 2021-04-01 00:16:13 | -4.5 | 0.73 | -6.164383411407471 |
1 | 2021-04-01 00:20:52 | 2021-04-01 00:27:13 | 8.0 | 2.0 | 4.0 |
2 | 2021-04-01 01:12:20 | 2021-04-01 01:17:19 | 7.0 | 1.92 | 3.6458332538604736 |
2 | 2021-04-01 01:13:21 | 2021-04-01 01:19:26 | 7.5 | 1.88 | 3.9893617630004883 |
2 | 2021-04-01 01:34:44 | 2021-04-01 01:37:56 | -4.5 | 0.65 | -6.92307710647583 |
2 | 2021-04-01 01:40:11 | 2021-04-01 01:40:44 | -3.0 | 0.25 | -12.0 |
1 | 2021-04-01 01:20:36 | 2021-04-01 01:24:39 | 6.5 | 1.7 | 3.8235294818878174 |
1 | 2021-04-01 02:00:45 | 2021-04-01 02:07:54 | 8.0 | 2.0 | 4.0 |
2 | 2021-04-01 02:20:02 | 2021-04-01 02:25:56 | 7.5 | 1.98 | 3.7878787517547607 |
1 | 2021-04-01 02:16:09 | 2021-04-01 02:19:05 | 0.0 | 1.8 | 0.0 |
1 | 2021-04-01 02:16:11 | 2021-04-01 02:23:16 | 8.0 | 2.0 | 4.0 |
2 | 2021-04-01 03:30:12 | 2021-04-01 03:34:12 | 6.5 | 1.76 | 3.6931817531585693 |
1 | 2021-04-01 03:12:19 | 2021-04-01 03:17:22 | 7.0 | 1.8 | 3.8888888359069824 |
2 | 2021-04-01 03:39:52 | 2021-04-01 03:54:10 | -11.5 | 1.91 | -6.020942211151123 |
2 | 2021-04-01 04:45:21 | 2021-04-01 04:50:11 | 6.5 | 1.7 | 3.8235294818878174 |
1 | 2021-04-01 04:56:59 | 2021-04-01 05:02:22 | 7.5 | 2.0 | 3.75 |
1 | 2021-04-01 04:44:55 | 2021-04-01 04:49:22 | 6.0 | 1.5 | 4.0 |
2 | 2021-04-01 05:59:37 | 2021-04-01 06:04:14 | -5.5 | 0.79 | -6.962025165557861 |
2 | 2021-04-01 05:37:22 | 2021-04-01 05:42:04 | 6.5 | 1.69 | 3.846153736114502 |
2 | 2021-04-01 05:26:28 | 2021-04-01 05:30:33 | 6.5 | 1.67 | 3.8922154903411865 |
1 | 2021-04-01 05:45:24 | 2021-04-01 05:50:38 | 7.5 | 1.9 | 3.9473683834075928 |
2 | 2021-04-01 05:35:16 | 2021-04-01 05:41:15 | 7.0 | 1.76 | 3.9772727489471436 |
2 | 2021-04-01 05:37:16 | 2021-04-01 05:42:23 | 7.0 | 1.76 | 3.9772727489471436 |
1 | 2021-04-01 05:46:30 | 2021-04-01 05:51:54 | 7.0 | 1.9 | 3.6842105388641357 |
1 | 2021-04-01 06:27:52 | 2021-04-01 06:32:25 | 7.0 | 1.8 | 3.8888888359069824 |
1 | 2021-04-01 06:23:54 | 2021-04-01 06:30:46 | 8.0 | 2.0 | 4.0 |
2 | 2021-04-01 06:21:02 | 2021-04-01 06:28:30 | 7.5 | 1.98 | 3.7878787517547607 |
2 | 2021-04-01 06:54:28 | 2021-04-01 07:00:00 | 7.0 | 1.82 | 3.846153736114502 |
2 | 2021-04-01 06:41:31 | 2021-04-01 06:46:39 | 6.5 | 1.67 | 3.8922154903411865 |
2 | 2021-04-01 06:32:13 | 2021-04-01 06:36:21 | 6.0 | 1.56 | 3.846153736114502 |
1 | 2021-04-01 06:24:30 | 2021-04-01 06:30:16 | 7.5 | 2.0 | 3.75 |
2 | 2021-04-01 06:45:04 | 2021-04-01 06:50:07 | 7.0 | 1.76 | 3.9772727489471436 |
1 | 2021-04-01 06:27:04 | 2021-04-01 06:31:10 | 6.0 | 1.5 | 4.0 |
1 | 2021-04-01 06:25:35 | 2021-04-01 06:30:46 | 7.5 | 2.0 | 3.75 |
1 | 2021-04-01 06:34:44 | 2021-04-01 06:39:05 | 6.5 | 1.7 | 3.8235294818878174 |
2 | 2021-04-01 06:58:56 | 2021-04-01 07:03:35 | 6.5 | 1.64 | 3.9634146690368652 |
2 | 2021-04-01 06:28:41 | 2021-04-01 06:33:51 | -5.5 | 0.97 | -5.670103073120117 |
2 | 2021-04-01 06:34:59 | 2021-04-01 06:39:25 | 7.0 | 1.88 | 3.7234041690826416 |
2 | 2021-04-01 06:42:04 | 2021-04-01 06:47:26 | 7.0 | 1.78 | 3.932584285736084 |
1 | 2021-04-01 06:32:45 | 2021-04-01 06:39:10 | 7.5 | 1.9 | 3.9473683834075928 |
2 | 2021-04-01 06:40:40 | 2021-04-01 06:46:36 | 7.5 | 1.91 | 3.926701545715332 |
2 | 2021-04-01 06:57:53 | 2021-04-01 06:58:16 | -2.5 | 0.1 | -25.0 |
2 | 2021-04-01 06:05:23 | 2021-04-01 06:09:28 | 6.5 | 1.66 | 3.9156627655029297 |
2 | 2021-04-01 06:52:38 | 2021-04-01 06:59:10 | -6.0 | 0.72 | -8.333333015441895 |
2 | 2021-04-01 07:28:40 | 2021-04-01 07:33:05 | 7.0 | 1.89 | 3.7037036418914795 |
2 | 2021-04-01 07:24:05 | 2021-04-01 07:28:37 | 6.5 | 1.75 | 3.7142856121063232 |
1 | 2021-04-01 07:39:03 | 2021-04-01 07:47:16 | 8.0 | 2.0 | 4.0 |
2 | 2021-04-01 07:44:04 | 2021-04-01 07:48:10 | -3.5 | 0.01 | -350.0 |
2 | 2021-04-01 07:16:17 | 2021-04-01 07:22:38 | 7.5 | 1.92 | 3.90625 |
2 | 2021-04-01 07:08:09 | 2021-04-01 07:14:02 | 7.0 | 1.75 | 4.0 |
1 | 2021-04-01 07:49:47 | 2021-04-01 07:55:45 | 7.0 | 1.8 | 3.8888888359069824 |
2 | 2021-04-01 07:35:24 | 2021-04-01 07:40:55 | 7.5 | 1.91 | 3.926701545715332 |
1 | 2021-04-01 07:17:43 | 2021-04-01 07:20:58 | 5.5 | 1.4 | 3.9285714626312256 |
2 | 2021-04-01 07:08:56 | 2021-04-01 07:13:09 | 6.0 | 1.53 | 3.9215686321258545 |
2 | 2021-04-01 07:00:44 | 2021-04-01 07:04:20 | 7.5 | 1.98 | 3.7878787517547607 |
2 | 2021-04-01 07:08:42 | 2021-04-01 07:12:32 | 7.0 | 1.82 | 3.846153736114502 |
2 | 2021-04-01 07:24:53 | 2021-04-01 07:26:46 | -4.0 | 0.58 | -6.896551609039307 |
2 | 2021-04-01 07:54:51 | 2021-04-01 08:02:32 | 8.0 | 2.0 | 4.0 |
2 | 2021-04-01 07:34:02 | 2021-04-01 07:39:00 | 7.0 | 1.75 | 4.0 |
2 | 2021-04-01 07:08:40 | 2021-04-01 07:15:09 | 7.5 | 1.93 | 3.8860104084014893 |
2 | 2021-04-01 07:41:03 | 2021-04-01 07:46:36 | 6.5 | 1.63 | 3.987730026245117 |
2 | 2021-04-01 07:33:52 | 2021-04-01 07:36:06 | -4.5 | 0.94 | -4.787233829498291 |
2 | 2021-04-01 07:20:30 | 2021-04-01 07:25:09 | 6.5 | 1.65 | 3.939393997192383 |
1 | 2021-04-01 07:01:27 | 2021-04-01 07:07:24 | 7.5 | 1.9 | 3.9473683834075928 |
2 | 2021-04-01 07:16:58 | 2021-04-01 07:17:13 | -2.5 | 0.08 | -31.25 |
2 | 2021-04-01 07:14:50 | 2021-04-01 07:19:35 | 6.5 | 1.66 | 3.9156627655029297 |
2 | 2021-04-01 07:45:48 | 2021-04-01 07:52:20 | 7.5 | 2.0 | 3.75 |
2 | 2021-04-01 07:00:10 | 2021-04-01 07:00:34 | -2.5 | 0.07 | -35.71428680419922 |
2 | 2021-04-01 07:04:34 | 2021-04-01 07:10:10 | 7.0 | 1.83 | 3.825136661529541 |
2 | 2021-04-01 07:23:20 | 2021-04-01 07:30:41 | 7.5 | 2.0 | 3.75 |
1 | 2021-04-01 07:12:14 | 2021-04-01 07:19:02 | 8.0 | 2.0 | 4.0 |
2 | 2021-04-01 07:04:35 | 2021-04-01 07:09:12 | 6.5 | 1.76 | 3.6931817531585693 |
1 | 2021-04-01 07:02:52 | 2021-04-01 07:08:27 | 7.5 | 1.9 | 3.9473683834075928 |
2 | 2021-04-01 07:27:23 | 2021-04-01 07:32:42 | 7.0 | 1.9 | 3.6842105388641357 |
2 | 2021-04-01 07:52:14 | 2021-04-01 07:57:29 | -6.0 | 1.06 | -5.660377502441406 |
1 | 2021-04-01 07:40:41 | 2021-04-01 07:46:59 | 7.5 | 2.0 | 3.75 |
2 | 2021-04-01 07:24:54 | 2021-04-01 07:30:04 | -5.5 | 0.85 | -6.470588207244873 |
2 | 2021-04-01 07:37:05 | 2021-04-01 07:43:31 | -6.0 | 0.58 | -10.344827651977539 |
2 | 2021-04-01 07:51:34 | 2021-04-01 07:51:50 | -2.5 | 0.01 | -250.0 |
2 | 2021-04-01 07:00:43 | 2021-04-01 07:05:50 | 7.0 | 1.82 | 3.846153736114502 |
1 | 2021-04-01 07:13:30 | 2021-04-01 07:13:38 | 2.5 | 0.9 | 2.777777671813965 |
1 | 2021-04-01 08:49:08 | 2021-04-01 08:53:13 | 6.0 | 1.5 | 4.0 |
2 | 2021-04-01 08:35:32 | 2021-04-01 08:40:39 | 7.5 | 1.94 | 3.8659794330596924 |
1 | 2021-04-01 08:13:54 | 2021-04-01 08:21:50 | 8.0 | 2.0 | 4.0 |
2 | 2021-04-01 08:31:29 | 2021-04-01 08:36:29 | 7.0 | 1.8 | 3.8888888359069824 |
2 | 2021-04-01 08:29:21 | 2021-04-01 08:30:04 | -3.0 | 0.32 | -9.375 |
2 | 2021-04-01 08:33:38 | 2021-04-01 08:34:43 | -3.0 | 0.08 | -37.5 |
2 | 2021-04-01 08:01:47 | 2021-04-01 08:07:02 | -5.5 | 0.98 | -5.612245082855225 |
2 | 2021-04-01 08:07:43 | 2021-04-01 08:08:05 | -2.5 | 0.08 | -31.25 |
2 | 2021-04-01 08:29:08 | 2021-04-01 08:32:44 | 5.5 | 1.39 | 3.956834554672241 |
1 | 2021-04-01 08:59:57 | 2021-04-01 09:07:01 | 7.5 | 1.9 | 3.9473683834075928 |
2 | 2021-04-01 08:55:46 | 2021-04-01 09:02:17 | -6.0 | 0.78 | -7.692307472229004 |
2 | 2021-04-01 08:16:46 | 2021-04-01 08:21:00 | 6.5 | 1.67 | 3.8922154903411865 |
%%sql
SELECT COUNT(*) as cnt
FROM nyc.taxis
cnt |
---|
17703 |
Partitioning¶
A table’s partitioning can be updated in place and applied only to newly written data. Query plans are then split, using the old partition scheme for data written before the partition scheme was changed, and using the new partition scheme for data written after. People querying the table don’t even have to be aware of this split. Simple predicates in WHERE clauses are automatically converted to partition filters that prune out files with no matches. This is what’s referred to in Iceberg as Hidden Partitioning.
%%sql
ALTER TABLE nyc.taxis
ADD PARTITION FIELD VendorID
%%sql
DESCRIBE TABLE EXTENDED nyc.taxis;
col_name | data_type | comment |
---|---|---|
VendorID | bigint | None |
tpep_pickup_datetime | timestamp_ntz | None |
tpep_dropoff_datetime | timestamp_ntz | None |
passenger_count | double | None |
RatecodeID | double | None |
store_and_fwd_flag | string | None |
PULocationID | bigint | None |
DOLocationID | bigint | None |
payment_type | bigint | None |
fare | double | None |
distance | double | The elapsed trip distance in miles reported by the taximeter. |
fare_per_distance_unit | float | None |
extra | double | None |
mta_tax | double | None |
tip_amount | double | None |
tolls_amount | double | None |
improvement_surcharge | double | None |
total_amount | double | None |
congestion_surcharge | double | None |
airport_fee | double | None |
# Partition Information | ||
# col_name | data_type | comment |
VendorID | bigint | None |
# Metadata Columns | ||
_spec_id | int | |
_partition | struct<VendorID:bigint> | |
_file | string | |
_pos | bigint | |
_deleted | boolean | |
# Detailed Table Information | ||
Name | demo.nyc.taxis | |
Type | MANAGED | |
Location | s3://warehouse/nyc/taxis | |
Provider | iceberg | |
Owner | root | |
Table Properties | [created-at=2025-06-20T09:58:21.220162508Z,current-snapshot-id=1280087261393946112,format=iceberg/parquet,format-version=2,write.format.default=parquet,write.parquet.compression-codec=zstd] |
Metadata Tables¶
Iceberg tables contain very rich metadata that can be easily queried. For example, you can retrieve the manifest list for any snapshot, simply by querying the table's snapshots
table.
%%sql
SELECT snapshot_id, manifest_list
FROM nyc.taxis.snapshots
snapshot_id | manifest_list |
---|---|
4656165136069422702 | s3://warehouse/nyc/taxis/metadata/snap-4656165136069422702-1-b07eea67-1b9f-4231-acb2-b6eed7ccd24c.avro |
1284970284820445596 | s3://warehouse/nyc/taxis/metadata/snap-1284970284820445596-1-0e8016e0-364d-488b-8f05-d763561d72df.avro |
3983088334768464117 | s3://warehouse/nyc/taxis/metadata/snap-3983088334768464117-1-13bb707e-e812-4b18-937e-55c4639723e0.avro |
1280087261393946112 | s3://warehouse/nyc/taxis/metadata/snap-1280087261393946112-1-319556dc-0c2f-4929-8104-7758597e94e8.avro |
The files
table contains loads of information on data files, including column level statistics such as null counts, lower bounds, and upper bounds.
%%sql
SELECT file_path, file_format, record_count, null_value_counts, lower_bounds, upper_bounds
FROM nyc.taxis.files
file_path | file_format | record_count | null_value_counts | lower_bounds | upper_bounds |
---|---|---|---|---|---|
s3://warehouse/nyc/taxis/data/00000-21-5b6b2036-89db-4c14-9bec-932bd1dda5f5-0-00001.parquet | PARQUET | 17703 | {1: 0, 2: 0, 3: 0, 4: 299, 5: 0, 6: 299, 7: 299, 8: 0, 9: 0, 10: 0, 11: 0, 12: 0, 13: 0, 14: 0, 15: 0, 16: 0, 17: 0, 18: 299, 19: 299, 20: 0} | {1: bytearray(b'\x01\x00\x00\x00\x00\x00\x00\x00'), 2: bytearray(b'@\x88-\xfe\xdd\xbe\x05\x00'), 3: bytearray(b'@\x98\x82 \xde\xbe\x05\x00'), 4: bytearray(b'\x00\x00\x00\x00\x00\x00\x00\x00'), 5: bytearray(b'{\x14\xaeG\xe1z\x84?'), 6: bytearray(b'\x00\x00\x00\x00\x00\x00\xf0?'), 7: bytearray(b'N'), 8: bytearray(b'\x01\x00\x00\x00\x00\x00\x00\x00'), 9: bytearray(b'\x01\x00\x00\x00\x00\x00\x00\x00'), 10: bytearray(b'\x00\x00\x00\x00\x00\x00\x00\x00'), 11: bytearray(b'\x00\x00\x00\x00\x00 l\xc0'), 12: bytearray(b'\x00\x00\x00\x00\x00\x00\x12\xc0'), 13: bytearray(b'\x00\x00\x00\x00\x00\x00\xe0\xbf'), 14: bytearray(b'\x85\xebQ\xb8\x1e\xd5t\xc0'), 15: bytearray(b'\x00\x00\x00\x00\x00\x00\x04\xc0'), 16: bytearray(b'333333\xd3\xbf'), 17: bytearray(b'R\xb8\x1e\x85\xebYu\xc0'), 18: bytearray(b'\x00\x00\x00\x00\x00\x00\x04\xc0'), 19: bytearray(b'\x00\x00\x00\x00\x00\x00\xf4\xbf'), 20: bytearray(b'\x00\x00\xfa\xc5')} | {1: bytearray(b'\x06\x00\x00\x00\x00\x00\x00\x00'), 2: bytearray(b'@\xa72h9\xc1\x05\x00'), 3: bytearray(b'\xc0X\xfclC\xc1\x05\x00'), 4: bytearray(b'\x00\x00\x00\x00\x00\x00\x18@'), 5: bytearray(b'\x00\x00\x00\x00\x00\x00\x00@'), 6: bytearray(b'\x00\x00\x00\x00\x00\xc0X@'), 7: bytearray(b'Y'), 8: bytearray(b'\t\x01\x00\x00\x00\x00\x00\x00'), 9: bytearray(b'\t\x01\x00\x00\x00\x00\x00\x00'), 10: bytearray(b'\x04\x00\x00\x00\x00\x00\x00\x00'), 11: bytearray(b'\x00\x00\x00\x00\x00\x00 @'), 12: bytearray(b'\x00\x00\x00\x00\x00\x00\x16@'), 13: bytearray(b'\x00\x00\x00\x00\x00\x00\xe0?'), 14: bytearray(b'\x00\x00\x00\x00\x00@U@'), 15: bytearray(b'\x00\x00\x00\x00\x00\x80;@'), 16: bytearray(b'333333\xd3?'), 17: bytearray(b'33333SU@'), 18: bytearray(b'\x00\x00\x00\x00\x00\x00\x04@'), 19: bytearray(b'\x00\x00\x00\x00\x00\x00\xf4?'), 20: bytearray(b'\x00\x00\x80@')} |
Time Travel¶
The history table lists all snapshots and which parent snapshot they derive from. The is_current_ancestor
flag let's you know if a snapshot is part of the linear history of the current snapshot of the table.
%%sql
SELECT *
FROM nyc.taxis.history
made_current_at | snapshot_id | parent_id | is_current_ancestor |
---|---|---|---|
2025-06-20 09:58:28.093000 | 4656165136069422702 | None | True |
2025-06-20 10:00:39.150000 | 1284970284820445596 | 4656165136069422702 | True |
2025-06-20 10:01:37.382000 | 3983088334768464117 | 1284970284820445596 | True |
2025-06-20 10:01:48.206000 | 1280087261393946112 | 3983088334768464117 | True |
You can time-travel by altering the current-snapshot-id
property of the table to reference any snapshot in the table's history. Let's revert the table to it's original state by traveling to the very first snapshot ID.
%%sql --var df
SELECT *
FROM nyc.taxis.history
made_current_at | snapshot_id | parent_id | is_current_ancestor |
---|---|---|---|
2025-06-20 09:58:28.093000 | 4656165136069422702 | None | True |
2025-06-20 10:00:39.150000 | 1284970284820445596 | 4656165136069422702 | True |
2025-06-20 10:01:37.382000 | 3983088334768464117 | 1284970284820445596 | True |
2025-06-20 10:01:48.206000 | 1280087261393946112 | 3983088334768464117 | True |
df.head()
Row(made_current_at=datetime.datetime(2025, 6, 20, 9, 58, 28, 93000), snapshot_id=4656165136069422702, parent_id=None, is_current_ancestor=True)
type(df)
pyspark.sql.dataframe.DataFrame
original_snapshot = df.head().snapshot_id
spark.sql(f"CALL system.rollback_to_snapshot('nyc.taxis', {original_snapshot})")
original_snapshot
4656165136069422702
%%sql
SELECT
VendorID
,tpep_pickup_datetime
,tpep_dropoff_datetime
,fare
,distance
,fare_per_distance_unit
FROM nyc.taxis
VendorID | tpep_pickup_datetime | tpep_dropoff_datetime | fare | distance | fare_per_distance_unit |
---|---|---|---|---|---|
1 | 2021-04-01 00:00:18 | 2021-04-01 00:21:54 | 25.5 | 8.4 | None |
1 | 2021-04-01 00:42:37 | 2021-04-01 00:46:23 | 5.0 | 0.9 | None |
1 | 2021-04-01 00:57:56 | 2021-04-01 01:08:22 | 11.5 | 3.4 | None |
1 | 2021-04-01 00:01:58 | 2021-04-01 00:54:27 | 44.2 | 0.0 | None |
2 | 2021-04-01 00:24:55 | 2021-04-01 00:34:33 | 9.0 | 1.96 | None |
2 | 2021-04-01 00:19:16 | 2021-04-01 00:21:46 | 4.5 | 0.77 | None |
2 | 2021-04-01 00:25:11 | 2021-04-01 00:31:53 | 11.5 | 3.65 | None |
1 | 2021-04-01 00:27:53 | 2021-04-01 00:47:03 | 26.5 | 8.9 | None |
2 | 2021-04-01 00:24:24 | 2021-04-01 00:37:50 | 12.0 | 2.98 | None |
1 | 2021-04-01 00:19:18 | 2021-04-01 00:41:25 | 28.0 | 8.9 | None |
2 | 2021-04-01 00:04:25 | 2021-04-01 00:29:22 | 23.5 | 7.48 | None |
2 | 2021-04-01 00:03:07 | 2021-04-01 00:18:02 | 13.5 | 3.39 | None |
2 | 2021-04-01 00:35:44 | 2021-04-01 00:51:06 | 14.0 | 3.51 | None |
2 | 2021-04-01 00:52:32 | 2021-04-01 01:04:41 | 12.5 | 3.42 | None |
2 | 2021-04-01 00:28:05 | 2021-04-01 00:47:59 | 33.5 | 12.14 | None |
1 | 2021-04-01 00:39:01 | 2021-04-01 00:57:39 | 32.0 | 11.8 | None |
2 | 2021-04-01 00:15:10 | 2021-04-01 00:22:46 | 7.5 | 1.44 | None |
2 | 2021-04-01 00:30:46 | 2021-04-01 00:39:52 | 8.5 | 1.65 | None |
2 | 2021-04-01 00:48:18 | 2021-04-01 01:06:50 | 25.0 | 8.16 | None |
1 | 2021-04-01 00:19:42 | 2021-04-01 00:33:25 | 21.5 | 7.4 | None |
2 | 2021-04-01 00:14:42 | 2021-04-01 00:42:59 | 38.5 | 13.65 | None |
2 | 2021-04-01 00:48:53 | 2021-04-01 01:02:10 | 12.5 | 3.14 | None |
2 | 2021-04-01 00:54:51 | 2021-04-01 01:01:47 | 8.0 | 1.89 | None |
1 | 2021-04-01 00:17:17 | 2021-04-01 00:43:38 | 42.5 | 15.5 | None |
1 | 2021-04-01 00:24:04 | 2021-04-01 00:56:20 | 52.0 | 20.1 | None |
2 | 2021-04-01 00:31:12 | 2021-04-01 00:36:07 | 6.0 | 1.21 | None |
2 | 2021-04-01 00:36:46 | 2021-04-01 00:40:07 | 4.5 | 0.5 | None |
2 | 2021-04-01 00:27:29 | 2021-04-01 00:42:54 | 28.0 | 10.12 | None |
1 | 2021-04-01 00:30:59 | 2021-04-01 00:36:40 | 6.0 | 1.0 | None |
2 | 2021-04-01 00:10:17 | 2021-04-01 00:25:00 | 27.0 | 9.4 | None |
2 | 2021-04-01 00:12:31 | 2021-04-01 00:14:32 | 4.5 | 0.89 | None |
2 | 2021-04-01 00:22:25 | 2021-04-01 00:39:31 | 16.5 | 4.67 | None |
1 | 2021-04-01 00:39:29 | 2021-04-01 01:02:44 | 52.0 | 17.2 | None |
2 | 2021-04-01 00:36:24 | 2021-04-01 23:39:59 | 45.5 | 16.09 | None |
2 | 2021-04-01 00:01:40 | 2021-04-01 00:06:54 | 8.0 | 2.06 | None |
2 | 2021-04-01 00:32:22 | 2021-04-01 00:42:47 | 11.0 | 3.24 | None |
2 | 2021-04-01 00:53:54 | 2021-04-01 00:58:37 | 7.5 | 1.93 | None |
2 | 2021-04-01 00:14:08 | 2021-04-01 00:23:22 | 15.0 | 4.77 | None |
1 | 2021-04-01 00:13:34 | 2021-04-01 00:29:53 | 24.0 | 8.3 | None |
1 | 2021-04-01 00:20:43 | 2021-04-01 00:52:07 | 52.0 | 18.9 | None |
2 | 2021-04-01 00:46:15 | 2021-04-01 00:52:13 | 6.0 | 1.04 | None |
2 | 2021-04-01 00:29:43 | 2021-04-01 00:37:43 | 10.0 | 3.08 | None |
2 | 2021-04-01 00:33:29 | 2021-04-01 01:04:31 | 52.0 | 19.46 | None |
1 | 2021-04-01 00:25:43 | 2021-04-01 00:58:50 | 85.0 | 26.3 | None |
1 | 2021-04-01 00:03:42 | 2021-04-01 00:16:54 | 20.0 | 7.0 | None |
1 | 2021-04-01 00:32:51 | 2021-04-01 00:44:32 | 12.5 | 3.6 | None |
1 | 2021-04-01 00:59:07 | 2021-04-01 01:04:18 | 6.0 | 1.2 | None |
2 | 2021-03-31 23:57:46 | 2021-04-01 00:01:31 | 5.5 | 1.18 | None |
2 | 2021-04-01 00:34:30 | 2021-04-01 00:40:33 | 6.0 | 0.95 | None |
2 | 2021-04-01 00:48:51 | 2021-04-01 00:55:10 | 7.0 | 1.44 | None |
2 | 2021-04-01 00:19:34 | 2021-04-01 00:24:29 | 6.0 | 1.05 | None |
2 | 2021-04-01 00:43:38 | 2021-04-01 01:06:40 | 30.0 | 7.54 | None |
2 | 2021-04-01 00:43:47 | 2021-04-01 01:19:20 | 51.5 | 18.99 | None |
1 | 2021-04-01 00:03:09 | 2021-04-01 00:09:11 | 8.5 | 2.4 | None |
2 | 2021-04-01 00:12:20 | 2021-04-01 00:57:11 | 43.0 | 13.27 | None |
2 | 2021-03-31 23:56:44 | 2021-04-01 00:01:14 | 5.0 | 0.69 | None |
2 | 2021-04-01 00:04:04 | 2021-04-01 00:09:28 | 6.0 | 1.12 | None |
2 | 2021-04-01 00:17:47 | 2021-04-01 00:33:39 | 15.0 | 4.08 | None |
2 | 2021-04-01 00:51:32 | 2021-04-01 01:09:05 | 21.0 | 6.34 | None |
1 | 2021-04-01 00:15:13 | 2021-04-01 00:36:09 | 28.2 | 0.0 | None |
2 | 2021-04-01 00:01:02 | 2021-04-01 00:07:46 | 7.0 | 1.69 | None |
2 | 2021-04-01 00:16:31 | 2021-04-01 00:20:07 | 5.0 | 0.86 | None |
2 | 2021-04-01 00:17:14 | 2021-04-01 00:43:30 | 23.0 | 6.72 | None |
2 | 2021-04-01 00:07:48 | 2021-04-01 00:33:07 | 34.5 | 12.37 | None |
2 | 2021-04-01 00:32:23 | 2021-04-01 00:51:58 | 26.0 | 9.11 | None |
2 | 2021-04-01 00:01:58 | 2021-04-01 00:08:13 | 6.5 | 1.33 | None |
2 | 2021-04-01 00:16:48 | 2021-04-01 00:23:45 | 9.0 | 2.66 | None |
2 | 2021-04-01 00:08:52 | 2021-04-01 00:21:46 | 17.5 | 5.4 | None |
2 | 2021-04-01 00:53:47 | 2021-04-01 01:14:13 | 19.0 | 5.47 | None |
2 | 2021-04-01 00:13:06 | 2021-04-01 00:29:40 | 16.5 | 4.59 | None |
2 | 2021-04-01 00:35:02 | 2021-04-01 01:01:27 | 33.0 | 11.19 | None |
2 | 2021-04-01 00:05:00 | 2021-04-01 00:17:40 | 12.0 | 3.09 | None |
2 | 2021-04-01 00:54:19 | 2021-04-01 23:25:04 | 12.5 | 3.3 | None |
2 | 2021-04-01 00:11:19 | 2021-04-01 00:12:55 | 3.5 | 0.44 | None |
2 | 2021-04-01 00:26:23 | 2021-04-01 00:31:14 | 5.5 | 1.03 | None |
2 | 2021-04-01 00:12:04 | 2021-04-01 00:17:26 | 7.5 | 1.78 | None |
2 | 2021-04-01 00:29:03 | 2021-04-01 00:49:47 | 32.0 | 11.41 | None |
1 | 2021-04-01 00:06:19 | 2021-04-01 00:19:18 | 11.5 | 2.8 | None |
1 | 2021-04-01 00:31:32 | 2021-04-01 00:36:12 | 6.5 | 1.6 | None |
2 | 2021-04-01 00:10:29 | 2021-04-01 00:14:55 | 5.5 | 0.86 | None |
2 | 2021-04-01 00:36:45 | 2021-04-01 00:42:05 | 6.0 | 0.91 | None |
2 | 2021-04-01 00:29:42 | 2021-04-01 00:58:06 | 52.0 | 17.9 | None |
1 | 2021-04-01 00:01:15 | 2021-04-01 00:15:05 | 17.0 | 5.3 | None |
2 | 2021-04-01 00:13:04 | 2021-04-01 00:21:48 | 12.0 | 3.5 | None |
1 | 2021-04-01 00:06:13 | 2021-04-01 00:29:21 | 36.5 | 13.3 | None |
2 | 2021-04-01 00:04:50 | 2021-04-01 00:20:32 | 27.0 | 9.69 | None |
1 | 2021-04-01 00:02:44 | 2021-04-01 00:21:21 | 18.0 | 5.4 | None |
1 | 2021-04-01 00:44:08 | 2021-04-01 00:57:16 | 13.0 | 3.5 | None |
2 | 2021-04-01 00:05:48 | 2021-04-01 00:25:39 | 21.5 | 6.59 | None |
2 | 2021-04-01 00:00:49 | 2021-04-01 00:12:50 | 12.0 | 3.0 | None |
1 | 2021-04-01 00:09:05 | 2021-04-01 00:33:26 | 34.0 | 12.0 | None |
1 | 2021-04-01 00:02:07 | 2021-04-01 00:11:47 | 10.5 | 2.6 | None |
1 | 2021-04-01 00:41:49 | 2021-04-01 00:42:48 | 3.0 | 0.2 | None |
2 | 2021-04-01 00:03:37 | 2021-04-01 00:19:46 | 28.5 | 10.41 | None |
1 | 2021-04-01 00:06:18 | 2021-04-01 00:13:35 | 6.5 | 0.9 | None |
1 | 2021-04-01 00:19:44 | 2021-04-01 00:24:55 | 5.5 | 1.1 | None |
2 | 2021-04-01 00:05:45 | 2021-04-01 00:15:21 | -9.0 | 1.96 | None |
2 | 2021-04-01 00:05:45 | 2021-04-01 00:15:21 | 9.0 | 1.96 | None |
2 | 2021-04-01 00:26:14 | 2021-04-01 00:30:22 | 5.0 | 0.85 | None |
2 | 2021-04-01 00:46:33 | 2021-04-01 00:58:41 | 15.5 | 4.47 | None |
Another look at the history table shows that the original state of the table has been added as a new entry with the original snapshot ID.
%%sql
SELECT *
FROM nyc.taxis.history
made_current_at | snapshot_id | parent_id | is_current_ancestor |
---|---|---|---|
2025-06-20 09:58:28.093000 | 4656165136069422702 | None | True |
2025-06-20 10:00:39.150000 | 1284970284820445596 | 4656165136069422702 | False |
2025-06-20 10:01:37.382000 | 3983088334768464117 | 1284970284820445596 | False |
2025-06-20 10:01:48.206000 | 1280087261393946112 | 3983088334768464117 | False |
2025-06-20 10:17:46.994000 | 4656165136069422702 | None | True |
%%sql
SELECT COUNT(*) as cnt
FROM nyc.taxis
cnt |
---|
2171187 |