Python-v1.6.2 Released

What's Changed
chore: bump version for release by @ion-elgreco in #4562
fix: consolidate table URL parsing for Python and Rust and logic dedup by @khalidmammadov in #4537
feat(core): expose LogicalFileView::partition_values_map as public API by @BoazC-MSFT in #4564
fix: gate python sftp dependency on windows by @ethan-tyler in #4571
docs: move feature table to docs by @plaindocs in #4569
fix: apply merge barrier before duplicate validation by @ethan-tyler in #4577
fix: preserve container values with bad null counts by @ethan-tyler in #4581
chore: bump version to 1.6.2 by @ion-elgreco in #4583

New Contributors
@BoazC-MSFT made their first contribution in #4564

Full Changelog: python-v1.6.1...python-v1.6.2

u/PrideDense2206 — 1 day ago

▲ 11 r/ApacheIceberg+1 crossposts

testing iceberg rests catalog

I spent some time testing writing to Iceberg REST catalogs across vendors. Same code, seven catalogs.

Code's public 👇
github.com/djouallah/te...

u/PrideDense2206 — 4 days ago

▲ 2 r/rust

Rust Engineers in SF or Bay Area

Happy Thursday. I’m working with a few people to kick off a new (in-person) meetup in San Francisco and we’re having trouble figuring out how to reach the broader Rust community.

If you are working with Rust, thinking about getting into Rust, or want to help build up the Rust+AI community. I’d be happy to have you all.

Any ideas would be welcome.

(I feel like Berlin and Europe have attracted all the talent!)

reddit.com

u/PrideDense2206 — 15 days ago

▲ 16 r/DeltaLake+1 crossposts

Follow-up: delta-explain is now more stable. Looking for Delta Lake users willing to test it

Hi there,

a few months ago I posted here about delta-explain, a small tool I was building to inspect Delta Lake pruning and data skipping.

I’ve kept working on it, and it is now in a more stable state. I’m looking for a few people who work with Delta Lake and would be willing to test it on real tables.

delta-explain makes Delta Lake file pruning visible from metadata. Given a table and a predicate, it shows how partition pruning and data skipping affect the set of files that would still need to be scanned. It can be used from the CLI, from a Python script, or as a GitHub Action in a CI pipeline.

I’m mainly looking for feedback on the basics. Is the output understandable? Does the installation work smoothly? Are the explanations in the documentation clear enough? Are there situations where the result looks wrong or unclear?

I’d also be interested in technical feedback on edge cases: are there table layouts, predicates, or statistics patterns where a metadata-based pruning explanation would be especially useful, confusing, or easy to misread?

Project: https://github.com/cdelmonte-zg/delta-explain
Documentation: https://cdelmonte-zg.github.io/delta-explain/
PyPI: https://pypi.org/project/delta-explain/

Thanks!

u/LongjumpingOption523 — 15 days ago

▲ 6 r/NineSols

First Hours In

I am loving the art style and direction of the game. In terms of difficulty this one is tough. I just finished >!General Yingzhoa!<, and that was a slaughter. I finally got my stride about 20 deaths in, but was wondering if it is worth doing the play through in standard mode and just getting good, or just switching to story mode and accepting that my reaction time might be too slow.

Is there a trophy for completing in standard mode.

reddit.com

u/PrideDense2206 — 1 month ago

▲ 4 r/Saros

So Close to No Hit on Prophet

I keep trying to get a no hit Biome for the trophy.

Prophet is easily one of my favorite bosses since he is painfully difficult in the early game, and so much fun late game. I got hit by a stray projectile early on…. Boo, but it was still a pretty great fight.

u/PrideDense2206 — 2 months ago

▲ 2 r/Saros

Endgame Bosses

I generally found Priestess to be the more difficult of the endgame fights. The run up to the King was more like a Nightmare Strand as a level (so having done a bunch of those, I felt prepared).

But Priestess required me to think. Oh, I need to ring the tuning forks, I need to parkour over a bunch of obstacles, but in contrast, the King was no more difficult than the other bosses (except the no Second Chance thjng).

Anyone else feel like this?

reddit.com

u/PrideDense2206 — 2 months ago

▲ 10 r/DeltaLake+2 crossposts

Integrating the Rust Delta Kernel into ClickHouse

delta.io

u/PrideDense2206 — 2 months ago

▲ 4 r/DeltaLake

Delta Lake Community Sync: IcebergCompatV3 in delta-kernel-rs

Wondering about Apache Iceberg v3 compatibility inside delta-kernel-rs? Check out this video to hear from Songhang.

Want to get involved in the project? Reach out on the Delta Users slack (https://go.delta.io/slack) through the #delta-kernel channel.

youtu.be

u/PrideDense2206 — 2 months ago

▲ 4 r/DeltaLake

Recording is live for the Delta Lake Community Meetup - 5/19

Thanks to everyone who came out for today's Delta Lake community meetup.

We covered the following:

changes to the way we are handling github Issues and PR backlogs, and the introduction of the "stale", "not-stale" labels
introduced catalog commits and catalog managed tables, and discussed the difference between the traditional Delta tables (transaction log in the file system) and how catalog managed tables change the way commits are resolved and managed via Unity Catalog or any other "catalog-commit" enables catalog. Ultimately, the delta table still syncs via the catalog to the file system, but the source of truth for commits and the state of the table resides in the catalog.
We discussed when, where, how, why to use traditional delta lake tables vs catalog managed tables.
We talked about performance optimizations and updates to delta-rs. See https://github.com/delta-io/delta-rs/blob/main/CHANGELOG.md for rust-v0.32.3 and python-v1.6.0 to follow along with the changes Tyler was discussing.
We finished the conversation with a discussion on the unified Delta kernel - what it takes to provide a single kernel for both Java and Rust ecosystems. Nick dove into the new changes in the java FFM to discuss how foreign functions and memory changes native function calling in a world that was only JNI for years. This makes it easier to support calling java functions from rust, and enable the roundtrip between the delta-kernel-rs and the calling engine.

Thanks again for coming out. As always, pop over to https://delta.io/community/ to get involved.

---

If you want to test out catalog managed tables with Apache Spark, please take a look at our unitycatalog-playground project.

Also, if you want to learn more about delta-rs, take a look at this video: https://www.youtube.com/watch?v=i_jwF2sLRFs from Robert Pack and Zach Schuerman.

youtube.com

u/PrideDense2206 — 2 months ago

▲ 1 r/Saros

Halcyon Drop Rate?

I’m wondering if people are finding themselves playing through multiple biomes with maybe one drop or less. Is there a trick to getting it to drop more or is this more of a grind it out thing

reddit.com

u/PrideDense2206 — 2 months ago

▲ 7 r/DeltaLake+1 crossposts

Delta Lake Community Meetup

Hey everyone, we’ve got another exciting community meetup coming up next week. We’ve got all the details and an area for discussions or questions/answers on GitHub (link below).

See you next week!

https://github.com/delta-io/delta/discussions/6705

luma.com

u/PrideDense2206 — 2 months ago

▲ 9 r/DeltaLake

Delta Grows Up: Writes, Unity Catalog and Time Travel

TL;DR: DuckDB's Delta and Unity Catalog extensions shed their experimental tags — now with writes, Unity Catalog and time travel support.

duckdb.org

u/PrideDense2206 — 2 months ago

▲ 11 r/Saros

This trophy a nod to Expedition 33

Cause I love that game too.

u/PrideDense2206 — 3 months ago

▲ 6 r/DeltaLake

What are Delta Lake Catalog-Managed Tables?

The next evolution of Delta: Catalog-Managed Tables

The data ecosystem is moving toward a catalog-centric model for managing open table formats. As open catalogs gain adoption, the catalog has emerged as the system of record for table identity, discovery, and authorization.

With Delta Lake 4.1.0, Delta introduces catalog-managed tables, which establish the catalog as the coordinator of table access and source of truth for table state. This simplifies how tables are discovered and secured, enables consistent governance across engines, and unlocks faster performance. The design also aligns Delta with the catalog-managed model pioneered by Iceberg, creating a shared foundation for interoperable, high-performance lakehouse tables.

Unity Catalog is the first open lakehouse catalog to support catalog-managed tables, extending unified governance across any format.

What are Catalog-Managed Tables?

Catalog-managed tables are tables for which the catalog brokers table access as well as stores the table’s latest metadata and commits. Clients reference the table by name, not by path, and use the catalog to resolve the table’s storage location. The catalog also manages concurrency control for proposed writes to a table. Writers leverage the catalog, not object store APIs, for atomic commits.

For more details, see the Delta protocol RFC on Github here. See how Unity Catalog implements support for the Catalog-Managed Tables specification here.

Before: Challenges with Delta tables that were managed by the filesystem

Before catalog-managed tables, the filesystem – not the catalog – was the primary authority for table access and changes to table state.

To access filesystem-managed Delta tables, Delta clients first look at the transaction log (_delta_log) stored with the table to determine the latest version. Clients then reconstruct the current state of the table by replaying the log entries, which describe the table’s schema and data files that belong to the table. Once the table state is known, the system reads the relevant data files to answer the query. When writing to the table, clients write new data files to storage and then atomically commit a new transaction log entry via filesystem APIs to advance the table to a new version.

Historically, data teams have faced the following challenges with filesystem-managed Delta tables:

Brittle path-based access: Delta clients have to know the exact path of the filesystem-managed table they are accessing, and credentials have to be provisioned directly by the storage system. This tightly couples applications to physical storage locations, so routine changes like table relocation, storage reorganization, or credential rotation could break pipelines and queries.
Risky coarse-grained authorization: Filesystems lack fine-grained access control, so complying with data privacy requirements often requires splitting datasets across multiple tables or storage paths to isolate sensitive fields or records. This leads to duplicated data, fragmented governance, and fragile pipelines.
Unsafe schema changes: Path-based writes can modify table schemas or metadata without validation, potentially introducing incompatible changes that break downstream workloads. This occurs because storage credentials cannot distinguish between clients authorized to write data and those authorized to modify table metadata.
Bottlenecked performance: Replaying the Delta transaction log to resolve a table’s latest state requires multiple calls to the filesystem, which can add 100+ ms to query execution.

Now: Catalog-Managed Delta Tables address these challenges

Catalog-managed tables address the governance and performance challenges by involving the catalog in read, write, and authorization coordination. This way, teams can unlock:

Standardized table discovery: The catalog provides stable logical table identifiers (such as Unity Catalog’s three-level namespace), eliminating the need for clients to depend on physical storage paths for discovery.
Unified governance: The catalog is responsible for granting clients access to data, rather than teams needing to manage fragmented access policies across their storage systems. This dramatically simplifies how data teams ensure engines access their data in a governed manner.
Enforceable constraints: The catalog can authoritatively validate or reject schema and constraint changes, preventing incompatible updates that could compromise data integrity or break downstream workloads.
Faster query planning and faster writes: If a Delta client is trying to access a table, the catalog can directly inform it of the table-level metadata. This skips cloud storage entirely and removes a major source of metadata latency. This feature also opens the door for “inline commits” where the (metadata) content of the commit is sent directly to the catalog.

Catalog-managed Delta tables dramatically simplify how engines discover and access data under consistent governance, all while improving read and write performance. Table state updates are flushed to the filesystem, reinforcing Delta’s openness and portability.

How do Catalog-Managed Tables work?

The Catalog-Managed Tables Delta feature fundamentally changes how Delta tables are discovered, read, and committed to.

Table Discovery

For catalog-managed tables, Delta tables are discovered and accessed through the catalog, not by filesystem paths. Engines must first resolve a table by name via the catalog, establishing table identity, location, and access credentials. This resolution step occurs before the Delta client interacts with the filesystem and determines the rules the client must follow for subsequent reads and writes.

Reads

A catalog-managed table may have commits that have been ratified by the catalog but not yet flushed, or “published”, to the filesystem. Reads therefore begin by getting these latest commits from the catalog, typically via a get_catalog_commits API exposed by the catalog.

If additional history is required, such as older published commits or checkpoints, Delta clients can LIST the filesystem and merge those published commits with the catalog-provided commits to construct a complete snapshot. This split view allows catalogs to always provide the most recent table state while offloading long-term commit storage to the filesystem.

Writes

Previously, writing to a Delta table involved calling filesystem “PUT-if-absent” APIs to perform atomic writes with mutual exclusion. In this model, the filesystem determined which writes win. While simple and scalable, this approach treated commits as opaque blobs: the filesystem could not inspect commit contents, enforce constraints, or coordinate writes across tables.

For catalog-managed tables, clients propose commits to the catalog, typically by first staging commits in the filesystem’s <table_path>/_delta_log/_staged_commits directory and then requesting ratification. Staging ensures that readers never observe unapproved commits. The protocol also allows for “inline” commits, where the contents of the commit are sent directly to the catalog, skipping the 100ms+ filesystem write. Staged commits are still performed using optimistic concurrency control to provide transactional guarantees.

Catalogs can also define their own commit APIs, allowing them to accept richer commit payloads, inspect actions and metadata, enforce constraints, and apply catalog-level policies before ratifying a commit.

To unburden catalogs from having to store these ratified commits indefinitely, ratified commits can be periodically “published” to the _delta_log in the filesystem. Once published, catalogs no longer need to retain or serve those commits because clients can easily discover them by listing.

Evolving open table formats

Catalog-managed Delta tables represent a critical convergence between how data is stored and how it is governed. Open table formats and open catalogs are evolving together so that governance becomes a native property of the table itself rather than an external overlay.

As an added benefit, Delta’s new catalog-oriented design closely resembles that of Iceberg tables. Ultimately, this makes it simpler for practitioners to discover and govern data consistently, regardless of table format.

We are excited to continue collaborating with the ecosystem to evolve Delta with open catalogs so that they deliver performant commits, efficient metadata management, multi-engine interoperability, and unified governance.

>You can read the original blog post over on delta.io

To learn more about Catalog Managed Tables / Catalog Commits, check out our video on youtube.

u/PrideDense2206 — 3 months ago

▲ 6 r/Silksong

If you are a completionist, then your Silksong journey is going to take you to some interesting places. The First Sinner is hands down one of my favorite boss fights. It leaves no room for error and she even heals if you let her.

Anyone else have a great time in this fight?

u/PrideDense2206 — 3 months ago

▲ 8 r/DeltaLake+1 crossposts

The Delta Lake 4 journey has marked a shift from the file system to the catalog. Each release has deepened support for catalog-managed tables and extended that design philosophy across the Delta ecosystem. Delta Lake 4.2 advances on two fronts: Kernel expands outward with a new Apache Flink connector, streaming improvements, and broader data type support. Catalog-managed tables also mature with atomic operations, schema evolution from SQL, and synchronous UniForm.

New kernel-based Apache Flink Connector

-- Create the clickstream landing table as a Unity Catalog managed table
CREATE TEMPORARY TABLE clickstream_raw (
  event_date STRING,
  event_type STRING,
  user_id STRING
) WITH (
  'connector' = 'delta',
  'table_name' = 'clickstream_raw',
  'unitycatalog.name' = 'prod',
  'unitycatalog.endpoint' = '&lt;endpoint&gt;',
  'unitycatalog.token' = '&lt;token&gt;',
  'partitions' = 'event_date',
  'uid' = 'clickstream-ingest'
);

-- Stream events into the table via Flink SQL
INSERT INTO clickstream_raw VALUES
  ('2026-04-20', 'click',    'user_1'),
  ('2026-04-20', 'purchase', 'user_2'),
  ('2026-04-22', 'click',    'user_4');

Simplified Schema Evolution

We've introduced INSERT INTO … BY NAME now supports automatic schema evolution when autoMerge is enabled, adding new columns to the table schema as part of the commit. For SQL-first teams, this removes one of the last reasons to drop into a DataFrame notebook just to evolve a schema

SET spark.databricks.delta.schema.autoMerge.enabled = true;

INSERT INTO prod.consumer.clickstream BY NAME
SELECT event_date, event_type, user_id, device_type
FROM prod.consumer.clickstream_raw
WHERE event_date = '2026-04-23';

Data Type Support

In Delta Kernel, we add support for geospatial, collation, and variant types. Here’s how a clickstream pipeline can push event-specific properties into a single Variant payload:

CREATE TABLE prod.consumer.clickstream_v2 (
  event_date DATE,
  event_type STRING,
  user_id STRING,
  device_type STRING,
  properties VARIANT
)
USING DELTA
PARTITIONED BY (event_date);

INSERT INTO prod.consumer.clickstream_v2 BY NAME
SELECT event_date, event_type, user_id, device_type,
       parse_json(raw_properties) AS properties
FROM prod.consumer.clickstream_raw WHERE event_date = '2026-04-24';

This is just a few highlights from the release, for the full blog post take a look at https://delta.io/blog/2026-04-17-delta-4-2-released/.

u/PrideDense2206 — 3 months ago

▲ 2 r/DeltaLake

Join us for this first 𝗗𝗲𝗹𝘁𝗮 𝗟𝗮𝗸𝗲 𝗖𝗼𝗺𝗺𝘂𝗻𝗶𝘁𝘆 𝗠𝗲𝗲𝘁𝘂𝗽 of 2026 on Tuesday, April 21 at 9AM PT! 🚀

We’re bringing the community together for a deep dive into the ecosystem, infrastructure enhancements, and the future project roadmap. Come get your technical questions answered live by the maintainers.

𝗪𝗵𝗮𝘁 𝘄𝗲'𝗹𝗹 𝗰𝗼𝘃𝗲𝗿:

🔹 Latest Delta Lake updates and how the community is evolving

🔹 A technical look at infrastructure enhancements

🔹 The future of Delta Lake: Roadmap insights and a deep dive into Iceberg v4 compatible metadata

🔹 Live Q&A with the community

RSVP 👇

https://luma.com/deltalake-0426

u/PrideDense2206 — 3 months ago