Platform engineering is the discipline of designing and building toolchains and workflows that enable self-service capabilities for software engineering organizations in the cloud-native era.

Platform Engineering

We built a *real-time lakehouse* using *S3* Tables + AWS *Glue* + Apache *Doris(*<https://doris.apache.org/>*)* to query Iceberg data in S3 directly: No data copies, no endless ETL jobs, no wait.

:bricks: Stacks:
:one: S3 Tables store data in the Iceberg format, bringing ACID, schema evolution, and time travel to S3 buckets.

:two: AWS Glue data catalog as the metadata layer, keeping track of table schemas, snapshots, and partitions.

:three: Apache Doris (via VeloDB Cloud) as the compute engine. Doris connects to Glue and queries S3 Tables directly, delivering sub-second analytics and high concurrency, all without data movement.

P.S. Doris can be both a query engine on top of table formats and a real-time data warehouse when you need to materialize and accelerate results.

This pipeline is also applicable to many other open-source combinations, with table formats like Iceberg, Paimon, catalogs like Unity, Polaris, Gravitino, and query engines like Spark, Flink, Trino.
Full demo step in blog post(<https://www.velodb.io/blog/real-time-lakehouse-s3-tables-aws-glue-and-apache-doris>)