PyIceberg 0.2.1: Iceberg ❤️ PyArrow & DuckDB

1 year ago
64

In this video, we demonstrate the new features of PyIceberg 0.2.1. For the demo, we use the docker-spark-iceberg setup that's available here: https://github.com/tabular-io/docker-spark-iceberg
After spinning up the docker-compose setup, the Jupyter notebook will be available at http://localhost:8888/

The notebook PyIceberg - Getting Started.ipynb will guide you through how to read data into PyArrow, and then Pandas. And in the last part, it will demonstrate how to query the Pandas dataset using DuckDB.

For a complete overview of all the installation options, please refer to the documentation: https://py.iceberg.apache.org/

If there are any questions, please reach out using the Iceberg Slack: https://iceberg.apache.org/community/ or open an issue or pull request on Github https://github.com/apache/iceberg

#iceberg #python #pyarrow #duckdb #tabular #datalake

Loading comments...