Python + Upsolver: Simplified Realtime Data Workflows

1 year ago
24

One of the powerful things about Python is its ability to connect disparate tools into one common integrated development experience. In this talk, we’ll explore how to create and run a near real-time pipeline where we consume events from a Kafka topic and transform the data before landing them in the lake, using Upsolver through our Python SDK. In this way, we get exactly-once processing, strong ordering and automatic schema evolution out of the box thanks to the powerful Upsolver engine, but without having to switch to a different UI and building in SQL only.

Presenter: Santona Tuli

Santona Tuli, PhD began her data journey through fundamental physics—searching through massive event data from particle collisions at CERN to detect rare particles. She’s since extended her machine learning engineering to natural language processing, before switching focus to product and data engineering for data workflow authoring frameworks. As a python engineer, she started with the programmatic data orchestration tool, Airflow, helping improve its developer experience for data science and machine learning pipelines. Currently at Upsolver, she leads data engineering and science, driving developer research and engagement for the declarative workflow authoring framework in SQL. Dr. Tuli is passionate about building, as well as empowering others to build, end-to-end data and ML pipelines, scalably.

Note: The microphone wasn't in the best position to record the speaker. I've done my best to amplify the audio, but that includes some static noise in the background.

Loading comments...