Lakehouse Example with Apache Spark, Minio, Nessie Catalog, Iceberg and Docker

Lakehouse solutions, which offer us the comfort of a relational database on big data by combining the best aspects of the data warehouse and the data lake, take their place in our lives daily. Today we will create a simple lakehouse example on docker using completely open-source components. 1. Components That Make Up the Infrastructure […]

What is Kafka Connect?

Kafka Connect is an integral part of Apache Kafka and integrates other systems with Kafka. For example, Kafka Connect can be used to transfer changes from a database (source) to Kafka and write them from there to another data storage system (sink), thus allowing other applications/services (e.g. dashboard) to access real-time data. Kafka Connect provides […]