While building this data pipeline, what are the assumptions that we make? What is the schema of data? Is it the same present in the “Train” dataset?
Do we read from a Kafka Stream or we Push to it?
Can anyone please shed some light on this?