How ML models will be developed and deployed



As ML models are developed to cover the whole data stored in a big data architecture,I have few questions.

  1. How do we analyze the whole data,which tool and technique could be used?
    As python/R has limitation in analyzing the whole data in a standalone m/c,How could we do the same with whole data in hadoop clusters?
  2. How to develop and train the ML models in big data environment?
    Should we create pyspark/python/R models in single stand alone machines and deploy the same in production?
    Or is there another way to address the scenario.