Month: February 2019
2 posts
How to install Spark (PySpark) on Windows
Apache spark is a general-purpose cluster computing engine aimed mainly at distributed data processing. In this tutorial, we will walk you through the step by step process of setting up Apache Spark on Windows.
A look into ETA Problem using Regression in Python – Machine Learning
The term ETA here refers to the Estimated Completion Time of a computational process in general. In particular, this problem is specific to estimating completion time a batch of long scripts running parallel to each other.