Linear Regression using Apache Spark ML vs Sci-Kit Learn

Matthew Salminen
7 min readNov 12, 2023
Image Source: https://www.analyticsvidhya.com/blog/2022/05/an-end-to-end-guide-on-ml-pipeline-using-apache-spark-in-python/

In my last article I was able to do the best I could in predicting when a two-hour marathon would be broken using machine learning. I am not an ML expert by any means but I wanted to dig as deep as I could in understanding linear, log, and polynomial regression. Although it was possible to complete a model with the popular sci-kit learn library, my prediction didn’t seem to support my…

--

--

Matthew Salminen

Marathoner | Trail Runner | Data Engineer | living in Irvine, CA