
Learning Apache Spark 2
By :

Apache Spark provides a number of classification and regression algorithms. The main algorithms are listed as follows.
In machine learning and statistics, classification is the problem of identifying to which of a set of categories (sub-populations) a new observation belongs, on the basis of a training set of data containing observations (or instances) whose category membership is known. Typically in classification cases, the dependent variables are categorical. A very common example is classification of e-mail as spam versus not spam. The major algorithms that come with Spark include the following:
In machine learning and statistics, Regression is a process by which we estimate or predict a response based on the model trained based on previous data sets....