The training points Xi for which λ: > 0 are called the support vectors. In practice, we fre- quently encounter training data sets for which the support vectors are a small minority of the training points, especially when the number of training points is much larger than the number of features (i.e., the dimension of the feature space). Explain why the support vectors are the only training points needed to evaluate the decision rule. Then explain why the non-support vectors nonetheless still have some influence on the decision rule what is the nature of that influence?