Support Vector Classifier (SVC) is a form of Support Vector Machines (SVM) capable of categorizing inputs under supervised training. This blog post discusses how to implement SVC using Python using the scikit learn module. We will walk through a simple example below.
To start off, import the necessary support modules, which includes numpy
,and SVC
from sklearn.svm
, and joblib
from sklearn.externals
.
#!/usr/bin/python import numpy as np from sklearn.svm import SVC from sklearn.externals import joblib
Next we create two arrays, X
and y
. X
is a list of feature vectors, each feature vector a list in itself. In the example below, each feature vector has two elements, and there are 8 sample feature vectors. y
is the label vector, each element in the vector is a ‘label’; that is, the label you want the SVC to associate the corresponding input with. In the below example, the feature vector, [-2, -2]
will be associated with label ‘d’ (fifth element of y
).
# X is a list of feature vector. X = np.array([[-1, -1], [-2, -1], [1, 1], [2, 1], [-2, -2], [8, 4], [5, 3], [4, 8]]) # y is a list of labels. A label could be integer number or a string y = np.array(['a', 'a', 'c', 'b', 'd', 'c', 'd', 'e'])
Now, we can create an SVC and train it with X
and y
:
# clf is the SVM Classifier clf = SVC() # Train the SVC clf.fit(X, y)
To test the network’s ability to generalize, we pick a feature vector that is very close to one of the feature vectors used for training. We do this because we know what the right answer ought to be. For instance, since we know [-2, -2]
is associated with label ‘d’, we would expect that [-1.8, -2.1]
to also respond with label ‘d’.
# Print the prediction - note [-1.8, -2.1] is very close to [-2, -2], # which maps to 'd' positionally (5th position), 'd' should be the output print(clf.predict([-1.8, -2.1]))
Finally, after the SVC is trained, we want the learning to persist by saving the SVC to a file. This is done using joblib
:
# save file as binary joblib.dump(clf, "mysvm_save.pkl", compress=9)
To test the save file, load it into another variable, clf2
, and test it:
# reload saved file and run the svm clf2 = joblib.load("mysvm_save.pkl") print(clf2.predict([-1.8, -2.1])) # 'd' should be the output
Now run the code:
$ python sci2.py ['d'] ['d']
Sure enough, the results are as expected.