In [1]:
Copied!
from lab_tools import CIFAR10, get_hog_image
dataset = CIFAR10('../../extern_data/CIFAR10/')
from lab_tools import CIFAR10, get_hog_image
dataset = CIFAR10('../../extern_data/CIFAR10/')
Pre-loading training data Pre-loading test data
1. Nearest Neighbor¶
The following example uses the Nearest Neighbor algorithm on the Histogram of Gradient decriptors in the dataset.
In [2]:
Copied!
from sklearn.neighbors import KNeighborsClassifier
clf = KNeighborsClassifier(n_neighbors=1)
clf.fit( dataset.train['hog'], dataset.train['labels'] )
from sklearn.neighbors import KNeighborsClassifier
clf = KNeighborsClassifier(n_neighbors=1)
clf.fit( dataset.train['hog'], dataset.train['labels'] )
Out[2]:
KNeighborsClassifier(n_neighbors=1)
- What is the descriptive performance of this classifier ?
- Modify the code to estimate the predictive performance.
- Use cross-validation to find the best hyper-parameters for this method.
In [3]:
Copied!
# -- Your code here -- #
# -- Your code here -- #
2. Decision Trees¶
Decision Trees classify the data by splitting the feature space according to simple, single-feature rules. Scikit-learn uses the CART algorithm for its implementation of the classifier.
- Create a simple Decision Tree classifier using scikit-learn and train it on the HoG training set.
- Use cross-validation to find the best hyper-paramters for this method.
In [4]:
Copied!
from sklearn import tree
# --- Your code here --- #
from sklearn import tree
# --- Your code here --- #
3. Random Forests¶
Random Forest classifiers use multiple decision trees trained on "weaker" datasets (less data and/or less features), averaging the results so as to reduce over-fitting.
- Use scikit-learn to create a Random Forest classifier on the CIFAR data.
- Use cross-validation to find the best hyper-paramters for this method.
In [5]:
Copied!
from sklearn import ensemble
# --- Your code here --- #
from sklearn import ensemble
# --- Your code here --- #