In order for our networks to learn anything, we need a dataset that contains inputs and targets. PyBrain has the pybrain.dataset package for this, and we will use the SupervisedDataSet class for our needs.
The SupervisedDataSet class is used for standard supervised learning. It supports input and target values, whose size we have to specify on object creation:
>>> from pybrain.datasets import SupervisedDataSet >>> ds = SupervisedDataSet(2, 1)
Here we have generated a dataset that supports two dimensional inputs and one dimensional targets.
A classic example for neural network training is the XOR function, so let’s just build a dataset for this. We can do this by just adding samples to the dataset:
>>> ds.addSample((0, 0), (0,)) >>> ds.addSample((0, 1), (1,)) >>> ds.addSample((1, 0), (1,)) >>> ds.addSample((1, 1), (0,))
We now have a dataset that has 4 samples in it. We can check that with python’s idiomatic way of checking the size of something:
>>> len(ds) 4
We can also iterate over it in the standard way:
>>> for inpt, target in ds: ... print inpt, target ... [ 0. 0.] [ 0.] [ 0. 1.] [ 1.] [ 1. 0.] [ 1.] [ 1. 1.] [ 0.]
We can access the input and target field directly as arrays:
>>> ds['input'] array([[ 0., 0.], [ 0., 1.], [ 1., 0.], [ 1., 1.]]) >>> ds['target'] array([[ 0.], [ 1.], [ 1.], [ 0.]])
It is also possible to clear a dataset again, and delete all the values from it:
>>> ds.clear() >>> ds['input'] array(, shape=(0, 2), dtype=float64) >>> ds['target'] array(, shape=(0, 1), dtype=float64)