Search this blog

Saturday 16 June 2012

Operators that deserve to be better known: part IV

The "Materialize Data" operator can be very useful. It has the effect of making a fresh copy of the data which I take to mean the example set or sets. Using this operator can often cure some tricky problems.

As an example, here is a process that does a simple cross validation on the sample sonar data set (you may have to change the location for your setup). Inside the cross validation the intermediate training and test data sets are remembered. In a later loop, these saved example sets are recalled. This shows details of how cross validation works.

The "Remember" operator fails to save the correct example if the preceding "Materialize Data" is disabled. In addition, the collection that is recalled later does not display correctly (on my 64 bit Windows machine). Enabling the "Materialize Data" operator restores correct operation.