Search this blog

Thursday 19 January 2012

Reading wav files into RapidMiner using R

I prefer using RapidMiner to create complex processes. It's easier to get the big picture perspective of a process using the connected operator view. This means I tend to try and bring things into RapidMiner from elsewhere where possible.

An example of this is reading wav files; very easy in R. Here's an example process that uses R to do this and imports the results into RapidMiner.

You will need to uncomment the line in the R script to fetch the tuneR library.

You will also have to provide the path to the wav file you want to import.

There may be wav files it cannot read but this is an exercise for the reader.

Wednesday 11 January 2012

Convert date to unix timestamp

Use the "Generate Attributes" operator and create an attribute with the following expression.

round (date_diff (date_parse (0), dateToConvert,"en","GMT")/1000)

Monday 9 January 2012

Recurrence plots

Recurrence plots are interesting - see this link for a very good Web site that explains them and what they can be used for. In particular, see this link for an animation that explains how they are made.

Inspired by this I made an example using RapidMiner. It works as follows
  1. A simple sine series is made.
  2. The "Windowing" operator converts the series into a set of triplets
  3. The "Multiply" operator makes a copy of the example set
  4. The "Cross Distances" operator calculates the distances between all the points in the two copies
  5. The "Filter Examples" operator allows filtering to be done. The smaller the distance the closer the points.
To see the result use the Block plotter and you should get this rather nice looking picture.

For fun, I downloaded the Google share price and I made this even nicer looking picture (this shows distances less than 0.1)

I suspect that when it comes to the application of recurrence plots to share prices there is more money in the art of looking rather than the science of predicting.