Weka

"The overall goal of our project is to build a state-of-the-art facility for developing machine learning (ML) techniques and to apply them to real-world data mining problems. Our team has incorporated several standard ML techniques into a software "workbench" called WEKA, for Waikato Environment for Knowledge Analysis. With it, a specialist in a particular field is able to use ML to derive useful knowledge from databases that are far too large to be analysed by hand. WEKA's users are ML researchers and industrial scientists, but it is also widely used for teaching."

The book produced as part of the Weka project is the one from which COC131 is based. The website includes the Weka software which will be used in the later workshop sessions and coursework. I strongly recommend downloading the software and playing with it at your leisure. Its Java based and therefore should run on most common platforms. If you need aditional help with WEKA, or want to find out more, the installation comes with a tutorial on the "explorer" part of WEKA, which is what we will be using, in later tutorials.

Tutorial 01 (13/02/09)

Get the old faithful data-set (.csv) here
Get the tutorial 01 exercises here
Get the tutorial 01 solutions here
Statistics revision for Tutorial 01 here

Tutorial 02 (20/02/09)

Get the iris data-set (.arff) here
Get the tutorial 02 exercises here

Tutorial 03 (27/02/09)

Get the tutorial 03 exercises here

Tutorial 04 (06/03/09)

Tutorial 03 exercises and clarification of any issues from earlier tutorials

Tutorial 05 (13/03/09)

Get the tutorial 04 exercises here

Tutorial 06 (20/03/09)

Get the flags data-set (.arff) here
Get the whole euro data-set (.arff) here
Get the tutorial 05 exercises here

Tutorial 07 (27/03/09)

Get the weather.nominal data-set (.arff) here
Get the weather.nominal data-set (.csv) here
Get the tutorial 06 exercises here

Tutorial 08 (24/04/09)

I have uploaded our last tutorial, so that you can go through the exercises in your own time before the tutorial.
Get the iris training data-set (.arff) here
Get the iris testing data-set (.arff) here
Get the tutorial 07 exercises here

Coursework

Data-sets
Get a whole selection of other data-sets (including the ones needed for the coursework) here. Extract the data files from the .jar file using your favorite extraction program.

Questions related to coursework part-B can also be directed to me over email.