Skip to content

Latest commit

 

History

History
20 lines (15 loc) · 800 Bytes

File metadata and controls

20 lines (15 loc) · 800 Bytes

bigdata_test

A first go at big data: using a naive density estimator and a Bayes classifiers

Theory

There are some really great tutorials (actually lecture notes) by Andrew Moore. You can find them on his website

Data

There is a useful set repository with data sets, provided by the UCI Machine Learning Repository. You can find it here.

Technology

I'm using PostGreSQL the manage the data. It all runs pretty fine on my BeagleBone Black.

Todo

There are still a number of todos and fixes left. For instance: there need to be some testing and statistics on prediction accuracy. We also need to upgrade the user friendlyness. Perhaps make a nice http post / rest interdace?