Machine Learning is a hot topic for nowadays. ML is a core part of Data Analysis and an auxiliary tool in a lot of domains (NLP, search engines, e-commerce solutions and etc). Many ML related courses available on the Coursera in “Statistics, Data Analysis, and Scientific Computing” and “Computer Science: Artificial Intelligence, Robotics, Vision” sections. Kaggle holds ML competitions more and more often.
Java has some popular and recognized ML libraries such as Mahout and Weka, but it is much harder to find .NET high performance ML library (which does not run on the IKVM.NET).
What is already available in .NET World?
As Don Syme said, it would be cool to have an independent comparison of already available ML libraries. We need to understand what is suitable for what needs.
There are 116 Nuget packages for query “machine learning”. Anyone into #fsharp know/reviewed/used these? nuget.org/packages?q=mac…
— Don Syme (@dsyme) December 18, 2012
Also I want to mention some most promising of them:
- Encog Machine Learning Framework
- WekaSharp: An F# Wrapper for Weka
- AForge.NET : AForge.Neuro and AForge.MachineLearning
- Accord.MachineLearning
- NUML
- ILNumerics
- Infer.NET
- Microsoft Sho
What can we do?
We are talking that F# is great for data scientists and statisticians and so it is! We still do not have mature F# ML library, but we have a lot of posts about ML and a lot of interest in this domain:
- Clifford Champion and F-AI.
- Jorn Nielsen and libml.
- Mathias Brandewinder with ” Machine Learning in Action” post series and sources code:
- Reto Matter and his Neural Network series:
- Andy Gordon and Infer.NET:
- Yin Zhu:
- “F# as a Octave/Matlab replacement for Machine Learning” by Gustavo Guerra
- “K-means step by step in F#” by Anton Kropp.
- Chris Smith:
- “Discrete Classification using F# and Fuzzy Logic” by TechNeilogy
It is time to put it all together into FShapr.ML. This can be done in two parts: a complete functional ML framework plus a collection of useful customizable samples.
Since the title reads .NET – why limit to F#? There are some alternatives to C#, among which is one, which does not only offer popular ML routines but even a whole bunch of common math support functionality. Ndim arrays, all the matlab-like convenience and even 3D plotting: http://ilnumerics.net … I wonder, if this could be used from F# as well?
I have chosen F# because:
– It is great language for such tasks
– I want to propose this to F# community
– I see that there are people who are interested in
I do not suggest to start new ML library from the scratch, just to collect all that we already have and continue working together.
ILNumerics is an option. I have missed it during post preparation. Thanks for that (Already added to post). It could be used from F# surely.
Sergey,
I am experimenting with SQL Server Analysis Service (SSAS) – Data Mining (will perhaps blog about it later).
There are many high-performance algorithms available in SSAS which are accessible from .Net / F#. See here: http://msdn.microsoft.com/en-us/library/ms175595.aspx
Data Mining models can be scripted with Analysis Management Objects (AMO):
http://msdn.microsoft.com/en-us/library/ms124924(v=SQL.90).aspx
It might be good to have a Data Mining Extensions (DMX) Type Provider for accessing SSAS Data Mining functionality seamlessly from F#:
http://msdn.microsoft.com/en-us/library/ms132058.aspx#BKMK_Queries
It seems that SSAS offers rich capabilities in this area but most are not aware of them.
Extremely overall quite fascinating post. I was searching for this sort of data and delighted in perusing this one. Continue posting. A debt of gratitude is in order for sharing.data analytics course in Hyderabad