F# IKVM Type Provider

FourEightThree

After reading Sergey Tihon’s post about IKVM and the stanford parser. I thought I’d try and create a IKVM type provider.

The first version can be found on my GitHub account here.
The following code shows a simple sample on how to use the type provider.

The above is the same sample included in the source. As you can see it requires an IKVM distribution your machine and requires your consuming project to reference IKVM.OpenJDK.Core.

This is very early days, for this so I would be surprised if it works for anything even moderately complex. But I hope to change this to support everything that IKVM supports assuming I get the time, in my day job :). Feel free to have a play and let me know what works and what doesn’t.

The biggest gripe I have at the minute is that the type providers don’t allow direct references…

View original post 62 more words

ServiceStack: New API – F# Sample (Web Service out of a web server)

Two weeks ago in F# Weekle #6 2013 I mentioned Don Syme’s “F# + ServiceStack – F# Web Services on any platform in and out of a web server” post. There were two samples of  using ServiceStack from F#. One of these examples is given on ServiceStack wiki page in Self Hosting section. It is also detailed in Demis Bellot’s “F# Web Services on any platform in and out of a web server!” post.

Unfortunately, this example is already obsolete. Some time ago, ServiceStack released a brand new API that significantly changed programming approach, especially routing (for details see “ServiceStack’s new API design“). But I am happy to say that you can find an updated example below!

New design is more typed. In the previous version IService‘s methods returned the Object, but now Service returns concrete type that is defined by IReturn<T> interface of request message.

open System
open ServiceStack.ServiceHost
open ServiceStack.WebHost.Endpoints
open ServiceStack.ServiceInterface

[<CLIMutable>]
type HelloResponse = { Result:string }

[<Route("/hello")>]
[<Route("/hello/{Name}")>]
type Hello() =
    interface IReturn<HelloResponse>
    member val Name = "" with get, set

type HelloService() =
    inherit Service()
    member this.Any (request:Hello) =
        {Result = "Hello," + request.Name}

//Define the Web Services AppHost
type AppHost() =
    inherit AppHostHttpListenerBase("Hello F# Services", typeof<HelloService>.Assembly)
    override this.Configure container = ignore()

//Run it!
[<EntryPoint>]
let main args =
    let host = if args.Length = 0 then "http://*:8080/" else args.[0]
    printfn "listening on %s ..." host
    let appHost = new AppHost()
    appHost.Init()
    appHost.Start host
    Console.ReadLine() |> ignore
    0

For comparison, the previous version is:

open System
open ServiceStack.ServiceHost
open ServiceStack.WebHost.Endpoints

type Hello = { mutable Name: string; }
type HelloResponse = { mutable Result: string; }
type HelloService() =
    interface IService with
        member this.Any (req:Hello) = { Result = "Hello, " + req.Name }

//Define the Web Services AppHost
type AppHost =
    inherit AppHostHttpListenerBase
    new() = { inherit AppHostHttpListenerBase("Hello F# Services", typeof<HelloService>.Assembly) }
    override this.Configure container =
        base.Routes
            .Add<Hello>("/hello")
            .Add<Hello>("/hello/{Name}") |> ignore

//Run it!
[<EntryPoint>]
let main args =
    let host = if args.Length = 0 then "http://*:1337/" else args.[0]
    printfn "listening on %s ..." host
    let appHost = new AppHost()
    appHost.Init()
    appHost.Start host
    Console.ReadLine() |> ignore
    0

Update: An example of ServiceStack New API for F# 2.0 users. F# 2.0 does not have val keyword / auto-properties which were used in the first example.

</pre>
open System
open ServiceStack.ServiceHost
open ServiceStack.WebHost.Endpoints
open ServiceStack.ServiceInterface

type Project() =
    let mutable projectID = 0
    let mutable projectName = ""
    let mutable companyName = ""
    let mutable projectStatus = ""
    member this.ProjectID with get() = projectID and set(pid) = projectID <-pid
    member this.ProjectName with get() = projectName and set(pn) = projectName <- pn
    member this.CompanyName with get() = companyName and set(cn) = companyName <- cn
    member this.ProjectStatus with get() = projectStatus and set(ps) = projectStatus <-ps

type ProjectResponse() =
    let mutable projects = List.empty<Project>
    member this.Projects with get() = projects and set(pr) = projects <- pr

[<Route("/Project/{ProjectName}")>]
type ProjectRequest() =
    let mutable projectName = ""
    interface IReturn<ProjectResponse>
    member this.ProjectName with get() = projectName and set(n) = projectName <- n

type ProjectService() =
    inherit Service()
    member this.Any (request:ProjectRequest) =
        ProjectResponse(
             Projects = [Project(ProjectName=request.ProjectName, ProjectID=1, CompanyName="A")])

//Define the Web Services AppHost
type AppHost() =
    inherit AppHostHttpListenerBase("Project F# Services", typeof<ProjectService>.Assembly)
    override this.Configure container = ignore()

//Run it!
[<EntryPoint>]
let main args =
    let host = if args.Length = 0 then "http://*:8080/" else args.[0]
    printfn "listening on %s ..." host
    let appHost = new AppHost()
    appHost.Init()
    appHost.Start host
    Console.ReadLine() |> ignore
    0
<pre>

F# Image Blurrer

Image blurring is a king of popular task during presentation preparation. For example, if you want show something but hide sensitive information. Of course you can buy Photoshop but it is too expensive for such a simple task. Also you can download Paint.NET, but this is not an option for F# geek – it is too easy=). It is much better to write something by yourself (Binaries are available as well as source code).

open System
open System.IO
open System.Drawing
open System.Drawing.Imaging

let blur (image:Bitmap) blurSize =
    let blurred = new Bitmap(image.Width, image.Height)
    use graphics = Graphics.FromImage(blurred)
    let rectangle = Rectangle(0, 0, image.Width, image.Height)
    graphics.DrawImage(image, rectangle, rectangle, GraphicsUnit.Pixel);
    for X in [0..image.Width-1] do
        for Y in [0..image.Height-1] do
            let (r,g,b,c) =
                [Math.Max(0, X-blurSize)..Math.Min(image.Width-1, X+blurSize)]
                |> Seq.fold (fun sum x ->
                    [Math.Max(0, Y-blurSize)..Math.Min(image.Height-1, Y+blurSize)]
                    |>  Seq.fold (fun (r,g,b,c) y ->
                        let p = blurred.GetPixel(x,y)
                        (r + (int)p.R, g + (int)p.G, b + (int)p.B, c+1)
                     ) sum
                 ) (0,0,0,0)
    blurred.SetPixel(X, Y, Color.FromArgb(r/c, g/c, b/c));
    blurred

[<EntryPoint>]
let main argv =
   try
       printfn "argv = %A" argv
       let (fileName, blurSize) =
       match argv with
       | [|fileName|] -> (fileName, 3)
       | [|fileName; size|] ->
           match Int32.TryParse(size) with
           | (true, blurSize) when blurSize > 0 -> (fileName, blurSize)
           | _ -> failwithf "Incorrect blurSize '%s'" size
           | _ -> failwith "Incorrect parameters. Please enter 'fileName' and 'blurSize'"
       printfn "FileName:%s\nBlurSize:%d" fileName blurSize
       if (not(File.Exists(fileName)))
           then failwithf "File '%s' does not exist." fileName

      use inputStream = new MemoryStream(File.ReadAllBytes(fileName));
      use source = new Bitmap(Image.FromStream(inputStream));
      printfn "Processing..."
      use result = blur source blurSize
      printfn "Saving..."
      let resultFileName =
          sprintf "%s_%dblurred.jpg" (Path.GetFileNameWithoutExtension(fileName)) blurSize
      result.Save(resultFileName, ImageFormat.Jpeg)
      printfn "Done!"
   with
   | e ->
       printfn "Exception : %s" e.Message
       Console.ReadLine() |> ignore
   0
fsharp.org
Origin image
Blurred image (blurSize=3)
Blurred image (blurSize=3)

 

F# Weekly #8, 2013

Welcome to F# Weekly,

F# Heroes at 2013 MVP Global Summit

The greatest event of this week and maybe of the year is 2013 MVP Global Summit. But while our MVPs make history, the new portion of news from this past week is waiting for you =).

News

Blogs

That’s all for now.  Have a great week.

Previous F# Weekly edition – #7

F# Weekly #7, 2013

Welcome to F# Weekly,

One more week passed and a new portion of F# Weekly is waiting to be read.

News

Blogs

That’s all for now.  Have a great week.

Previous F# Weekly edition – #6

NLP: Stanford Named Entity Recognizer with F# (.NET)

Update (2014, January 3): Links and/or samples in this post might be outdated. The latest version of samples are available on new Stanford.NLP.NET site.

All code samples from this post are available on GitHub.

Samples for one more Stanford NLP library were ported to .NET. It is Stanford Named Entity Recognizer (NER).

To compile stanford-ner.jar to .NET assembly you need to follow the steps from my post “NLP: Stanford Parser with F# (.NET)“. Also you can download already compiled version from GitHub.

What is Stanford Named Entity Recognizer (NER)?nlp-logo-navbar

Stanford NER (also known as CRFClassifier) is a Java implementation of a Named Entity Recognizer. Named Entity Recognition (NER) labels sequences of words in a text which are the names of things, such as person and company names, or gene and protein names. The software provides a general (arbitrary order) implementation of linear chain Conditional Random Field (CRF) sequence models, coupled with well-engineered feature extractors for Named Entity Recognition. (CRF models were pioneered by Lafferty, McCallum, and Pereira (2001); see Sutton and McCallum (2006) for a better introduction.) Included with the download are good 3 class (PERSON, ORGANIZATION, LOCATION) named entity recognizers for English (in versions with and without additional distributional similarity features) and another pair of models trained on the CoNLL 2003 English training data. The distributional similarity features improve performance but the models require considerably more memory.

Read more about Named-entity recognition on Wikipedia.

Let’s play!

So, again, code is pretty straightforward and easy to read and understand. It looks procedural with some extra noise of type casting because of Java runtime nature.

open edu.stanford.nlp.ie
open edu.stanford.nlp.ie.crf
open edu.stanford.nlp.io
open edu.stanford.nlp.ling

open java.util
open System.IO
open IKVM.FSharp

let main file =
    let classifier =
        CRFClassifier.getClassifierNoExceptions(
            @"..\..\..\..\StanfordNLPLibraries\stanford-ner\classifiers\english.all.3class.distsim.crf.ser.gz")
    match file with
    | Some(fileName) ->
        let fileContents = File.ReadAllText(fileName)
        classifier.classify(fileContents).iterator()
        |> Collections.toSeq
        |> Seq.cast<java.util.List>
        |> Seq.iter (fun sentence ->
            sentence.iterator()
            |> Collections.toSeq
            |> Seq.cast<CoreLabel>
            |> Seq.iter (fun word ->
                printf "%s/%O "
                    (word.word())
                    (word.get(CoreAnnotations.AnswerAnnotation().getClass()))
            )
            printfn ""
        )
    | None ->
        let s1 = "Good afternoon Rajat Raina, how are you today?"
        let s2 = "I go to school at Stanford University, which is located in California."
        printfn "%s\n" (classifier.classifyToString(s1))
        printfn "%s\n" (classifier.classifyWithInlineXML(s2))
        printfn "%s\n" (classifier.classifyToString(s2, "xml", true));
        classifier.classify(s2).iterator()
        |> Collections.toSeq
        |> Seq.iteri (fun i coreLabel ->
            printfn "%d\n:%O\n" i coreLabel
        )

Let’s test NER on the text from Don Syme wiki page =).

Don Syme is an Australian computer scientist and a Principal Researcher at Microsoft Research, Cambridge, U.K. He is the designer and architect of the F# programming language, described by a reporter as being regarded as “the most original new face in computer languages since Bjarne Stroustrup developed C++ in the early 1980s.

Earlier, Syme created generics in the .NET Common Language Runtime, including the initial design of generics for the C# programming language, along with others including Andrew Kennedy and later Anders Hejlsberg. Kennedy, Syme and Yu also formalized this widely used system.

He holds a Ph.D. from the University of Cambridge, and is a member of the WG2.8 working group on functional programming. He is a co-author of the book Expert F# 2.0.

In the past he also worked on formal specification, interactive proof, automated verification and proof description languages.

Named-entity recognition result:

Don/PERSON Syme/PERSON is/O an/O Australian/O computer/O scientist/O and/O a/O Principal/O Researcher/O at/O Microsoft/ORGANIZATION Research/ORGANIZATION ,/O Cambridge/LOCATION ,/O U.K./LOCATION ./O He/O is/O the/O designer/O and/O architect/O of/O the/O F/O #/O programming/O language/O ,/O described/O by/O a/O reporter/O as/O being/O regarded/O as/O “/O the/O most/O original/O new/O face/O in/O computer/O languages/O since/O Bjarne/PERSON Stroustrup/PERSON developed/O C/O +/O +/O in/O the/O early/O 1980s/O ./O

Earlier/O ,/O Syme/PERSON created/O generics/O in/O the/O ./O NET/O Common/O Language/O Runtime/O ,/O including/O the/O initial/O design/O of/O generics/O for/O the/O C/O #/O programming/O language/O ,/O along/O with/O others/O including/O Andrew/PERSON Kennedy/PERSON and/O later/O Anders/PERSON Hejlsberg/PERSON ./O Kennedy/PERSON ,/O Syme/PERSON and/O Yu/PERSON also/O formalized/O this/O widely/O used/O system/O ./O

He/O holds/O a/O Ph.D./O from/O the/O University/ORGANIZATION of/ORGANIZATION Cambridge/ORGANIZATION ,/O and/O is/O a/O member/O of/O the/O WG2/O .8/O working/O group/O on/O functional/O programming/O ./O He/O is/O a/O co-author/O of/O the/O book/O Expert/O F/O #/O 2.0/O ./O

In/O the/O past/O he/O also/O worked/O on/O formal/O specification/O ,/O interactive/O proof/O ,/O automated/O verification/O and/O proof/O description/O languages/O ./O

F# Weekly #6, 2013

Welcome to F# Weekly,

Hey, new portion of F# stuff from this past week is already here! Savor these with a cup of coffee.

News

Blogs

That’s all for now.  Have a great week.

Previous F# Weekly edition – #5

NLP: Stanford POS Tagger with F# (.NET)

Update (2014, January 3): Links and/or samples in this post might be outdated. The latest version of samples are available on new Stanford.NLP.NET site.

All code samples from this post are available on GitHub.

Continuing the theme of porting Stanford NLP libraries to .NET, I am glad to introduce one more library – Stanford Log-linear Part-Of-Speech Tagger.

To compile stanford-postagger.jar to .NET assembly you need nothing special, just follow the steps from my previous post “NLP: Stanford Parser with F# (.NET)“. Also you can download already compiled version from GitHub.

What is Stanford POS Tagger?nlp-logo-navbar

A Part-Of-Speech Tagger (POS Tagger) is a piece of software that reads text in some language and assigns parts of speech to each word (and other token), such as noun, verb, adjective, etc., although generally computational applications use more fine-grained POS tags like ‘noun-plural’.

Read more about Part-of-speech tagging on Wikipedia.

Let’s play!

I was really surprised with performance of .NET version of Stanford POS Tagger.  It is fast enough! If you do not need advanced syntactic dependencies between the words and part-of-speech information is enough, then do not use Stanford Parser, Stanford POS Tagger is just what you need.

module TaggerDemo

open java.io
open java.util

open edu.stanford.nlp.ling
open edu.stanford.nlp.tagger.maxent;

open IKVM.FSharp
let model = @"..\..\..\..\StanfordNLPLibraries\stanford-postagger\models\wsj-0-18-left3words.tagger"

let tagReader (reader:Reader) =
    let tagger = MaxentTagger(model)
    MaxentTagger.tokenizeText(reader).iterator()
    |> Collections.toSeq
    |> Seq.iter (fun sentence ->
        let tSentence = tagger.tagSentence(sentence :?> List)
        printfn "%O" (Sentence.listToString(tSentence, false))
        )

let tagFile (fileName:string) =
    tagReader (new BufferedReader(new FileReader(fileName)))
let tagText (text:string) =
    tagReader (new StringReader(text))

As you see, it is really simple to use. We instantiate MaxentParser and initialize it with wsj-0-18-left3words.tagger model. After that we are loading text, tokenize it to sentences and tag sentences one by one.

Let’s test tagger on the F# Software Foundation Mission Statement =).

Mission Statement

The mission of the F# Software Foundation is to promote, protect, and advance the F# programming language, and to support and facilitate the growth of a diverse and international community of F# programmers.

Tagging result:

Mission/NNP Statement/NNP 
The/NNP mission/NN of/IN the/DT F/NN #/# Software/NNP Foundation/NNP is/VBZ 
to/TO promote/VB ,/, protect/VB ,/, and/CC advance/NN the/DT F/NN #/# 
programming/VBG language/NN ,/, and/CC to/TO support/VB and/CC facilitate/VB 
the/DT growth/NN of/IN a/DT diverse/JJ and/CC international/JJ community/NN 
of/IN F/NN #/# programmers/NNS ./.

Descriptions of POS tags you can find here.

NLP: Stanford Parser with F# (.NET)

Update (2014, January 3): Links and/or samples in this post might be outdated. The latest version of samples are available on new Stanford.NLP.NET site.

All code samples from this post are available on GitHub.

Natural Language Processing is one more hot topic as Machine Learning. For sure, it is extremely important, but poorly developed.

What we have in .NET?

Lets start from what we already have.

Looks really bad. It is hard to find something that really useful. Actually we have one more option, which is IKVM.NET. With IKVM.NET we should be able to use most of Java-based NLP frameworks. Let’s try to import Stanford Parser to .NET.

IKVM.NET overview.

IKVM.NET is an implementation of Java for Mono and the Microsoft .NET Framework. It includes the following components:

  • A Java Virtual Machine implemented in .NET
  • A .NET implementation of the Java class libraries
  • Tools that enable Java and .NET interoperability

Read more about what you can do with IKVM.NET.

About Stanford NLP nlp-logo-navbar

The Stanford NLP Group makes parts of our Natural Language Processing software available to the public. These are statistical NLP toolkits for various major computational linguistics problems. They can be incorporated into applications with human language technology needs.

All the software we distribute is written in Java. All recent distributions require Sun/Oracle JDK 1.5+. Distribution packages include components for command-line invocation, jar files, a Java API, and source code.

IKVM .jar to .dll compilation

First of all, we need to download and install IKVM.NET. You can do it from SourceForge. The next step is to download Stanford Parser (current latest version is 2.0.4 from 2012-11-12). Now we need to compile stanford-parser.jar to .NET assembly. You can do it with the following command:

ikvmc.exe stanford-parser.jar

If you need a strongly typed one, then you should do two more steps.

ildasm.exe /all /out=stanford-parser.il stanford-parser.dll
ilasm.exe /dll /key=myKey.snk stanford-parser.il

No signed stanford-parser.dll is available on GitHub.

Let’s play!

That’s all! Now we are ready to start playing with Stanford Parser.  I want to show up here one of the standard examples(ParserDemo.fs), the second one is available on the GitHub with other sources.

let demoAPI (lp:LexicalizedParser) =
  // This option shows parsing a list of correctly tokenized words
  let sent = [|"This"; "is"; "an"; "easy"; "sentence"; "." |]
  let rawWords = Sentence.toCoreLabelList(sent)
  let parse = lp.apply(rawWords)
  parse.pennPrint()

  // This option shows loading and using an explicit tokenizer
  let sent2 = "This is another sentence.";
  let tokenizerFactory = PTBTokenizer.factory(CoreLabelTokenFactory(), "")
  use sent2Reader = new StringReader(sent2)
  let rawWords2 = tokenizerFactory.getTokenizer(sent2Reader).tokenize()
  let parse = lp.apply(rawWords2)

  let tlp = PennTreebankLanguagePack()
  let gsf = tlp.grammaticalStructureFactory()
  let gs = gsf.newGrammaticalStructure(parse)
  let tdl = gs.typedDependenciesCCprocessed()
  printfn "\n%O\n" tdl

  let tp = new TreePrint("penn,typedDependenciesCollapsed")
  tp.printTree(parse)

let main fileName =
  let lp = LexicalizedParser.loadModel(@"..\..\..\..\StanfordNLPLibraries\stanford-parser\stanford-parser-2.0.4-models\englishPCFG.ser.gz")
  match fileName with
  | Some(file) -> demoDP lp file
  | None -> demoAPI lp

What we are doing here? First of all, we instantiate LexicalizedParser and initialize it with englishPCFG.ser.gz model. Then we create two sentences. First is created from already tokenized string(from string array, in this sample). The second one is created from the string using PTBTokenizer. After that we create lexical parser that is trained on the Penn Treebank corpus. Finally, we are parsing our sentences using this parser. Result output can be found below.

[|"1"|]
Loading parser from serialized file ..\..\..\..\StanfordNLPLibraries\
stanford-parser\stanford-parser-2.0.4-models\englishPCFG.ser.gz ... 
done [1.5 sec].
(ROOT
 (S
 (NP (DT This))
 (VP (VBZ is)
 (NP (DT an) (JJ easy) (NN sentence)))
 (. .)))

[nsubj(sentence-4, This-1), cop(sentence-4, is-2), det(sentence-4, another-3), 
root(ROOT-0, sentence-4)]
(ROOT
 (S
 (NP (DT This))
 (VP (VBZ is)
 (NP (DT another) (NN sentence)))
 (. .)))
nsubj(sentence-4, This-1)
cop(sentence-4, is-2)
det(sentence-4, another-3)
root(ROOT-0, sentence-4)

I want to mention one more time, that full source code is available at the fsharp-stanford-nlp-samples GitHub repository. Feel free to use and extend it.