Stanford CoreNLP is available on NuGet for F#/C# devs

Update (2014, January 3): Links and/or samples in this post might be outdated. The latest version of samples are available on new Stanford.NLP.NET site.


Stanford CoreNLP provides a set of natural language analysis tools which can take raw English language text input and give the base forms of words, their parts of speech, whether they are names of companies, people, etc., normalize dates, times, and numeric quantities, and mark up the structure of sentences in terms of phrases and word dependencies, and indicate which noun phrases refer to the same entities. Stanford CoreNLP is an integrated framework, which make it very easy to apply a bunch of language analysis tools to a piece of text. Starting from plain text, you can run all the tools on it with just two lines of code. Its analyses provide the foundational building blocks for higher-level and domain-specific text understanding applications.

Stanford CoreNLP integrates all Stanford NLP tools, including the part-of-speech (POS) taggerthe named entity recognizer (NER)the parser, and the coreference resolution system, and provides model files for analysis of English. The goal of this project is to enable people to quickly and painlessly get complete linguistic annotations of natural language texts. It is designed to be highly flexible and extensible. With a single option you can change which tools should be enabled and which should be disabled.

Stanford CoreNLP is here and available on NuGet. It is probably the most powerful package from whole The Stanford NLP Group software packages. Please, read usage overview on Stanford CoreNLP home page to understand what it can do, how you can configure an annotation pipeline, what steps are available for you, what models you need to have and so on.

I want to say thank you to Anonymous😉 and @OneFrameLink for their contribution and stimulating me to finish this work.

Please follow next steps to get started:

Before using Stanford CoreNLP, we need to define and specify annotation pipeline. For example, annotators = tokenize, ssplit, pos, lemma, ner, parse, dcoref.

The next thing we need to do is to create StanfordCoreNLP pipeline. But to instantiate a pipeline, we need to specify all required properties or at least paths to all models used by pipeline that are specified in annotators string. Before starting samples, let’s define some helper function that will be used across all source code pieces: jarRoot is a path to folder where we extracted files from stanford-corenlp-3.2.0-models.jar; modelsRoot is a path to folder with all models files; ‘!’ is overloaded operator that converts model name to relative path to the model file.

let (@@) a b = System.IO.Path.Combine(a,b)
let jarRoot = __SOURCE_DIRECTORY__ @@ @"..\..\temp\stanford-corenlp-full-2013-06-20\stanford-corenlp-3.2.0-models\"
let modelsRoot = jarRoot @@ @"edu\stanford\nlp\models\"
let (!) path = modelsRoot @@ path

Now we are ready to instantiate the pipeline, but we need to do a small trick. Pipeline is configured to use default model files (for simplicity) and all paths are specified relatively to the root of stanford-corenlp-3.2.0-models.jar. To make things easier, we can temporary change current directory to the jarRoot, instantiate a pipeline and then change current directory back. This trick helps us dramatically decrease the number of code lines.

let props = Properties()
props.setProperty("annotators","tokenize, ssplit, pos, lemma, ner, parse, dcoref") |> ignore
props.setProperty("sutime.binders","0") |> ignore

let curDir = System.Environment.CurrentDirectory
let pipeline = StanfordCoreNLP(props)

However,  you do not have to do it. You can configure all models manually. The number of properties (especially paths to models) that you need to specify depends on the annotators value. Let’s assume for a moment that we are in Java world and we want to configure our pipeline in a custom way. Especially for this case, stanford-corenlp-3.2.0-models.jar contains (you can find it in the folder with extracted files), where you can specify new property values out of code. Most of properties that we need to use for configuration are already mentioned in this file and you can easily understand what it what. But it is not enough to get it work, also you need to look into source code of Stanford CoreNLP. By the way, some days ago Stanford was moved CoreNLP source code into GitHub – now it is much easier to browse it.  Default paths to the models are specified in file, property keys are listed in file and information about which path match to which property name is contained in Thus, you are able to dive deeper into pipeline configuration and do whatever you want. For lazy people I already have a working sample.

let props = Properties()
let (<==) key value = props.setProperty(key, value) |> ignore
"annotators"    <== "tokenize, ssplit, pos, lemma, ner, parse, dcoref"
"pos.model"     <== ! @"pos-tagger\english-bidirectional\english-bidirectional-distsim.tagger"
"ner.model"     <== ! @"ner\english.all.3class.distsim.crf.ser.gz"
"parse.model"   <== ! @"lexparser\englishPCFG.ser.gz"

"dcoref.demonym"            <== ! @"dcoref\demonyms.txt"
"dcoref.states"             <== ! @"dcoref\state-abbreviations.txt"
"dcoref.animate"            <== ! @"dcoref\animate.unigrams.txt"
"dcoref.inanimate"          <== ! @"dcoref\inanimate.unigrams.txt"
"dcoref.male"               <== ! @"dcoref\male.unigrams.txt"
"dcoref.neutral"            <== ! @"dcoref\neutral.unigrams.txt"
"dcoref.female"             <== ! @"dcoref\female.unigrams.txt"
"dcoref.plural"             <== ! @"dcoref\plural.unigrams.txt"
"dcoref.singular"           <== ! @"dcoref\singular.unigrams.txt"
"dcoref.countries"          <== ! @"dcoref\countries"
"dcoref.extra.gender"       <== ! @"dcoref\namegender.combine.txt"
"dcoref.states.provinces"   <== ! @"dcoref\statesandprovinces"
"dcoref.singleton.predictor"<== ! @"dcoref\singleton.predictor.ser"

let sutimeRules =
    [| ! @"sutime\defs.sutime.txt";
       ! @"sutime\english.holidays.sutime.txt";
       ! @"sutime\english.sutime.txt" |]
    |> String.concat ","
"sutime.rules"      <== sutimeRules
"sutime.binders"    <== "0"

let pipeline = StanfordCoreNLP(props)

As you see, this option is much longer and harder to do. I recommend to use the first one, especially if you do not need to change the default configuration.

And now the fun part. Everything else is pretty easy: we create an annotation from your text, path it through the pipeline and interpret the results.

let text = "Kosgi Santosh sent an email to Stanford University. He didn't get a reply.";

let annotation = Annotation(text)
use stream = new ByteArrayOutputStream()
pipeline.prettyPrint(annotation, new PrintWriter(stream))
printfn "%O" (stream.toString())

Certainly, you can extract all processing results from annotated test.

let customAnnotationPrint (annotation:Annotation) =
    printfn "-------------"
    printfn "Custom print:"
    printfn "-------------"
    let sentences = annotation.get(CoreAnnotations.SentencesAnnotation().getClass()) :?> java.util.ArrayList
    for sentence in sentences |> Seq.cast<CoreMap> do
        printfn "\n\nSentence : '%O'" sentence

    let tokens = sentence.get(CoreAnnotations.TokensAnnotation().getClass()) :?> java.util.ArrayList
    for token in (tokens |> Seq.cast<CoreLabel>) do
       let word = token.get(CoreAnnotations.TextAnnotation().getClass())
       let pos  = token.get(CoreAnnotations.PartOfSpeechAnnotation().getClass())
       let ner  = token.get(CoreAnnotations.NamedEntityTagAnnotation().getClass())
       printfn "%O \t[pos=%O; ner=%O]" word pos ner

    printfn "\nTree:"
    let tree = sentence.get(TreeCoreAnnotations.TreeAnnotation().getClass()) :?> Tree
    use stream = new ByteArrayOutputStream()
    tree.pennPrint(new PrintWriter(stream))
    printfn "The first sentence parsed is:\n %O" (stream.toString())

    printfn "\nDependencies:"
    let deps = sentence.get(SemanticGraphCoreAnnotations.CollapsedDependenciesAnnotation().getClass()) :?> SemanticGraph
    for edge in deps.edgeListSorted().toArray() |> Seq.cast<SemanticGraphEdge> do
        let gov = edge.getGovernor()
        let dep = edge.getDependent()
        printfn "%O(%s-%d,%s-%d)"
            (gov.word()) (gov.index())
            (dep.word()) (dep.index())

The full code sample is available on GutHub, if you run it, you will see the following result:

Sentence #1 (9 tokens):
Kosgi Santosh sent an email to Stanford University.
[Text=Kosgi CharacterOffsetBegin=0 CharacterOffsetEnd=5 PartOfSpeech=NNP Lemma=Kosgi NamedEntityTag=PERSON] [Text=Santosh CharacterOffsetBegin=6 CharacterOffsetEnd=13 PartOfSpeech=NNP Lemma=Santosh NamedEntityTag=PERSON] [Text=sent CharacterOffsetBegin=14 CharacterOffsetEnd=18 PartOfSpeech=VBD Lemma=send NamedEntityTag=O] [Text=an CharacterOffsetBegin=19 CharacterOffsetEnd=21 PartOfSpeech=DT Lemma=a NamedEntityTag=O] [Text=email CharacterOffsetBegin=22 CharacterOffsetEnd=27 PartOfSpeech=NN Lemma=email NamedEntityTag=O] [Text=to CharacterOffsetBegin=28 CharacterOffsetEnd=30 PartOfSpeech=TO Lemma=to NamedEntityTag=O] [Text=Stanford CharacterOffsetBegin=31 CharacterOffsetEnd=39 PartOfSpeech=NNP Lemma=Stanford NamedEntityTag=ORGANIZATION] [Text=University CharacterOffsetBegin=40 CharacterOffsetEnd=50 PartOfSpeech=NNP Lemma=University NamedEntityTag=ORGANIZATION] [Text=. CharacterOffsetBegin=50 CharacterOffsetEnd=51 PartOfSpeech=. Lemma=. NamedEntityTag=O]
(NP (NNP Kosgi) (NNP Santosh))
(VP (VBD sent)
(NP (DT an) (NN email))
(PP (TO to)
(NP (NNP Stanford) (NNP University))))
(. .)))

nn(Santosh-2, Kosgi-1)
nsubj(sent-3, Santosh-2)
root(ROOT-0, sent-3)
det(email-5, an-4)
dobj(sent-3, email-5)
nn(University-8, Stanford-7)
prep_to(sent-3, University-8)

Sentence #2 (7 tokens):
He didn’t get a reply.
[Text=He CharacterOffsetBegin=52 CharacterOffsetEnd=54 PartOfSpeech=PRP Lemma=he NamedEntityTag=O] [Text=did CharacterOffsetBegin=55 CharacterOffsetEnd=58 PartOfSpeech=VBD Lemma=do NamedEntityTag=O] [Text=n’t CharacterOffsetBegin=58 CharacterOffsetEnd=61 PartOfSpeech=RB Lemma=not NamedEntityTag=O] [Text=get CharacterOffsetBegin=62 CharacterOffsetEnd=65 PartOfSpeech=VB Lemma=get NamedEntityTag=O] [Text=a CharacterOffsetBegin=66 CharacterOffsetEnd=67 PartOfSpeech=DT Lemma=a NamedEntityTag=O] [Text=reply CharacterOffsetBegin=68 CharacterOffsetEnd=73 PartOfSpeech=NN Lemma=reply NamedEntityTag=O] [Text=. CharacterOffsetBegin=73 CharacterOffsetEnd=74 PartOfSpeech=. Lemma=. NamedEntityTag=O]
(NP (PRP He))
(VP (VBD did) (RB n’t)
(VP (VB get)
(NP (DT a) (NN reply))))
(. .)))

nsubj(get-4, He-1)
aux(get-4, did-2)
neg(get-4, n’t-3)
root(ROOT-0, get-4)
det(reply-6, a-5)
dobj(get-4, reply-6)

Coreference set:
(2,1,[1,2)) -> (1,2,[1,3)), that is: “He” -> “Kosgi Santosh”

C# Sample

C# samples are also available on GitHub.

Stanford Temporal Tagger(SUTime)


SUTime is a library for recognizing and normalizing time expressions. SUTime is available as part of the Stanford CoreNLP pipeline and can be used to annotate documents with temporal information. It is a deterministic rule-based system designed for extensibility.

There is one more useful thing that we can do with CoreNLP – time extraction. The way that we use CoreNLP is pretty similar to the previous sample. Firstly, we create an annotation pipeline and add there all required annotators. (Notice that this sample also use the operator defined at the beginning of the post)

let pipeline = AnnotationPipeline()

let tagger = MaxentTagger(! @"pos-tagger\english-bidirectional\english-bidirectional-distsim.tagger")

let sutimeRules =
    [| ! @"sutime\defs.sutime.txt";
       ! @"sutime\english.holidays.sutime.txt";
       ! @"sutime\english.sutime.txt" |]
    |> String.concat ","
let props = Properties()
props.setProperty("sutime.rules", sutimeRules ) |> ignore
props.setProperty("sutime.binders", "0") |> ignore
pipeline.addAnnotator(TimeAnnotator("sutime", props))

Now we are ready to annotate something. This part is also equal to the same one from the previous sample.

let text = "Three interesting dates are 18 Feb 1997, the 20th of july and 4 days from today."
let annotation = Annotation(text)
annotation.set(CoreAnnotations.DocDateAnnotation().getClass(), "2013-07-14") |> ignore

And finally, we need to interpret annotating results.

printfn "%O\n" (annotation.get(CoreAnnotations.TextAnnotation().getClass()))
let timexAnnsAll = annotation.get(TimeAnnotations.TimexAnnotations().getClass()) :?> java.util.ArrayList
for cm in timexAnnsAll |> Seq.cast<CoreMap> do
    let tokens = cm.get(CoreAnnotations.TokensAnnotation().getClass()) :?> java.util.List
    let first = tokens.get(0)
    let last = tokens.get(tokens.size() - 1)
    let time = cm.get(TimeExpression.Annotation().getClass()) :?> TimeExpression
    printfn "%A [from char offset '%A' to '%A'] --> %A"
        cm first last (time.getTemporal())

The full code sample is available on GutHub, if you run it you will see the following result:

18 Feb 1997 [from char offset ’18’ to ‘1997’] –> 1997-2-18
the 20th of july [from char offset ‘the’ to ‘July’] –> XXXX-7-20
4 days from today [from char offset ‘4’ to ‘today’] –> THIS P1D OFFSET P4D

C# Sample

C# samples are also available on GitHub.


There is a pretty awesome library. I hope you enjoy it. Try it out right now!

There are some other more specific Stanford packages that are already available on NuGet:

51 thoughts on “Stanford CoreNLP is available on NuGet for F#/C# devs

  1. I’m glad, that I was able to inspire you to complete your work😉

    I got your NuGet package working within minutes. Awesome😉

    Much better than what I did. Until now it wasn’t clear to me, that you can get the models from stanford-corenlp-3.2.0-models.jar by simply unzipping. For my previous implementation (the one I posted on pastebin) I collected the necessary models from the individual NLP packages provided by Stanford and searched on GitHub for the dcoref files. What a waste of time😀

    If you want, you could simply add the hint to simply unzip, like you already did at

    Thanks again. You are helping me a lot to get started in NLP.

  2. Hi,
    I am working on a Farsi (Perisan) chatter bot and I have a good experience in C#. I find your work really interesting but I have no experience in j#.could you please give me a hand about how can I train Your version of Stanford Tagger with Persian data?
    Yours Faithfully,
    Ashkan Sirous

  3. Hi,
    I am trying to test your segmentation program for Chinese using C#. In the console window, I receive a lot of gibberish, which is presumably an encoding problem — perhaps I am doing something wrong.

    Is it possible to have the info that is sent to the console sent to a file, instead?

    Also, is it possible to pass the program one sentence, and receive all of the segmentation information back, rather than sending a whole file at a time. In other words, is there a method to call to send a Chinese string and receive back the segmentation info?

    This is the code I’m using (all in main, of course):

    string fileName = “testdata.txt”;

    var props = new Properties();
    props.setProperty(“sighanCorporaDict”, “c:\\Stanford\\stanford-segmenter-2013-06-20\\data”);
    props.setProperty(“serDictionary”, “c:\\stanford\\stanford-segmenter-2013-06-20\\data\\dict-chris6.ser.gz”);
    props.setProperty(“testFile”, “testdata.txt”);
    props.setProperty(“inputEncoding”, “UTF-8”);
    props.setProperty(“sighanPostProcessing”, “true”);

    var segmenter = new CRFClassifier(props);
    segmenter.loadClassifierNoExceptions(“c:\\stanford\\stanford-segmenter-2013-06-20\\data\\ctb.gz”, props);


    Many thanks for all of your work, and Happy New Year! I look forward to trying this out.


    Jon Rachlin

    1. According to your issue:
      – Newer version of segmenter already available (from 2013-11-12)
      – Could you check the encoding of your file? It should be UTF-8
      – Here is a working sample

      >Is it possible to have the info that is sent to the console sent to a file, instead?
      I have not tried this, but it should be possible. Something like this should work

      >is there a method to call to send a Chinese string and receive back the segmentation info?
      There are some classification methods, you can try them and choose such one that fit better to your task. For example ‘segmenter.classifyToString’ get text as string and return segmented string.

  4. Hi Sergey,

    Thank you for the awesome tutorial, it helped me a lot! I have a problem however, i’m trying to create the parse tree of my text as you did above, except I don’t want to do it per sentence, but rather the full body of text. I can’t get it to work, as i’m not sure what object to use the “.get(new TreeCoreAnnotations.TreeAnnotation().getClass())” method on. I’ve tried to use it on the annotation object itself but the tree always comes out null.

    Any help would be greatly appreciated!

      1. Hi, thank you for the reply!

        I have noticed that it can become very slow using larger bodies of text, however I want the full tree because I need to be able to determine context with regards to items across the entire body of text. I’ve tried the example above that you gave me, but it doesn’t quite do what I need.

        I think I will find a way around it, possibly using the sentence trees to make a final bigger tree. Thank you for your help though🙂

  5. Hi….
    I have downloaded coreNLP from Nuget. But how to use it? Is there any guide documentation which could lead me to use it?

  6. Hi,
    When I compile example, I receive a lot of exception. There are few of them:

    A first chance exception of type ‘java.lang.InternalError’ occurred in IKVM.OpenJDK.Core.dll
    A first chance exception of type ‘java.lang.reflect.InvocationTargetException’ occurred in Unknown Module.
    A first chance exception of type ‘java.lang.InternalError’ occurred in IKVM.OpenJDK.Core.dll
    A first chance exception of type ‘java.lang.InternalError’ occurred in stanford-corenlp-3.3.1.dll
    An unhandled exception of type ‘java.lang.InternalError’ occurred in stanford-corenlp-3.3.1.dll
    Additional information: unexpected entry: cli.System.TypeLoadException: Could not load type ‘IKVM.Attributes.HideFromReflectionAttribute’ from assembly ‘IKVM.Runtime, Version=7.4.5196.0, Culture=neutral, PublicKeyToken=13235d27fcbfff58’.

    Could you help me?
    Thanks alot for your work.

  7. Hi Sergey,

    I’ve been using your StanfordCoreNLP Nuget package for a couple of months now and everything works fine.

    Due to including another IKVM port into the same project, I upgraded the IVKM reference to version 7.4.5196.0.
    Unfortunately the StanfordCoreNLP Nuget package doesn’t work with the latest IVKM Nuget package.

    I guess this is because “IKVM.Attributes.HideFromReflectionAttribute” was removed. (see

    Upon loading the parser I get an exception (translated into english):
    unexpected entry: cli.System.TypeLoadException: Could not load type “IKVM.Attributes.HideFromReflectionAttribute” in assembly “IKVM.Runtime, Version=7.4.5196.0, Culture=neutral, PublicKeyToken=13235d27fcbfff58”.

    Is it possible that you release a newer version of StanfordCoreNLP referencing the latest IKVM version?

    I see the comments of Oleksandr Motsok, but I think I can’t reference to different versions at the same time?

      1. Hi Sergey,
        thanks for responding and solving my issue so fast😉 My pipeline works as intended again.

        Do you want future issues/questions on your blog or rather on GitHub?
        Have a nice evening😉

  8. Sergey, thanks for your work. I’m trying to simply load the modules via C# and getting a RuntimeIOException loading a tagger model. All the C# sample references go 404.

    Just doing this:
    var props = new Properties();
    props.setProperty(“annotators”, “tokenize, ssplit, pos, lemma, ner, parse, dcoref”);
    props.setProperty(“sutime.binders”, “0”);

    var curDir = System.Environment.CurrentDirectory;
    Environment.CurrentDirectory = @”C:\AI\models\edu\stanford\nlp\models\”; //also tried just the jar location (c:\ai\models).
    nlp = new StanfordCoreNLP(props); // <<< fails here.
    Environment.CurrentDirectory = curDir;

    Any suggestions, or non-404 sample C# skeleton would be deeply appreciated – thanks!

    1. as a follow up – it appears that pos is what’s having an issue. Other models load, but pos doesn’t. – not sure why a load issue would start there.

      1. and… nevermind… apparently it needs to be a subdirectory of the whole rather than separate. (C:\AI\models\stanford-corenlp-full-2014-06-16\stanford-corenlp-3.4-models) rather than higher. It’s loading now – looking forward to exploring it – thanks!

  9. Hi Sergey, I appreciate so much your effort and time!
    Do you have a C# code samples availables for a newbie like me? All current links refer to a page with 404 error code. Thks

  10. Hi Serjey, nice stuff!

    But wondering if you could help me out with a problem runnign the library. I’ve been trying to use CoreNLP in my C# project. I get the dependencies correctly from NuGet, and I can instantiate the StandfordCoreNLP class (pretty much line for line the c# example you wrote out).

    But when I get to the Annotation.annontate call, an exception is thrown with the message – “Provider not found” from the IKVM.OpenJDK.XML.API.

    I’ve pasted the full stack trace here, if it’s of any help? Any ideas where I have went wrong?

    at javax.xml.transform.TransformerFactory.newInstance()
    at edu.stanford.nlp.time.XMLUtils.printNode(OutputStream out, Node node, Boolean prettyPrint, Boolean includeXmlDeclaration)
    at edu.stanford.nlp.time.XMLUtils.nodeToString(Node node, Boolean prettyPrint)
    at edu.stanford.nlp.time.Timex.init(Element A_1)
    at edu.stanford.nlp.time.Timex..ctor(Element element)
    at edu.stanford.nlp.time.Timex.fromMap(String text, Map map)
    at edu.stanford.nlp.time.TimeExpressionExtractorImpl.toCoreMaps(CoreMap A_1, List A_2, TimeIndex A_3)
    at edu.stanford.nlp.time.TimeExpressionExtractorImpl.extractTimeExpressionCoreMaps(CoreMap annotation, String docDate, TimeIndex timeIndex)
    at edu.stanford.nlp.time.TimeExpressionExtractorImpl.extractTimeExpressionCoreMaps(CoreMap annotation, CoreMap docAnnotation)
    at A_1, CoreMap A_2)
    at A_1, CoreMap A_2, CoreMap A_3)
    at tokens, CoreMap document, CoreMap sentence)
    at A_1, CoreMap A_2, CoreMap A_3)
    at tokens, CoreMap document, CoreMap sentence)
    at tokenSequence, CoreMap doc, CoreMap sentence)
    at edu.stanford.nlp.pipeline.NERCombinerAnnotator.doOneSentence(Annotation annotation, CoreMap sentence)
    at edu.stanford.nlp.pipeline.NERCombinerAnnotator.annotate(Annotation annotation)
    at edu.stanford.nlp.pipeline.AnnotationPipeline.annotate(Annotation annotation)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.annotate(Annotation annotation)

  11. Hello Sergey,

    I am trying to learn the library. I am using C# with the posted example, but I get the following error. I loaded the package “Stanford.NLP.CoreNLP” (it added IKVM.NET) via nuget and downloaded the code. Unzipped the .jar models. My directory is correct.:

    edu.stanford.nlp.util.ReflectionLoading.ReflectionLoadingException was unhandled
    Message=Error creating edu.stanford.nlp.time.TimeExpressionExtractorImpl
    at edu.stanford.nlp.util.ReflectionLoading.loadByReflection(String className, Object[] arguments)
    at edu.stanford.nlp.time.TimeExpressionExtractorFactory.create(String className, String name, Properties props)
    at edu.stanford.nlp.time.TimeExpressionExtractorFactory.createExtractor(String name, Properties props)
    at props, Boolean useSUTime, Properties sutimeProps)
    at applyNumericClassifiers, Boolean useSUTime, Properties nscProps, String[] loadPaths)
    at edu.stanford.nlp.pipeline.AnnotatorImplementations.ner(Properties properties)
    at edu.stanford.nlp.pipeline.AnnotatorFactories.6.create()
    at edu.stanford.nlp.pipeline.AnnotatorPool.get(String name)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(Properties A_1, Boolean A_2, AnnotatorImplementations A_3)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP..ctor(Properties props, Boolean enforceRequirements)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP..ctor(Properties props)
    at ConsoleApplication1.Program.Main(String[] args) in d:\Programming_Code\VisualStudio\visual studio 2013\Projects\AutoWikify\ConsoleApplication1\ConsoleApplication1\Program.cs:line 30
    at System.AppDomain._nExecuteAssembly(RuntimeAssembly assembly, String[] args)
    at Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly()
    at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
    at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
    at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
    at System.Threading.ThreadHelper.ThreadStart()
    InnerException: edu.stanford.nlp.util.MetaClass.ClassCreationException
    Message=MetaClass couldn’t create public edu.stanford.nlp.time.TimeExpressionExtractorImpl(java.lang.String,java.util.Properties) with args [sutime, {sutime.binders=0, annotators=tokenize, ssplit, pos, lemma, ner, parse, dcoref}]
    at edu.stanford.nlp.util.MetaClass.ClassFactory.createInstance(Object[] params)
    at edu.stanford.nlp.util.MetaClass.createInstance(Object[] objects)
    at edu.stanford.nlp.util.ReflectionLoading.loadByReflection(String className, Object[] arguments)
    InnerException: java.lang.reflect.InvocationTargetException
    at __(Object[] )
    at Java_sun_reflect_ReflectionFactory.FastConstructorAccessorImpl.newInstance(Object[] args)
    at java.lang.reflect.Constructor.newInstance(Object[] initargs, CallerID )
    at edu.stanford.nlp.util.MetaClass.ClassFactory.createInstance(Object[] params)

    Here is my code:

    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Text;
    using java.util;
    using edu.stanford.nlp.pipeline;
    using Console = System.Console;

    namespace ConsoleApplication1
    class Program
    static void Main(string[] args)
    // Path to the folder with models extracted from `stanford-corenlp-3.4-models.jar`
    var jarRoot = @”D:\Programming_SDKs\stanford-corenlp-full-2015-01-30\stanford-corenlp-3.5.1-models\”;

    // Text for processing
    var text = “Kosgi Santosh sent an email to Stanford University. He didn't get a reply.”;

    // Annotation pipeline configuration
    var props = new Properties();
    props.setProperty(“annotators”, “tokenize, ssplit, pos, lemma, ner, parse, dcoref”);
    props.setProperty(“sutime.binders”, “0”);

    // We should change current directory, so StanfordCoreNLP could find all the model files automatically
    var curDir = Environment.CurrentDirectory;
    var pipeline = new StanfordCoreNLP(props);

    // Annotation
    var annotation = new Annotation(text);

    // Result – Pretty Print
    using (var stream = new ByteArrayOutputStream())
    pipeline.prettyPrint(annotation, new PrintWriter(stream));

  12. Hi,
    I’ve fixed the dependencies in pom.xml, but I still get this exception:

    “Unable to resolve “edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger” as either class path, filename or URL”

    My source code is exactly the same as the C# code you provided here. I was wondering if you have any ideas what the problem might be.


  13. Hey,
    I tried the c# code for this and it works fine. But in the output screen, along with the desired result, I am also getting

    “Reading TokensRegex rules from edu/stanford/nlp/models/sutime/defs.sutime.txt
    Reading TokensRegex rules from edu/stanford/nlp/models/sutime/english.sutime.txt”

    Is there any way I can get rid of these and only get the output which I am printing?

  14. Could you please give an example of processing the the results in C#. Such as getting the tags as trees and graphs or tokens as list. Something like this.

    // these are all the sentences in this document
    // a CoreMap is essentially a Map that uses class objects as keys and has values with custom types
    List sentences = document.get(SentencesAnnotation.class);

    for(CoreMap sentence: sentences) {
    // traversing the words in the current sentence
    // a CoreLabel is a CoreMap with additional token-specific methods
    for (CoreLabel token: sentence.get(TokensAnnotation.class)) {
    // this is the text of the token
    String word = token.get(TextAnnotation.class);
    // this is the POS tag of the token
    String pos = token.get(PartOfSpeechAnnotation.class);
    // this is the NER label of the token
    String ne = token.get(NamedEntityTagAnnotation.class);

    // this is the parse tree of the current sentence
    Tree tree = sentence.get(TreeAnnotation.class);

    // this is the Stanford dependency graph of the current sentence
    SemanticGraph dependencies = sentence.get(CollapsedCCProcessedDependenciesAnnotation.class);

  15. Hi Sergey,

    I got the whole description of the sentences but I only need to know whether the sentence is of negative or positive sentiment. From which part of the resultset I could understand that?

  16. Hi,
    i;m novice worker of using the standford for c#…i need a guidance regarding dcoref for my own language..i;m nt getting idea where i start from ..may i have to make my own library to convert in urdu language.???can anyone help me ..

  17. dcoref is another name of simple annotation ??i need jst dcoref code using standford core nlp.

    1. The sample contains line `props.setProperty(“annotators”, “tokenize, ssplit, pos, lemma, ner, parse, dcoref”);` which configure annotators that you want to apply to your text

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s