SharePoint 2013: Content Enrichment and performance of 3rd party services

There are a lot of cases when people write Content Enrichment service for SharePoint 2013 to integrate it with some 3rd party tagging/enrichment API. Such integration very often leads to huge crawl time degradation, the same happened in my case too.

The first thought was to measure the time that CPES (Content Processing Enrichment Service) spends on calls. I wrapped all my calls to external system in timers and run Full Crawl again. The results were awful, calls were in average from 10 to 30 times slower from CPES than from POC console app. Code was the same and of course perfect, so, probably my 3rd party service was not able to handle such an incredible load that was generated by my super-fast SharePoint Farm.

If you are in the same situation with fast SharePoint, awesome CPES implementation and bad 3rd party service, you probably should try async integration approach described in “Advanced Content Enrichment in SharePoint 2013 Search” by Richard diZerega.

But what if the problem is in my source code… what if execution environment affects execution time of REST calls… Some time later, I found a nice post “Quick Ways to Boost Performance and Scalability of ASP.NET, WCF and Desktop Clients” from Omar Al Zabir. This article describes the notion of Connection Manager:

By default, system.net has two concurrent connections per IP allowed. This means on a webserver, if it’s calling WCF service on another server or making any outbound call to a particular server, it will only be able to make two concurrent calls. When you have a distributed application and your web servers need to make frequent service calls to another server, this becomes the greatest bottleneck

WHOA! That means that it does not matter how fast our SharePoint Search service is. Our CPES executes no more than two calls to 3rd party service at the same time and queues other. Let’s fix this unpleasant injustice in web.config:

 <system.net>
   <connectionManagement>
     <add address="*" maxconnection="128"/>
   </connectionManagement>
 </system.net>

Now performance degradation should be fixed and if your 3rd party service is good enough your crawl time will be reasonable, otherwise you need to go back to advice from Richard diZereg.

SharePoint 2013: Content Enrichment for Large Files

There are a couple of guides on how to write Content Enrichment services for SharePoint 2013. One of them is official MSDN article “How to: Use the Content Enrichment web service callout for SharePoint Server“.

This article advice you two configuration steps to adjust max size of document that will be processed by CPES (Content Processing Enrichment Service).

  1. Modify web.config to accept messages up to 8 MB, and configure readerQuotas to be a sufficiently large value.
    <bindings>
     <basicHttpBinding>
       <!-- The service will accept a maximum blob of 8 MB. -->
       <binding maxReceivedMessageSize = "8388608">
       <readerQuotas maxDepth="32"
         maxStringContentLength="2147483647"
         maxArrayLength="2147483647" 
         maxBytesPerRead="2147483647" 
         maxNameTableCharCount="2147483647" /> 
       <security mode="None" />
       </binding>
     </basicHttpBinding>
    </bindings>
    
  2. Modify SPEnterpriseSearchContentEnrichmentConfiguration.
    $ssa = Get-SPEnterpriseSearchServiceApplication
    $config = New-SPEnterpriseSearchContentEnrichmentConfiguration
    $config.Endpoint = http://Site_URL/ContentEnrichmentService.svc
    $config.InputProperties = "Author", "Filename"
    $config.OutputProperties = "Author"
    $config.SendRawData = $True
    $config.MaxRawDataSize = 8192
    Set-SPEnterpriseSearchContentEnrichmentConfiguration –SearchApplication
    $ssa –ContentEnrichmentConfiguration $config
    

The concept is generally good, but what if you need to process files larger than 8MB? Let’s try to increase this number up to 300Mb for example (I think that ideally this limit should be not less than max file size allowed for your web apps).

Let’s change both values and run full crawl of SharePoint site. After that, if you are lucky, you will see something like that in your “Error Breakdown”:

crawl_errors_001

WAT? Something went wrong, but what it was … Let’s investigate ULS logs on the machine with Search Service. After a couple of unforgettable minutes of reading ULS logs, I’ve found the following error message:

[Microsoft.CrawlerFlow-cb9134ec-91c6-4bac-89f9-a0cc9fe1e481] Microsoft.Ceres.Evaluation.Engine.ErrorHandling.HandleExceptionHelper : Evaluation failure detected: Operator : ContentEnrichment Operator type : ContentEnrichmentClient Error id : 3206 Correlation id : 60ef1afd-038a-4f64-8230-2b2493923f80 Partition id : 0c37852b-34d0-418e-91c6-2ac25af4be5b Message : Failed to send the item to the content processing enrichment service. 49691C90-7E17-101A-A91C-08002B2ECDA9:#9: https://mysite.com/MyDoc.pptx id : ssic://780174 System.ServiceModel.EndpointNotFoundException: There was no endpoint listening
at http://MyServer:8081/ContentProcessingEnrichmentService.svc that could accept the message. This is often caused by an incorrect address or SOAP action. See InnerException, if present, for more details. —> System.Net.WebException: The remote server returned an error: (404) Not Found.
at System.Net.HttpWebRequest.GetResponse()
at System.ServiceModel.Channels.HttpChannelFactory`1.HttpRequestChannel.HttpChannelRequest.WaitForReply(TimeSpan timeout) –

WAT? “The remote server returned an error: (404) Not Found”. Does the service not exist sometimes? How could it be? Let’s go to IIS log (on the machine where your CPES is installed). Path to the IIS logs should look similar to this c:\inetpub\logs\LogFiles\W3SVC3\.

iislog

It is true – CPES sometimes returns 404.13 status. Let’s google what this status code means.

404.13 – Content length too large. The request contains a Content-Length header. The value of the Content-Length header is larger than the limit that is allowed for the server.

Seems that IIS is not ready yet to receive our 300Mb files. There is one more parameter in web config that should be tweaked to  handle really large files, this parameter is maxAllowedContentLength (default value is 30000000, that is ~30Mb). Let’s change it in web.config:

 <system.webServer>
   <security>
     <requestFiltering>
       <requestLimits maxAllowedContentLength="314572800" />
     </requestFiltering>
   </security>
 </system.webServer>

Recrawl your content once again, and Voila, strange errors gone! Enjoy your content enrichment!)

FAST Search Server 2010 for SharePoint Versions

Talbott Crowell's Software Development Blog

Here is a table that contains a comprehensive list of FAST Search Server 2010 for SharePoint versions including RTM, cumulative updates (CU’s), and hotfixes. Please let me know if you find any errors or have a version not listed here by using the comments.

Build Release Component Information Source (Link to Download)
14.0.4763.1000 RTM FAST Search Server Mark   van Dijk 
14.0.5128.5001 October 2010 CU FAST Search Server KB2449730 Mark   van Dijk 
14.0.5136.5000 February 2011 CU FAST Search Server KB2504136 Mark   van Dijk 
14.0.6029.1000 Service Pack 1 FAST Search Server KB2460039 Todd   Klindt
14.0.6109.5000 August 2011 CU FAST Search Server KB2553040 Todd   Klindt
14.0.6117.5002 February 2012 CU FAST Search Server KB2597131 Todd   Klindt
14.0.6120.5000 April 2012 CU FAST Search Server KB2598329 Todd   Klindt
14.0.6126.5000 August 2012 CU FAST Search Server KB2687489 Mark   van Dijk 
14.0.6129.5000 October 2012 CU FAST Search Server KB2760395 Todd…

View original post 245 more words

Declarative authorization in REST services in SharePoint with F# and ServiceStack

This post is a short overview of my talk at Belarus SharePoint User Group at 2013/06/27.

The primary goals were to find an efficient declarative way to specify authorization on REST service (that knows about SharePoint built in security model) and try F# in SharePoint on a real-life problem. Service Stack was selected over ASP.NET Web API because I wanted to find a solution that operates in SharePoint 2010 on .NET 3.5. Continue reading “Declarative authorization in REST services in SharePoint with F# and ServiceStack”

How to determine browser type in JavaScript (for SharePoint 2010 sites)

According to the sad situation in nowadays front-end development, we have to check current browser type and version in JavaScript code and behave differently depend on that. There are many options to do so like this or this. But working in SharePoint 2010 environment you have one more, init.js defines browseris object (see on the picture below) which contains most of required data. Be free to rely on SharePoint in this case.
browseris

Explain rank in Sharepoint 2013 Search

Add your thoughts here… (optional)

Insights into search black magic

Default ranking model in Sharepoint 2013 is completely different from what we’ve seen in FS4SP and is definitely a step forward comparing to SP2010. In a nutshell it uses multiple features (to take into account query terms, it’s proximity; document click-through, authority, anchor text, url depth etc ) which are mixed with help of neural network as a final step. Details can be found in patent claim http://www.google.com/patents/US8296292.

Hopefully there’s a way to bring more light into this black magic. I’ve modified default Display Template and added “Explain Rank” link to each item. This link redirects user to ExplainRank page which hosted under {search_center_url}/_layouts/15/ folder.

1 - starwars

I used

  • d=ctx.CurrentItem.Path
  • q=ctx.DataProvider.$2_3.k  (have found this hidden property using trial & error method)
  • another option is to extract value for q= from QueryText from ctx.ListData.Properties.SerializedQuery which value was<Query Culture=”en-US” EnableStemming=”True” EnablePhonetic=”False” EnableNicknames=”False” IgnoreAllNoiseQuery=”True” SummaryLength=”180″ MaxSnippetLength=”180″ DesiredSnippetLength=”90″ KeywordInclusion=”0″ QueryText=”star wars” QueryTemplate=”{searchboxquery}” TrimDuplicates=”True”…

View original post 150 more words

F# and FAST Search for SharePoint 2010

If you are a SharePoint developer, an Enterprise Search developer or an employee of a large corporation with Global Search through private internal infrastructure then you may be interested in search automation. Deployment of FAST Search Server 2010 for SharePoint (F4SP) is out of the current post’s scope (you can follow TechNet F4SP Deployment Guide if you need).

F# 3.0 comes with feature called “type providers” that helps you to simplify your life in daily routine. For the case of WCF, the Wsdl type provider allows us to automate the proxy generation. Here we need to note that, F# 3.0 works only on the .NET 4.0 and later, but SharePoint 2010 server side runs exclusively on the .NET 3.0 64bit. Let’s see how this works together.

Connecting to the web service

Firstly, we create an empty F# Script file.

#r "System.ServiceModel.dll"
#r "FSharp.Data.TypeProviders.dll"
#r "System.Runtime.Serialization.dll"

open System
open System.Net
open System.Security
open System.ServiceModel
open Microsoft.FSharp.Data.TypeProviders

[<Literal>]
let SearchServiceWsdl = "https://SharePoint2010WebAppUrl/_vti_bin/search.asmx?WSDL"
type SharePointSearch = Microsoft.FSharp.Data.TypeProviders.WsdlService<SearchServiceWsdl>

At this point, the type provider creates proxy classes in the background. The only thing we need to do is to configure the access security. The following code tested on the two SharePoint 2010 farms with NTLM authentication and HTTP/HTTPS access protocols.

let getSharePointSearchService() =
    let binding = new BasicHttpBinding()
    binding.MaxReceivedMessageSize <- 10000000L
    binding.Security.Transport.ClientCredentialType <- HttpClientCredentialType.Ntlm
    binding.Security.Mode <- if (SearchServiceWsdl.StartsWith("https"))
                                 then BasicHttpSecurityMode.Transport
                                 else BasicHttpSecurityMode.TransportCredentialOnly

    let serviceUrl = SearchServiceWsdl.Remove(SearchServiceWsdl.LastIndexOf('?'))
    let service = new SharePointSearch.ServiceTypes.
                        QueryServiceSoapClient(binding, EndpointAddress(serviceUrl))
    //If server located in another domain then we may authenticate manually
    //service.ClientCredentials.Windows.ClientCredential
    //  <- (Net.NetworkCredential("User_Name", "Password"))
    service.ClientCredentials.Windows.AllowedImpersonationLevel
        <- System.Security.Principal.TokenImpersonationLevel.Delegation;
    service

let searchByQueryXml queryXml =
    use searchService = getSharePointSearchService()
    let results = searchService.QueryEx(queryXml)
    let rows = results.Tables.["RelevantResults"].Rows
    [for i in 0..rows.Count-1 do
        yield (rows.[i].ItemArray) |> Array.map (sprintf "%O")]

Building search query XML

To query F4SP we use the same web service as for build-in SharePoint 2010 search, but with a bit different query XML. The last thing that we need to do is to build query.You can find query XML syntax on Microsoft.Search.Query Schema, but it is hard enough to work with it using official documentation. There is a very useful CodePlex project called FAST Search for Sharepoint MOSS 2010 Query Tool which provides a user-friendly query builder interface.

F4SPQueryTool

FAST Query Language (FQL) Syntax

FAST has its own query syntax(FQL Syntax) that can be directly used through SharePoint Search Web Service.

let getFQLQueryXml (fqlString:string) =
  """<QueryPacket Revision="1000">
       <Query>
         <Context>
           <QueryText language="en-US" type="FQL">{0}</QueryText>
         </Context>
         <SupportedFormats Format="urn:Microsoft.Search.Response.Document.Document" />
         <ResultProvider>FASTSearch</ResultProvider>
         <Range>
           <StartAt>1</StartAt>
           <Count>5</Count>
         </Range>
         <EnableStemming>false</EnableStemming>
         <EnableSpellCheck>Off</EnableSpellCheck>
         <IncludeSpecialTermsResults>false</IncludeSpecialTermsResults>
         <IncludeRelevantResults>true</IncludeRelevantResults>
         <ImplicitAndBehavior>false</ImplicitAndBehavior>
         <TrimDuplicates>true</TrimDuplicates>
         <Properties>
           <Property name="Url" />
           <Property name="Write" />
           <Property name="Size" />
         </Properties>
       </Query>
     </QueryPacket>"""
  |> (fun queryTemplate -> String.Format(queryTemplate,fqlString))

let fqlQueryResults =
  """and(string("Functional Programming", annotation_class="user", mode="phrase"),
         or("fileextension":string("ppt", mode="phrase"),
            "fileextension":string("pptx", mode="phrase")))
     AND filter(and(isdocument:1))"""
  |> getFQLQueryXml |> searchByQueryXml

Keyword Query Syntax

FAST also supports native SharePoint Keyword Query Syntax.

let getKeywordQueryXml (keywordString:string) =
  """<QueryPacket Revision="1000">
       <Query>
         <Context>
           <QueryText language="en-US" type="STRING">{0}</QueryText>
         </Context>
         <SupportedFormats Format="urn:Microsoft.Search.Response.Document.Document" />
         <ResultProvider>FASTSearch</ResultProvider>
         <Range>
           <StartAt>1</StartAt>
           <Count>5</Count>
         </Range>
         <EnableStemming>false</EnableStemming>
         <EnableSpellCheck>Off</EnableSpellCheck>
         <IncludeSpecialTermsResults>false</IncludeSpecialTermsResults>
         <IncludeRelevantResults>true</IncludeRelevantResults>
         <ImplicitAndBehavior>false</ImplicitAndBehavior>
         <TrimDuplicates>true</TrimDuplicates>
         <Properties>
           <Property name="Url" />
           <Property name="Write" />
           <Property name="Size" />
         </Properties>
       </Query>
     </QueryPacket>"""
  |> (fun queryTemplate -> String.Format(queryTemplate,keywordString))

let simpleKeywordQueryResults =
  """"Functional Programming" scope:"Documents" (fileextension:"PPT" OR fileextension:"PPTX")"""
  |> getKeywordQueryXml |> searchByQueryXml

Query Syntax Summary

One of the principal differences between two syntaxes is that Keyword Query needs to be converted into FQL on the SharePoint side. Keyword syntax also supports scope conditions, which will be converted into FQL filters. For example “scope:”Documents”” will be translated into ” filter(and(isdocument:1))” (In the case when Documents scope exists in the SharePoint Query Service Application).  Unfortunately, we can not specify SharePoint scope in FQL query.

SharePoint 2013 Development Environment

Talbott Crowell's Software Development Blog

In this post, I’m going to describe my setup and some of the software and system requirements I needed for getting a SharePoint 2013 development environment up and running.  I’ll try to update this post as new releases of the Microsoft Office Developer Tools for Visual Studio 2012 are released or I make major changes or discoveries as time progresses.

sharepoint2013

Development Environment Setup

I’ve been using CloudShare for some of my SharePoint 2013 development ever since I purchased my new MacBook Air 11 inch which is maxed out at 8 GB and running Parallels so I can run Windows 7.  But recently I’ve decided to pull out my old Dell Precision M6500 laptop which has 16 GB to get a SharePoint 2013 development environment set up locally.  This beast is a great laptop for SharePoint development, but it is very heavy (the power supply is heavier then my MacBook Air). …

View original post 1,930 more words

Seven SharePoint Videos from //build/ 2012 Conference