Stormin’ F#

Faisal's space

Apache Storm is a scalable ‘stream computing’ platform that is fast gaining popularity. Hadoop and Storm can share the same cluster and the two complement each other well for different computing needs – batch for Hadoop and near-real-time for Storm.

Storm provides a macro architecture for executing ‘big data’ stream processing ‘topologies’. For example, one easily increase the parallelism of any node in the Storm topology to suit the performance requirements.

For streaming analytics, however, Storm does not offer much help out of the box. Often one has to write the needed analytic logic from scratch. Wouldn’t it be nice if one could use something like Reactive Extensions (Rx) within Storm components?

Luckily Nathan Marz – the original author of Storm – chose to enable Storm with multi-language support. While Storm itself is written in Clojure and Java, it implements a (relatively simple?) language-independent protocol that can be used with basically…

View original post 292 more words

F# Weekly #4, 2015

Welcome to F# Weekly,

A roundup of F# content from this past week:

News

Videos/Presentations/Courses

Blogs

F# vNext News

New releases

That’s all for now. Have a great week.

Previous F# Weekly edition – #3Subscribe

F# Weekly #3, 2015

Welcome to F# Weekly,

A roundup of F# content from this past week:

News

Videos/Presentations/Courses

Blogs

F# vNext News

New releases

That’s all for now. Have a great week.

Previous F# Weekly edition – #2Subscribe

F# Weekly #2, 2015

Welcome to F# Weekly,

A roundup of F# content from this past week:

News

Videos/Presentations/Courses

Blogs

F# vNext News

New releases

That’s all for now. Have a great week.

Previous F# Weekly edition – #1Subscribe

FAKE Final Targets

Today I discovered a new (for me) FAKE feature called FinalTarget. I worked on integration with our new product ReportPortal (in beta right now) that collects and visualizes test results and faced a problem …

The integration process is the following:

  1. Start a new launch (test session) on the server
  2. Execute all tests: unit & integration tests (in my scenario).
  3. Close a launch on the server

As you see, I need to close a launch even in the case when tests failed. So I cannot stop an execution of build script directly on the failure. Unexpectedly, but FAKE provides an elegant solution for this scenario – FinalTarget (target that will be executed in any case if you activate it).

FAKE dependencies look like this in my script:

"Clean"
 ==> "RestorePackages"
 ==> "Build"
 =?> ("RP_StartNewLaunch", not <| hasBuildParam "skipTests")
 =?> ("RunUnitTests", not <| hasBuildParam "skipTests")
 =?> ("RunIntegrationTests", hasBuildParam "allTests")
 =?> ("RP_FinishLaunch", not <| hasBuildParam "skipTests")
 ==> "All"

Targets look like this:

Target "RP_StartNewLaunch" (fun _ ->
    ...
    ActivateFinalTarget "RP_FinishLaunch"
)

FinalTarget "RP_FinishLaunch" (fun _ ->
    ...
)

Target "RunUnitTests" (fun _ ->
    ...
)

Target "RunIntegrationTests" (fun _ ->
    ...
)

As you see, I defined FinalTarget instead of usual Target for “RP_FinishLaunch” and activated it on start of a new launch.

FAKE is awesome 😉

F# Weekly #1, 2015

Welcome to F# Weekly,

A roundup of F# content from this past week:

News

Videos/Presentations/Courses

This week from F# Advent Calendar in English

This week from F# Advent Calendar in Japanese

Blogs

F# vNext News

New releases

That’s all for now. Have a great week.

Previous F# Weekly edition – #52Subscribe

Twitter Pulse #fsharp 2014

Past two years, at the beginning of the year, I did a post where I have tried to sum up year results and find some interesting facts/stats from tweets:

This year was really awesome, a lot of things happened:

There are so many things occurred, so I need your help to find everything that happened or changed this year.

Especially for this, I prepared a data set with tweets starting from Jan 1, 2013 that is available here, and ask you to help me analyze it.

How to start:

  1. Download  the data set & unzip an archive.
  2. Download latest version of Fsharp.Data.
  3. Copy-paste following code snippet:
#r @"..\packages\FSharp.Data.2.1.1\lib\net40\FSharp.Data.dll"
open FSharp.Data

type Tweets = CsvProvider<"fsharp_2013-2014.csv">
let tweets = Tweets.GetSample()

//TODO: Your awesome analytics here

My sample analysis:

As an example of what you can do with data, I prepared a calculation of people activity that shows who had more tweets this year.

tweets.Rows
|> Seq.filter (fun x -> x.CreatedDate.Year = 2014)
|> Seq.groupBy (fun x -> x.FromUserScreenName)
|> Seq.map (fun (group, items) ->
    (group, Seq.length items))
|> Seq.sortBy (fun (_, cnt) -> -cnt)
|> Seq.take 100
|> Seq.iter (fun (group, cnt) ->
    printfn "%s:%d" group cnt)

When you execute these rows you will get a statistical output in FSI with user names and number of tweets. You can copy this output to wordle.net and play with settings to visualize it in the nice way:
2015-01-02_1007

Please help me to observe data set and share your results with me on Twitter (@sergey-tihon). I will include your plots/charts in the end of this post. Thank you!