11 Feb 2017
If you follow me on Twitter, you may have noticed a recurring topic lately: Azure Functions. I have found it both useful for many use cases, and simply fun to work with; and it fits pretty nicely with F#. I recently gave a talk at NDC London (the video should be online at some point), where I demoed a small example, trying to fit in as many features as I could, in as little time and code as possible. Someone took up my offer to write a tutorial from the ground up, so I figured, let’s take that example and turn it into a post. It is a demo, so what it does is not particularly useful by itself, but it illustrates many of the features and tricks I found useful, and should be a good starting point to write “real” code.
The app: sending exchange rate updates on Slack
What we will build is an app which will post, on a regular cadence, the latest available USD/GBP exchange rate on Slack. The reason I picked that example is two fold. First, the exchange rate changes often, which will help verify that things are indeed working. Then, we’ll be able to showcase how easy it is to integrate functions to put together a working application.
Before starting with the code itself, we will need two things: exchange rates, and Slack.
For the exchange rate, we will use Yahoo, while it’s still there. Yahoo has a free API for exchange rates, available at the following URL:
http://query.yahooapis.com/v1/public/yql?q=select * from yahoo.finance.xchange where pair in ("GBPUSD")&env=store://datatables.org/alltableswithkeys
This returns an xml document, which looks like this:
<query xmlns:yahoo="http://www.yahooapis.com/v1/base.rng" yahoo:count="1" yahoo:created="2017-02-11T19:56:24Z" yahoo:lang="en-US">
So the first part of our job will be to regularly call that URL, and extract the
Rate from the xml document.
Posting to Slack isn’t very difficult either. I created my own personal Slack at
mathias-brandewinder, where I can talk to myself quietly, as well as test examples like this one. I then created a webhook, by going to
Incoming WebHooks, and pick a channel to post to. I created a channel
#exchange_rate for the occasion. Once the setup is done, you get a WebHook URL, which looks like
https://hooks.slack.com/services/S0meL0ngCrypt1cK3y, where you can now POST JSON messages.
So the second part of our job will be to take that rate, create a JSON message and POST it.
10 Jan 2017
About 2 years ago, I wrote a little application, @fsibot. @fsibot is a Twitter bot which, when it receives a Tweet that is a valid F# expression, will evaluate it and return the result to the sender. Got to code FizzBuzz in an interview? Impress your audience, and send a Tweet from your cell phone to @fsibot:
It was very fun to write, rather pointless, but turned out to be an interesting exercise, which taught me a lot. And, in spite of its simplicity, it’s a decent sample app, which touches on many aspects a real-world app might encounter.
After some hiccups early on, @fsibot has been running pretty smoothly, until I noticed issues recently. Rather than trying to figure out what the hell was going on, I decided to port it over Azure Functions, which sounded like a better fit for it. While at it, I also made a couple of changes to the bot. If you are interested, you can find the code on GitHub.
25 Sep 2016
The intent of this post is primarily practical. During the Kaggle Home Depot competition, we ended up using the Random Forest implementation of ALGLIB, which worked quite well for us. Taylor Wood did all the work figuring out how to use it, and I wanted to document some of its aspects, as a reminder to myself, and to provide a starting point for others who might need a Random Forest from F#.
The other reason I wanted to do this is, I have been quite interested lately in the idea of developing a DSL to specify a machine learning model, which could be fed to various algorithms implementation via simple adapters. In that context, I thought taking a look at ALGLIB and how they approached data modelling could be useful.
I won’t discuss the Random Forest algorithm itself; my goal here will be to “just use it”. In order to do this, I will be using the Titanic dataset from the Kaggle “Learning From Disaster” competition. I like that dataset because it’s not too big, but it hits many interesting problems: missing data, features of different types, … I will be using it two ways, for classification (as is usually the case), but also for regression.
Let’s dive in the ALGLIB random forest. The library is available as a nuget package,
alglibnet2. To use it, simply reference the assembly
#r @"alglibnet2/lib/alglibnet2.dll"; you can then immediately train a random forest, using the
alglib.dfbuildrandomdecisionforest method - no need to open any namespace. The training method comes in 2 flavors,
alglib.dfbuildrandomdecisionforestx1. The first one is a specialization of the second one, which takes an additional argument; therefore, I’ll work on the second, most general version.
03 Sep 2016
Today, we’ll close our exploration of Gradient Boosting. First, we looked into a simplified form of the approach, and saw how to combine weak learners into a decent predictor. Then, we implemented a very basic regression tree. Today, we will put all of this together. Instead of stumps, we will progressively fit regression trees to the residuals left by our previous model; and rather than using plain residuals, we will leverage DiffSharp, an F# automatic differentiation library, to generalize the approach to arbitrary loss functions.
I won’t go back over the whole setup again here; instead I will just recap what we have at our disposition so far. Our goal is to predict the quality of a bottle of wine, based on some of its chemical characteristics, using the Wine Quality dataset from the UCI Machine Learning repository. (References: P. Cortez, A. Cerdeira, F. Almeida, T. Matos and J. Reis. Modeling wine preferences by data mining from physicochemical properties. In Decision Support Systems, Elsevier, 47(4):547-553, 2009.)
Gist available here
We are using a couple of types to model our problem:
type Wine = CsvProvider<"data/winequality-red.csv",";",InferRows=1500>
type Observation = Wine.Row
type Feature = Observation -> float
type Example = Observation * float
type Predictor = Observation -> float
28 Aug 2016
One of the reasons I use F# so much is that it’s an awesome scripting language to Get Stuff Done. Case in point: this blog. I recently decided to switch from BlogEngine.NET to Jekyll, which meant porting over nearly 9 years of blog posts (about 300), extracting html-formatted content from SQL and converting it to markdown. After a couple of weeks of manual process, I realized that at the current cadence, it would take me about a year to complete, and that by then I would probably have lost my mind out of boredom. Time for some automation with F# scripts!