20 May 2026
With a return to dungeon master duties, making maps has made a come back in my
weekend activities, and I started revisiting an old side-project of mine,
drawing mountains in a style similar to topographic maps. The aspect I am
mostly interested in is not the contour lines, but rather the usage of shadows
to visualize the relief.
My goal here is as follows: given a grid of altitudes describing a terrain, can
I draw it and suggest the relief by rendering the shadows of mountains?
I am pretty certain that there must exist solutions to this. I am less
interested in the result than in understanding lights and shadows. In other
words, this project is entirely pointless, except as an exploration exercise!
In aprevious series of posts, I used SVG to draw geometric figures and
found it reasonably enjoyable, so that’s what I decided to use.
One aspect of that previous project wasn’t very pleasant, though: the process
of generating documents by manually running a scripts to create a html file,
and opening it it the browser to see the results. So I figured I would try
something different, and use this as an excuse to give
Bolero, the F# WebAssembly library, a spin.
Initial setup
The setup of Bolero was completely straightforward:
- install the template,
- create a project, using the
--minimal option,
- start the server with
dotnet watch run,
- go to the browser, a basic elmish app is running, with hot-reload.
More...
06 May 2026
This post is a continuation of my exploration of the RANSAC algorithm. In
my previous post, I began investigating if I could auto-tune some of the
input parameters. The first attempt was not a success, but gave me an idea,
which will be today’s post.
As a quick recap, RANSAC is a method to estimate a model in the presence of
noisy data (so-called outliers). The method requires 2 input parameters:
t, “A threshold value to determine data points that are fit well by the
model (inlier)”,
d, “The number of close data points (inliers) required to assert that the
model fits well to the data”.
Rather than having to specify myself these 2 arguments, I would like the model
to figure out good values by itself. In my initial attempt I tried to directly
estimate the proportion of inliers in the dataset. This time, I will try a
different angle: what if I started from pessimistic estimates, assuming many
outliers and very high noise, and progressively tightened up the estimates?
More...
08 Apr 2026
In my previous posts, I looked into the RANSAC algorithm. One thing I
wondered about is, could the algorithm usage be made simpler by automatically
tuning some of its input parameters? That is, rather than requiring the user to
enter parameters, can we derive reasonable values for these parameters from the
data? In this post, I will go over my first attempt. This was not a success,
in that it did not produce immediately useful results, but it gave me a better
understanding of the algorithm, and the process was interesting, so that’s what
this post will focus on.
In the algorithm as described in Wikipedia, 2 parameters stand out in
particular:
t, “A threshold value to determine data points that are fit well by the
model (inlier)”,
d, “The number of close data points (inliners) required to assert that the
model fits well to the data”.
My thinking here was that, given a particular model, I should be able to derive
these numbers from the data itself. t, the threshold value, is about how
far predictions are from the correct value, something we can measure. d is
about how many predictions should be close to their actual values overall. It
is related to a different measure, the proportion of outliers. If I had, say,
20% of outliers in my sample, I would expect roughly 80% of my model
predictions to be close to the correct value.
RANSAC assumes that you, the user, will input these values. What I wondered
about is, if all I had was a model and a dataset, how I would go about
evaluating the value of d. Or, stated differently, could I estimate
p(outliers), the proportion of outliers in a dataset?
More...
18 Mar 2026
In our previous installment, we set up the stage for our exploration of
the RANSAC algorithm, illustrating how traditional linear regression could
perform quite badly on a dataset with many suspect observations. In a nutshell,
the estimation penalizes large prediction errors more heavily, and as a result,
it can over compensate for a few very large errors (so called “outliers”), with
a poor model overall.
This issue is particularly prevalent with “standard” linear regression, because
it evaluates goodness of fit using the square of the prediction errors, which
mechanically penalizes heavily larger errors. However, it applies beyond that
specific case. Most estimation techniques proceed by minimizing some
measurement of prediction error, and, as large errors are worse than small
ones, will usually penalize large errors more heavily, introducing the same
issue of sensitivity to outliers.
The RANSAC algorithm attempts to address that issue, using a few simple ideas
together. In pseudo-code, slightly modified from the Wikipedia version,
the key part of the algorithm goes like this:
- Take a small random subset of the data available, hoping it contains mostly
"clean" observations, so-called inliners,
- Estimate a prediction model on that small sample,
- Use that prediction model on the entire data. If the error on an observation
is too large, discard it as an outlier, otherwise keep the observation as a
potential inliner,
- If the overall number of inliners is too small (i.e. we have too many
outliers), this model is a bad fit, discard it,
- Otherwise estimate a model on the inliners, and return the result.
More...
05 Mar 2026
Recently, as part of a project I am working on, I had to estimate a model to
make some predictions. And, as is usually the case, the data available was not
very good, with a lot of suspect data points, so called “outliers”. Estimating
the parameters of a model becomes tricky then, because a few data points that
are very wrong can have an outsized impact on the parameters, and tilt the
results of the estimation far off the correct values.
This lead me to the RANSAC algorithm, which is designed to handle that
exact problem. And, in my experience, re-implementing an algorithm from scratch
is a great way to really understand how it works, so that’s what I did: you can
find the current code here.
In this post, I will motivate the problem first, demonstrating how outliers
can wreck model estimation. In our next installment, we will go over RANSAC and
see if it fares any better!
Let’s illustrate the problem on a simple example. Imagine that we have a model,
where we observe a value X and a value Y, and that these 2 follow a simple
relationship where
Y = 1 + 0.5 * X
That is, they form a straight line. Now imagine that we have a dataset with a
sample of observations for X and Y, with 2 caveats:
- In half the cases, the observation is completely wrong, and there is no
relationship between
X and Y,
- When the observation is correct, the observation is a bit off, and the
recorded values are
Y = 1 + 0.5 * X + some noise.
Let’s first create a model to represent that:
type Obs = { X: float }
type Parameters = {
Constant: float
Slope: float
}
let predictor (parameters: Parameters) =
fun (obs: Obs) ->
parameters.Constant
+ parameters.Slope * obs.X
let trueParameters = {
Constant = 1.0
Slope = 0.5
}
We have observations Obs and model Parameters.
The “true” model has parameters { Constant = 1.0; Slope = 0.5 }, which we can
use to predict the value Y like so:
predictor trueParameters { X = 1.0 }
val it: float = 1.5
More...