Field notes on integrating a LLM in a Business Workflow

For many reasons, I am not a fan of the current hype around Large Language Models (LLMs). However, a few months ago, I was asked to work on a project to evaluate using LLMs for a practical use case. I figured this would be an interesting opportunity to see by myself what worked and what didn’t, and perhaps even change my mind on the overall usefulness of LLMs.

In this post, I will go over some of the things I found interesting.

Caveat: I have a decent knowledge of Machine Learning, but this was my first foray into LLMs. As a result, this post should not be taken as competent advice on the topic. It is intended as a beginners’ first impressions.

Context

The client - let’s call them ACME Corp - produces and distributes many products all over the world. Plenty of useful information about these products, such as inventory or shipments, are available in a database. Unfortunately, most employees at ACME Corp have neither access nor a good enough grasp of SQL (or of the database itself) to make use of that information.

The thought then was to explore if, by using LLMs, we could give users a way to access that information, in their own language (“what is the current inventory of sprockets model 12345 in Timbuktu”), without the hurdle of writing complex SQL queries. And, because ACME Corp is international, “in their own language” is meant quite literally: the question could be asked in English, as well as in a wide range of other languages.

At a high level, we want something like this:

diagram of the workflow: convert a request to a query and respond

Given the time budget on the project, we did not have the option to fine-tune a model for our domain, and used a “stock” LLM.

More...

Maximum Likelihood estimation with Quipu, part 2

In my previous post, I went over fitting the parameters of a Log-Normal distribution to a sample of observations, using Maximum Likelihood Estimation (MLE) and Quipu, my Nelder-Mead solver. MLE was overkill for the example I used, but today I want to illustrate some more interesting things you could do with MLE, building up from the same base setup.

Let’s do a quick recap first. I will be using the following libraries:

#r "nuget: MathNet.Numerics, 5.0.0"
#r "nuget: MathNet.Numerics.FSharp, 5.0.0"
#r "nuget: Plotly.NET, 5.0.0"
#r "nuget: Quipu, 0.5.2"

Our starting point is a sample of 100 independent observations, generated by a Log-Normal distribution with parameters Mu=1.3 and Sigma=0.3 (which describe the shape of the distribution), like so:

open MathNet.Numerics.Random
open MathNet.Numerics.Distributions

let mu, sigma = 1.3, 0.3
let rng = MersenneTwister 42
let duration = LogNormal(mu, sigma, rng)

let sample =
    duration.Samples()
    |> Seq.take 100
    |> Array.ofSeq

LogNormal distribution and histogram of the sample

If we want to find a distribution that fits the data, we need a way to compare how well 2 distributions fit the data. The Maximum Likelihood function does just that: it measures how likely it is that a particular distribution could have generated a sample - the higher the number, the higher the likelihood:

let logLikelihood sample distributionDensity =
    sample
    |> Array.sumBy (fun observation ->
        observation
        |> distributionDensity
        |> log
        )
More...

Maximum Likelihood estimation with Quipu, part 1

Back in 2022, I wrote a post around using Maximum Likelihood Estimation with DiffSharp to analyze the reliability of a production system. Around the same time, I also started developing - and blogging about - Quipu, my F# implementation of the Nelder-Mead algorithm.

The two topics are related. Using gradient descent with DiffSharp worked fine, but wasn’t ideal. For my purposes, it was too slow, and the gradient approach was a little overly complex. This led me to investigate if perhaps a simpler maximization technique like Nelder-Mead would do the job, which in turn led me to develop Quipu.

Fast forward to today: while Quipu is still in pre-release, its core is fairly solid now, so I figured I would revisit the problem, and demonstrate how you could go about using Quipu on a Maximum Likelihood Estimation (or MLE in short) problem.

In this post, we will begin with a simple problem first, to set the stage. In the next installment, we will dive into a more complex case, to illustrate why MLE can be such a powerful technique.

The setup

Imagine that you have a dataset, recording when a piece of equipment experienced failures. You are interested perhaps in simulating that piece of equipment, and therefore want to model the time elapsed between failures. As a starting point, you plot the data as a histogram, and observe something like this:

histogram of observations

It looks like observations fall in between 0 and 8, with a peak around 3.

What we would like to do is estimate a distribution that fits the data. Given the shape we are observing, a LogNormal distribution is a plausible candidate. It takes only positive values, which we would expect for durations, and its density climbs to a peak, and then decreases slowly, which is what we observe here.

More...

Delaunay triangulation with Bowyer-Watson: initial super triangle, revisited

In our last installment, I hit a roadblock. I attempted to implement Delaunay triangulations using the Bowyer-Watson algorithm, followed this pseudo-code from Wikipedia, and ended up with a mostly working F# implementation. Given a list of points, the code produces a triangulation, but occasionally the outer boundary of the triangulation is not convex, displaying bends towards the inside, something that should never supposed happen for a proper Delaunay triangulation.

While I could not figure out the exact issue, by elimination I narrowed it down a bit. My guess was that the issue was probably a missing unstated condition, probably related to the initial super-triangle. As it turns out, my guess was correct.

The reason I know is, a kind stranger on the internet reached out with a couple of helpful links (thank you!):

Bowyer-Watson algorithm: how to fill “holes” left by removing triangles with super triangle vertices
Bowyer-Watson algorithm for Delaunay triangulation fails, when three vertices approach a line

The second link in particular mentions that the Wikipedia page is indeed missing conditions, and suggests that the initial super triangle should verify the following property to be valid:

it seems that one should rather demand that the vertices of the super triangle have to be outside all circumcircles of any three given points to begin with (which is hard when any three points are almost collinear)

That doesn’t look overly complicated, let’s modify our code accordingly, and check if this fixes our problem!

More...

Delaunay triangulation: hitting an impasse

During the recent weeks, I have been making slow but steady progress implementing Delaunay triangulation with the Bowyer-Watson algorithm. However, as I mentioned in the conclusion of my previous post, I spotted a bug, which I hoped would be an easy fix, but so far no such luck: it has me stumped. In this post, I will go over how I approached figuring out the problem, which is interesting in its own right. My hope is that by talking it through I might either get an idea, or perhaps someone else will spot what I am missing!

Anyways, let’s get into it. Per the Wikipedia entry:

a Delaunay triangulation […] of a set of points in the plane subdivides their convex hull into triangles whose circumcircles do not contain any of the points; that is, each circumcircle has its generating points on its circumference, but all other points in the set are outside of it. This maximizes the size of the smallest angle in any of the triangles, and tends to avoid sliver triangles.

The Bowyer-Watson algorithm seemed straightforward to implement, so that’s what I did. For those interested, my current code is here, and it mostly works. Starting from a list of 20 random points, I can generate a triangulation like so:

let points =
    let rng = Random 0
    List.init size (fun _ ->
        {
            X = rng.NextDouble() * 100.
            Y = rng.NextDouble() * 100.
        }
        )
points
|> BowyerWatson.delaunay

With a bit of tinkering, I can render the result using SVG:

This looks like what I would expect: all the points are connected in a mesh of triangles.

More...