Four Lessons We Can Learn from the Goats (or the Lions)

I recapitulate the goats, wolves and lions.puzzle, my pencil and paper proof that the solution is 23 and Sascha's proof utilizing the Wolfram language. I think that this puzzle and the different ways how we obtained the solution give relevant hints for the treatment of mathematical problems in general, and, of course, also for the treatment of mathematical finance problems.

Lesson 1: Analytic solutions are the preferred ones.
In the goats, wolves and lions case, as soon as we have deducted the importance of even and odd sums, we can obtain the solution within the blink of an eye, independent if the inital numbers are (17,55,6) or (1000017, 1000055, 1000006).
In finance, e.g., the Black-Scholes closed form solutions for options are certainly (within the Black Scholes world) to be preferred over binomial trees (other reasons why not to use trees are here ) or over finite elements numerical solutions. A more relevant example from finance: Pencil and paper work shows that Americal call options should not be exercised early when no discrete or continuous dividends occur. In this example, analysis removes the tedious work of treating American options.

Lesson 1b: Caveat in the case of inverse problems.
We have seen in the local volatility exanmple that analytic solutions for inverse problems may be dangerous in the presence of noisy data.

Lesson 2: Brute force algorithms should be chosen carefully.
We have presented two ways to solve the goats, wolves and lions problem by brute force: (1) either generate all possible meal sequences (leading to 3^77 words in the alphabet in the worst case) or (2) generate all possible states of animal populations (with 77^3 being an upper bound). Obviously (2) is the more efficient way.
Note that also Sascha's functional program implicitly contains both ways. It was essential for perfomance reasons to clean future states of forests from multiple identical states. This cleaning is possible due to the recombination of the meal trees.
In finance, we know that brute force American Monte Carlo (generate a new starting point of an MC simulation at every exercise date and every state then) leads to exploding computing times. Least squares Monte Carlo (the Longstaff -Schwartz style) is the preferred "soft brute force" algorithm.

Lesson 3: Algorithms beat languages.
Sascha and I obtained similar computation times around a second with his Wolfram code and my Fortran program and we would both have failed with algorithm (1) from Lesson 2. Of course, some languages are more equal than the others, but key are the underlying algoirthms.

Lesson 4: Elegance produces enthusiasm.
Nevertheless (and this is not an easy outing for a real Fortran programmer), Sascha's functional solution is much more elegant and readable than my Fortran DO loops and IF statements.
You don't have to be a programmer for using the UnRisk language and produce the results (values, risk figures) you need for a fit and proper risk management.

A Language To Program Everything

is the title of an article in Wired Magazine UK Edition, May-14. It is about Wolfram Language.

In his post Functional Goats, Wolves and Lions, Sascha provides a first glance how its functional programming style can be used to build a language bottom up.

Knowledge-based programming

This is, how Stephen Wolfram describes it. It comes from the concepts, design and technologies that are  realized in Mathematica (already providing a symbolic language for everything) and Wolfram|Alpha (the computational knowledge engine).

It is about describing things in computational terms.

A symbolic language

The idea of symbolic languages is to represent things in the same way. A symbol can describe a mathematical expression, graph, geometrical object, movie, … and even a program. A representation in the symbolic way means, they are pieces of data.

In the context of the puzzle, Sascha decided, to solve the problem without mathematical shortcuts. For readability.

The algorithmic knowledge base

But for many problems algorithmic maths is indispensable. And knowledge-based programming needs and algorithmic knowledge base. And you are lucky if you have organized things orthogonally. Their description, models and methods. This makes it easy to describe things in computational terms.

Knowledge based programming means: yours system needs a symbolic Language and an Engine implementing it.

UnRisk Financial Language atop Wolfram Language

13 years ago, we made a fundamental decision. To wrap our derivative valuation engine with Mathematica language. We inherited the symbolic language constructs and an algorithmic knowledge base. The other way around: we extended Mathematica into the world of derivative and risk analytics.

From that date on we built UnRisk in a bottom-up fashion. The financial language and the algorithmic base become richer and richer. And it gets easier and easier to make the system platform agnostic, inherently parallel, in the cloud, ….

And any quant developer can do the same. In an extremely short amount of time. Like us.

UnRisk a symbolic and numerical computation engine

To give full explanation on how it works, we have established the UnRisk Academy. One of its reference classes goes to Frankfurt and Zurich soon. A Workout in Computational Finance

Picture from sehfelder

Functional Goats, Wolves and Lions

Using the Wolfram Language as a programming language, we are going to develop a program that solves the goats, wolves and lions problem in a functional programming style. At the time of this writing the only publicly available implementation of the Wolfram language is for the Raspberry Pi.

Unlike procedural and object-oriented programming, functional programming lets you approach problems in a bottom-up fashion. By adding functions, you build the language up toward your program. For the given puzzle, we will just use individual sentences from the specification of the problem and turn these into corresponding functions that build up our program.

"There are three species of animals in a magic forest: lions, wolves and goats. At the very beginning, there are 17 goats, 55 wolves and 6 lions in the forest."

We need a model for a forest in our language. The Wolfram language contains an expression called association which suits our need to model a forest perfectly. It represents an association between keys and values. The keys will be the different species in the forest; the values will be number of animals:
<|"Goats" -> 17, "Wolves" -> 55, "Lions" -> 6|>

"A correct answer contains the number of animals left and a devouring strategy leading to this number."

Because the solution needs to contain the devouring strategy, we also add a "strategy" key to our model of a forest. The value will be list of meals leading to the number of animals in the given forest. This lets us define the initial forest in the following way:
initialForest = <|"Goats"->17,"Wolves"->55,"Lions"->6,"Strategy"->{}|>;

Note that our model of a forest is only complete with respect to the given problem we have to solve. We do not add trees to make the tree huggers happy. We do not add an evil witch, either, although one could argue that any magic forest must be the home of an evil witch.

"A wolf, after having devoured a goat, is transmuted into a lion; a lion, after having devoured a goat, is transmuted into a wolf; and a lion having devoured a wolf becomes a goat."

In the context of the puzzle, a meal is an action that changes the state of the forest. Thus it seems natural to model a meal as a function which returns the new state of a forest given an existing forest state. In functional programming data is immutable, therefore we create a new association expression representing the state of the forest after the meal.

If[forest["Wolves"]>0 && forest["Goats"]>0,

The function WolfDevoursGoat first checks if this form of meal is possible at all and if it is, returns a new forest state by adjusting the animal counters and adding itself to the strategy list. Let's run the function on the initial forest:

In[1]:= WolfDevoursGoat[initialForest]
Out[1]= <|"Goats"->16,"Wolves"->54,"Lions"->7,"Strategy"->{WolfDevoursGoat}|>

We can now define functions for the other two forms of meals LionDevoursGoat and LionDevoursWolf in a similar fashion.

For a given forest, three different form of meals can be applied to it. We add a Meal function which lets us compute all possible forests that may be result from a given forest after a meal:

Meal[forest_Association] := Flatten[{
    WolfDevoursGoat[forest], LionDevoursWolf[forest], LionDevoursGoat[forest]

The possible forests are returned in a list. Let's run the functions on the initial forest:

In[2]:= Meal[initialForest]
Out[2]= {

After the initial meal we are now dealing with a list of forests. Therefore we add another version of the Meal function which also works for a list of forests:

Meal[forests : {_Association ..}] := 
    Flatten[Map[Meal, forests]]

The functions just maps the Meal function for a single forest to all forests in the given list and the flattens out the result. We can now do a nested invocation of the Meal function:

In[3]:= Meal[Meal[initialForest]]
Out[3]= {

Looking at the result, there seem to be similar forests in the result list. After two meals we may end up with 17 goats, 55 wolves and 4 lions in two different ways. The strategies {LionDevoursGoat, LionDevoursWolf} and {LionDevoursWolf, LionDevoursGoat} lead to the same number of animals for each species. Since we only have to return a single strategy that leads to a solution, there is no point in keeping more then one forest with the same number of animals for each species in the list. We thus redefine our Meal function for lists to eliminate duplicates:

    forest1["Wolves"] == forest2["Wolves"] && 
    forest1["Goats"]  == forest2["Goats"] && 
    forest1["Lions"]  == forest2["Lions"]

The ForestEqualQ function is a helper function which compares two forests for equality by extracting the species counts and comparing them with each other. The Meal function now uses the function DeleteDuplicates to eliminate redundant forests.

"After every meal, there is one animal fewer than before; therefore after some time, there is no devouring possible any more."

Our job now is to nest Meal invocations until no devouring is possible any more. A DevouringPossible function can be easily defined in terms of the Meal function we already have. Devouring for a given forest is only possible if the Meal function returns a non-empty set of new forests:


For a list of forests, devouring is possible if for all underlying forests devouring is still possible. Otherwise we must have found the solution. Again we use Map to apply the DevouringPossible to each forest in the list, then we use Apply and And to compute the final Boolean result.


When devouring is no longer possible, we can extract a solution, i.e., a stable forest from the list of forest. We do that with a StableForests function:

StableForests[forests : {_Association ..}] :=
    Select[forests, Composition[Not, DevouringPossible]]

Given these functions the final function to compute the result almost writes itself :


Starting from a single forest, we nest invocations of Meal with NestWhile while devouring is still possible. Then we extract the solution with StableForests.

"A correct answer contains the number of animals left."

In[4]:= FindStableForests[initialForest]
Out[4]= {<|"Goats" -> 0, "Wolves" -> 0, "Lions" -> 23,
  "Strategy" -> {WolfDevoursGoat, WolfDevoursGoat, WolfDevoursGoat,
    WolfDevoursGoat, WolfDevoursGoat, WolfDevoursGoat,
    WolfDevoursGoat, WolfDevoursGoat, WolfDevoursGoat,
    WolfDevoursGoat, WolfDevoursGoat, WolfDevoursGoat,
    WolfDevoursGoat, WolfDevoursGoat, WolfDevoursGoat,
    WolfDevoursGoat, WolfDevoursGoat, LionDevoursWolf,
    WolfDevoursGoat, LionDevoursWolf, WolfDevoursGoat,
    LionDevoursWolf, WolfDevoursGoat, LionDevoursWolf,
    WolfDevoursGoat, LionDevoursWolf, WolfDevoursGoat,
    LionDevoursWolf, WolfDevoursGoat, LionDevoursWolf,
    WolfDevoursGoat, LionDevoursWolf, WolfDevoursGoat,
    LionDevoursWolf, WolfDevoursGoat, LionDevoursWolf,
    WolfDevoursGoat, LionDevoursWolf, WolfDevoursGoat,
    LionDevoursWolf, WolfDevoursGoat, LionDevoursWolf,
    WolfDevoursGoat, LionDevoursWolf, WolfDevoursGoat,
    LionDevoursWolf, WolfDevoursGoat, LionDevoursWolf,
    WolfDevoursGoat, LionDevoursWolf, WolfDevoursGoat,
    LionDevoursWolf, WolfDevoursGoat, LionDevoursWolf,

We see from the output above that 23 lions are left in a stable forest.
On an Intel Core 2 Duo machine, the function takes around 1.5 seconds to deliver the solution.

"A correct answer contains the a devouring strategy leading to this number."

Although the strategy can be easily extracted from the result of the FindStableForests function, we want to deliver the strategy in proper style. Since we are dealing with a magic forest, an appropriate style is to deliver it as a fairy tale:
MakeFairyTale[forest_Association] := With[{
   beginning = {
    "Once upon a time there was a forest.\n",
    "In the forest there lived `Goats` goats, `Wolves` wolves and `Lions` lions.\n"
   middle = TemplateExpression[
      {WolfDevoursGoat -> 
        "Then a wolf devoured a goat and turned into a lion.\n", 
       LionDevoursGoat -> 
        "Then a lion devoured a goat and was transmuted to a wolf.\n",
       LionDevoursWolf -> 
        "Then a lion devoured a wolf and became a goat.\n"}]
   end = TemplateObject[{
      TemplateIf[TemplateSlot["Goats"] > 0, 
        "The remaining `Goats` goats lived happily ever after."]],
      TemplateIf[TemplateSlot["Wolves"] > 0, 
        "The remaining `Wolves` wolves lived happily ever after."]],
      TemplateIf[TemplateSlot["Lions"] > 0, 
        "The remaining `Lions` lions lived happily ever after."]]
    TemplateApply[beginning, forest], 
    TemplateApply[{middle, end}, First[FindStableForests[forest]]]
In[5]:= MakeFairyTale[initialForest]
Out[5]= "Once upon a time there was a forest.
In the forest there lived 17 goats, 55 wolves and 6 lions.
Then a wolf devoured a goat and turned into a lion.
Then a wolf devoured a goat and turned into a lion.
Then a wolf devoured a goat and turned into a lion.
Then a wolf devoured a goat and turned into a lion.
Then a wolf devoured a goat and turned into a lion.
Then a wolf devoured a goat and turned into a lion.
Then a wolf devoured a goat and turned into a lion.
Then a wolf devoured a goat and turned into a lion.
Then a wolf devoured a goat and turned into a lion.
Then a wolf devoured a goat and turned into a lion.
Then a wolf devoured a goat and turned into a lion.
Then a wolf devoured a goat and turned into a lion.
Then a wolf devoured a goat and turned into a lion.
Then a wolf devoured a goat and turned into a lion.
Then a wolf devoured a goat and turned into a lion.
Then a wolf devoured a goat and turned into a lion.
Then a wolf devoured a goat and turned into a lion.
Then a lion devoured a wolf and became a goat.
Then a wolf devoured a goat and turned into a lion.
Then a lion devoured a wolf and became a goat.
Then a wolf devoured a goat and turned into a lion.
Then a lion devoured a wolf and became a goat.
Then a wolf devoured a goat and turned into a lion.
Then a lion devoured a wolf and became a goat.
Then a wolf devoured a goat and turned into a lion.
Then a lion devoured a wolf and became a goat.
Then a wolf devoured a goat and turned into a lion.
Then a lion devoured a wolf and became a goat.
Then a wolf devoured a goat and turned into a lion.
Then a lion devoured a wolf and became a goat.
Then a wolf devoured a goat and turned into a lion.
Then a lion devoured a wolf and became a goat.
Then a wolf devoured a goat and turned into a lion.
Then a lion devoured a wolf and became a goat.
Then a wolf devoured a goat and turned into a lion.
Then a lion devoured a wolf and became a goat.
Then a wolf devoured a goat and turned into a lion.
Then a lion devoured a wolf and became a goat.
Then a wolf devoured a goat and turned into a lion.
Then a lion devoured a wolf and became a goat.
Then a wolf devoured a goat and turned into a lion.
Then a lion devoured a wolf and became a goat.
Then a wolf devoured a goat and turned into a lion.
Then a lion devoured a wolf and became a goat.
Then a wolf devoured a goat and turned into a lion.
Then a lion devoured a wolf and became a goat.
Then a wolf devoured a goat and turned into a lion.
Then a lion devoured a wolf and became a goat.
Then a wolf devoured a goat and turned into a lion.
Then a lion devoured a wolf and became a goat.
Then a wolf devoured a goat and turned into a lion.
Then a lion devoured a wolf and became a goat.
Then a wolf devoured a goat and turned into a lion.
Then a lion devoured a wolf and became a goat.
Then a wolf devoured a goat and turned into a lion.
The remaining 23 lions lived happily ever after."

You can download a Mathematica notebook which contains all the code presented in this post. This notebook will be executable with Mathematica 10.

"A correct answer contains the proof that this is the maximal possible number of animals left."

We know:
(1) The lion is the king of the jungle.
(2) A jungle is a kind of forest.
(3) 23 is a magic number that movies have been made about.
(4) Thus from (1), (2) and (3), we know that 23 lions must be the maximal possible number of animals left in a magic forest.

Some Observations on the Functional Solution

Being true to functional programming spirit, the solution presented does not use any procedural loop constructs or variables. In the Wolfram Language, function parameters (e.g., forest) which may look like variables are actually constants. The only variable in the whole notebook is initialForest, which was just added as a convenience.

Note that in the solution there isn't a single recursive function invocation. It's possible to solve the given problem with a super terse recursive function, which will make the code look smart but whose drawback is that it will make the code unreadable. Recursion is an overused feature in functional programming. Niklaus Wirth said it best in his classic book "ALGORITHMS+DATA STRUCTURES=PROGRAMS": "The lesson to draw is to avoid the use of recursion when there is an obvious solution by iteration. ... The lesson is that algorithms which by their nature are recursive rather than iterative should be formulated as recursive procedures." For example, modeling a list as a recursive data structure is a very bad idea, which ultimately killed a programming language.

Also note that we deliberately avoided taking any mathematical shortcuts in the given solution. For example, the DevouringPossible function could be written more efficiently by testing if only one species is left. The program is fast enough without applying mathematical optimizations like these and again the program would only become a lot less readable. On top of that, adding unnecessary mathematical optimizations may add subtle bugs which may go undetected over years.

Convection - A Widespread Mechanism in Nature

Last week I derived the PDE starting from an SDE describing a family of short rate models (see here). When we take a look at the resulting PDE we see that it is go the kind of a reaction-diffusion-convection equation. In a blog post a few months ago I explained the effect of diffusion for the example of osmotic pressure (see Diffusion and Osmotic Pressure). Starting with this friday I will make some blog posts to show examples of the physical effect of convection in nature.

Before we can give examples we should make something like a qualitative definition what convection is. Convection is the transfer of internal energy into or out of an object by the physical movement of a surrounding fluid (liquids or a gases). This fluid transfers the internal energy along with its mass. Although the heat is initially transferred between the object and the fluid by conduction, the bulk transfer of energy comes from the motion of the fluid. Convection can occur spontaneously/naturally through the creation of convection cells. Forced convection occurs either when the fluid is propelled across the object or the object is propelled through the fluid.

From a more microscopic view convection is a collective movement of groups or aggregates of molecules within fluids and rheids, either through diffusion or by advection (the term advection sometimes serves as a synonym for convection, but technically, convection covers the sum of transport both by diffusion and by advection - advection itself is the transport mechanism related to the fluid's bulk motion).

We will close todays blog post with a very common example where we observe convection: Cooking a pot of water. The water at the bottom of the pot is close to the flame (heat source) and heats up (red arrows). It rises because it expands and has lower density. Then the water releases heat as it cools down at the top (blue arrows) and sinks down again.  Basically, the moving (circulating) water is a conveyor belt for heat transport from the hot flame to the cooler surface.


Next friday I will give some more examples of convection in nature and how this influences our live.

How I Failed Solving the Goats, Wolves and Lions Problem

The problem description. The solutions 3 ways to solve it.

I took the challenge and thinking that the students participating the mathematical kangaroo contest have about 2 1/2 minutes per problem, I thought 30 min would be fair looking for an elegant solution. Don't looking at the choices at first hand.

And I failed

When you start, It's just you and the problem and your ideas. And your ideas are influenced by your knowledge, skills and tools. I am an Algebraist by education, a kind of abstractonaut  driving things up to the "abstract nonsense" (Recall, the free algebraic structures in the freedom of a financial language).

Not bad for formal language structures, symbolic computation, functor programming, pattern matching programming, ….

However, my try: find a closed form solution. I did suppress that there are states and transitions and that a rule base that describes the constraints is also an analytic solution.

Not surprisingly, I did not find it. Time out

With Andreas' elegant solution you only need to know what even and odd numbers are and a simple inference engine for the rule base deducing new knowledge.

Andreas is sure that children of the age of 9 years could solve the problem with a few hints.

Children are "scientists" as well - outside "official"settings?

Deep, Beautiful and Elegant Explanations

After knowing the solution, I, by chance, looked for the current best sellers in science and maths and found: This Explains Everything - 150 Deep, Beautiful and Elegant Theories of How the Word Works, in short Essays.

There are all sorts of answers to: what is your favorite deep, beautiful or elegant explanation? One just wrote "keep it simple" and then crossed it out. One explained why "elegant=complex".

Edit: I need to confess a litte dishonesty: I started with a strategy (thinking wolves ..):

(1) 6 lions devour 6 goats:  (17, 55, 6) -> (11, 61, 0)
(2) 6 wolves
 devour 6 goats: (11, 61,0) ->(5, 55, 6)
(3) 6 lions devour . uuups

(and said to myself: stop trying, solve! - and the headache began)

BTW, this strategy had worked with 18 goats (not 17) leading to (0, 61, 0). What a nice puzzle!

Picture from sehfelder (the lonesome "solver" and the possible paths ;))

Three Ways to Solve the Goats, Wolves and Lions Problem

I refer to the Easter puzzle on goats, wolves and lions. Thanks for the responses I got.

To obtain the solution, I present three possible ways:
  • a non-feasible brute force solution,
  • a feasible brute force solution, and
  • an elegant, so to say, analytical solution.
Remember that the initial state is 17 goats, 55 wolves, 6 lions.
When two animals of different species are chosen for a meal, these two animals disappear and an animal of the third species is generated. Therefore, there are finitely many possible states and finitely many ways to achieve these states. Devouring stops when there are animals of only one species left.

Non-feasible brute force algorithm
We can describe every devouring step by the animal that is generated in this step. E.g., if a lion devours a wolf, the resulting animal is a goat, and the step is described by the letter G (goat). Therefore, we could generate  all words with fewer than 78 letters (78 is the initial total number of animals) of the three-letter alphabet {G, W, L} and decide which words are the shortest ones leading to the wiping out of two species. In the worst case, this would lead to 3^77 (=5.47 * 10^36)  words to be generated. Of course, there are much fewer possibilities (no word can start with "GGGGGGG" (why?)), but far too many to obtain reasonable computing times.

A feasible brute force algorithm
Instead of bookkeeping the possible  paths, the next algorithm keeps track of the possible states. So, starting from (17, 55, 6) (78 animals), we could obtain (16,54,7), (16, 56, 5) or (18, 54, 5) (77 animals) and so on, checking for 77,76, 75, .... animals, which combinations are attainable and which not. The algorithm stops as soon as a state is achieved which contains two species wtih 0 animals,
This algorithm needs less than one second on a single core and delivers (0, 0, 23) as the final state. One could also derive the devouring strategy leading to those 23 lions left.

The elegant solution
We know:
(1) the final state must contain two zero-species.
(2) Every meal reduces the number of animals in two species by 1 (each of them) and increases the number in the third species by 1.
(3) Thus (following from (2)), the combined number of any zwo species (goats and wolves, goats and lions, wolves and lions) either remains the same after a meal or is reduced by 2.
(4) If the final state should contain two zeros, then any earlier state of the two species to be wiped out must conatin an even sum for these two species. And the initial state as well.
(5) The only even sum obtained in the inital state is that from goats and wolves (17+55=72), Thus, the surviving species has to be lions.
(6) We know from (3) that goats + lions always stay below the initials number of goats + lions, i.e. 23. Therefore, 23 is an upper bound for the number of lions.
(7) If we can devlop a strategy ending up at (0, 0, 23), we are done. We also know that every devouring step has to contain a wolve. Otherwsie (if a lion ate a goat), 23 could not be achieved.

An optimal devouring strategy 
The optimal strategy is not unique. One possibility is:
(1) 17 wolves devour 17 goats:  (17, 55, 6) -> (0, 38, 23)
(2) 19 lions devour 19 wolves: (0, 38, 23) ->(19, 19, 4)
(3) 19 wolves devour 19 goats (19,19, 4) -> (0,0,23).

Next week, I will discuss the relevance of this example to computational finance.

Quants - In the Cage of Strategic Planning

Quants are often forced to be racers at critical paths

This may also be the result of a principle trap - strategic planning

Again, do organizations, quants work for, always know what that means?

A plan may let us sleep well, but it's not a strategy. A strategy is a framework of measures - not the measures themselves.

The trap of time- and cost-based thinking

To build a strategy is a wicked problem. It is about taking risks and facing an unknown future. It is uncomfortable, consequently many turn into a more comfortable activity: strategic planning. Planning is usually based on time- and cost-baesd thinking -  it rarely contributes to group performances.

The trap of self-reference

If you have avoided the planning and cost trap don't forget to focus on the value makers that influence your business most: creators, developers, actors. Don't overestimate capabilities and resources.
This thinking often let see you (too) great opportunities and encourage a bad strategy.

Get out of your comfort zone

It is comfortable to think everybody needs your plans. But you better focus, make your strategy simple, without searching for perfection and explain the rationale and logic explicit. This usually makes you uncomfortable?

But this is what quants need to unfold their ideas and creativity, find new ways to help conserving capital, set the stage of successful investment and risk management, leverage technology and build more quantitative skills.

Picture from sehfelder

A Little About Food Webs and Debt Webs

Andreas' mathematical puzzle Goats, Wolves and Lions reminded me on a general discussion, motivated by the complexity economy thinkers: about analogy from food webs (what eats what) to economic webs (who profits from whom) - debt webs, for example, would provide better insight to avoid bank's (true) systemic risk?

Mark Buchanan suggests a transfer of Google's PageRank algorithm into a DebtRank scheme could help reducing instability of the finical system - How Google's algorithm could fix the financial system.

Motivated by this I have written about Analogies, Metaphors and Big Problems

Who are the Lions, Wolves and Goates in an economic web?

This is not so easy to say, in a DebtRank the bank size is not as important as we might have thought. It is more the effect of transition: how wide would the spread of economies distress be, if a bank failed.

BTW, two physicist of Austria presented this article about how DebRank could be used to design a systemic risk transaction tax. It would maybe change the DebtRank in a way that financial institutions would never need to pay the tax. Lions and wolves may transmute?

DebtRank Value Adjustment?

Another thing comes to my mind. DVA is to reflect the market value of non-performace to the carrying value of fixed income securities issued by a company.

This needs exposure modeling, the prediction of future cash flows of a portfolio with a counter party - that needs the estimation of default potability …

This introduces much more complexity in the valuation space.

What, if we took into account the DebtRank of counter parties? With possible feedback loops?

p.s. the DebtRank metaphor has nothing to do with the solution of the puzzle.

Picture from sehfelder

Goats, Wolves and Lions

This is a puzzle from the Austrian 2014 mathematics kangaroo contest, taken from the students' level (16years and older), which was brought to me by a friend, who is high school teacher. I think, this is really a challenging and nice one. Remember that this puzzle is one of 30 puzzles (albeit the most difficult one), and the contestants have 75 minutes of time for all of them.
I will probably publish my solution on Wednesday. I am waiting for your answers (see below what I mean with an "answer") until Wednesday noon to Andreas Binder.

The puzzle
There are three species of animals in a magic forest: lions, wolves and goats. Wolves can devour goats, and lions can devour wolves and goats. ("The stronger animal eats the weaker one".)  As this is a magic forest, a wolve, after having devoured a goat, is transmuted into a lion; a lion, after having devoured a goat, is transmuted into a wolve; and a lion having devoured a wolve becomes a goat.

At the very beginning, ther are 17 goats, 55 wolves and 6 lions in the forest. After every meal, there is one animal fewer than before; therefore after some time, there is no devouring possible any more.
What is the maximum number of animals who can live in the forest then?

A super-correct answer in my understanding
A correct answer contains
(a) the number of animals left,
(b) a devouring strategy leading to this number
(c) the proof that this is the maximal possible number of animals left.

Two versions of the puzzle: The even more difficult one.
The even more difficult one does not give you possible answers for (a). This is thought for the geniuses among you.

The kangaroo version is a multiple choice test, giving you 5 possible answers. I will write down these possible answers below the pictures. Therefore, geniuses, stop reading now.

Image Source:
Image source:
Image Source:

The multiple choice answers at the math kangaroo were
(A) 1
(B) 6
(C) 17
(D) 23
(E) 35

We Celebrate 555 Posts

About 5 years ago the first post in UnRisk Insight was first set free into the world: the introduction of the UnRiskverse and the evolutionary approach - that still drives UnRisk  development. A summary of the principles can be found here).

Post 555: Interest Rate Models - from SDE to PDE

Write more - learn more

It was the beginning of an exciting journey. We discussed more, wrote more, explained more, learned more - not only in the blog.

The UnRisk Academy was established and after Michael Aichinger and Andreas Binder wrote the book A Workout in Computational Finance, live workouts became an Academy reference class.

The most popular is not always the best?

The blog is read all over the world and starting with  a few page visits we enjoy now about 5000 per month. The reason? More are writing, we strive for publishing each work day, try to present posts in a rhythm and present more knowledge details.

From the all time hit list:

15 Years of Adaptive Integration 2012

Diffusion and Osmotic Pressure  2013

Skateboarding and Computational Finance 2013

VaR of the Jungle 2011

Black vs. Bachelier 2013

The Big Joke of Big Data 2011

In 2014

The series about The City Swap 

The Black Karasinski Calibration Problem

The series on models, like The Jump Diffusion Models

seem to be candidates for an all time high ...

As I personally reflect today on these 5 blogging years, what is amazing to me is how much it changed my working life.

And the blog journey continues ….

Interest Rate Models - From SDE to PDE

After our review of different equity models in the physics's friday blog post series we now turn to the family of interest rate models. Today's model I want to show the derivation of a PDE (partial differential equation) for a short rate model, where the SDE (stochastic differential equation) is given by

We are interested in the change dV of the value of an interest rate instrument V(r , t) in an infinitesimally short time interval dt. We set up a self-replicating portfolio containing two interest rate instruments with different maturities T1 and T2 and corresponding values V1 and V2 and apply Ito's Lemma to obtain


we can get rid of the stochastic terms in the equation above. To avoid arbitrage we use the risk free rate 

and with some rearranging

This equality only holds when both sides of the equation only depend on r and t and not on product specific quantities, thus we get the following PDE for our short rate models

In our next blog post we will take a look on the different terms of this equation. 

Happy Easter

A Good Financial Languages has Freedom, Knowledge and brings Documents to Life

In my recent post The Freedom of a Financial Language I have emphasized on the benefit of a (domain specific) programming language that represents everything in symbolic expressions, for declarative programming, symbolic code manipulation, scripting ….. as a kind of freedom

When I wrote this I recalled the book Meta Math, Gregory Chaitin, about complexity, computability, randomness and incompleteness.

And think of

Computational knowledge

A little more abstractonautics: a Formal Axiomatic System is built of an Alphabet, a Grammar, Axioms, Inference Rules and a proof-checking Algorithm. Mathematical proofs must be formulated in this language and use axioms and the inference rules to create the logic "proof chain".

If the language is a an adequate programming language an automated theorem prover can create a constructive proof, a program that computes results applying the theorem.

Remember the polynomial ring. A system of multivariate polynomial equations can be solved by constructing the so called Groebner basis. The constructive proof that the Groebner bases are result of correct equivalence transformations (of the original system of polynomials) is a program that computes them. Provided, the language is a symbolic language and knows how to do the transformations.

A mathematical theorem that is proven can be pushed into the mathematical knowledge base.

A weaker, but still strong, requirement is: verified and validated.  An adequate programming language helps to automate verification and validation

A a verified and validated program can be pushed into the computational knowledge base. It is a new building block for something higher. The knowledge base becomes richer and richer and higher problems can be solved easier and easier.

Freedom (the language aspect) is complemented by automation (a rich world of algorithms).

But theres is more:

Documents that are programs 

In Language is too Clumsy an Instrument, I have introduced the idea of representing the cores of the term sheet in a document centered declarative (symbolic) language that comes with a vast variety of algorithms.

The vision: term sheets become documents that are computational. The same with financial reports, ...

UnRisk Financial Language is document centered, symbolic, declarative and it has built-in computational knowledge.

Enabling quants to extending it into their problem world with little code.


This Easter weekend, Andreas Binder will introduce a nice and challenging mathematical puzzle about Goats, Wolves and Lions. For recreational maths fans and geniuses.

Picture from sehfelder

Why are Adjoint Problems That Fast?

Last Monday, Hannes pointed out in his post on the Black Karasinski calibration that adjoint techniques are essential to obtain results in reasonable time.

For the sake of simplification, assume that we know the reversion speed and the volatility and we want to calibrate the drift parameter function theta. This problem is somehow similar to the Hull-White curve fitting problem (see also Austria is famous for skiing and calibration) but of higher complexity due to the lack of an analytic solution already for bond prices.

Remember that in both cases (Hull-White and Black-Karasinski), the drift terms are parameters in the respective partial differential equations (PDEs) that have to be solved for the valuation of financial instruments. For the Hull-White system, explicit solutions are available that need only the evaluation of a handful of integrals. In the Black-Karasinski case, the PDE has to be solved by numerical technques with finite elements being the preferred one.

In the calibration problem, the drift term has to be fitted to meet market bond prices (minimize an error functional plus some regularization term). This could either be done by calculate the so-called Gateaux derivatives in the directions of the changes of theta: This is the sensitivity approach which needs, for the calculation of the gradient, as many finite element solutions as there are design variables (i.e. free parameters in theta). The adjoint approach reformulates the problem and needs only one PDE solution for the gradient.

Further reading:
Michael Hinze, Rene Pinnau, Michael Ulbrich, Stefan Ulbrich: Optimization with PDE constraints, Springer 2009.

Similar approches are used in the local volatility calibration.

Left: Naive application of Dupire formula, right: Egger-Engl utilizing regularization and adjoint techniques.

The Freedom of a Financial Language

Mid 70s. My master thesis was about Free Algebraic Structures.

A free polynomial ring over the alphabet {1, x} describes all language constructs of the type (x+x+1+1+1)*(x+x+x +(-1-1-1-1)) + (1+1)*x*x*x … that can be equivalently manipulated by rules of commutativity, associativity, distributivity, an additive and multiplicative neutral and an additive inverse ….  When we have decided over which domain (Integers, ..) the polynomial ring is defined, we can symbolically expand, factorize, ..

Declare and Calculate?

But it does not calculate results. For calculation you need polynomial functions (that are isomorphic to their free structures).

At 70ies we understood a universal algebra as a set A with a collection of i-ary operations on A. Today universal algebra is a theory and those are algebraic structures and operations are thought as functions that take n elements of A and return a single element of A. I like this - it is more algorithmic.

BTW, a universal algebra (in the 70s sense) is like an abstract data type that's interface is free. OOP basics.

Abstract Nonsense?

However, the theory of free algebraic structures has some interesting aspects.  Some computer scientists adopt the algebraic category theory. In short, a Category  is itself a structure of structures and processes, that preserve this structure, are called Functors.

You can find a concise description in the Haskell pages. Haskell does support in-language Functor Programming (other OO languages could do this by Functor Class),

UnRisk Financial Language is a Free Language

UnRisk Financial Language and UnRisk-Q  are atop Wolfram Language and Mathematica. It is a symbolic language and allows for symbolic manipulation before computation.

You can use the vast variety of the built in derivative and risk analytics algorithms of UnRisk / Mathematica or build or instantiate you own financial constructs or methods.

Its high-level structure and the symbolic character allows you for declarative programming, symbolic code manipulation, scripting and what have you.

Quants develop as we develop

In many cases we develop UnRisk in UnRisk Financial Language. Quants can do the same.

Picture from sehfelder

The Black Karasinski Calibration Problem continued

In my last blog I introduced the Black Karasinski model. Today I will present some more details of the two step procedure used for the identification of the model parameters.
  1. inner loop: identification of the time dependent drift function by solving the regularized curve fitting problem given the speed and the volatility parameters
  2. outer loop: identification of the speed and volatility parameter, such that the model prices of caps and swaptions are close to the quoted market prices.
Since the problem of fitting of the drift function is ill posed (see the oscillating behaviour of the drift in the picture), regularization techniques have to be applied.
Formulation as a minimization problem including a penalty term (penalizing the L2 norm or the H1 norm of the solution x), leads to a smooth drift, which is consistent with the initial term structure of interest.

For the calculations in step 2 of the algorithm we minimize the functional

To end up with a calibration algorithm which is not too time consuming (compared to models where analytic formulas are available), for the calculation of the gradient with respect to the time dependent drift parameter an adjoint approach is applied. 
Finally we end up with a calibration algorithm with a computational time < 10 sec.

Goodness of Fit   

                                   market data (red) compared to the model data (green)

5 Reasons Why It's Hard Working in the Data Salt Mines of Business Intelligence

It's not hard to make a picture from a model (it may be expensive). But it's hard to extract a model from a picture. From structured to flat data is easy the inverse is not.

But at the moment we get the impression massive data is the most valuable asset?

Take "Business Intelligence" (or "Trading Intelligence"). IMO, business intelligence is about theories, methodologies and technologies to create meaningful information for better business decisions - like finding the best position on the value / price map of your products.

Approaches can be model, or rule-based or data driven.

Life in The Data Salt Mines

But most of the definitions say something, like: a set of methods and tool to extract meaningful information, such as (understandable and hopefully computational) "models" .. from (raw) data.

That means models are not results of thinking but from data mining techniques such as statistics, fuzzy logic based learning (fuzzy decision tree, fuzzy rule based learning … ), neural networks, kernel methods (like supported vector machines), self organizing maps and what have you. (Why fuzzy? Because it makes the decisions trees and rule bases understandable and computational).

That does not sound like hard work, does it?

Challenges? I select 5.

1. What truth does your data set represent?

A set of data is true if it describes a real behavior without ambiguity. You have perfect data records of the price dynamic of one market segment related to your product class, but you want to approach a new market segment with an extra service?

2. What are raw data?

It's like in the kitchen: chefs do not only use raw ingredients but semifinished things like stocks, sauces, …
Many of the data are "cooked" (a result of a process) - cost, expenses, …are dependent of accounting standards, … In fact only very few data are raw.

3. Do Big Data present more than noise?

Usually big data represent objects in high dimensional parameter spaces and it is practically impossible to capture each possible data point. So you may run into the high-dimension-low-number-of-sampels problem - in those sets large deviations are more attributable to noise - see The Big Joke of Big Data.

4. Does machine learning generalize?

If you have data of partitions that you are interested in and your data present the truth in this partition you just extract different models of their data sets. No problem. But the data collection might be expensive. Therefore you want to extract models that generalize, but machine learning is of bad nature for generalization. See Should quants learn more about deep learning?

5. Are there unified best practices for business decisions?

IMO, you shall not extract best practice from general business patterns but the deep understanding what's going on in your environment. You may need to change strategies and the positioning of your product quite often.

As a result, I want to point out: don't build your business intelligence system by data driven methods only. Use an intelligent mix of model based systems and calibrate them to informative data.

What about trading intelligence? 

In fact, it is my belief that every challenge above is true for high frequency trading, in particular the branch that hunts for patterns in market data.

What about information efficiency of prices? What about risk spectra at various time scales? What about total trading cost? …..

I am not so bad in understanding machine learning, but I don't want to answer these questions - so, I am not a candidate for a job in the data salt mines.

Picture from sehfelder

What Austria is Famous For: Skiing and Model Calibration

In my recent posts on mathematics and skis and the beauty of metal skis, I presented a story from the stone age of industrial mathematics.

The lesson I learnt from that project was that there is no help in having a clever model for an industrial process, when there is (a) no chance to get the model parameters for this process and (b) the parameters in fact are essential for the results.

In the ski production project, (a) and (b) were the case so that we had to stop the project.

The Industrial Mathematics Institute, which is my scientific cradle, so to say, is certainly one of the leading centers in the world for inverse problems (with the big names Engl, Neubauer, Scherzer, Kaltenbacher and several more somehow related to the foundations of nonlineat inverse problems theory), and this is one of the reasons why we at UnRisk are especially proud of our calibration routines.

The Figures show calibration results for curve fitting without regularization and with proper regularization.

Learn more about calibration and inverse problems in the the workout in computational finance.

A remark: Of course, Austria is not only famous for skiing and calibration but also for percussionists, steel mills, machinery for injection moulding, firefighter cars, bodybuilders. Just to name a few.

… Fit the Battle of Econo, Econo, Econo

In All Quants Need Informative Data, I pointed out what some physics problems and economics problems have in common: they have only uninformative data (and need to transform the few data they get into model parameters - by constant recalibration).

This is a blog post of the economist Chris House: Why Are Physicists Drawn to Economics.

And this are the replies from Mark Buchanan: Arrogant physicists - do they think economy is easy? and Noah Smith: Coming into econ from physics (and other fields).

I think, all agree that the problem is "complexity" (in my understanding, a system is complex if it contains at least one subsystem of co-evolution).

But it seems economists over estimate the difference between physics and economics intuition - system behavior versus human behavior - and the physicist's emphasis on lab experiments (the data problem).

Brian Arthur's complexity economics as a different frame work for economic thought seems to be motivated by physics intuition (simplified: positive feed back structures drive emergent systems; market dominance drives market dominance, innovation drives innovation, ...).  In this framework equilibrium (neoclassical) economics is a special case of complexity economics (not the other way around). This is the article.

The difference between system behavior and human behavior? IMO, human behavior changes when humans use models intensively.

In quant finance: up to 1987 when all option traders played the simple Black Scholes game (constant volatility) everything worked well - the headache began with the introduction of the far out of the money options and the exploration of the volatility smile … now models have adapted, but there is still the problem of uninformative data.

And this is, IMO, the kind of meso-layer (remember the discussion about material behavior) between the micro layer of concrete economic transformations and the macro layer of the development of a complete economy. Understand the influence of game rules and derivative economic objects.

You can't predict future but build it.

Still a lot to do for quants: take models from good to great and solve them adequately, extract informative data, calibrate, recalibrate (or even co-calibrate) … all blazing fast.

Picture from sehfelder

Insight to the Black Karasinski Calibration problem

Hi, my name is Johannes Fürst and I am one of the financial engineers in the UnRisk developer team. Together with my colleagues, I have been working on a wide range of computational finance projects. My primary duties are the implementation and improvement of numerical methods and algorithms used for the pricing of instruments and model calibration in the UnRisk software package.

In my first blog post, I would like to give a little insight to the calibration of the Black Karasinski model, which assumes that the short interest rate process follows the stochastic differential equation
Since the distribution of the short rate is assumed to be lognormal, the interest rate never falls below zero. Before complex instruments can be priced, the model parameters have to be calibrated to the current term structure of interest rates and to the market prices of caps and swaptions.

To be able to fit the current term structure of interest, ϑ is chosen to be time dependent, whereas the reversion speed parameter η and the volatility parameter σ - used for the calibration to option data - are chosen to be constant.

Since (even for simple instruments) there is no analytical formula available in this model, numerical methods have to be applied (even in the calibration process). Using the Ito formula and no arbitrage arguments, the pricing equation - which is a parabolic partial differential equation  - can be derived:

The prices of bonds, caps and swaptions are obtained by solving the upper PDE backwards in time, taking into consideration the instrument specific interface conditions and using the instrument specific payoff as terminal condition at the product maturity date T. For the numerical solution of the PDE, we use a finite difference sheme (taking into account the direction of the flow using upwind techniques) on a two dimensional grid with respect to r and t.

More details of the calibration process and results for the Black Karasinski model will be presented in the blog next monday.

Jump Diffusion Models

In my last post I started with an overview of jump models. The first group of models I presented have been the jump - diffusion models.
Jump diffusion models add a jump process to the Brownian motion responsible for the ”normal” propagation of the underlying. In such models, jumps repre- sent rare events like crashes. The Black Scholes model is extened through the addition of a compound Poisson jump process to capture the skewness and tail behaviour observed in real world data. For the jump process a compound Poisson process with an intensity and a jump size distribution f is used. With drift and Brownian motion
we get the following process

for the spot price. Depending on the jump size distribution f we can distinguish between different jump models:
  1. Merton Jump Diffusion Model: The Merton Jump Diffusion model uses a normal distribution for f. This extension adds three additional parameters to the Black Scholes model — the intensity of the compound Poisson process, the jump mean and the standard deviation of the jump size.
  2. Kou Model: Kou modelled the jump distribution f with an asymmetric double exponential distribution to distinguish between up- and down-moves. In the Kou model four additional parameters arise, namely the intensity of the compound Poisson process, the intensity of the exponential distribution for down jumps , the intensity of the exponential distribution for up jumps and the probability for the up jump.

All Quants Need Informative Data

Andreas in The Beauty of Metal Skis describes a very general problem in a prototypical case.

You have a model that works perfect under lab conditions but how to transfer it into the conditions of the real working space?

To me a model is "perfect" in the sense of "speculative realism" if it contains all "influencing" parameters that we know from the behavior of the real wold system (a kind of an isomorphic map). If you also have an isomorphic map from a theorem described by mathematical symbols you have a mathematical model. But this does not say anything about their real quantities.

I divide this mathematically modeled systems into 2 classes: those that allow for the extraction of informative data from the "process" and those that can't.

Work with informative or uninformative data? 

Like in Andreas example it is often about material properties. In the case of metallurgical materials you may obtain the required parameters in the lab or during the process (like in forming …) Wood, composite materials, .. are very difficult.

We discuss this with our quantum physicists, Michael and Stefan, and they say: material research emphasizes on the micro-, meso- and macro-levels (say, the continuum mechanics). You need to dive into the micro level, if you, for example, want to understand crack propagation, creeping, ... in a, say, forming, cutting, welding ... process.

But also in mechanism where you want to control optimal paths (multi axes machine tools, robots, satellites, antenna systems ...), you need to calibrate the control system to the real system behavior. You may have applied an advanced Discrete Mechanics and Optimal Control approach, but the real system has special properties …

Riding the waves of observed data

Shifting from the lab world into the "real"world is calibration. An inverse problem. You will do it wrong if you don't know the traps and how to overcome them (by regularization techniques, ...) or if you have uninformative data.

In The Blank Swan of Metal Forming I pointed out that some physics problems and quant finance problems are not so different.

You have just structured a great new exotic option type. You replicate it by, say, a Heston model that is calibrated to vanilla options. You used the most advanced solvers (and cross-model validation) and did the calibration right.

The fit is so well, but the price may be so wrong, because you don't have market prices of members of your option class. Your market data are too uninformative.

What to do? You need to make provisions for using adequate real time market data (if you get some) in the real trading process. Recalibrate constantly by riding on the price waves.

BTW, the momentary "battle" between the "econophysicist" and the economists is also about this: are physicists problems of better nature to create computational knowledge, because they have always informative data? I will come back to this here.

Picture from sehfelder

The Beauty of Metal Skis

I recapitulate from Mathematics and Skis that the idea was to lay the different layers of the cross-country ski on top of each other with a temperature of maybe 90 degrees centigrade (so that the are thermally expanded), then glue them and then bring them down to snow temperature.

After I had implemented my beam model (in plain FORTRAN 77, of course), we (i.e. the ski manufacturer) produced some skis made of purely metal layers (to be more specific steel and aluminium layers). Why that? The material properties (i.e., the thermal expansion factor and the mechanical stiffness parameters) of metals at these low temperatures (low for metals) are very well known. Hence, we could use the very first experiments to identify the gluing temperature.

With that temperature fixed, we were able to reproduce the results of all metal ski experiments (different number of layers, different thicknesses, different order of materials) within a tolerance of less than 5 percent. The computing time for one experiment was about 3 minutes on a central IBM monster machine at Linz university (with a computing power much lower than today's cellphones).

Image Sourece: European Patent Office, Patent Nr. 0086983. This patent has nothing to do with our developments.

Now it is time for the bad news: When it came to the real skis (made of wood), it turned out that the material parameters of wood used for ski production differed by a factor of 4, even for wood layers produced form the same batch which should have identical parameters. We had asked the ski manufacturer several times if it would be possible to obtain the material parameters in a reliable way and had got the answer "Yes, certainly." Maybe we should have been more insisting.