Message Boards Message Boards

Mathematica 11.3's new DeBruijnSequence: adding new dimensions

GROUPS:

Introduction

The new version of Mathematica (11.3) introduces a whole range of interesting new functionality. Some functions like FindTextualAnswer and FindEquationalProof quite clearly open up new possibilities and provide tools of enormous power. But MMA 11.3 also introduces a huge range of other functionality that might go a bit unnoticed but also is highly interesting. Here I show the example of DeBruijnSequence and DeBruijnGraph which are interesting not only for safe busters.

Suppose we have a lock like this one:

enter image description here

Suppose we have to enter a 4 digit code. There are $10^4=10000$ combinations, starting at 0000, 0001, ..., 9999. Each has 4 digits so if I want to test all combinations I will need 40000 keystrokes.

Let's suppose that the key lock is stupid, and only remembers the last for digits you entered. So if you entered 207689, it would have tested the combinations 2076, 0768, and 7689. So I only needed to type in 6 keystrokes instead of 12. We want to ask the following question:

What is the shortest sequence of digits (keystrokes) that will contain all possible combination of 4 digits as subsequences?

The answer is given by the DeBruijnSequence, named after the Dutch mathematician Nicolaas Govert (Dick) de Bruijn (9 July 1918 – 17 February 2012).

Entity["Person", "NicolaasGovertDeBruijn::3s54p"][{"Image", "BirthDate", "DeathDate"}]

enter image description here

As my explanation of the problem is not good, I recommend this video where someone explains the whole thing much better than me.

A simpler problem

Let's make the whole problem a bit simpler. Suppose I only have two digits 1 and 2 and I want to find all possible combinations of 2.

Tuples[{1, 2}, 2]
(*{{1, 1}, {1, 2}, {2, 1}, {2, 2}}*)

So there are 4 tuples of length 2, which require me to type in 8 keystrokes. Let's see if we can do better if the lock remembers only the last for digits entered. The sequence

1, 1, 2, 2,1

solves this problem. If I partition it and always shift by one I get:

Partition[{1, 1, 2, 2, 1}, 2, 1]
(*{{1, 1}, {1, 2}, {2, 2}, {2, 1}}*)

All combinations of subsequences. It turns out that the sequence 1,1,2,2,1 is a DeBruijnSequence:

DeBruijnSequence[{1, 2}, 2]
(*{1, 1, 2, 2}*)

Note that I have to read the sequence cyclically, i.e. when I reach the end I have to keep typing (length of required sequences)-1 digits from the beginning. In this case the length the subsequences is 2 so I need to add the first digit to the end:

1,1,2,2,1

How can we find such a sequence? Well, one way is to write down all the tuples of two digits:

Tuples[{1, 2}, 2]
(*{{1, 1}, {1, 2}, {2, 1}, {2, 2}}*)

We then look for "overlapping sequences". So for example takes the first element of the tuple list {1,1}, delete its first entry {1} and then take all list that you can construct by adding one digit to the end: {1,1} and {1,2}. This gives two links in a graph {1,1}->{1,1} and {1,1}->{1,2}. We then proceed like that for all other tuples, i.e. we link them to those tuples that can be generated by deleting the first entry and adding any of the available digits:

DeleteCases[Flatten[Outer[If[Drop[#1, 1] == Drop[#2, -1], #1 -> #2] &, tuples, tuples, 1], 1], Null]
(*{{1, 1} -> {1, 1}, {1, 1} -> {1, 2}, {1, 2} -> {2, 1}, {1, 2} -> {2, 2}, {2, 1} -> {1, 1}, {2, 1} -> {1, 2}, {2, 2} -> {2, 1}, {2, 2} -> {2, 2}}*)

The respective graph looks like this:

g = Graph[DeleteCases[Flatten[Outer[If[Drop[#1, 1] == Drop[#2, -1], #1 -> #2] &, tuples, tuples, 1], 1], Null], PlotTheme -> "Scientific"]

enter image description here

Next we need to find a HamiltonianPath, i.e. a path that visits every note exactly once:

hamiltonpath = FindHamiltonianPath[g]
(*{{1, 1}, {1, 2}, {2, 2}, {2, 1}}*)

The rule to construct the DeBruijnSequence is then easy. We use the first entry of the path, i.e. {1,1} and then add one by one the second elements of all tuples in the list:

sequence = Join[hamiltonpath[[1]], hamiltonpath[[2 ;;, 2]]]
(*{1, 1, 2, 2, 1}*)

Note that the DeBruijnSequence that Mathematica gives is one shorter, i.e. cyclic, ours just reconstructed sequence is not!

The Wolfram Language also has a function that directly constructs the Graph:

DeBruijnGraph[2, 2, PlotTheme -> "Scientific"]

enter image description here

This graph looks different from the one we constructed above, but this is just its layout. In fact the graphs are isomorphic:

IsomorphicGraphQ[DeBruijnGraph[2, 2], g]

evaluates to "True".

A slightly more complicated example

Ok. Let's do one more example. We are looking for sequences of length 2 on an alphabet of three digits, 1,2, and 3.

tuples = Tuples[{1, 2, 3}, 2];
g = Graph[DeleteCases[Flatten[Outer[If[Drop[#1, 1] == Drop[#2, -1], #2 -> #1] &, tuples, tuples, 1], 1], Null], PlotTheme -> "Scientific"]

enter image description here

which turns out to be isomorphic to

DeBruijnGraph[3, 2, PlotTheme -> "Scientific"]

enter image description here

so that

IsomorphicGraphQ[DeBruijnGraph[3, 2], g]

gives "True".

Solution to our problem

We are not ready to solve our key lock problem. We are looking for the DeBruijnSequence for string length 4 and on the 10 digits 0,1,2,...,9.

DeBruijnSequence[10, 4]

enter image description here

The sequence is

DeBruijnSequence[10, 4] // Length

10000 long, but it's cyclic so we have to add the first three digits to the end to have all combinations. This means that 10003 keystrokes are enough, which has just saved us nearly 30000 keystrokes!!!! This is fantastic if you want to bust a safe.

For those interested: here is the explanation by a professional.

Note, that this is quite useful for DNA sequencing, too.

Extending the DeBruijnSequence to more dimensions

Of course, a mathematician always asks whether we can generalise this. Let's try to generate a two dimensional DeBruijn sequence, i.e. a torus instead of a cycle like before (remember that the sequence that the Wolfram Langauge generates has to be considered to be cyclic!). Again you can find the main idea on this website.

It turns out that there is, of course, maths for this: Toroidal tilings from de Bruijn-Good cyclic sequences. The algorithm is a bit technical, but at the end we obtain:

deBruijn2D[symbols_, l_, w_] := 
Module[{column1, shifts, solution}, 
column1 = DeBruijnSequence[symbols , l]; 
shifts = DeBruijnSequence[symbols^l, w - 1]; 
solution = Transpose[FoldList[RotateLeft, column1, shifts][[;; -2]]]; 
If[EvenQ[symbols] && w == 2, Join[#, {#[[1]]}] &@Transpose[Join[Transpose[solution], {RotateLeft[Transpose[solution][[1]], 1/2 symbols^l]}]], 
Join[#, {#[[1]]}] &@Transpose[Join[Transpose[solution], {Transpose[solution][[1]]}]]]]

The input slots are:

  • symbols : number of different symbols (will be represented as digits starting at zero.
  • length: length of the sequences; similar to the one dimensional case ("of length 4").
  • width: similar to length but for the second direction; refer to paper.

In fact if length_ is m, width_ is n and we have an alphabet of c numbers, then we obtain an array of size: $c^m \times c^{m(n-1)}$. So here is the DeBruijn tiling in two dimensions, on two digits where the length and width are both 2.

deBruijn2D[2, 2, 2] // MatrixForm

enter image description here

Here is a more colourful representation:

ArrayPlot[deBruijn2D[2, 2, 2] /. {0 -> Red, 1 -> Green}]

enter image description here

This is actually quite useful. Every 2 by 2 square is unique in the plane and all combinations occur. This means that if I know the colours of the tiles in one 2 by 2 square I know where I am on the plane!!!!

More symbols/larger systems

Let's look at a system with 4 symbols and 2 by 2 square tiles.

solution=deBruijn2D[4, 2, 2]

Here is a nice representation of that (in which I have "added" a column/row to take care of the cyclicity):

Grid[Partition[Grid /@ Flatten[Partition[solution /. {0 -> White, 1 -> Green, 2 -> Blue, 3 -> Red}, {2, 2}, 1], 1], 16], Frame -> All]

enter image description here

Modulo bugs in my code, each of these 2-by-2 squares should be "unique". There should be $4^4=256$ different 2-by-2 tiles:

Length[Flatten[Partition[solution2, {2, 2}, 1], 1] // DeleteDuplicates]

gives 256, which is a good sign. Here is another representation without the black grid (which is also non-cyclic):

ImageAssemble[(Graphics[{#, Rectangle[]}] & /@ #) & /@ (solution/. {0 -> White, 1 -> Green, 2 -> Blue, 3 -> Red})]

enter image description here

Of course, everything actually lives on a torus:

imgtiles = ImageAssemble[(Graphics[{#, Rectangle[]}] & /@ #) & /@ (solution[[;; -2]][[All, ;; -2]] /. {0 -> White, 1 -> Green, 2 -> Blue, 3 -> Red})];
img = ImageResize[imgtiles, 775];
imgmatrix = Join[ImageData[img], ImageData[img]];
frames = Table[
ParametricPlot3D[{Cos[u] (3 + Cos[v]), Sin[u] (3 + Cos[v]), Sin[v]}, {u, 0, 2 \[Pi]}, {v, 0, 2 \[Pi]}, 
TextureCoordinateFunction -> ({ #4,  #5} &), PlotStyle -> Directive[Specularity[White, 50], Texture[Image[imgmatrix[[g ;; g + 774, All, All]]]]], 
Axes -> False, Lighting -> "Neutral", Mesh -> None, Boxed -> False, ImageSize -> 900], {g, 1, 775, 7}];
Do[Export["~/Desktop/DeBruijnTorus/frame" <> ToString[1000 + i] <> ".jpg", frames[[i]]], {i, 1, Length[frames]}]

enter image description here

Automatic graphics

We can create another function for the visualisation:

deBruijnPlot[matrix_] := Module[{}, (MatrixPlot[# /. (Rule @@@ Transpose[{#, RandomColor[Length[#]]}] &@
DeleteDuplicates[Sort[Flatten[#]]])]) &@matrix ]

So we get:

deBruijnPlot[deBruijn2D[4, 2, 2]]

enter image description here

Conclusion

DeBruijnSequences are very intriguing mathematical objects with many applications. Here is for example a card trick based on the DeBruijnSequence. They can help to sequence DNA and to know where you are on a plane if all you have is very local information.

The functions that I introduced in this post are not optimised at all. They are relatively slow, and you will have to respect certain rules for length and width. Also the size of the matrices increases very fast as the number of symbols, length and width increase. Some slightly larger systems than the ones shown in the post can be computed, however, and their patterns are intriguing:

deBruijnPlot[deBruijn2D[8, 3, 2]]

enter image description here

They are obviously a consequence of the algorithm that I have used.

It is also quite possible to extend the entire procedure to more than 2 dimensions, but that is for another day....

Cheers,

Marco

POSTED BY: Marco Thiel
Answer
2 months ago

Great stuff Marco! cheers!

POSTED BY: Sander Huisman
Answer
2 months ago

It would be nice to see this in a more readily findable place such as the Wolfram Blog.

POSTED BY: Murray Eisenberg
Answer
2 months ago

@Murray, this is exactly why we created Staff Picks: http://wolfr.am/StaffPicks - that particular group collects blog-like best contributions, that post is already there ;-)

Staff Picks can be also found on the authors profile: http://community.wolfram.com/web/mthiel

@Marco - truly great, thank you! Some folks sent us requests to include this in WL docs!

POSTED BY: Vitaliy Kaurov
Answer
1 month ago

Great post. This function is very useful, although a little difficult to realize when it can help your code.

One thing it is good for is for making efficient experiments with CellularAutomaton, because you can get the minimal list that contains all of the neighborhoods.

POSTED BY: Todd Rowland
Answer
2 months ago

enter image description here - Congratulations! This post is now a Staff Pick as distinguished by a badge on your profile! Thank you, keep it coming!

POSTED BY: Moderation Team
Answer
1 month ago

Group Abstract Group Abstract