Allemang Tables (Introduction, no code)

Allemang Tables (notes added 17 March 2003)

Motivation

There are inefficiencies in our genus calculations:

    1) We treat heap sizes that are theoretically identical (for example, two heap sizes each equal to *2, say, or to a4[0], say) as distinct.
    2) We fail to capture the most general form of our prior knowledge about the behavior of heaps "up to" a specific heap size
    3) Specifically we don't capture period information.  

The idea is to carry some information along to each heap size, just as we did in Sibert-Conway decompositions.  

The form of the information we "carry" is different for genus calculations.

Here's a first shot at the information we want to carry for a heap size:

    1) Whether the game is an adder, and if so, which adder is it.
    2) If not an adder, then a formal symbol a, b, c, d, etc, with "heap period" information (p,s)  (period = p, start = s); rank = p+s
    3) An addition table for the formal symbols, showing for each linear combination of formal symbols reduced modulo the rank,
    what is the genus of that position.
    
If two heaps are identical in the sense of misere reversibility, we want to be sure that we "collapse" them when starting to consider
larger heap sizes.

We need a name for the data structure that is "the information carried."  We'll call it an genus analysis, or simply analysis, for short.

A complete analysis augments an analysis by including global period information, which is a pair (P, S) together with a
table of formal equivalences to symbols a, b, c etc or to adders of length P+S.  Once heap size h >= P+S+1 is reached, the genus of a heap of size
h can be found in the global period table by reducing h modulo P.

The first task is come up with a good internal representation of an analysis.

The second task is to come up with an algorithm for incrementally extending an analysis to the next heap size.  Allemang gives one,
but we need to come up with one ourselves.

Inductively, assume that a correct analysis to heap size h is given.  We want to extend it to heap size h+1.

We're first going to consider positions that involve only one heap of size h+1.

As a preliminary step, we compute the genus of the heap of size h+1.  This is the starting point.  

We'll compute all genus results of positions 1 x (h+1) + (a position involving only heaps of size h or less).

That's a job that itself needs to be broken into pieces.

The first piece is to consider positions P(j) = 1 x (h+1) + j *h[1].  That is a "linear" sequence of genus values. (***)

We need to be able to tell when this sequence of genera repeats as a function of j.  

The moves from P are of two types:  (A) ones that move in the h+1 component, and (B) those that don't.

(A) Suppose there are M1 possible moves from the (h+1) component.  Each of these moves inductively leads
to a position whose genus is directly computable from the partial analysis we've completed so far.  
So we just look those M genera values up and throw them into a bag.

For the (B) moves, we're going to have a fixed number M2 of these, depending on how many options there are from h[1].
For each of those M2 options, we're going to be looking up a value that it is earlier in the linear sequence we are currently
attacking.

So in summary, at each stage j, we have M1+M2 options to look at, where M depends only on the (fixed) number of
moves available from the heap size h+1, and M2 depends on the number of moves from h[1].  And for each such
option, we can easily obtain its genus, either from the inductive analysis or from an earlier term in the sequence
we are currently computing.  

The mex-with-0,1-carrying algorithm is then applied to these M1+M2 options, and we get the genus of the jth value in the
linear sequence we started with at (***)

They differ in their j-portion...

The moves that leave the h+1 component unchanged....

....arggh, how to do this?...
    
Notes added 19 March 2003:

Rather than groping around for the algorithm just yet, we can at least characterize its desired output data structures.

A partial analysis to heap size H has two pieces:

    1) Heap period information for each heap size t that is an element of {1, 2, ... , H}
        Fixing t, this information is captured by two integers (p,s), where s >= 0, p >= 1.  
        It asserts that a position that contains p+s or more heaps of size t can be reduced to s such heaps without changing anything
        Another way of writing this is as an equation  {p+s}.A = s.A, where A is the game consisting of a single pile of size t
        Typical examples of periods that occur within particular games (these are NOT universally applicable identities in general).
        
                s=0, p=1:  Game = *0; all multiples have genus 0^{120}.  All multiples also zero, equation A=0
                s=0, p=2:  Game = *1; genus sequence 0^{120}, 1^{031}, 0^{120}, ... equation A+A = 0
                s=1, p=2:  Game = *2; genus sequence 0^{120}, 2^{20}, 0^{02}, ... equation A+A+A = A
                s=1, p=1:  Game = a[4]=2+2; genera 0^{012}, 0^{02}, 0^{02}, ... equation A+A = A
                
        A basis position [at size t] is one that consists fewer than s+p heaps of size t
                
    2) Addition table information for looking up the genus of each possible linear combination of basis positions.  Since there are
        s+p possible basis positions (0,1,2...s+p-1) for each heap size t. This can be a big table.  We need to be efficient by treating
        adders separately

A complete analysis augments a partial analysis by dropping the dependence on a finite value H.  

    3) Global period information (P,S) reflecting the fact that a heaps of size P+S or greater can be treated as heaps of size S.
    
    From an efficiency and clarity point of view, we're often interested in representing the global period information as a sequence
    of P+S games, where each one is either an adder or a formal symbol A, B, C, etc with known little-p and little-s.

How should we represent this information in Mathematica?  

    It's very simple to represent the heap period information.  It's just a list of length H with entries {p,s}.
    It's also simple to represent the addition table.  It's a multidimensional array with a genus symbol in each entry.
    The global period information can also be represented just as a pair {P,S}, since everything is implied by that.

We're also keenly interested in cases where different heap sizes H are a priori identical, for example, when two heap sizes are both equal
to the game a4[0], say, or to *2, then we definitely want to keep only one of them around in our internal computations.  One way of
handling that is to collapse the information in the move list; ie if heaps 7 and 4 are each equal to *2, say, then we can basically blow away
heap size 7 (replacing all moves to 7 with moves to 4).  We did something like this in our Sibert-Conway notebook, and observed it to
work well.  It's not clear if we should try to do it immediately in this notebook though because it might make things more confusing when
we're first implementing the algorithm.

In implementing the algorithm, we probably want to keep track of which games are adders, and which ones aren't.  Allemang gives a
useful rule:

    "Now, if we are given a sum of games, and we wish to know whether they [sum to an adder], all we have to do is look at
    the summands.  If they are adders, so is the sum; if one of them is not, neither is the sum"
    
What we're most interested in is this passage from H to H+1 in a partial analysis.  We need to put our finger on the algorithm.

In the passage from H to H+1, we'll certainly depend on the correctness of the decomposition to heap size H.  

One of the common operations we'll want to do with analyses is look up outcomes.  A partial analysis to heap size H
is amenable to quickly returning the correct genus for any position that involves arbitrary numbers of heaps (provided
no such heap is larger than size H).  It's just a lookup in the addition table after the appropriate (p,s) reductions have
taken place.

In passing from heap size H to size H+1, we're going to be dependent upon looking up partially completed analyses
to heap size H+1.  What do we mean by that?  Well, we mean that we're going to be wanting not just to look up
results about positions where no heap is bigger than H, but also information about positions that involve some H
terms also.  There are two basic operations in this algorithm:

    1) Compute an outcome
    2) Discover one-dimensional periodicity, collapse, and record it.
    
The input to step 1), computing an outcome, is always an explicit position, and the options are also explicit
positions.  But the process of looking up the outcome for an explicit position will in general first involve doing heap-period
reduction.  This should be done in the partially-completed case just as it is done in the normal case, as much as
possible.  

We might be able to do this in the partial case by carrying along associations between "match vectors" and partial addition tables
of genus values [ie (p,s)-values]. Every match vector has length H+1. The match vector is intended to summarize the
results of already completed genus calculations along 0, 1, or more dimensions simultaneously.  If a match vector has 0 entries
equal to -1, then it simply records the genus of an explicit position, and the associated addition table is empty.  If a match
vector has exactly 1 value equal to -1, say, at position h, then it indicates that its associated addition table records the result of any
position whose heaps exactly match the numbers in the non-minus-1 positions, but are allowed to have ANY number
of heaps of size h.  The addition table is just a list of p+s genus values in that case.

If a match vector has two values equal to -1 at positions h_1 and h_2, then an arbitrary number of heaps of sizes h_1 and
h_2 can be handled, again provided that the remaining heap sizes exactly correspond to the match vector in other positions.  In this
case, the addition table is a two dimensional object.

In this formulation, a "completed" analysis to heap size H+1 corresponds to the construction of a match vector that has all -1 entries.
The associated addition table is then exactly the object that we've previously called a "partial analysis to heap size H", but now it is
to heap size H+1.

So what's the algorithm?

First compute (H+1) + 1, (H+1)+1+1, ....   until periodic.  Record the match vector (-1,0,0,0,....,1) (the last '1' in the H+1 position),
the (linear) addition table.  

Q: How to tell when periodic?
A: Note that all these positions have the exact same number of options.  When there is a repeat in arguments, there is
necessarily a repeat in values.   Suppose we compute a term in this sequence and we find it repeats an earlier term.

A possibly more useful way to think about this:

    There are the same # of options at each step, and each one is of two types.  
    
        a) If it moves in the H component, it's an immediate lookup from an addition table we've
        already computed.  This lookup may or may not be in the 'periodic' portion of the table;
        but we probably will want to know which is the case (periodic lookup, or nonperiodic)
        
        b) If it moves in the non-H component, it's an immediate lookup from an earlier term
        in the series we're currently computing.  This earlier value may or may not be a repeat
        of an eariler value we've computed.  Again, we're probably going to be interested in
        whether or not the lookup was in the periodic portion or not.
        
        c) Finally, we're interested in whether the actual value computed is a repeat of an earlier value.

Regardless of whether the values are repeating, we should be able to stare at the options and our existing addition
tables to determine how many values we need to compute to be guaranteed of the periodicity.  Put another way,
suppose we _observe_ periodicity p, where what we mean by that is we find a full cycle of repeating values p.
This is something we can do after all, just keep track of repeating values.  If a value repeats "inside" the repeating
portion of the addition table, we need to keep calculating.  What we need is a full set of periodic values "inside"
the repeating block, whose "boundaries" we can determine UP FRONT, before any values are computed.

Notes added 20 March 2003

It's simpler than that.  It's like this:

    There's a fixed number of moves that's unchanging with each term in the sequence.
    They are of two types:   those that move in H, and those that don't.
    
    Think about how the form of a given move changes as you look at each new term in the sequence.
    
        if it's a move in H, say h->uv, then the elements of the sequence will look like
        
            u v
            u v 1
            u v 1 1
            u v 1 1 1  etc
            
        if it's not a move in H, the elements will look like
        
            H
            H 1
            H 1 1
            H 1 1 1
            
The first class of positions will be periodic by assumption, and we will be able to give numbers (p,s) such that if you
are looking at the ith term of the sequence and i>=p+s, then you can get the correct answer by reducing i modulo p
until it is first less than or equal to p+s.  And if we're given a bunch of such position sequences, each with their own
(p,s) values, then we can compute their "combined" (p,s)-value so that *all* of them will be simultaneously periodic
from that point onward.

So the algorithm is this: "Algorithm ALPHA"

    Before calculating terms, preprocess the H-moves to determine the "combined" guaranteed (p,s) repeat value.
    This will involve taking maximums on the s values and lcm's of the p's presumably.
    
    With the 'global' (p,s) for H-moves predetermined, start calculating terms, looking for matches.  Each match
    sets up a 'pseudo-period' p.  If a pseudo-period fills itself out to an actual period of length p, AND it continues
    into the combined region (roughly speaking---check boundary conditions here----) , you're done, you've found the
    period.
    
    Finally, with the periodic genus sequence in hand, truncate it back to the 'smallest' (p,s) that works.
    
This algorithm is not quite baked, but it's close.  

It is similar in structure to the genus calculation algorithm itself.

If we were a smarter person, we would be able to codify this algorithm recursively from the ground up.
In other words, we would be able to write the primitive <1,2>-genus algorithm in terms of the <1> genus,
and algorithm ALPHA in terms of the <1,2> algorithm, and not only that, it would always be the same algorithm.
This would be the elegant way to go.

What we're going to do instead is develop the data structures and some simple operation on them to get started

An oracle is a data structure that captures a partial analysis to some heap size.  

An oracle includes information about which games are adders.  For the non adders, it includes information about
the standalone heap period.

oracle :- {octalcode, heapsize, {heapinfo, additiontable}}
heapsize :- Either:
          {x} -  the analysis is partial to heap size x
          {P,S} - the analysis is complete and is periodic starting at heap size S with period P.
          In the complete case, the
heapinfo :- {h1info, h2info, .... , hninfo}  (* heapsize entries, either x or S+P of them *)
hXinfo :-   {x}  or {p, s}.

        If {x}, heap size X is (or can be treated as) an adder of size x
        If {p, s}, heap size X can be treated as periodic of length p, starting at heap size s, when played with
        other heaps

additiontable  :-  an associative array of values that look like

            { {0,0,0,1,0,0,0,2}, {1, {1,4,2,0}} }

That particular entry means that 1 heap of size 4  together with 2 copies of the heap of size 8 has genus 1^1420.

More precisely, each entry in the additiontable is a pair {x,y}, where

    x is a vector that is always zero at heapsizes known to be equivalent to adders
    y is a conventional genus value
    
How many entries are in the addition table?  We need to tabulate values y for each choice of x-components
where the individual x-components have p+s+1 choices of values 0,1,2, p+s.  These values (p+s+1) have to
be multiplied together to obtain size of the complete addition table.

That's just the oracle data structure.  What can we do with it?

Probably the first thing we'll want to do with an oracle is an independent partial verification (IPV)

In an IPV, we verify that the genus values returned by an oracle correspond to independent calculations of
'representative' positions.  Since an oracle specifies the genus of an infinite number of positions, an IPV
can't completely verify that it is correct.  But presumably an IPV might be useful to try to find errors in the oracle.

An IPV could work by first considering positions that have at most one token, then at most two, then three, etc.
Building it up this way will lead to a maximum number of cache hits, and it's something that we can incrementally
extend, and it's also presumably easy to program.

We have two functions that are available to compute genera.  

directGenus[{0,4,5},{0,0,2,0,0,0,1}]   --- ie vector weight representation of G in the second arg
genusPosition[{0,4,5},{3,3,7}]  -- ie explicit heap representation in second arg.

In their implementation, genusPosition converts the explicit heap representation to the vector weight representation and calls
directGenus.

For IPV, we need a stepping mechanism so that we can generate all the partitions.  

The DiscreteMath`Combinatorica` package has a function that enumerates the partitions of n

Partitions[5] returns

{{5},{4,1},{3,2},{3,1,1},{2,2,1},{2,1,1,1},{1,1,1,1,1}}

Having such a convenient function for partition generation, we'll use the genusPosition representation in IPV.

There are
627 partitions of 20
1958 partitions of 25
5604 partitions of 30
14883 partitions of 35
37338 partitions of 40

Let's create the oracle for Kayles {0,7,7} to heap size 1 as an example

{{0,7,7}, {1}, {{1}}, {}}

{octal code, heapsize = 1, heapsize 1 is :1, empty addition table since all heaps are adders}

For a more complicated example, how about .351, a game we solved (we think). We've written it up in our 'autobook' [Graphics:../Images/index_gr_148.gif] paper

{{0,3,5,1},
  {{8,4}},
  {{1},{2},{1},{2},{4},{4,2},{4},{2},{3,1},{2},{3,1}},
  { - additionTable with 4 * 3 * 3 = 36 entries-}
  }

But this representation is inefficient because the heaps of size 9 and 11 can be treated as identical (they are both the formal
symbol B).  We need to come up with a more succinct representation of the addition table in order to handle this.

The heaplist data structure should probably be changed.  It's clumsy because we already have a natural representation for
adders and special games, anyway.  We should use integers for things that really are integers [(p,s) values, for example,]
and use game symbols for things that are games.  An oracle will be more transparent when we treat it that way.

4 April 2003
Here's another improvement to the oracle format.  Why does the oracle need to include the addition table at all?  It doesn't really.  It's enough
just to specify the adders and the (p,s) information for the other positions.  The actual addition table can be recovered
by direct calculation when it's needed.  We do need to keep track of non-adders that can be treated identically, however.
We might consider introducing a new class of formal game symbols g[symbol,p,s] in oracles.  As much as is possible, we'd like
the 'symbols' to be conventional games.  However this isn't really possible, because these symbols are exactly where we cross
the line separating 'real' game values such as n[1] or a4[2] from 'pretended' games A, B, C, etc.  For example, a 'real' game has
well defined options and is always the same mathematical object wherever it appears.  Pretension games have neither well-defined
options, nor are they invariant across analyses.  So it would be a mistake to try to fold them into our formal game notation system.

We should probably use a symbol that won't conflict globally with other values that might be hanging around.

Maybe something that embeds a formal symbol,  octal code game identifier [maybe], and (p,s) information: like P[A, {0,3,5,1}, {4,2}].

I think we'll drop the octal code identifier inside the P[] expressions since it's not going to be useful probably.

These improvements let us write down an oracle with less fuss.

NEW ORACLE FORMAT

oracle :- {octalcode, depthinfo, heaplist}
depthinfo :- either {N}  --- indicating that the oracle is PARTIAL to depth size N
                 or {P,S}  --- indicating that the oracle is COMPLETE with global period P starting at heap size S.
heapinfo :- a list of either N or P+S entries, {g_1, g_2, ... }.
          each entry is either a nim heap n[1],n[2],n[3] or adder a[j] or a formal symbol identifier with period P[A,{p,s}].
         
This oracle format is sufficient to express Allemang-style game solutions as well as his partial results.

If an oracle is in hand, we'll want to do IPVs on it [IPV = independent partial verification].  

A well-definedness check on a oracle makes sure that the oracle returns the same genus value for different
positions that it asserts are identical.  For example, if two heap sizes h, t are asserted to be P[A,{4,2}], say, then
we'll want to know that sums involving h terms can be freely replaced by t terms without changing the resulting
genera.  We probably need an adder check too, of some kind.  How should this work, and is it really a different
process from IPV?

I don't think it is a different process than IPV.   What we should do is

    1) For positions asserted to be adders, confirm that they have the correct single heap genera to their ostensible period [NO DON'T NEED THIS]
    2) For positions asserted to P[X,{p,s}], confirm that they have periodic genera. [PROBABLY DON'T NEED THIS EITHER]
    
For oracle outcome calculation:  ALWAYS USE THE SMALLEST HEAP SIZE X when calculating results
involving multiple terms that look like P[X,{p,s}], for fixed X.  In this way, the oracle will always return a well-defined result.

IPV is where errors, if any, will be caught, because in IPV, we don't reduce to smallest heaps, rather, we directly compute outcomes.
So we have the property that the "oracle" always returns a well defined value, and if the oracle is wrong, we will catch it,
provided we carry out IPV far enough.

Backing up, the basic operations are:

    Given an oracle and explicit position P, return the genus that the oracle asserts for P.
    Given an oracle and explicit position, compare the independently calculated genus of P with the value asserted by the oracle.
    Given a partial oracle to depth d, partially verify it by considering partitions with no part greater than d.

[Graphics:../Images/index_gr_149.gif]

This genusByOracle function presumes a global movelist is available (or generated if not) under the control of logic living in directGenus

here's the oracle format:

oracle :- {octalcode, depthinfo, heaplist}
depthinfo :- either {N}  --- indicating that the oracle is PARTIAL to depth size N
                 or {P,S}  --- indicating that the oracle is COMPLETE with global period P starting at heap size S.
heapinfo :- a list of either N or P+S entries, {g_1, g_2, ... }.
          each entry is either a nim heap n[1],n[2],n[3] or adder a[j] or a formal symbol identifier with period and minimal heap indicator i,  P[A,{p,s}, i].
          

The 'position' should be a list of explict heap sizes.      

The genusShift helper function implements the method at the top of page 411 in WW---given a genus symbol and adder, it returns the
genus of their sum by doing simple nim sums and shifts.

We'll also use a little helper function that we wrote before, lVal[genus, n] that extrapolates a misere genus sequence list to large values if necessary, based on the period two property.
The argument n needs to be 1, 2, 3, etc; it's the desired index sought.

The input genus to genusShift is assumed to be a genus symbol in already normalized form (ie chopLogic has already been applied).
The 'adder' must be one of n[0], n[1],n[2],n[3],a[4],a[5],a[6], .... a[k],  ....     


Converted by Mathematica      August 22, 2003