Generalized Fishy Cycles

From Sudopedia, the Free Sudoku Reference Guide
Jump to navigationJump to search

(sorry for the dollar signs, I haven't found yet how to do LaTeX on this wiki)

Preliminaries

In an incomplete sudoku, we construct a graph as follows. Take a fixed symbol $s$ (in a typical sudoku, $s\in\{1,\ldots,9\}$) which can still be placed somewhere on the board. As vertices, we take all positions in the sudoku where the symbol $s$ can still be placed. Two vertices are connected iff their corresponding positions belong to the same row, column or zone.

Definition 1.1. We call an edge `strong' if the two connected positions are the only ones in their respective row, column or zone. We call an edge `weak' if the two connected positions are not the only ones in their respective row, column or zone.

Remark 1.2. Note that this is not necessarily well-defined, since they can be the only ones in their row/column but not in their zone, or vice versa. In this case, we can immediately remove all other options for symbol $s$ in that zone or row/column, and consider the link as strong.

Remark 1.3. The choice for these names can be explained as follows:

  • If a link is strong, the symbol $s$ will end up on precisely one of the positions corresponding to the vertices of the strong link.
  • If a link is weak, the symbol $s$ may end up on one of the positions corresponding to the vertices of the weak link.

We will mark the vertices where the $s$ is placed with $1$ and the others with $0$, hence a strong link will have $(0,1)$ or $(1,0)$, while a weak link can have $(0,0)$, $(0,1)$ or $(1,0)$. This marking function is denoted $f:V\to\{0,1\}$, the set of all marking functions is denoted $M$.

Notation 1.4. We usally denote a strong link between vertices $v$ and $w$ as $v=w$, and a weak link as $v-w$. For example, a $3$-cycle with two strong links and one weak link can then be written as $v_1-v_2=v_3=v_1$. If we want to denote a link without specifying wether it is weak or strong, we denote $v\sim w$.

Reducing the cycles

Now we will study the cycles in this graph. In a common sudoku, there are many such cycles (hundreds, even thousands in some cases), but only a few turn out to be useful (i.e. give information). We will study in detail which cycles contribute information, and how a human (or computer) can look specifically for these cycles.

But first we have to define information.

Definition 2.1. We obtain information about an edge if \begin{itemize} \item either the edge is strong, and at least one of the markings $(0,1)$ or $(1,0)$ is not possible, \item or the edge is weak, and at least one of the markings $(0,0)$, $(0,1)$ or $(1,0)$ is not possible. \end{itemize} We obtain information from the chain if we obtain information about any of its edges.

Now we can begin our classification, but we start with an important remark which will greatly reduce the number of cases to be handled.

Remark 2.2. A cycle of the form $v_1\sim \cdots \sim v_i=v_{i+1}=v_{i+2}\sim \cdots \sim v_1$ gives information if and only if the shorter cycle $v_1\sim \cdots \sim v_i\sim v_{i+3}\sim \cdots \sim v_1$ does. In other words, we may remove any two consecutive strong links by merging the three respective vertices to one big vertex.

We define the formal term rewriting system $\Phi$ as the system applying the above shortening on the set of (weak-strong) cycles. To find out wether a cycle provides information, it is sufficient to study wether its normal form (with respect to $\Phi$) provides information. Hence, we only need to study cycles in which there are no two consecutive strong links. Now classify these cycles and the information they provide, depening on the parity of the length of the cycle (an invariant in the term rewriting process) and the maximum number of consecutive weak links.

  • If the cycle has odd length:
    • If the cycle has three (or more) consecutive weak links: then no information is available. For every weak link we can attach $(0,0)$ to that weak link and alternate $(0,1)$ and $(1,0)$ over all other edges. As there are three or more consecutive weak links, we can apply this construction to one of these weak links as $(0,0)$ and rotate everything. We can even do this twice, so all links will in the end have had both $(1,0)$ and $(0,1)$ in a valid setup, and all weak links can have $(0,0)$, which gives by definition no information.
    • If the cycle has at least two pairs of two consecutive weak links: then no information is available. Again for every weak link we can attach $(0,0)$ to that link and alternatingly $(0,1)$ and $(1,0)$ to all other links. As there are two pairs of two consecutive links we can make all links have $(1,0)$ and $(0,1)$ in a valid setup, and all weak links can have $(0,0)$, which gives by definition no information.
    • If there is exactly one pair of two consecutive weak links, but no triple of consecutive weak links: there we have information. It is impossible to obtain $0-1-0$ over this pair, since the only way to complete this to a full cycle would be an alternation of $=1-0=1-0...1=0-1=$, but for this the cycle's length has the wrong parity. Since $0-1-1$ and $1-1-0$ are de facto impossible, we can exclude $(0,1)$ and $(1,0)$ respectively on the edges in the pair of consecutive weak links.
    • If there is a weak link, but no pair of two consecutive weak links: the only possibility for this case in a cycle of odd length is when there is one vertex, connected to itself by a weak link (as a reduction case of $v_1-v_2=v_3=v_1$, for example), and in that case the only option is $(0,0)$ for that weak link.
    • If there are no weak links: then the only option is one vertex connected to itself by a strong link. This yields a contradiction, i.e. your sudoku is unsolvable if this happens.
  • If the cycle has even length:
    • If the cycle has three (or more) consecutive weak links: then no information is avalable. Alternating $(1,0)$ and $(0,1)$, and rotating it one position afterwards, we can mark all edges with $(0,1)$ and $(1,0)$ in a legal way. Assigning $0-0-0-1$ or $1-0-0-0$ to the triple of weak links and filling the rest up by alternating $(0,1)$ and $(1,0)$ accordingly, we eliminate any information on these edges. For any other weak link we can assign $0-0$ to it and alternate $(0,1)$ and $(1,0)$ accordingly, to end with a second $(0,0)$ in (one of) the leftover place(s) in the triple of consecutive weak links. Hence there is by definition no information.
    • If the cycle has a pair of consecutive weak links, but no triple: then no information is available. Because the cycle has even length, we either have that the length of the cycle is two (in which case there is obviously no information), or we automatically need a second pair of consecutive weak links. Again alternating $(1,0)$ and $(0,1)$, and rotating it one position afterwards, we can mark all edges with $(0,1)$ and $(1,0)$ in a legal way. Any weak link can be assigned $(0,0)$ and we can alternate $(0,1)$ and $(1,0)$ accordingly, to end with a second $(0,0)$ in one of the pairs of weak links. As there are at least two such pairs, this is always possible. Hence we can get no information in this case.
    • If the cycle has a weak link, but no pair of consecutive weak links: then we do find information. The only way this can happen is by an alternation of strong and weak links, and the only solution consists of alternation $(0,1)$ and $(1,0)$. For all weak links, we find that $(0,0)$ is impossible for that link.
    • If there are no weak links: then no information is available. Note that we are here talking about the empty cycle, coming from reducing $v_1=v_2=v_1$.

Conclusions on unreduced weak-strong cycles

Translating our information back to the unreduced model is simple: if we have a conclusion on a vertex (i.e. all possible markings on one of its incident edges have either $0$ or $1$ on that vertex) and that vertex gets expanded to a pair of strong links, then that expansion must be either from $0$ to $0=1=0$ or from $1$ to $1=0=1$.

The reduced forms from the previous section are handy for the classification and its drawings. However, for some human players it may slow them down slightly if they can't do the reduction in their mind.

How do we avoid using the reduction? Well, instead of looking at consecutive weak links, one can look at the parity of the number of strong links between two weak links. Two weak links are consecutive in the reduced cycle if and only if there is an even number of strong links between them in the original cycle. With this information, one can easily do the exercice of writing up the classification without reduction now.

Finding the useful cycles and translating our conclusions back to the sudoku

There are often many, many cycles in this graph. However, only a few usually give information, so it's crucial to spot the difference quickly. An algorithm to find these cycles with minimal effort could be the following:

For every vertex $v_1$, manually track all cycles starting and ending in $v_1$, but stop the current cycle as soon as one of the following happens:

  • you get two consecutive weak links, in a step different from the last one (i.e. $(v_n,v_1)$ can be weak-weak),
  • you get a weak link, after an even number of strong links since the last weak link, and that new weak link does not end in $v_1$,

hereby we note that the second stop condition is just the unreduced extension of the first, you can also reduce the cycles in your head while tracing them, and only use the first stop condition.

In this way, we skip all cycles that do not give extra information: in its reduced form, such a cycle allows at most one pair of two consecutive weak links, with $v_1$ the middle vertex. This removes most of the information-free cycles (in fact, all of them, except the trivial case where all links are strong), and none of the cycles that can contribute information. This algorithm is easy to implement on a computer, but with some routine it can also be easily done by a human.

Finally, translating our information back to the sudoku is easy:

  • If a vertex is assigned $1$ (i.e. at least one of it's incident vertices is assigned only markings that have a $1$ on that vertex) then we place symbol $s$ on the position corresponding to that vertex.
  • If a vertex is assigned $0$ (i.e. at least one of it's incident vertices is assigned only markings that have a $0$ on that vertex) then we know that symbol $s$ can no longer be placed on the position corresponding to that vertex.
  • If an weak link could not be assigned $(0,0)$, then we know that $s$ has to be placed on one of the two corresponding positions, so we know that $s$ can no longer be placed on any other position on the row/column/zone corresponding to that weak link.

Final remarks

I will end with a short word on the techniques which are generalized by this one. Technically, it also generalizes the trivial techniques such as Full House, Last Digit, Hidden Single and Naked Single. However, these are usually spotted on sight so you will never get any benefit from them being generalized by this cycle procedure.

More interestingly, it generalizes several of the currently known single-digit techniques, including several Fishes and Single Digit Patterns, and some chain techniques such as Fishy Cycles. While harder to apply than most of these individual techniques, I find it is (once used to it) certainly less time-consuming than applying each trick separatedly.