MINISTRY OF

EDUCATION AND TRAINING

VIETNAM ACADEMY OF

SCIENCE AND TECHNOLOGY

GRADUATE UNIVERSITY

OF SCIENCE AND TECHNOLOGY

Nguyễn Thị Nga

SOME MATHEMATICAL ISSUES

BEHIND SUDOKU PUZZLES

MASTER THESIS IN MATHEMATICS

Hanoi - 2018

Confirmation

This thesis was written on the basis of my research works carries

out at Institute of Mathematics, Vietnam Academy of Science and

Technology under the supervision of Dr. Le Xuan Thanh. All results

of other authors that are used in this thesis are cited correctly.

September 12, 2018

The author

Nguyen Thi Nga

Acknowledgements

This thesis was conducted and completed at the Institute of Mathematics, under the guidance of Dr. Le Xuan Thanh. By this occasion,

I would like to express my gratitude and deep respect to Dr. Le Xuan

Thanh, the exemplary teacher who has spent a lot of time and effort

to guide me through this thesis. Thanks to conscientious guidance,

my research skills have been growing up. He introduced me to many

seminars and conferences, as well as helped me a lot in enlarging my

knowledge. His clear and careful characteristics have a significant influence on me in conducting learning, researching, writing scientific

documents. He was the one who instilled on my passion for Applied

Mathematics. He is not only a master in advising students but also

a very warm person in daily life. I always receive sincere and effective advices from him on professional issues, as well as professional

orientation in the future.

I sincerely thank Center for Postgraduate Training and Department of Numerical Analysis and Scientific Computing of Institute of

Mathematics, Vietnam Academy of Science and Technology for creating favorable conditions for me to complete this thesis.

I express my gratitude to everyone from the Institute of Mathematics, who devoted themselves to teaching and creating favorable

conditions for me to complete my master course.

I would also like to thank my friends for their companion and

help. I would like to say my heartfelt gratitude to my family for their

understanding, patience, support during my time at the Institute of

Mathematics.

1

Contents

1 INTRODUCTION

2

2 SUDOKU ENUMERATION

2.1 Completing blocks B1 − B3 . . . . . .

2.2 A simple heuristic enumeration method

2.3 An exact enumeration method . . . . .

2.3.1 Lexicographical catalogue . . .

2.3.2 Enumerating from catalogue . .

2.4 Conclusions . . . . . . . . . . . . . . .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

5

6

8

10

12

12

45

3 IP FORMULATIONS FOR SUDOKU

46

3.1 A binary linear programming formulation . . . . . . . . 47

3.2 An integer programming formulation . . . . . . . . . . 49

3.3 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . 52

4 NUMERICAL EXPERIMENTS

4.1 Modeling by ZIMPL . . . . . . .

4.1.1 ZIMPL model for (BLP ) .

4.1.2 ZIMPL model for (N LIP )

4.2 Numerical experiments . . . . . .

4.3 Conclusions . . . . . . . . . . . .

5 CONCLUSIONS

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

53

54

54

57

59

61

62

2

Chapter 1

INTRODUCTION

Sudoku is a famous puzzle, which is original called Number Place.

In Japanese, Sudoku means “single number”. In its well-known format, a Sudoku puzzle can be described as follows. Give a grid consisting of 9 × 9 squares (cells), which can also be viewed as a composition

of nine 3 × 3 blocks (so-called subgrids). The puzzle rules are:

(1) each cell of grid must be filled by exactly one digit;

(2) only digits from 1 to 9 can be filled in the grid;

(3) each digit appears exactly once in each column, each row, and

each 3 × 3 subgrid.

Usually, a Sudoku puzzle is provided with a partially complete grid,

and the objective is to completely fill the grid. A well-posed Sudoku

puzzle, also called proper Sudoku, has exactly one solution. Figure

1.1 gives an example of a standard 9 × 9 Sudoku puzzle and a solution

for the puzzle.

3

5

8

9

1 6

3

7

4

8

6

1

8 7

2 6

1

6

3

8 5

4 7

2 1

9

8

3

4

8

1

3

5

6

9

7

2

7

5

6

1

9

2

3

4

8

2

9

3

8

7

4

6

1

5

5

6

9

7

3

8

4

2

1

3

4

8

2

1

5

7

6

9

1

2

7

6

4

9

8

5

3

8

3

2

4

6

1

5

9

7

6

1

5

9

8

7

2

3

4

9

7

4

5

2

3

1

8

6

Figure 1.1: A Sudoku puzzle (left) and a solution for it (right).

Howard Garns (March 2nd, 1905 - October 6th, 1989) is an American architect, is most likely considered as the father of the modern

Sudoku. He published the earliest known examples of modern Sudoku

on Dell Magazines in 1979. In April 1984, the puzzle was introduced

on the Japanese paper Monthly Nikolist. The name of the puzzle in

Japanese was “S¨

uji wa dokushin ni Kagiru” (translated “the digits

must be single”), and afterward was abbreviated to Sudoku by Maki

Kaji - the president of Nikoli Co. Ltd., a Japanese puzzle manufacturer. Wayne Gould (born July 3rd 1945 in Hawera, New Zealand) devised a computer program to rapidly produce Sudoku puzzles. Thanks

to his efforts, Sudoku successfully appeared in a local US newspaper.

Then, in November 2004, The Times of London began featuring Sudoku. The puzzle was rapidly spread to other newspapers as a regular

feature.

Despite the simple rules, there are many non-trivial matters and

challenging issues behind Sudoku puzzles. These make Sudoku not

only popular in daily life but also attractive to many mathematicians.

In [6], the authors proved a non-trivial result that the smallest number

of clues in a proper 9 × 9- Sudoku puzzle is 17. Concerning the complexity issue, it had been shown in [8] that solving Sudoku puzzles of

general size n2 × n2 (with n ≥ 3) is NP-complete. There are also a lots

of variants of Sudoku with different sizes and/or additional constraints

4

(see https://en.wikipedia.org/wiki/Glossary of Sudoku for more details). However, in this thesis we only consider Sudoku puzzles of the

well-known form described above.

In addition to the theoretical issues, having efficient mathematical

models as well as finding fast and accurate algorithms for solving Sudoku puzzles are also attractive topics of interest. In many scheduling

and timetabling problems, there are similar constraints to the ones

of Sudoku. So Sudoku is not only a typical problem in many mathematical specialities (such as combinatorics, complexity, combinatorial

optimization, . . .) but also an important example in mathematical

programming (in senses of both modelling and numerical solutions).

The goal of thesis is to study some mathematical issues behind Sudoku puzzles. We focus mainly on the problem of counting the number

of Sudoku grids, and on how to model Sudoku puzzles as mathematical programs. Apart from this chapter, this thesis is organized in 4

more chapters. Chapter 2 is devoted to the problem of enumarating

Sudoku. In Chapter 3 we present two integer programming models for

solving Sudoku puzzles. Chapter 4 presents some numerical experiments of the mathematical models proposed in Chapter 3. Chapter 5

closes this thesis by some conclusions.

5

Chapter 2

SUDOKU

ENUMERATION

The goal of this chapter is to compute the number, which is denoted

by NS for convenience, of possible solutions for Sudoku. Surprisingly,

no general combinatorial formula for computing that number has been

known until now. In this chapter, based on the ideas in [2, 4], we give

a complete enumeration method for computing NS . For the sake of

the enumeration method, we label the blocks of a Sudoku grid by B1

to B9 as shown in Figure 2.1.

B1

B2

B3

B4

B5

B6

B7

B8

B9

Figure 2.1: Labels of blocks in Sudoku grid.

We first start with the number of ways to complete the first three

6

blocks B1 − B3 . Section 2.1 shows how to compute that number.

This gives the starting point for a simple heuristic method computing

approximately the value of NS . This method is presented in Section

2.2. Then in Section 2.3 we present an exact method to compute the

number NS .

2.1

Completing blocks B1 − B3

We say that a Sudoku grid is of canonical form if the top-left block

B1 is filled as in Figure 2.2.

1

4

2 3

5 6

7

8

9

Figure 2.2: Canonical block.

Let S9 be the group of permutations on the set of nine digits

{1, . . . , 9}. We say that two completely filled Sudoku grids G1 and

G2 are equivalent if G2 is obtained from G1 after applying a permutation f ∈ S9 , or shortly speaking, G2 = f (G1 ). Note that every

non-canonical Sudoku grid is equivalent to a canonical Sudoku grid

via some permutation in S9 . Therefore, each canonical Sudoku grid is

equivalent to |S9 | = 9! Sudoku grids. In other words, if we denote by

NC the number of canonical Sudoku grids, then NS = NC × 9!.

Now we concentrate on considering canonical Sudoku grids. In the

following, by {a, b, c} we indicate the elements a, b, c in any order, and

(a, b, c) in the indicated order. Since the top row of B1 is (1, 2, 3),

there are two cases for the top row of block B2 .

• Case 1 (Pure): It consists of either the second or the third row

of B1 , i.e., either {4, 5, 6} or {7, 8, 9}.

• Case 2 (Mixed): It consists of the three digits in the mixture of

{4, 5, 6} and {7, 8, 9}.

7

More precisely, all possible top rows of blocks B2 and B3 are shown in

Table 2.1.

No. Top row of B2 Top row of B3

1

{4, 5, 6}

{7, 8, 9}

2

{7, 8, 9}

{4, 5, 6}

3

{4, 5, 7}

{6, 8, 9}

4

{4, 5, 8}

{6, 7, 9}

5

{4, 5, 9}

{6, 7, 8}

6

{4, 6, 7}

{5, 8, 9}

7

{4, 6, 8}

{5, 7, 9}

8

{4, 6, 9}

{5, 7, 8}

9

{5, 6, 7}

{4, 8, 9}

10

{5, 6, 8}

{4, 7, 9}

11

{5, 6, 9}

{4, 7, 8}

12

{6, 8, 9}

{4, 5, 7}

13

{6, 7, 9}

{4, 5, 8}

14

{6, 7, 8}

{4, 5, 9}

15

{5, 8, 9}

{4, 6, 7}

16

{5, 7, 9}

{4, 6, 8}

17

{5, 7, 8}

{4, 6, 9}

18

{4, 8, 9}

{5, 6, 7}

19

{4, 7, 9}

{5, 6, 8}

20

{4, 7, 8}

{5, 6, 9}

Case

Pure

Pure

Mixed

Mixed

Mixed

Mixed

Mixed

Mixed

Mixed

Mixed

Mixed

Mixed

Mixed

Mixed

Mixed

Mixed

Mixed

Mixed

Mixed

Mixed

Table 2.1: Possible top rows of blocks B2 and B3 .

In the first pure case (No.1 in Table 2.1), blocks B2 and B3 can

be completed together as in Figure 2.3. Since each set {a, b, c} corresponds to 3! = 6 permutations of its elements and there are 6 rows of

blocks B2 , B3 , this leads to 66 different configurations of B2 , B3 . The

same result holds for the second pure case (No.2 in Table 2.1).

8

1

2

3 {4, 5, 6} {7, 8, 9}

4

5

7

8

6 {7, 8, 9} {1, 2, 3}

9 {1, 2, 3} {4, 5, 6}

Figure 2.3: Completing B1 − B3 in the first pure case.

For the first mixed case (No.3 in Table 2.1), by Sudoku rules, blocks

B2 and B3 can be completed together as in Figure 2.4. Here, a, b, c

stand for 1, 2, 3. Each choice of a determines a choice for {b, c}, and

there are 3 choices for a ∈ {1, 2, 3}. Again, each row of blocks B2 and

B3 in Figure 2.4 corresponds to 3! = 6 permutations of its filled digits,

and there are 6 rows of these two blocks. Therefore, Figure 2.4 stands

for 3 × 66 different configurations of B2 , B3 . The same result holds for

the other mixed cases (No. 4-20 in Table 2.1).

1

2

4

5

3 {4, 5, 7} {6, 8, 9}

6 {8, 9, a} {7, b, c}

7

8

9 {6, b, c} {4, 5, a}

Figure 2.4: Completing B1 − B3 in the first mixed case.

To summarize, we have 2 pure cases (each gives 66 different configurations of blocks B2 − B3 , and 18 mixed cases (each gives 3 × 66

different configurations of blocks B2 − B3 ). Therefore, in total, the

number of different configurations of blocks B2 − B3 in canonical Sudoku grids is

2 × 66 + 18 × 3 × 66 = 2612736.

This means that the number of possibilities for the three blocks B1 −B3

(in which B1 is not necessary in canonical form) is

9! × 2612736 = 948109639680.

2.2

A simple heuristic enumeration method

For convenience, we recall the rules of Sudoku puzzles:

9

(i) each digit from 1 to 9 appears exactly once in each block B1 −B9 ;

(ii) each digit from 1 to 9 appears exactly once in each row of

Sudoku grid;

(iii) each digit from 1 to 9 appears exactly once in each column of

Sudoku grid.

Each block consists of 9 cells, therefore we have 9! ways to fill all

the digits 1, . . . , 9 in each block. This results in Nb = (9!)9 ways to fill

in all blocks B1 to B9 satisfying rule (i).

We have known from Section 2.1 that 948109639680 is the number

of ways to fill in the three blocks B1 − B3 so that each block has the

digits 1, . . . , 9 and also each row has the digits 1, . . . , 9. The same

results hold true for blocks B4 − B6 , and for blocks B7 − B9 . Therefore

the number of ways to fill in all blocks B1 − B9 in such a way that

satisfies rules (i) and (ii) is Nr = 9481096396803 .

So, in Nb possibilities of filling in blocks B1 − B9 satisfying block

rule (i), there are Nr possibilities that also satisfy row property (ii),

that correspond to a proportion of

9481096396803

Nr

=

.

p=

Nb

(9!)9

Similarly, in Nb possibilities of filling in blocks B1 −B9 satisfying block

rule (i), there are 9481096396803 possibilities of filling in blocks B1 −B9

satisfying both block rule (i) and column rule (iii), that correspond

to the same proportion p.

A solution to Sudoku is just one of the Nb grids satisfying rule

(i) that has both the row property (ii) and column property (iii).

Assuming the row and column properties are independent, the total

number of solutions to Sudoku would be

9481096396803

Nb × p = (9!) ×

(9!)9

9481096396806

=

(9!)9

(9!)6 × 566 × 636

=

(9!)9

2

9

2

10

566 × 636

=

(9!)3

= 6657084616885512582463.488

≈ 6.657 × 1021 .

In fact, the row property (ii) and column property (iii) are not independent, so the number computed above cannot be the correct answer.

It is even not an integer! However, after the computation in the next

section, we will see that this number is really close to the exact answer

(with a difference is just 0.2%).

2.3

An exact enumeration method

In this section, we will prove that the exact number of Sudoku

grids is

NS = 6670903752021072936960 ≈ 6.671 × 1021 .

To have an impression about how big NS is, let us compute how large

of computer memory to save all these number of Sudoku grids. Table

2.2 gives the binary representation of each digit from 1 to 9, which in

turn gives the number of bits to save each of the digits in computer

memory.

Digit

Expression

Binary representation Number of bits

0

1

1×2

12

1

2

1 × 21 + 0 × 20

102

2

1

0

3

1×2 +1×2

112

2

4

1 × 22 + 0 × 21 + 0 × 20

1002

3

5

1 × 22 + 0 × 21 + 1 × 20

1012

3

6

1 × 22 + 1 × 21 + 0 × 20

1102

3

2

1

0

7

1×2 +1×2 +1×2

1112

3

8

1 × 23 + 0 × 22 + 0 × 21 + 0 × 20

10002

4

3

2

1

0

9

1×2 +0×2 +0×2 +1×2

10012

4

Table 2.2: Binary representations of digits from 1 to 9.

11

Since each digit from 1 to 9 appears exactly 9 times in a Sudoku

grid, it follows from Table 2.2 that the number of bits needed to encode

a Sudoku grid is

9 × (1 + 2 × 2 + 3 × 4 + 4 × 2) = 225.

Following the data measure given in Table 2.3, to store all Sudoku

grids we need the computer memory space of

NS × 225

= 1.66639 × 108 (petabytes).

5

8 × 1024

Measure

Symbol Equivalence

1 byte

1B

8 bits

1 kilobyte

1 KB

1024 B

1 megabyte 1 MB

1024 KB

1 gigabyte

1 GB

1024 MB

1 terabyte

1 TB

1024 GB

1 petabyte

1 PB

1024 TB

Table 2.3: Units in data measurement.

Now we discuss in detail the computation for NS . Thanks to the

discussion in Section 2.1, from now we restrict our consideration to

canonical Sudoku grids. We also know from Section 2.1 that there are

2612736 possible configurations of B2 − B3 in canonical Sudoku grids.

We partition the configurations of B2 and B3 into classes as follows.

• Two configurations of B2 and B3 are in the same class if they

have the same number of ways of completing to a full valid Sudoku grid.

We then look at operations that do not change the number of Sudoku

grids. The partition of the configuration of B2 and B3 into classes help

us to reduce the number of possibilities which we need to consider in

2612736 possibilities for B2 − B3 .

12

2.3.1

Lexicographical catalogue

We classify all 2612736 configurations of B2 − B3 as follows.

• Within B2 and within B3 , we permute the columns so that the

entries in the top row are in increasing order.

• Exchange B2 and B3 if necessary so that B2 is before B3 in

lexicographical order with respect to their top rows.

Concerning the action in the second step, when we permute B1 ,

B2 and B3 in any way, then B1 is changed. But we can relabel to

put it back into canonical form. Hence, if we exchange B2 and B3 ,

then every way of filling in B2 − B3 gives us a unique way to complete

B3 − B2 to complete grid (just exchange between B5 and B6 , and B8

and B9 in the former configuration).

Concerning the action in the first step, similar to the second step,

each way of completing from a configuration of (B2 , B3 ) to a complete

Sudoku grid gives a unique way to complete its column permutation

to a complete Sudoku grid (just perform the same permutation on the

column in a complete grid).

There are 3! ways to permute the columns in each block B2 and B3

from the first step, so we have (3!)2 = 36 grids which with the same

number of ways of completing. From the second step, permutation

of 2 blocks B2 and B3 lead to double number of possibilities may be

reduced of the number of way of completing of (B2 , B3 ). Overall, we

can reduce 36×2 = 72 times in 2612736 possibilities mentioned above.

Thus, we only need to consider 2612736/72 = 36288 possibilities for

our catalogue.

2.3.2

Enumerating from catalogue

At this point, our catalogue consists of 36288 configurations of

blocks B2 − B3 , that are lexicographically ordered (as described in

Section 2.3.1). Our framework to handle with the catalogue is given

in Algorithm 1.

13

Algorithm 1 Enumerating from catalogue

1: Input: Lexicographically ordered catalogue of configurations of

B2 − B3

2: Initialization: Color all configurations in catalogue

3:

numberOfClass ← 0

4: while catalogue has some color configuration do

5:

Let α be the first colored configuration in catalogue

6:

Find Cα := the class of all configurations in catalogue that are

equivalent to α

7:

Uncolor all configurations in Cα

8:

numberOfClass ← numberOfClass + 1

9:

Compute nα := the number of ways to complete from α to full

Sudoku grid

10: end while

After executing Algorithm 1, we get a list C of configurations of

B2 − B3 , each of the configurations in C is a representative in its

equivalent class. It follows from our discussion that

|Cα | = 36288,

α∈C

and the number of canonical Sudoku grids is computed by

nα |Cα |.

NC = 72

(2.1)

α∈C

Computing |Cα | and nα for each class representative α ∈ C are the key

procedures in Algorithm 1, respectively corresponding to the steps in

line 6 and line 9 of the algorithm pseudo-code. We now discuss the

procedures in detail.

1. Finding equivalent class of a given configuration in the catalogue

We note that there are four types of operations that can be applied

to B1 − B3 preserving the number of completions from these three

blocks to a full grid:

14

• type 1: permuting the columns within each block B1 , B2 , B3 ,

• type 2: permuting the whole block B1 , B2 , B3 ,

• type 3: permuting the cells within each column of B1 , B2 , B3

(provided that the obtained configuration does not break the

Sudoku rules),

• type 4: relabelling (i.e. applying a permutation f ∈ S9 to a

configuration of B1 − B3 ).

The first two types are discussed in Section 2.3.1, while the operations

of type 4 are already mentioned in Section 2.1. These four types of

operations suggest the following exhaustive way to find the equivalent

class Cα of a chosen configuration α in the catalogue.

Start from the chosen configuration α. Let us consider all configurations of B1 − B3 obtained from α by applying possible operations

of types 1-3. Then for each obtained configuration, we apply an operation of type 4 in order to transform block B1 to canonical form. We

furthermore apply operations of types 1-2 in order to transform blocks

B2 − B3 to satisfy the lexicographical conditions mentioned in Section

2.3.1, i.e., the entries in the top row of B2 , B3 are in increasing order

and B2 is before B3 in lexicographical order with respect to their top

rows. If we are successfully performing these operations, then the last

obtained configuration belongs to the equivalent class of α.

To better understand this process, let us consider some illustrative

examples.

Example 1. Assume that B1 − B3 are filled as in Figure 2.5. Since

block B1 is in canonical form, and blocks B2 − B3 are in the lexicographical order, this configuration belongs to our catalogue.

1 2 3 4 8 9 5 6 7

4 5 6 1 2 7 3 8 9

7 8 9 3 5 6 1 2 4

Figure 2.5: Configuration 1.

15

Permuting the first and the second rows of all the blocks, we obtain

the configuration in Figure 2.6.

4 5 6 1 2 7 3 8 9

1 2 3 4 8 9 5 6 7

7 8 9 3 5 6 1 2 4

Figure 2.6: Configuration 1 after permuting the first two rows.

In this configuration, B1 is not in the canonical form, so we need a

relabelling operation to transform it back to canonical form. Relabelling 1 ↔ 4, 2 ↔ 5, 3 ↔ 6 in all the three blocks B1 − B3 , we obtain

the configuration in Figure 2.7.

1 2 3 4 5 7 6 8 9

4 5 6 1 8 9 2 3 7

7 8 9 6 2 3 4 5 1

Figure 2.7: Configuration 1 after permuting the first two rows and

relabelling.

In configuration in Figure 2.7, block B1 is in canonical form, and

blocks B2 − B3 are in the lexicographical order. Thus, this configuration belongs to our catalogue. Furthermore, since all operations we

have applied are of types 3-4, this obtained configuration in Figure

2.7 has the same number of completions to the full Sudoku grid as the

configuration 1 in Figure 2.5). Therefore, these two configurations are

in the same equivalent class.

Example 2. Start from the configuration of B1 −B3 in 2.5, permute

the first two columns of B1 , we get the configuration in Figure 2.8.

16

2 1 3 4 8 9 5 6 7

5 4 6 1 2 7 3 8 9

8 7 9 3 5 6 1 2 4

Figure 2.8: Configuration 1 after permuting the first two columns of

B1 .

In this configuration, block B1 is not in the canonical form. Relabelling

1 ↔ 2, 4 ↔ 5, 7 ↔ 8 in all the three blocks B1 − B3 , we obtain the

configuration in Figure 2.9.

1 2 3 5 7 9 4 6 8

4 5 6 2 1 8 3 7 9

7 8 7 3 4 6 2 1 5

Figure 2.9: Configuration 1 after permuting the first two columns of

B1 and relabelling.

This configuration belongs to our catalogue since block B1 is in canonical form, and blocks B2 − B3 are in the lexicographical order. So it

also belongs to the equivalent class of configuration in Figure 2.5.

Example 3. Assume that B1 − B3 are filled as in Figure 2.10. It is

obvious that this is one configuration in our catalogue.

1 2 3 4 5 8 6 7 9

4 5 6 1 7 9 2 3 8

7 8 9 2 3 6 1 4 5

Figure 2.10: Configuration 2.

Permuting the first two columns of B1 , we get the configuration in

Figure 2.11.

17

2 1 3 4 5 8 6 7 9

5 4 6 1 7 9 2 3 8

8 7 9 2 3 6 1 4 5

Figure 2.11: Configuration 2 after permuting the first two columns of

B1 .

By relabelling 1 ↔ 2, 4 ↔ 5, 7 ↔ 8 we transform B1 back to canonical

form. However, B2 and B3 are not in lexicographical order (see Figure

2.12).

1 2 3 5 4 7 6 8 9

4 5 6 2 8 9 1 3 7

7 8 9 1 3 6 2 5 4

Figure 2.12: Transform configuration in Figure 2.11 to canonical

form.

So we need to permute the first two columns of B2 to get a configuration whose B2 − B3 are in lexicographical order (see Figure 2.13).

1 2 3 4 5 7 6 8 9

4 5 6 8 2 9 1 3 7

7 8 9 3 1 6 2 5 4

Figure 2.13: A configuration obtained from Configuration 2.

With the similar arguments to the previous examples, this configuration belongs to the equivalent class of Configuration 2.

Example 4. We again consider the configuration of B1 −B3 given in

Figure 2.5. We notice the positions of 1 and 4 in B1 and B2 . Permuting

the cells containing the digits 1 and 4 within the first and the fourth

columns, we get the configuration in Figure 2.14.

18

4 2 3 1 8 9 5 6 7

1 5 6 4 2 7 3 8 9

7 8 9 3 5 6 1 2 4

Figure 2.14: Configuration 1 after permuting positions of 1 and 4 in

the first columns of B1 and B2 .

In configuration in Figure 2.14, block B1 is not in canonical form.

Hence, we have to relabel 1 ↔ 4 to transform B1 back to canonical

form, and obtain the configuration as in Figure 2.15.

1 2 3 4 8 9 5 6 7

4 5 6 1 2 7 3 8 9

7 8 9 3 5 6 4 2 1

Figure 2.15: Configuration obtained after relabelling 1 ↔ 4 in Figure

2.14.

Configuration in 2.15 satisfies the 2 rules of configurations in our catalogue (that are, block B1 is in canonical form, and blocks B2 − B3 are

in the lexicographical order). So configuration in Figure 2.5 and configuration in Figure 2.15 are equivalent and they belong to the same

equivalent class.

2. Completing from a given configuration of B1 − B3 to full Sudoku

grid

In Step 9 of Algorithm 1, we need to compute the number of ways

to complete a full Sudoku grid from a given configuration α of blocks

B1 − B3 . In principle, we can try all possible ways to fill the remaining

blocks B4 −B9 and see which are Sudoku grids. However, we can speed

up this process by insisting that the first column of blocks B4 and B7

is lexicographically ordered. More precisely, by permuting the middle

three rows of the grid, we can assume that the entries on the first

column of B4 are in numerical order, and similarly for B7 . We can

also exchange the middle three rows and the last three rows of the

19

grid so that B4 is before B7 in lexicographical order with respect to

their first columns). By the same arguments as in Section 2.3.1, this

helps to speed up our calculation by a factor of 72.

Mathematically speaking, let Dα be the set of all configurations of

B1 − B3 , B4 , B7 in which B1 − B3 are given by α and B4 , B7 are in

lexicographical order. Let n∗α be the number of ways to complete a

full Sudoku grid from all configurations in Dα . Then we have

nα = 72n∗α .

(2.2)

For each configuration in Dα we try all possible ways to fill the

remaining blocks B5 , B6 , B8 , B9 by using the idea of backtracking algorithm. That is, we first list all possibilities of filling the digit 1 into

these remaining blocks. Then for each of the possibilities, we list all

possibilities of filling the digit 2 into the remaining cells, and so on.

If for some digit a we could not find any possibilities for filling the

next digit a + 1 into these blocks (so that the Sudoku rules are not

violated), then we skip the current possibility of filling the digit a and

come to the next possibility.

We now have all necessary materials for Algorithm 1. In Code 1

we give the code for this algorithm in C++ programming language.

1

2

3

Code 1: C++ code to enumerate equivalent classes in catalogue.

/* This code is written by Nguyen Thi Nga

based on the code of Ed Russell from

http :// www . afjarvis . staff . shef . ac . uk / sudoku /

equiv . c */

4

5

6

/* The aim of this code is to give information

of

44 equivalent classes of Sudoku grids . */

7

8

9

/* Each class consists of B2 - B3 ’s

configurations that have

the same number of ways to complete Sudoku grid

.

20

10

Block B1 is in canonical form . */

11

12

13

14

15

16

17

18

19

/* A representative ( rep for short )

is a way of filling digits to 6 columns of

blocks B2 - B3 .

Each of such columns consists of three digits (

a, b, c)

which is represented by an integer of value

a * 16^2 + b * 16 + c

( i . e . the integer abc in hexadecimal system ) .

We choose the representation in hexadecimal

system

because we may need 4 bits to represent a digit

. */

20

21

22

23

24

25

26

27

28

/* Each representative is stored as an array of

7 integers

rep [0 -6] , in which the first 6 integers rep

[0 -5] correspond to

the hexadecimal representations of 6 columns of

B2 - B3 ,

while rep [6] is the number of B2 - B3 ’s

configurations

that map to this representative .

Furthermore , for the sake of listing the

representatives ,

we impose rep [0] < rep [1] < rep [2] , rep [3] <

rep [4] < rep [5] ,

and rep [0 ,1 ,2] < rep [3 ,4 ,5] ( in lexicographical

order ) . */

29

30

31

32

33

/* Printed information of each class consist of

class index , a class representative ,

number of configurations of B2 - B3 in the class ,

number of ways to complete Sudoku grids

21

34

from a configurations of B2 - B3 in the class . */

35

36

37

38

#i n c l u d e < stdio .h >

#i n c l u d e < stdlib .h >

#i n c l u d e < memory .h >

39

40

41

#d e f i n e NREP 36288

// number of

representatives ( reps )

#d e f i n e NREPX 22266

// number of reps after

a reduction

42

43

44

// Define a function to swap values of two

integers

#d e f i n e SWAP (A , B ) { i n t t = A ; A = B ; B = t ; }

45

46

47

// Shorten the name of data type unsigned int

(32 bits )

typedef unsigned i n t uint32 ;

48

49

50

51

52

53

// Some arrays

s t a t i c i n t rep [ NREP ][7];

s t a t i c i n t colour [ NREP ];

s t a t i c uint32 tmpl [9][16];

s t a t i c i n t ntmpl [9];

54

55

56

// Some variables

s t a t i c i n t grand_total_hi , grand_total_lo ;

57

58

/* Some miscellaneous utility functions */

59

60

61

62

63

// Use an integer to represent a set of three

digits

s t a t i c i n t pack ( i n t a , i n t b , i n t c ) {

i n t packme [10] = {0};

i n t packed = 0;

22

int i;

packme [ a ] = 1;

packme [ b ] = 1;

packme [ c ] = 1;

f o r ( i =1; i <=9; i ++)

i f ( packme [ i ] == 1)

packed = packed * 16 + i ;

return packed ;

64

65

66

67

68

69

70

71

72

}

73

74

75

76

77

78

79

80

/* Function to compare the digit parts of two

reps a and b .

It return a negative value if

a is before b in the lexicographical order . */

s t a t i c i n t qsrepcmp ( const void *a , const void *

b) {

return memcmp (a , b , 6 * s i z e o f ( i n t ) ) ;

}

81

82

83

84

85

86

87

/* Function to re - order the digit part of a

representative

for the sake of listing , i . e . to satisfy

conditions

rep [0] < rep [1] < rep [2] , rep [3] < rep [4] < rep

[5] ,

and rep [0 ,1 ,2] < rep [3 ,4 ,5] ( in lexicographical

order ) . */

s t a t i c void RepOrder ( i n t * rep ) {

88

89

90

91

92

// Re - order the entries in the top row of B2

i f ( rep [0] > rep [1])

SWAP ( rep [0] , rep [1]) ;

i f ( rep [0] > rep [2])

EDUCATION AND TRAINING

VIETNAM ACADEMY OF

SCIENCE AND TECHNOLOGY

GRADUATE UNIVERSITY

OF SCIENCE AND TECHNOLOGY

Nguyễn Thị Nga

SOME MATHEMATICAL ISSUES

BEHIND SUDOKU PUZZLES

MASTER THESIS IN MATHEMATICS

Hanoi - 2018

Confirmation

This thesis was written on the basis of my research works carries

out at Institute of Mathematics, Vietnam Academy of Science and

Technology under the supervision of Dr. Le Xuan Thanh. All results

of other authors that are used in this thesis are cited correctly.

September 12, 2018

The author

Nguyen Thi Nga

Acknowledgements

This thesis was conducted and completed at the Institute of Mathematics, under the guidance of Dr. Le Xuan Thanh. By this occasion,

I would like to express my gratitude and deep respect to Dr. Le Xuan

Thanh, the exemplary teacher who has spent a lot of time and effort

to guide me through this thesis. Thanks to conscientious guidance,

my research skills have been growing up. He introduced me to many

seminars and conferences, as well as helped me a lot in enlarging my

knowledge. His clear and careful characteristics have a significant influence on me in conducting learning, researching, writing scientific

documents. He was the one who instilled on my passion for Applied

Mathematics. He is not only a master in advising students but also

a very warm person in daily life. I always receive sincere and effective advices from him on professional issues, as well as professional

orientation in the future.

I sincerely thank Center for Postgraduate Training and Department of Numerical Analysis and Scientific Computing of Institute of

Mathematics, Vietnam Academy of Science and Technology for creating favorable conditions for me to complete this thesis.

I express my gratitude to everyone from the Institute of Mathematics, who devoted themselves to teaching and creating favorable

conditions for me to complete my master course.

I would also like to thank my friends for their companion and

help. I would like to say my heartfelt gratitude to my family for their

understanding, patience, support during my time at the Institute of

Mathematics.

1

Contents

1 INTRODUCTION

2

2 SUDOKU ENUMERATION

2.1 Completing blocks B1 − B3 . . . . . .

2.2 A simple heuristic enumeration method

2.3 An exact enumeration method . . . . .

2.3.1 Lexicographical catalogue . . .

2.3.2 Enumerating from catalogue . .

2.4 Conclusions . . . . . . . . . . . . . . .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

5

6

8

10

12

12

45

3 IP FORMULATIONS FOR SUDOKU

46

3.1 A binary linear programming formulation . . . . . . . . 47

3.2 An integer programming formulation . . . . . . . . . . 49

3.3 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . 52

4 NUMERICAL EXPERIMENTS

4.1 Modeling by ZIMPL . . . . . . .

4.1.1 ZIMPL model for (BLP ) .

4.1.2 ZIMPL model for (N LIP )

4.2 Numerical experiments . . . . . .

4.3 Conclusions . . . . . . . . . . . .

5 CONCLUSIONS

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

53

54

54

57

59

61

62

2

Chapter 1

INTRODUCTION

Sudoku is a famous puzzle, which is original called Number Place.

In Japanese, Sudoku means “single number”. In its well-known format, a Sudoku puzzle can be described as follows. Give a grid consisting of 9 × 9 squares (cells), which can also be viewed as a composition

of nine 3 × 3 blocks (so-called subgrids). The puzzle rules are:

(1) each cell of grid must be filled by exactly one digit;

(2) only digits from 1 to 9 can be filled in the grid;

(3) each digit appears exactly once in each column, each row, and

each 3 × 3 subgrid.

Usually, a Sudoku puzzle is provided with a partially complete grid,

and the objective is to completely fill the grid. A well-posed Sudoku

puzzle, also called proper Sudoku, has exactly one solution. Figure

1.1 gives an example of a standard 9 × 9 Sudoku puzzle and a solution

for the puzzle.

3

5

8

9

1 6

3

7

4

8

6

1

8 7

2 6

1

6

3

8 5

4 7

2 1

9

8

3

4

8

1

3

5

6

9

7

2

7

5

6

1

9

2

3

4

8

2

9

3

8

7

4

6

1

5

5

6

9

7

3

8

4

2

1

3

4

8

2

1

5

7

6

9

1

2

7

6

4

9

8

5

3

8

3

2

4

6

1

5

9

7

6

1

5

9

8

7

2

3

4

9

7

4

5

2

3

1

8

6

Figure 1.1: A Sudoku puzzle (left) and a solution for it (right).

Howard Garns (March 2nd, 1905 - October 6th, 1989) is an American architect, is most likely considered as the father of the modern

Sudoku. He published the earliest known examples of modern Sudoku

on Dell Magazines in 1979. In April 1984, the puzzle was introduced

on the Japanese paper Monthly Nikolist. The name of the puzzle in

Japanese was “S¨

uji wa dokushin ni Kagiru” (translated “the digits

must be single”), and afterward was abbreviated to Sudoku by Maki

Kaji - the president of Nikoli Co. Ltd., a Japanese puzzle manufacturer. Wayne Gould (born July 3rd 1945 in Hawera, New Zealand) devised a computer program to rapidly produce Sudoku puzzles. Thanks

to his efforts, Sudoku successfully appeared in a local US newspaper.

Then, in November 2004, The Times of London began featuring Sudoku. The puzzle was rapidly spread to other newspapers as a regular

feature.

Despite the simple rules, there are many non-trivial matters and

challenging issues behind Sudoku puzzles. These make Sudoku not

only popular in daily life but also attractive to many mathematicians.

In [6], the authors proved a non-trivial result that the smallest number

of clues in a proper 9 × 9- Sudoku puzzle is 17. Concerning the complexity issue, it had been shown in [8] that solving Sudoku puzzles of

general size n2 × n2 (with n ≥ 3) is NP-complete. There are also a lots

of variants of Sudoku with different sizes and/or additional constraints

4

(see https://en.wikipedia.org/wiki/Glossary of Sudoku for more details). However, in this thesis we only consider Sudoku puzzles of the

well-known form described above.

In addition to the theoretical issues, having efficient mathematical

models as well as finding fast and accurate algorithms for solving Sudoku puzzles are also attractive topics of interest. In many scheduling

and timetabling problems, there are similar constraints to the ones

of Sudoku. So Sudoku is not only a typical problem in many mathematical specialities (such as combinatorics, complexity, combinatorial

optimization, . . .) but also an important example in mathematical

programming (in senses of both modelling and numerical solutions).

The goal of thesis is to study some mathematical issues behind Sudoku puzzles. We focus mainly on the problem of counting the number

of Sudoku grids, and on how to model Sudoku puzzles as mathematical programs. Apart from this chapter, this thesis is organized in 4

more chapters. Chapter 2 is devoted to the problem of enumarating

Sudoku. In Chapter 3 we present two integer programming models for

solving Sudoku puzzles. Chapter 4 presents some numerical experiments of the mathematical models proposed in Chapter 3. Chapter 5

closes this thesis by some conclusions.

5

Chapter 2

SUDOKU

ENUMERATION

The goal of this chapter is to compute the number, which is denoted

by NS for convenience, of possible solutions for Sudoku. Surprisingly,

no general combinatorial formula for computing that number has been

known until now. In this chapter, based on the ideas in [2, 4], we give

a complete enumeration method for computing NS . For the sake of

the enumeration method, we label the blocks of a Sudoku grid by B1

to B9 as shown in Figure 2.1.

B1

B2

B3

B4

B5

B6

B7

B8

B9

Figure 2.1: Labels of blocks in Sudoku grid.

We first start with the number of ways to complete the first three

6

blocks B1 − B3 . Section 2.1 shows how to compute that number.

This gives the starting point for a simple heuristic method computing

approximately the value of NS . This method is presented in Section

2.2. Then in Section 2.3 we present an exact method to compute the

number NS .

2.1

Completing blocks B1 − B3

We say that a Sudoku grid is of canonical form if the top-left block

B1 is filled as in Figure 2.2.

1

4

2 3

5 6

7

8

9

Figure 2.2: Canonical block.

Let S9 be the group of permutations on the set of nine digits

{1, . . . , 9}. We say that two completely filled Sudoku grids G1 and

G2 are equivalent if G2 is obtained from G1 after applying a permutation f ∈ S9 , or shortly speaking, G2 = f (G1 ). Note that every

non-canonical Sudoku grid is equivalent to a canonical Sudoku grid

via some permutation in S9 . Therefore, each canonical Sudoku grid is

equivalent to |S9 | = 9! Sudoku grids. In other words, if we denote by

NC the number of canonical Sudoku grids, then NS = NC × 9!.

Now we concentrate on considering canonical Sudoku grids. In the

following, by {a, b, c} we indicate the elements a, b, c in any order, and

(a, b, c) in the indicated order. Since the top row of B1 is (1, 2, 3),

there are two cases for the top row of block B2 .

• Case 1 (Pure): It consists of either the second or the third row

of B1 , i.e., either {4, 5, 6} or {7, 8, 9}.

• Case 2 (Mixed): It consists of the three digits in the mixture of

{4, 5, 6} and {7, 8, 9}.

7

More precisely, all possible top rows of blocks B2 and B3 are shown in

Table 2.1.

No. Top row of B2 Top row of B3

1

{4, 5, 6}

{7, 8, 9}

2

{7, 8, 9}

{4, 5, 6}

3

{4, 5, 7}

{6, 8, 9}

4

{4, 5, 8}

{6, 7, 9}

5

{4, 5, 9}

{6, 7, 8}

6

{4, 6, 7}

{5, 8, 9}

7

{4, 6, 8}

{5, 7, 9}

8

{4, 6, 9}

{5, 7, 8}

9

{5, 6, 7}

{4, 8, 9}

10

{5, 6, 8}

{4, 7, 9}

11

{5, 6, 9}

{4, 7, 8}

12

{6, 8, 9}

{4, 5, 7}

13

{6, 7, 9}

{4, 5, 8}

14

{6, 7, 8}

{4, 5, 9}

15

{5, 8, 9}

{4, 6, 7}

16

{5, 7, 9}

{4, 6, 8}

17

{5, 7, 8}

{4, 6, 9}

18

{4, 8, 9}

{5, 6, 7}

19

{4, 7, 9}

{5, 6, 8}

20

{4, 7, 8}

{5, 6, 9}

Case

Pure

Pure

Mixed

Mixed

Mixed

Mixed

Mixed

Mixed

Mixed

Mixed

Mixed

Mixed

Mixed

Mixed

Mixed

Mixed

Mixed

Mixed

Mixed

Mixed

Table 2.1: Possible top rows of blocks B2 and B3 .

In the first pure case (No.1 in Table 2.1), blocks B2 and B3 can

be completed together as in Figure 2.3. Since each set {a, b, c} corresponds to 3! = 6 permutations of its elements and there are 6 rows of

blocks B2 , B3 , this leads to 66 different configurations of B2 , B3 . The

same result holds for the second pure case (No.2 in Table 2.1).

8

1

2

3 {4, 5, 6} {7, 8, 9}

4

5

7

8

6 {7, 8, 9} {1, 2, 3}

9 {1, 2, 3} {4, 5, 6}

Figure 2.3: Completing B1 − B3 in the first pure case.

For the first mixed case (No.3 in Table 2.1), by Sudoku rules, blocks

B2 and B3 can be completed together as in Figure 2.4. Here, a, b, c

stand for 1, 2, 3. Each choice of a determines a choice for {b, c}, and

there are 3 choices for a ∈ {1, 2, 3}. Again, each row of blocks B2 and

B3 in Figure 2.4 corresponds to 3! = 6 permutations of its filled digits,

and there are 6 rows of these two blocks. Therefore, Figure 2.4 stands

for 3 × 66 different configurations of B2 , B3 . The same result holds for

the other mixed cases (No. 4-20 in Table 2.1).

1

2

4

5

3 {4, 5, 7} {6, 8, 9}

6 {8, 9, a} {7, b, c}

7

8

9 {6, b, c} {4, 5, a}

Figure 2.4: Completing B1 − B3 in the first mixed case.

To summarize, we have 2 pure cases (each gives 66 different configurations of blocks B2 − B3 , and 18 mixed cases (each gives 3 × 66

different configurations of blocks B2 − B3 ). Therefore, in total, the

number of different configurations of blocks B2 − B3 in canonical Sudoku grids is

2 × 66 + 18 × 3 × 66 = 2612736.

This means that the number of possibilities for the three blocks B1 −B3

(in which B1 is not necessary in canonical form) is

9! × 2612736 = 948109639680.

2.2

A simple heuristic enumeration method

For convenience, we recall the rules of Sudoku puzzles:

9

(i) each digit from 1 to 9 appears exactly once in each block B1 −B9 ;

(ii) each digit from 1 to 9 appears exactly once in each row of

Sudoku grid;

(iii) each digit from 1 to 9 appears exactly once in each column of

Sudoku grid.

Each block consists of 9 cells, therefore we have 9! ways to fill all

the digits 1, . . . , 9 in each block. This results in Nb = (9!)9 ways to fill

in all blocks B1 to B9 satisfying rule (i).

We have known from Section 2.1 that 948109639680 is the number

of ways to fill in the three blocks B1 − B3 so that each block has the

digits 1, . . . , 9 and also each row has the digits 1, . . . , 9. The same

results hold true for blocks B4 − B6 , and for blocks B7 − B9 . Therefore

the number of ways to fill in all blocks B1 − B9 in such a way that

satisfies rules (i) and (ii) is Nr = 9481096396803 .

So, in Nb possibilities of filling in blocks B1 − B9 satisfying block

rule (i), there are Nr possibilities that also satisfy row property (ii),

that correspond to a proportion of

9481096396803

Nr

=

.

p=

Nb

(9!)9

Similarly, in Nb possibilities of filling in blocks B1 −B9 satisfying block

rule (i), there are 9481096396803 possibilities of filling in blocks B1 −B9

satisfying both block rule (i) and column rule (iii), that correspond

to the same proportion p.

A solution to Sudoku is just one of the Nb grids satisfying rule

(i) that has both the row property (ii) and column property (iii).

Assuming the row and column properties are independent, the total

number of solutions to Sudoku would be

9481096396803

Nb × p = (9!) ×

(9!)9

9481096396806

=

(9!)9

(9!)6 × 566 × 636

=

(9!)9

2

9

2

10

566 × 636

=

(9!)3

= 6657084616885512582463.488

≈ 6.657 × 1021 .

In fact, the row property (ii) and column property (iii) are not independent, so the number computed above cannot be the correct answer.

It is even not an integer! However, after the computation in the next

section, we will see that this number is really close to the exact answer

(with a difference is just 0.2%).

2.3

An exact enumeration method

In this section, we will prove that the exact number of Sudoku

grids is

NS = 6670903752021072936960 ≈ 6.671 × 1021 .

To have an impression about how big NS is, let us compute how large

of computer memory to save all these number of Sudoku grids. Table

2.2 gives the binary representation of each digit from 1 to 9, which in

turn gives the number of bits to save each of the digits in computer

memory.

Digit

Expression

Binary representation Number of bits

0

1

1×2

12

1

2

1 × 21 + 0 × 20

102

2

1

0

3

1×2 +1×2

112

2

4

1 × 22 + 0 × 21 + 0 × 20

1002

3

5

1 × 22 + 0 × 21 + 1 × 20

1012

3

6

1 × 22 + 1 × 21 + 0 × 20

1102

3

2

1

0

7

1×2 +1×2 +1×2

1112

3

8

1 × 23 + 0 × 22 + 0 × 21 + 0 × 20

10002

4

3

2

1

0

9

1×2 +0×2 +0×2 +1×2

10012

4

Table 2.2: Binary representations of digits from 1 to 9.

11

Since each digit from 1 to 9 appears exactly 9 times in a Sudoku

grid, it follows from Table 2.2 that the number of bits needed to encode

a Sudoku grid is

9 × (1 + 2 × 2 + 3 × 4 + 4 × 2) = 225.

Following the data measure given in Table 2.3, to store all Sudoku

grids we need the computer memory space of

NS × 225

= 1.66639 × 108 (petabytes).

5

8 × 1024

Measure

Symbol Equivalence

1 byte

1B

8 bits

1 kilobyte

1 KB

1024 B

1 megabyte 1 MB

1024 KB

1 gigabyte

1 GB

1024 MB

1 terabyte

1 TB

1024 GB

1 petabyte

1 PB

1024 TB

Table 2.3: Units in data measurement.

Now we discuss in detail the computation for NS . Thanks to the

discussion in Section 2.1, from now we restrict our consideration to

canonical Sudoku grids. We also know from Section 2.1 that there are

2612736 possible configurations of B2 − B3 in canonical Sudoku grids.

We partition the configurations of B2 and B3 into classes as follows.

• Two configurations of B2 and B3 are in the same class if they

have the same number of ways of completing to a full valid Sudoku grid.

We then look at operations that do not change the number of Sudoku

grids. The partition of the configuration of B2 and B3 into classes help

us to reduce the number of possibilities which we need to consider in

2612736 possibilities for B2 − B3 .

12

2.3.1

Lexicographical catalogue

We classify all 2612736 configurations of B2 − B3 as follows.

• Within B2 and within B3 , we permute the columns so that the

entries in the top row are in increasing order.

• Exchange B2 and B3 if necessary so that B2 is before B3 in

lexicographical order with respect to their top rows.

Concerning the action in the second step, when we permute B1 ,

B2 and B3 in any way, then B1 is changed. But we can relabel to

put it back into canonical form. Hence, if we exchange B2 and B3 ,

then every way of filling in B2 − B3 gives us a unique way to complete

B3 − B2 to complete grid (just exchange between B5 and B6 , and B8

and B9 in the former configuration).

Concerning the action in the first step, similar to the second step,

each way of completing from a configuration of (B2 , B3 ) to a complete

Sudoku grid gives a unique way to complete its column permutation

to a complete Sudoku grid (just perform the same permutation on the

column in a complete grid).

There are 3! ways to permute the columns in each block B2 and B3

from the first step, so we have (3!)2 = 36 grids which with the same

number of ways of completing. From the second step, permutation

of 2 blocks B2 and B3 lead to double number of possibilities may be

reduced of the number of way of completing of (B2 , B3 ). Overall, we

can reduce 36×2 = 72 times in 2612736 possibilities mentioned above.

Thus, we only need to consider 2612736/72 = 36288 possibilities for

our catalogue.

2.3.2

Enumerating from catalogue

At this point, our catalogue consists of 36288 configurations of

blocks B2 − B3 , that are lexicographically ordered (as described in

Section 2.3.1). Our framework to handle with the catalogue is given

in Algorithm 1.

13

Algorithm 1 Enumerating from catalogue

1: Input: Lexicographically ordered catalogue of configurations of

B2 − B3

2: Initialization: Color all configurations in catalogue

3:

numberOfClass ← 0

4: while catalogue has some color configuration do

5:

Let α be the first colored configuration in catalogue

6:

Find Cα := the class of all configurations in catalogue that are

equivalent to α

7:

Uncolor all configurations in Cα

8:

numberOfClass ← numberOfClass + 1

9:

Compute nα := the number of ways to complete from α to full

Sudoku grid

10: end while

After executing Algorithm 1, we get a list C of configurations of

B2 − B3 , each of the configurations in C is a representative in its

equivalent class. It follows from our discussion that

|Cα | = 36288,

α∈C

and the number of canonical Sudoku grids is computed by

nα |Cα |.

NC = 72

(2.1)

α∈C

Computing |Cα | and nα for each class representative α ∈ C are the key

procedures in Algorithm 1, respectively corresponding to the steps in

line 6 and line 9 of the algorithm pseudo-code. We now discuss the

procedures in detail.

1. Finding equivalent class of a given configuration in the catalogue

We note that there are four types of operations that can be applied

to B1 − B3 preserving the number of completions from these three

blocks to a full grid:

14

• type 1: permuting the columns within each block B1 , B2 , B3 ,

• type 2: permuting the whole block B1 , B2 , B3 ,

• type 3: permuting the cells within each column of B1 , B2 , B3

(provided that the obtained configuration does not break the

Sudoku rules),

• type 4: relabelling (i.e. applying a permutation f ∈ S9 to a

configuration of B1 − B3 ).

The first two types are discussed in Section 2.3.1, while the operations

of type 4 are already mentioned in Section 2.1. These four types of

operations suggest the following exhaustive way to find the equivalent

class Cα of a chosen configuration α in the catalogue.

Start from the chosen configuration α. Let us consider all configurations of B1 − B3 obtained from α by applying possible operations

of types 1-3. Then for each obtained configuration, we apply an operation of type 4 in order to transform block B1 to canonical form. We

furthermore apply operations of types 1-2 in order to transform blocks

B2 − B3 to satisfy the lexicographical conditions mentioned in Section

2.3.1, i.e., the entries in the top row of B2 , B3 are in increasing order

and B2 is before B3 in lexicographical order with respect to their top

rows. If we are successfully performing these operations, then the last

obtained configuration belongs to the equivalent class of α.

To better understand this process, let us consider some illustrative

examples.

Example 1. Assume that B1 − B3 are filled as in Figure 2.5. Since

block B1 is in canonical form, and blocks B2 − B3 are in the lexicographical order, this configuration belongs to our catalogue.

1 2 3 4 8 9 5 6 7

4 5 6 1 2 7 3 8 9

7 8 9 3 5 6 1 2 4

Figure 2.5: Configuration 1.

15

Permuting the first and the second rows of all the blocks, we obtain

the configuration in Figure 2.6.

4 5 6 1 2 7 3 8 9

1 2 3 4 8 9 5 6 7

7 8 9 3 5 6 1 2 4

Figure 2.6: Configuration 1 after permuting the first two rows.

In this configuration, B1 is not in the canonical form, so we need a

relabelling operation to transform it back to canonical form. Relabelling 1 ↔ 4, 2 ↔ 5, 3 ↔ 6 in all the three blocks B1 − B3 , we obtain

the configuration in Figure 2.7.

1 2 3 4 5 7 6 8 9

4 5 6 1 8 9 2 3 7

7 8 9 6 2 3 4 5 1

Figure 2.7: Configuration 1 after permuting the first two rows and

relabelling.

In configuration in Figure 2.7, block B1 is in canonical form, and

blocks B2 − B3 are in the lexicographical order. Thus, this configuration belongs to our catalogue. Furthermore, since all operations we

have applied are of types 3-4, this obtained configuration in Figure

2.7 has the same number of completions to the full Sudoku grid as the

configuration 1 in Figure 2.5). Therefore, these two configurations are

in the same equivalent class.

Example 2. Start from the configuration of B1 −B3 in 2.5, permute

the first two columns of B1 , we get the configuration in Figure 2.8.

16

2 1 3 4 8 9 5 6 7

5 4 6 1 2 7 3 8 9

8 7 9 3 5 6 1 2 4

Figure 2.8: Configuration 1 after permuting the first two columns of

B1 .

In this configuration, block B1 is not in the canonical form. Relabelling

1 ↔ 2, 4 ↔ 5, 7 ↔ 8 in all the three blocks B1 − B3 , we obtain the

configuration in Figure 2.9.

1 2 3 5 7 9 4 6 8

4 5 6 2 1 8 3 7 9

7 8 7 3 4 6 2 1 5

Figure 2.9: Configuration 1 after permuting the first two columns of

B1 and relabelling.

This configuration belongs to our catalogue since block B1 is in canonical form, and blocks B2 − B3 are in the lexicographical order. So it

also belongs to the equivalent class of configuration in Figure 2.5.

Example 3. Assume that B1 − B3 are filled as in Figure 2.10. It is

obvious that this is one configuration in our catalogue.

1 2 3 4 5 8 6 7 9

4 5 6 1 7 9 2 3 8

7 8 9 2 3 6 1 4 5

Figure 2.10: Configuration 2.

Permuting the first two columns of B1 , we get the configuration in

Figure 2.11.

17

2 1 3 4 5 8 6 7 9

5 4 6 1 7 9 2 3 8

8 7 9 2 3 6 1 4 5

Figure 2.11: Configuration 2 after permuting the first two columns of

B1 .

By relabelling 1 ↔ 2, 4 ↔ 5, 7 ↔ 8 we transform B1 back to canonical

form. However, B2 and B3 are not in lexicographical order (see Figure

2.12).

1 2 3 5 4 7 6 8 9

4 5 6 2 8 9 1 3 7

7 8 9 1 3 6 2 5 4

Figure 2.12: Transform configuration in Figure 2.11 to canonical

form.

So we need to permute the first two columns of B2 to get a configuration whose B2 − B3 are in lexicographical order (see Figure 2.13).

1 2 3 4 5 7 6 8 9

4 5 6 8 2 9 1 3 7

7 8 9 3 1 6 2 5 4

Figure 2.13: A configuration obtained from Configuration 2.

With the similar arguments to the previous examples, this configuration belongs to the equivalent class of Configuration 2.

Example 4. We again consider the configuration of B1 −B3 given in

Figure 2.5. We notice the positions of 1 and 4 in B1 and B2 . Permuting

the cells containing the digits 1 and 4 within the first and the fourth

columns, we get the configuration in Figure 2.14.

18

4 2 3 1 8 9 5 6 7

1 5 6 4 2 7 3 8 9

7 8 9 3 5 6 1 2 4

Figure 2.14: Configuration 1 after permuting positions of 1 and 4 in

the first columns of B1 and B2 .

In configuration in Figure 2.14, block B1 is not in canonical form.

Hence, we have to relabel 1 ↔ 4 to transform B1 back to canonical

form, and obtain the configuration as in Figure 2.15.

1 2 3 4 8 9 5 6 7

4 5 6 1 2 7 3 8 9

7 8 9 3 5 6 4 2 1

Figure 2.15: Configuration obtained after relabelling 1 ↔ 4 in Figure

2.14.

Configuration in 2.15 satisfies the 2 rules of configurations in our catalogue (that are, block B1 is in canonical form, and blocks B2 − B3 are

in the lexicographical order). So configuration in Figure 2.5 and configuration in Figure 2.15 are equivalent and they belong to the same

equivalent class.

2. Completing from a given configuration of B1 − B3 to full Sudoku

grid

In Step 9 of Algorithm 1, we need to compute the number of ways

to complete a full Sudoku grid from a given configuration α of blocks

B1 − B3 . In principle, we can try all possible ways to fill the remaining

blocks B4 −B9 and see which are Sudoku grids. However, we can speed

up this process by insisting that the first column of blocks B4 and B7

is lexicographically ordered. More precisely, by permuting the middle

three rows of the grid, we can assume that the entries on the first

column of B4 are in numerical order, and similarly for B7 . We can

also exchange the middle three rows and the last three rows of the

19

grid so that B4 is before B7 in lexicographical order with respect to

their first columns). By the same arguments as in Section 2.3.1, this

helps to speed up our calculation by a factor of 72.

Mathematically speaking, let Dα be the set of all configurations of

B1 − B3 , B4 , B7 in which B1 − B3 are given by α and B4 , B7 are in

lexicographical order. Let n∗α be the number of ways to complete a

full Sudoku grid from all configurations in Dα . Then we have

nα = 72n∗α .

(2.2)

For each configuration in Dα we try all possible ways to fill the

remaining blocks B5 , B6 , B8 , B9 by using the idea of backtracking algorithm. That is, we first list all possibilities of filling the digit 1 into

these remaining blocks. Then for each of the possibilities, we list all

possibilities of filling the digit 2 into the remaining cells, and so on.

If for some digit a we could not find any possibilities for filling the

next digit a + 1 into these blocks (so that the Sudoku rules are not

violated), then we skip the current possibility of filling the digit a and

come to the next possibility.

We now have all necessary materials for Algorithm 1. In Code 1

we give the code for this algorithm in C++ programming language.

1

2

3

Code 1: C++ code to enumerate equivalent classes in catalogue.

/* This code is written by Nguyen Thi Nga

based on the code of Ed Russell from

http :// www . afjarvis . staff . shef . ac . uk / sudoku /

equiv . c */

4

5

6

/* The aim of this code is to give information

of

44 equivalent classes of Sudoku grids . */

7

8

9

/* Each class consists of B2 - B3 ’s

configurations that have

the same number of ways to complete Sudoku grid

.

20

10

Block B1 is in canonical form . */

11

12

13

14

15

16

17

18

19

/* A representative ( rep for short )

is a way of filling digits to 6 columns of

blocks B2 - B3 .

Each of such columns consists of three digits (

a, b, c)

which is represented by an integer of value

a * 16^2 + b * 16 + c

( i . e . the integer abc in hexadecimal system ) .

We choose the representation in hexadecimal

system

because we may need 4 bits to represent a digit

. */

20

21

22

23

24

25

26

27

28

/* Each representative is stored as an array of

7 integers

rep [0 -6] , in which the first 6 integers rep

[0 -5] correspond to

the hexadecimal representations of 6 columns of

B2 - B3 ,

while rep [6] is the number of B2 - B3 ’s

configurations

that map to this representative .

Furthermore , for the sake of listing the

representatives ,

we impose rep [0] < rep [1] < rep [2] , rep [3] <

rep [4] < rep [5] ,

and rep [0 ,1 ,2] < rep [3 ,4 ,5] ( in lexicographical

order ) . */

29

30

31

32

33

/* Printed information of each class consist of

class index , a class representative ,

number of configurations of B2 - B3 in the class ,

number of ways to complete Sudoku grids

21

34

from a configurations of B2 - B3 in the class . */

35

36

37

38

#i n c l u d e < stdio .h >

#i n c l u d e < stdlib .h >

#i n c l u d e < memory .h >

39

40

41

#d e f i n e NREP 36288

// number of

representatives ( reps )

#d e f i n e NREPX 22266

// number of reps after

a reduction

42

43

44

// Define a function to swap values of two

integers

#d e f i n e SWAP (A , B ) { i n t t = A ; A = B ; B = t ; }

45

46

47

// Shorten the name of data type unsigned int

(32 bits )

typedef unsigned i n t uint32 ;

48

49

50

51

52

53

// Some arrays

s t a t i c i n t rep [ NREP ][7];

s t a t i c i n t colour [ NREP ];

s t a t i c uint32 tmpl [9][16];

s t a t i c i n t ntmpl [9];

54

55

56

// Some variables

s t a t i c i n t grand_total_hi , grand_total_lo ;

57

58

/* Some miscellaneous utility functions */

59

60

61

62

63

// Use an integer to represent a set of three

digits

s t a t i c i n t pack ( i n t a , i n t b , i n t c ) {

i n t packme [10] = {0};

i n t packed = 0;

22

int i;

packme [ a ] = 1;

packme [ b ] = 1;

packme [ c ] = 1;

f o r ( i =1; i <=9; i ++)

i f ( packme [ i ] == 1)

packed = packed * 16 + i ;

return packed ;

64

65

66

67

68

69

70

71

72

}

73

74

75

76

77

78

79

80

/* Function to compare the digit parts of two

reps a and b .

It return a negative value if

a is before b in the lexicographical order . */

s t a t i c i n t qsrepcmp ( const void *a , const void *

b) {

return memcmp (a , b , 6 * s i z e o f ( i n t ) ) ;

}

81

82

83

84

85

86

87

/* Function to re - order the digit part of a

representative

for the sake of listing , i . e . to satisfy

conditions

rep [0] < rep [1] < rep [2] , rep [3] < rep [4] < rep

[5] ,

and rep [0 ,1 ,2] < rep [3 ,4 ,5] ( in lexicographical

order ) . */

s t a t i c void RepOrder ( i n t * rep ) {

88

89

90

91

92

// Re - order the entries in the top row of B2

i f ( rep [0] > rep [1])

SWAP ( rep [0] , rep [1]) ;

i f ( rep [0] > rep [2])

## Một số vấn đề về khấu hao TSCĐ trong doanh nghiệp.DOC

## Một số vấn đề chung về án lệ trong các hệ thống pháp luật thuộc dòng học civil law

## MỘT SỐ VẤN ĐỀ LÝ LUẬN CƠ BẢN TRONG CÔNG TÁC HẠCH TOÁN NGUYÊN VẬT LIỆU TRONG DOANH NGHIỆP SẢN XUẤT

## MỘT SỐ VẤN ĐỀ LÍ LUẬN CƠ BẢN TRONG CÔNG TÁC HẠCH TOÁN NVL TRONG DOANH NGHIỆP SẢN XUẤT

## Một số vấn đề lí luận cơ bản trong công tác hạch toán vật liệu trong doanh nghiệp sản xuất

## Tài liệu ĐỀ ÁN " Một số vấn đề về khấu hao TSCĐ trong doanh nghiệp” docx

## Báo cáo khoa học: "Một số vấn đề về lịch trả nợ trong phân tích dự án đầu tư" pps

## Báo cáo khoa học: "Một số vấn đề về suất chiết khấu trong phân tích dự án đầu tư" pptx

## Báo cáo khoa học: "Một số vấn đề tồn tại hiện nay trong tính toán mực n-ớc lũ theo tần suất và kiến nghị ph-ơng pháp khắc phục" docx

## Nghiên cứu triết học " MỘT SỐ VẤN ĐỀ TRIẾT HỌC ĐẶT RA Ở CHÂU Á - THÁI BÌNH DƯƠNG HIỆN NAY " pot

Tài liệu liên quan