In the previous post, I implemented a sudoku solver. Unfortunately, it does not work yet. Let’s fix it!

Debugging the solver in the current state of the code is hard. The test fails because the expected solution is not equal to the actual solution returned, but since I have no way of visualizing the Grid it’s hard to say what the difference is.

Therefore, my approach here would be to write a function to print the grids. This function, with the current design, would naturally live as method of the Grid class. The Grid class seems to have become the place where I dump most of the code and has a number of responsibilities:

  • Hold & fetch grid data (which is the main goal of the class).
  • Find empty cells.
  • Compare equality with other grids.
  • Modify Grid.
  • And now I’d like to add: print Grid for debugging.

This does not sound right. Ideally, the class should have a single responsibility: hold and fetch data about grids. Equality could also be part of this, because it’s an implementation of the concept of equality for the Grid object. Modification also sounds ok as it’s very data related. Other algorithms should not be there though.

The initial mistake was made by adding the search for empty cells to this class, and I shall not repeat it again now. What I actually need is a method to iterate over all cells. With that I can extract findEmptyCell.

Thinking about this new accessor method interface, I’d just return an array of integers. I’m pretty sure in Java there would be a better (lazy) way, but I’m not that into the language and do not have internet access right now. The interface sounds simple enough to me.

I’ll attempt a test for the first:

@Test
public void testIter() throws InvalidGridException {
    Grid grid = new Grid(KNOWN_GRID);
    int i = 0;
    for (int value: grid.iter()) {
        assertEquals(KNOWN_GRID[i], value);
        ++i;
    }
}

Good enough. The code I’m writing to implement this, looks like this:

public int[] iter() {
    return array.clone();
}

Passing! I’m cloning to avoid that the Grid instance is mutated. Now I can move findEmptyCell out. I’m not liking where this is headed: I will have a pointer to an internal array of a class, however supposedly opaque, which is handled outside the class.

I really need to decide on an interface to identify a cell in the Grid. To me, it seems that the index is bad design, and I should really use the coordinates instead. That way, the internal data representation is properly encapsulated and the abstraction is not leaky.

At this point, it might be good to actually refactor before proceeding, instead of trying to fix the solver. If I continue on this road, I’ll end up having to rewrite everything later. I should at least refactor some things right now.

The CellRef Class

Let’s try to design this a bit better. I’ll base my refactoring on the new CellRef class:

public class CellRef {
    public final int x, y;

    CellRef(int x, int y) {
        this.x = x;
        this.y = y;
    }
}

This is what I call a “Data class”, that is a container for data that has no methods attached. You can think of it as a struct as it appears in C.

The first obvious place where I can use this, is converting getCellValue(int, int) to getCellValue(CellRef):

@Test
void testGetCellValue() throws InvalidGridException, OutOfGridException {
    Grid grid = new Grid(KNOWN_GRID);
    assertEquals(9, grid.getCellValue(new CellRef(0, 1)));
    assertEquals(3, grid.getCellValue(new CellRef(7, 4)));
    assertEquals(0, grid.getCellValue(new CellRef(8, 8)));
    assertEquals(8, grid.getCellValue(new CellRef(8, 0)));
    assertEquals(2, grid.getCellValue(new CellRef(0, 8)));
}

(I also changed all other tests and code using this function in a very similar way.)

Then change the method to this:

public int getCellValue(CellRef ref) throws OutOfGridException {
    int x = ref.x;
    int y = ref.y;

    if (x < 0 || x >= GRID_SIZE || y < 0 || y >= GRID_SIZE) {
        throw new OutOfGridException(String.format(
                "Coordinate out of grid bounds: x=%d,y=%d",
                x, y
        ));
    }

    return array[y*GRID_SIZE + x];
}

The tests are still passing. I won the battle but not the war yet.

The Feared Refactoring

Now, in the Solver I need to iterate over rows, columns and blocks. That’s ugly because Solver needs to know what the magic number 9 for the sudoku size. The Solver just wants to check for duplicates. It’s not concerned about the data layout.

I could write a new bunch of methods to fetch rows and columns. Some tests:

void assertArrayEquals(int[] actual, int[] expected) throws AssertionError {
    assertEquals(actual.length, expected.length);
    for (int i = 0; i < actual.length; ++i) {
        assertEquals(actual[i], expected[i]);
    }
}

@Test
void testGetRows() throws InvalidGridException {
    Grid grid = new Grid(KNOWN_GRID);
    int[][] rows = grid.getRows();
    assertArrayEquals(new int[]{0, 1, 2, 3, 4, 5, 6, 7, 8}, rows[0]);
    assertArrayEquals(new int[]{9, 0, 1, 2, 3, 4, 5, 6, 7}, rows[1]);
    assertArrayEquals(new int[]{8, 9, 0, 1, 2, 3, 4, 5, 6}, rows[2]);
    assertArrayEquals(new int[]{7, 8, 9, 0, 1, 2, 3, 4, 5}, rows[3]);
    assertArrayEquals(new int[]{6, 7, 8, 9, 0, 1, 2, 3, 4}, rows[4]);
    assertArrayEquals(new int[]{5, 6, 7, 8, 9, 0, 1, 2, 3}, rows[5]);
    assertArrayEquals(new int[]{4, 5, 6, 7, 8, 9, 0, 1, 2}, rows[6]);
    assertArrayEquals(new int[]{3, 4, 5, 6, 7, 8, 9, 0, 1}, rows[7]);
    assertArrayEquals(new int[]{2, 3, 4, 5, 6, 7, 8, 9, 0}, rows[8]);
}

@Test
void testGetColumns() throws InvalidGridException {
    Grid grid = new Grid(KNOWN_GRID);
    int[][] columns = grid.getColumns();
    assertArrayEquals(new int[]{0, 9, 8, 7, 6, 5, 4, 3, 2}, columns[0]);
    assertArrayEquals(new int[]{1, 0, 9, 8, 7, 6, 5, 4, 3}, columns[1]);
    assertArrayEquals(new int[]{2, 1, 0, 9, 8, 7, 6, 5, 4}, columns[2]);
    assertArrayEquals(new int[]{3, 2, 1, 0, 9, 8, 7, 6, 5}, columns[3]);
    assertArrayEquals(new int[]{4, 3, 2, 1, 0, 9, 8, 7, 6}, columns[4]);
    assertArrayEquals(new int[]{5, 4, 3, 2, 1, 0, 9, 8, 7}, columns[5]);
    assertArrayEquals(new int[]{6, 5, 4, 3, 2, 1, 0, 9, 8}, columns[6]);
    assertArrayEquals(new int[]{7, 6, 5, 4, 3, 2, 1, 0, 9}, columns[7]);
    assertArrayEquals(new int[]{8, 7, 6, 5, 4, 3, 2, 1, 0}, columns[8]);
}

@Test
void testGetBoxes() throws InvalidGridException {
    Grid grid = new Grid(KNOWN_GRID);
    int[][] boxes = grid.getBoxes();
    assertArrayEquals(new int[]{0, 1, 2, 9, 0, 1, 8, 9, 0}, boxes[0]);
    assertArrayEquals(new int[]{3, 4, 5, 2, 3, 4, 1, 2, 3}, boxes[1]);
    assertArrayEquals(new int[]{6, 7, 8, 5, 6, 7, 4, 5, 6}, boxes[2]);
    assertArrayEquals(new int[]{7, 8, 9, 6, 7, 8, 5, 6, 7}, boxes[3]);
    assertArrayEquals(new int[]{0, 1, 2, 9, 0, 1, 8, 9, 0}, boxes[4]);
    assertArrayEquals(new int[]{3, 4, 5, 2, 3, 4, 1, 2, 3}, boxes[5]);
    assertArrayEquals(new int[]{4, 5, 6, 3, 4, 5, 2, 3, 4}, boxes[6]);
    assertArrayEquals(new int[]{7, 8, 9, 6, 7, 8, 5, 6, 7}, boxes[7]);
    assertArrayEquals(new int[]{0, 1, 2, 9, 0, 1, 8, 9, 0}, boxes[8]);
}

I hard-coded the values. It’s a miracle if I got this right. Luckily, the tests won’t pass if I made a mistake. My tentative implementation of getRows is:

public int[][] getRows() {
    List<int[]> rows = new ArrayList<>();
    for (int i = 0; i < GRID_SIZE; ++i) {
        rows.add(new int[]{
                i, i+1, i+2, i+3, i+4, i+5, i+6, i+7, i+8,
        });
    }
    return rows.toArray(new int[][]{});
}

Now, the other methods would be the same, just using a different ways of indexing. I’ll move it all to a private method to avoid duplication:

int[][] collect(int stride, int[] offsets) {
    int[][] groups = new int[GRID_SIZE][GRID_SIZE];
    for (int i = 0; i < GRID_SIZE; ++i) {
        for (int j = 0; j < GRID_SIZE; ++j) {
            int index = i*stride + offsets[j];
            groups[i][j] = array[index];
        }
    }
    return groups;
}

public int[][] getRows() {
    return collect(9, new int[]{0, 1, 2, 3, 4, 5, 6, 7, 8});
}

public int[][] getColumns() {
    return collect(1, new int[]{0, 9, 18, 27, 36, 45, 54, 63, 72});
}

public int[][] getBoxes() {
    return collect(3, new int[]{0, 1, 2, 9, 10, 11, 18, 19, 20});
}

I also ditched the List to avoid some allocations. I’m not concerned about memory usage as we’re talking of a tiny bunch of integers.

The tests for getBoxes are still failing. The reason is that with a simple stride, I cannot represent the indexing of boxes. I need some more sophisticated collect() method. I find it clean and simple to pass a function to compute the next stride.

int[][] collect(Function<Integer, Integer> getStride, int[] offsets) {
    int[][] groups = new int[GRID_SIZE][GRID_SIZE];
    for (int i = 0; i < GRID_SIZE; ++i) {
        for (int j = 0; j < GRID_SIZE; ++j) {
            int index = getStride.apply(i) + offsets[j];
            groups[i][j] = array[index];
        }
    }
    return groups;
}

public int[][] getBoxes() {
    return collect(
        (i) -> (i / 3) * 27 + (i % 3) * 3,
        new int[]{0, 1, 2, 9, 10, 11, 18, 19, 20}
    );
}

This now passes (I also performed the required changes to the other methods).

With this in place, I can now refactor Solver. This is the state of the code right now:

static void validateRows(Grid state) throws
        OutOfGridException,
        InvalidInitialStateException
{
    for (int row = 0; row < 9; ++row) {
        Set<Integer> visited = new HashSet<>();
        for (int i = 0; i < 9; ++i) {
             int value = state.getCellValue(new CellRef(i, row));
             if (value == 0) continue;
             if (visited.contains(value)) {
                 throw new InvalidInitialStateException(String.format(
                         "Row %d contains duplicate value %d",
                         row, value
                 ));
             }
             visited.add(value);
        }
    }
}

This can be easily refactored into:

static void validateRows(Grid state) throws InvalidInitialStateException {
    for (int[] row: state.getRows()) {
        Set<Integer> visited = new HashSet<>();
        for (int value: row) {
             if (value == 0) continue;
             if (visited.contains(value)) {
                 throw new InvalidInitialStateException(String.format(
                         "Row %d contains duplicate value %d",
                         row, value
                 ));
             }
             visited.add(value);
        }
    }
}

But we can do even better, we can move all the logic to a private helper function to avoid duplication, just as we did in Grid.

static void validateUnique(int[][] groups, String setName)
    throws InvalidInitialStateException
{
    for (int i = 0; i < groups.length; ++i) {
        int[] group = groups[i];
        Set<Integer> visited = new HashSet<>();
        for (int value: group) {
            if (value == 0) continue;
            if (visited.contains(value)) {
                throw new InvalidInitialStateException(String.format(
                        "%s %d contains duplicate value %d",
                        setName, i, value
                ));
            }
            visited.add(value);
        }
    }
}

static void validateRows(Grid state) throws InvalidInitialStateException {
    validateUnique(state.getRows(), "Row");
}

static void validateColumns(Grid state) throws InvalidInitialStateException {
    validateUnique(state.getColumns(), "Column");
}

static void validateBoxes(Grid state) throws InvalidInitialStateException {
    validateUnique(state.getBoxes(), "Box");
}

Tests are still shining in green. How cool is it to refactor and being certain the stuff is still working? I remember those gloomy days where I would cross my fingers and hope for the best. Luckily, the testing sun shines now.

With the modifications above, I’m changing the error message a little for the Box case, but overall the code is much shorter and cleaner. I also got rid of that pesky 9, so the coupling is gone!

Finding Empty Cells

I continue to surf the Refactoring wave until it lasts.

I need to extract findEmptyCell from the Grid class, as I determined that I don’t want to have this method in the class. For this, I could convert the iter method at the start of this post. Instead, I’ll go for a simpler, perhaps uglier design, and drop iter from the code. I will just use getRows() to find the empty cell.

But first, I’ll move the tests to the SolverTest class. I’ll move findEmptyCell to Solver as for now I just need it there. I know it could be moved to an extra class for reuse, but for YAGNI reasons and laziness I’ll limit myself to the move.

// Tests moved to SolverTests from GridTests with some
// adaptations (commented below)
@ParameterizedTest
@ValueSource(ints = {0, 1, 77})
void testFindEmptyCell(int index) throws InvalidGridException {
    int[] data = IntStream.generate(() -> 1).limit(81).toArray();
    data[index] = 0;

    Grid grid = new Grid(data);
    // Use RefCell instead of index
    CellRef ref = Solver.findEmptyCell(grid);
    assertEquals(index % 9, ref.x);
    assertEquals(index / 9, ref.y);
}

@Test
void testFindEmptyCellNone() throws InvalidGridException {
    int[] data = IntStream.generate(() -> 1).limit(81).toArray();
    Grid grid = new Grid(data);
    // Expect null instead of -1
    assertEquals(null, Solver.findEmptyCell(grid));
}

That was easy. findEmptyCell will need to be a public method, which is a bit ugly as I’m polluting the interface. But if that enables better testing and easier refactoring later (the method has independent tests that can be moved effortlessly, as I just did), I’m fine with it.

I think that as programmers, we must decide to be pragmatic at times if we want to accomplish something. The point is also to keep improving the code every time instead of trying get it done perfectly.

The return value if there’s no cell is now null. I’m using 9 here again. I’ll have to fix it in the next step, but at least it’s in a test and not in the implementation itself.

Then, I move the method:

public static CellRef findEmptyCell(Grid state) {
    int[][] rows = state.getRows();
    for (int y = 0; y < rows.length; ++y) {
        int[] row = rows[y];
        for (int x = 0; x < row.length; ++x) {
            if (row[x] == 0) {
                return new CellRef(x, y);
            }
        }
    }
    return null;
}

It became static and a little longer, but I’m saving some time by reusing the existing getRows. The tests are passing.

Replacing Values

The last function that still needs to be changed is withCell. As customary, start by fixing the tests to accept CellRef instead of an index:

@Test
void testWithCell() throws InvalidGridException, OutOfGridException {
    CellRef ref = new CellRef(6, 1);

    Grid grid = new Grid(KNOWN_GRID);
    assertEquals(5, grid.getCellValue(ref));

    Grid newGrid = grid.withCell(ref, 7);
    assertEquals(5, grid.getCellValue(ref));
    assertEquals(7, newGrid.getCellValue(ref));
}

@ParameterizedTest
@ValueSource(ints = {-10, -1, 10, 100})
void testWithCellOutOfBounds(int value) throws InvalidGridException {
    Grid grid = new Grid(KNOWN_GRID);
    assertThrows(
            OutOfGridException.class,
            () -> grid.withCell(new CellRef(value, 4), 7)
    );
}

@ParameterizedTest
@ValueSource(ints = {-10, -1, 9, 100})
void testWithCellWrongValue(int value) throws InvalidGridException {
    Grid grid = new Grid(KNOWN_GRID);
    assertThrows(
            InvalidGridException.class,
            () -> grid.withCell(new CellRef(3, 7), value)
    );
}

In the first test, I created a variable for the reference. The others are just simple adaptations of the old tests to use the new interface. I also changed the bounds of the parameters to make sense for the new situation. Now that tests are red as blood, let’s fix the wounds:

public Grid withCell(CellRef ref, int value) throws
    OutOfGridException,
    InvalidGridException
{
    int index = ref.y*GRID_SIZE + ref.x;
    if (index < 0 || index >= array.length) {
        throw new OutOfGridException(
                String.format("Invalid index %d", index)
        );
    }
    if (value < 0 || value > GRID_SIZE) {
        throw new InvalidGridException(
                String.format("Cannot set value %d to cell", value)
        );
    }
    int[] data = array.clone();
    data[index] = value;
    return new Grid(data);
}

I actually just changed the signature and added the first line to compute the index. The out-of-bounds test fails because the code does not work. I can set CellRef(-4, 2) and currently that’s equivalent to CellRef(5, 1). I need to change the code:

public Grid withCell(CellRef ref, int value) throws
    OutOfGridException,
    InvalidGridException
{
    if (ref.x < 0 || ref.x >= GRID_SIZE || ref.y < 0 || ref.y >= GRID_SIZE) {
        throw new OutOfGridException(
                String.format("Invalid grid coordinates (%d, %d)", ref.x, ref.y)
        );
    }
    if (value < 0 || value > GRID_SIZE) {
        throw new InvalidGridException(
                String.format("Cannot set value %d to cell", value)
        );
    }
    int[] data = array.clone();
    int index = ref.y*GRID_SIZE + ref.x;
    data[index] = value;
    return new Grid(data);
}

That’s better, it now works. Another bug caught by testing!

As promise, I also fix the test above by removing that magic 9:

@Test
void testFindEmptyCell() throws
        InvalidGridException,
        OutOfGridException
{
    CellRef expRef = new CellRef(6, 1);

    int[] data = IntStream.generate(() -> 1).limit(81).toArray();
    Grid grid = new Grid(data).withCell(expRef, 0);

    CellRef ref = Solver.findEmptyCell(grid);
    assertEquals(expRef.x, ref.x);
    assertEquals(expRef.y, ref.y);
}

A last step to make everything compile is changing some data types in the solver algorithms to adapt to the new situation.

Printing the Grid!

Going back to the beginning, I wanted to visualize what is happening in the solver. I’ll add a private method to print a Grid content to Solver. I won’t write a test for this as it’s private and only a temporary aid during debugging:

static void printState(Grid state) {
    int[][] rows = state.getRows();
    for (int y = 0; y < rows.length; ++y) {
        if (y != 0 && (y % 3) == 0) System.out.printf("\n");

        int[] row = rows[y];
        for (int x = 0; x < row.length; ++x) {
            if (x != 0 && (x % 3) == 0) System.out.printf(" ");
            System.out.printf("%d ", row[x]);
        }
        System.out.printf("\n");
    }

    System.out.println("---");
}

This, manually tested for some state, prints:

2 0 0  0 0 0  4 0 8
0 0 0  0 3 4  0 2 5
0 7 0  8 0 0  0 0 0

7 0 9  0 0 0  0 3 0
1 0 8  2 4 0  5 7 6
6 0 2  3 1 7  9 0 4

3 6 0  4 9 0  0 0 0
0 2 0  0 7 8  6 0 0
4 0 0  0 0 0  0 5 0
---

Final Thoughts

In this post, I completed a refactoring of the code-base to clean up the classes, move functionality to suitable places and improve the encapsulation.

Am I happy with the result? Not really, but it’s better than it was before for sure. Methods are in better places, and there is now a consistent interface to the Grid class. A lot of places do not break abstraction anymore. There’s still improvements that could be made I guess, but I will move on to debugging the algorithm in the next post.

As usual, you can get the code at Gitlab.