Boggling with Haskell : how to achieve victory over word puzzles

Word grid puzzles are tricky. If only there was a way to solve them quickly… I know! Let’s do it with Haskell!

Prelude

This post was originally published in its initial form in 2017. It has been slightly refactored, but should not be considered an example of professional Haskell development

The basic premise of Boggle is quite simple. Given a $N * N$ grid of letters, find all possible words by drawing a path through the grid, via distinct blocks. Depending on the scoring system, you might want to prioritize time, count, length, rarity… It is clear that a suitable computer helper might be useful in conquering such a game. Big bank of words, superhuman recall.. you might gain quite a staggering victory

In this post, I’ll be showcasing a solution for the problem made in Haskell. Basic familiarity with Haskell and Cabal are assumed from the reader.

On words

Storage

Due to the expectedly large amount of words we will need to store for a reasonably powerful solver, we need to select an appropriate data structure. A list would work, but it would be ludicrously inefficient - even when sorted. We need something that can step letter by letter, without having to seek large amounts of data.

If you thought a tree would be the best solution, you’d be very right - we’ll be using a trie¹ (also known as a prefix tree). Each node of the tree (excluding the root starting node) contains a single letter, and flag to indicate if a word can be terminated at that point. By following directed paths between nodes, valid words can be searched for very efficiently.

Trie

Considering a Boggle answer is a path through distinct blocks on a grid, this gives us a convenient solution: if a path can be built, in lockstep, for a word both in the grid and the trie, it is a valid answer to the puzzle. We simply need to find all such paths, and sort them to be the most optimal for our use-case.

As of the concrete storage format, we’ll go with a very simple one - a letter will start a new node under the current node, # returns to the parent node, and ! marks a complete word at that node.

Dictionary building

Dictionary generation requires additional tooling, out of scope for this post. For your convenience though, I have included a simple form that will generate a dictionary for you, using clientside JS

Let’s code

Data structures

We’ll express a trie as a self-recursive structure. For convenience of implementation, we’ll utilize a simple type - with more effort, we could improve and have a type where you can’t have empty nodes in middle of a tree

-- A trie node may contain a character (but usually only the root node should be missing one), can contain next characters and an indication if the word is complete
data Node = Node {
    nodeChar :: Maybe Char,
    nextCharacters :: [Node],
    isCompleteWord :: Bool
} deriving (Eq, Show)

Loading the data

For the convenience of implementation, we’ll diverge a bit from the pure Haskell world and use strict state threads. One could (with lots of simplification) describe them as mutable state isolated in a function. This requires Data.Text, and Control.Monad.ST. For convenience, we also utilize the monad-loops library for certain helpful primitives

Do observe that the list is reversed at the end - this is to return the order of items back to the order they were presented in the file. It is not critical for functionality, but may help reasoning why results are presented in the order they are in

loadWordTree :: Text.Text -> Node
loadWordTree str = runST $ do
    -- Create a new reference for the string
    data_str <- newSTRef str
    -- Create an empty node
    root_node <- newSTRef (Node Nothing [] False)
    -- Read the subnode in
    readSubnode data_str root_node
    -- And return the now-altered root node
    readSTRef root_node
  where
    readSubnode :: STRef s Text.Text -> STRef s Node -> ST s ()
    readSubnode strRef nodeRef = do
      exitRef <- newSTRef False
      whileM_ (do
                str <- readSTRef strRef
                mustExit <- readSTRef exitRef
                return ((not . Text.null $ str) && not mustExit)
              ) (do
                -- Extract the head
                h <- readSTRef strRef <&> Text.head
                -- Strictly discard the head from the buffer
                modifySTRef' strRef Text.tail

                -- Read current head character
                case h of
                  '#' -> modifySTRef' exitRef (const True) -- We want to exit now
                  '!' -> modifySTRef' nodeRef (\node -> node {isCompleteWord = True}) -- This node now denotes a complete word
                  _   -> do
                   -- Initialize a new node with the character we just read
                   newNode <- newSTRef (Node (Just (Char.toUpper h)) [] False)
                   -- Read the subnode
                   readSubnode strRef newNode
                   -- Alter the current reference
                   readSTRef newNode >>= (\extractedNode -> modifySTRef' nodeRef (\node -> node {nextCharacters = extractedNode:nextCharacters node}))
              )
       -- Now that new changes are not expected, flip the list to its proper order         
      modifySTRef' nodeRef (\node -> node {nextCharacters = reverse (nextCharacters node)})
      return ()

Solver logic

Let’s then work on what exactly we consider a solution, and how to find one. As mentioned before, a path approach is a plausible choice - find a path valid both in the dictionary and the grid

Do note we need the arrays library for this, importing Data.Array as Array

Datatypes

We utilize a simple model - a solution is considered to be a tuple of metadata (word, starting point) and an array of coordinate-direction pairs to be overlaid on a grid

type Solution = (SolutionMeta, Array.Array BGCoord Flag) -- A definition of a solution

data Flag = None | D | U | L | R | UL | UR | DL | DR | Stop deriving (Eq,Show) -- For each point on a grid, in which direction we should step? None indicates a letter should be shown, Stop indicates that the solution ends there

data SolutionMeta = Meta {
  foundWord :: Text.Text, -- What was the word found?
  initialPoint :: BGCoord -- The starting point?
}

type BGCoord = (Int, Int) -- Coordinates used for a grid
type BoggleGrid = Array.Array BGCoord Char -- The datatype of a grid

Helper functions

A few general helper functions are required - direction to flag conversion, bounds checks, applying directions to the solution. Fairly self-explanatory.

-- Define a flag for each direction; a zero move means a stop and end of word
directionToFlag :: (Int, Int) -> Flag
directionToFlag (0,0) = Stop
directionToFlag (-1,0) = L
directionToFlag (1, 0) = R
directionToFlag (0,1) = D
directionToFlag (0,-1) = U
directionToFlag (-1,1) = DL
directionToFlag (1,1) = DR
directionToFlag (-1,-1) = UL
directionToFlag (1,-1) = UR
directionToFlag _ = undefined

-- Checks if some coordinate is in valid bounds
inBounds :: Array.Array BGCoord z -> BGCoord -> Bool
inBounds grid (x,y)
 | x < 0 || y < 0 = False
 | x > w || y > h = False
 | otherwise = True
 where
  (w,h) = (snd . Array.bounds) grid

-- Apply a direction to the grid
applyDirection :: (BGCoord, Array.Array BGCoord Flag) -> BGCoord -> (BGCoord, Array.Array BGCoord Flag)
applyDirection (pos@(posX, posY), baseGrid) direction@(dirX,dirY) = ((posX+dirX, posY+dirY), baseGrid Array.// [(pos, directionToFlag direction)]) 

Finding solutions

What to look for

List comprehensions can have multiple generators and conditions. This is crucial to the iteration, where possibilities are enumerated and filtered out by possibility
- This could very well be refactored into smaller parts and helper functions - whilst this does its job, it is not exactly pretty
Laziness is utilized here for benefit of processing - only when next answer is required, it is generated. Of course though, this will have the disadvantage of being dependent on the exact iteration order of words, as otherwise you’ll have to know all answers to sort them
List transposing has the effect of changing listing solutions starting from the same location to shifting across the board. It is not critical, but may be an useful trait if not sorting by score or any other criteria

Code

-- Returns a lazy list of possible solutions; select only those subnodes where we also have letters in the dictionary.           
generatePathList :: Node -> BoggleGrid -> [Solution]
generatePathList rootNode grid = concat $ List.transpose $ map (\(pos,node) -> findWordsOnPos node grid pos) validStarts
  where validStarts = [ (p,n) | p <- Array.indices grid, n <- nextCharacters rootNode, nodeChar n == Just (grid Array.! p)]

-- Finds the solutions from a single point. It is assumed the first letter is already root node 
findWordsOnPos :: Node -> BoggleGrid -> BGCoord -> [Solution]
findWordsOnPos rootNode grid initialPos = iterateNode rootNode Nothing (Text.singleton (Maybe.fromJust $ nodeChar rootNode)) (initialPos, zeroArray)
 where
  zeroArray = fmap (const None) grid -- A default-by-zero array of proper size

  -- Iteration function; the current node being explored,  the last direction taken if available, the current text, and the path
  iterateNode :: Node -> Maybe BGCoord -> Text.Text -> (BGCoord, Array.Array BGCoord Flag) -> [Solution]
  iterateNode currentNode maybeLastDir currentText point@(pos@(posX,posY), currentPath)
      | maybeLastDir == Just (0,0) = [(Meta currentText initialPos, currentPath)] -- End of path, return a result
      | otherwise = concat pathSelector -- Otherwise return the contents of the path selector
      where
        pathSelector = [ iterateNode nextNode (Just (dX,dY)) (if dX /= 0 || dY /= 0 then currentText `Text.snoc` nodeLetter else currentText) (applyDirection point (dX,dY))  | -- Apply a path for each combination where..
                        nextNode <- currentNode:nextCharacters currentNode,  -- we can select either our current node or any other node
                        dX <- [-1,0,1], dY <- [-1,0,1], -- there's a direction... 
                        gridLetter <- [grid Array.! (posX+dX, posY+dY)], -- must have a valid letter on the grid..
                        (Just nodeLetter) <- [nodeChar nextNode], -- we've now established that we can possibly take the next node. Let's analyze it
                        (dX /= 0 || dY /= 0) && nextNode /= currentNode || dX == 0 && dY == 0 && nextNode == currentNode && isCompleteWord currentNode, -- But either require a nonzero direction and a new node OR stay in place, choose current node and this word being a complete one  
                        inBounds currentPath (posX+dX, posY+dY), -- ensure that we can actually walk there
                        currentPath Array.! (posX+dX, posY+dY) == None, -- our next step must be empty
                        nodeLetter == gridLetter -- and which must naturally match to the chosen node 
                        ]

Binding with text user interface

With the supporting library code all done, we now need to create an interface for it. We’ll go for a relatively simple text-based user interface, operating by prompting for input. Note the utilization of Text library to support parsing arbitrary strings into some supported type (e.g. integer), and overloaded strings for more convenient operations with string literals.

{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE TupleSections #-}

module Main where

import BoggleUtils
    ( BoggleGrid,
      BGCoord,
      SolutionMeta(Meta),
      Flag(..),
      Node,
      loadWordTree,
      generatePathList )
import Data.Array
    ( Ix(range), (!), bounds, ixmap, listArray, Array )
import Data.Ord ()
import Data.List ( nubBy, sortBy )
import Data.IORef ( modifyIORef', newIORef, readIORef )
import Data.Maybe ()
import Data.STRef ()
import Data.Char ( toUpper )
import qualified Data.Text as T
import qualified Data.Text.IO as T.IO
import Text.Read ( readMaybe )
import System.IO ()
import System.Console.Readline ( readline )
import Control.Monad.ST ()
import Control.Monad.Loops ( untilJust, untilM_ )
import Data.Functor ((<&>))

-- Define a character to show for each flag; first one is normal, 2nd one is a highlighted one
char :: Flag -> (Char, Char)
char None = (' ', ' ')
char D = ('↓', '🢃')
char U = ('↑', '🢁')
char L = ('←', '🢀')
char R = ('→', '🢂')
char UL = ('↖', '🢄')
char UR = ('↗', '🢅')
char DR = ('↘', '🢆')
char DL = ('↙', '🢇')
char Stop = ('◊', '◊')

-- Prompt to exit when exit is typed (returns false), otherwise returns true
promptExit :: String -> IO Bool
promptExit str = untilJust $ do
   res <- promptStr (str ++ " (type 'exit' to exit)")
   return (Just (res /= "exit"))

-- Prompt yes/no on something
promptYN :: String -> IO Bool
promptYN str = untilJust $ do
  res <- promptStr (str ++ " (Y/N)")
  case res of
    "Y" -> return (Just True)
    "y" -> return (Just True)
    "N" -> return (Just False)
    "n" -> return (Just False)
    _   -> return Nothing

-- Prompt something and require it to be a valid result
promptX :: Read a => String -> IO a
promptX str = untilJust $ do {rdln <- untilJust $ readline (str ++ "\n>"); return (readMaybe rdln)}

-- Prompt a string, or possibly hang if unable to provide one
promptStr :: String -> IO String
promptStr str = untilJust $ readline (str ++ "\n>")

-- Generates a rendering of a given grid, using a specialized function for that purpose
printGrid:: Array BGCoord z -> ((BGCoord, z) -> Char) -> IO ()
printGrid grid func = mapM_ (\y -> do {row_func y;putStrLn " "}) (range (0,h))
 where
  -- Params
  (w,h) = snd $ bounds grid
  --
  row_func :: Int -> IO ()
  row_func y = putStrLn $ concatMap ((\coord -> ' ':[func (coord,grid ! coord)]) . (,y)) (range (0,w))

-- Do an IO operation to inquire from the user and generate a grid with that
generateWordGrid :: IO BoggleGrid
generateWordGrid = untilJust $ do
    x :: Int <- promptX "Enter the width of the grid"
    y :: Int <- promptX "Enter the height of the grid"

    if x < 1 || y < 1
      then do {putStrLn "Both sizes must be nonzero!"; return Nothing}
      else do
        str :: T.Text <- promptStr ("Enter the word to parse, reading left-to-right, up-to-down (must be exactly " ++ show (x*y) ++ " characters)") <&> T.pack
        if T.length str /= x*y then do {putStrLn "Word not of correct length!"; return Nothing}
        else
           return (Just (ixmap ((0,0),(x-1,y-1)) (\(x,y) -> (y,x)) $ listArray ((0,0), (y-1,x-1)) $ map toUpper (T.unpack str)))

pathReader :: BoggleGrid -> Node -> IO ()
pathReader grid rootNode = do
    --solutionRef <- newIORef $ generatePathList rootNode grid -- This takes all
    solutionRef <- newIORef $ sortBy (\(Meta word1 _, _) (Meta word2 _, _) -> compare (T.length word2) (T.length word1)) $ nubBy (\(Meta word1 _, _) (Meta word2 _, _) -> word1 == word2) $ generatePathList rootNode grid -- This removes duplicates and sorts by length, but also pulls in everything before displaying anything
    emptyAtStart <- readIORef solutionRef Data.Functor.<&> null
    if emptyAtStart then do {putStrLn "Oops! No solutions were found at all!"; return ()} else
      untilM_ (do
        putStrLn "Found solution!"
        (Meta word initialPos@(posX,posY), solutionGrid) <- readIORef solutionRef Data.Functor.<&> head
        putStrLn $ "Starting from " ++ show posX ++ " steps from left, " ++ show posY ++ " steps down, word '" ++ T.unpack word ++ "'\n"
        printGrid solutionGrid (\(pos, dir) -> if dir == None then grid ! pos else (if pos == initialPos then snd else fst) $ char dir)
       ) (do
        modifyIORef' solutionRef tail
        isEmpty <- readIORef solutionRef Data.Functor.<&> null
        promptExit "Press ENTER to get next word" >>= (\x -> return (not x || isEmpty))
       )

main = do
 putStrLn "Loading the wordlist.."

 wordtext <- T.IO.readFile "./wordlist.dat"
 let worddata = loadWordTree wordtext
 (do
   grid <- generateWordGrid
   putStrLn "Loaded grid!"
   printGrid grid snd
   putStrLn "Finding words"
   pathReader grid worddata
   return ()
  ) `untilM_` (do {bl <- promptYN "Try again?"; return (not bl);})

 return ()

Conclusion

By the end, you should have something like this:

How you utilize this newfound ability of yours is up to you. There are a few mobile games which can be rather effortlessly beaten with this approach.. barring the blatantly cheating possibility of buying more time in some of those tested games. Even the best helper cannot compensate for the fact that other players can simply fork over money and topple your compsci-powered dominance on scoreboards.

It should also be noted that when testing, I wasn’t the only one getting high scores - other users were observed to rack almost equally strong results. This suggests I might not have been the only one with a helper. Some cursory Googling reveals that other developers have also found interest in such an approach; some of the proposed tools looked rather questionable, so their analysis was skipped for time being.

A full copy of work is available to download as a ZIP file