How the wordsearch program works
This note describes how the wordsearch program creates a puzzle.
Initial size of puzzle
I keep the wordsearch puzzles square, there are the same number of rows as columns. It is obvious that this number cannot be less than the length of the longest word, so that is a lower bound.
Further, the number of grid squares on the puzzle is related to the total number of characters in all of the words. So I calculate that number as well, take 80% (some of the letters overlap), and take the square root. Between this latter calculation and the length of the longest word, I get an initial board size.
Ordering word list
The longer words will be harder to fit into the puzzle than the shorter words. So I sort the word list by word length, with the longest words first.
Initial starting position for a word
I don't want to place words starting from the upper left, as that will give a characteristic pattern at that point. So as I place each word I pick a random starting row and column, and then try the word in each of the possible orientations from that starting position. Then I scan the puzzle grid from that initial position until I wrap around back to the starting position.
Attempting to lay in words
I sequence through all of the words, longest first, from random starting positions. If I make it through all of the words, I have a successful puzzle. If not, I back out the previous word, put it in the next possible position, and continue on. Think "Depth First Search", a highly recursive strategy. Each successful puzzle has a score (that I'll talk about in a moment). Once I have a particular board size that has at least one successful puzzle, I take whichever successful puzzle with the best score and publish it for the user.
Some word placements just don't work. They don't meet the constraints for orientation, they don't cross correctly, etc. But of those that do, some are better than others. I count the number of shared grid cells. Those puzzles that share more characters are "better" than others. There are some other simple huristics that go here as well.
Fill in remaining grid cells
The winning puzzle doesn't use all of the grid squares. Because this puzzle generator can be used by different languages, with different alphabets, I choose "fill in" characters from the set of character that the user used in their words. I also put in these fill-in characters in the proportion to the frequency that they are used in the word list. This typically means that there will be far more "e" characters put in than "x" characters.
Because of the random nature of placement, the same word list will not generate the same puzzle every time. Because I don't know anything about the character set, this can generate puzzles in Japanese as easily as in English. (Note that bit took some effort to get there...)
Over the years people have asked for extensions to support different audiences. For instance, a teacher of special-ed students wanted puzzles that only went down and across, and not backwards or diagnoal. That could be handled by setting an input box, and constraining the orientations that were tried. Another user wanted to ensure that no "bad" words got generated. I haven't done that, as I have to get a better vocabulary, and search for all of the bad words in all possible orientations. But it could be done.
By and large, the above strategy builds high quality, compact word search puzzles, in a fashion that has been maintainable over the last twenty years.