Listener 4656: “Co-star” by Lysander
Posted by Steve Tregidgo on 14 May 2021
This puzzle was one of those rare occasions where I needed to turn to technology to help with a tricky endgame, and I thought it was interesting enough to share. I was relieved and thankful that Lysander had made the clues accessible enough that I could cold-solve three quarters on a first pass (which is very high for me), which meant I could concentrate on the endgame, in which all letters were to be enciphered to unique one- or two-character sequences, with all entries being real words.
Thanks to the difference between answer enumerations and the entry cell counts in the grid, it was easy enough to deduce which letters became two letters. 15d, TELLER, had the same answer and entry lengths, so they all mapped to one letter. 23a, NONE, was entered in five cells meaning that one letter mapped to two; it wasn’t E (because of 15d) and couldn’t be N as that would make the entry six letters, so it had to be O. In this way I was able to get nearly all of the mapping lengths, if not the actual target letters!
Clashes are normally a feature I dread (because I need all the checking letters I can get my hands on), but the generosity of the clues meant it wasn’t a problem for the initial pass. And in fact clashes are crucial in the endgame, because they allow one to see that (for example) the entry for L clashes with the first of the two letters entered for F, and that in turn clashes with the second of the two letters entered for O. In fact there are only about a dozen unique letters in the final grid, and the clashes provide a map of where each unique letter ends up, and how many there are. Counting those occurrences of unique letters, even without knowing what the letters are, provides a great place to start from if guessing letters: the most common are likely to be among EARIOTNS (the 8 most common letters in English based on a scan of the Concise Oxford Dictionary).
I figured that trying all of those combinations on paper would be time-consuming, and I hadn’t yet understood the puzzle title: I’d thought that maybe “Co-star” was the encoding of something like “x-ray” or “q-tip”, but with no obvious starting word it was a dead end. (Of course it was even more straightforward than that: C maps to ST and O maps to R. That would have been a great starting point if I’d realised it early enough!) But still, all those combinations were going to lead to a lot of pencilling in and rubbing out.
I started by writing a grid visualiser. I typed up a list of all the entry details (where they started, which direction they went, how long they were) and filled in any answers I’d already worked out. The visualiser would print them in grid form. That let me see which crossing letters clashed, which would be important later. I made it so I could enter some speculative encipherings, and the visualiser would show me those encoded letters, highlighting any clashes. Like this:

In this image, there are four colours:
- The cyan letters (in lower-case) are the original answer, but duplicated over two cells when I knew a letter encoded to two letters. That let me visualise clashes. On the very first row we can see that the second encoded letter from B is the same as the first encoded letter from J. (The two shades of cyan are for across and down entries.)
- The yellow letters are encoded. The encodings here are wrong; I just made some guesses to check the visualiser worked.
- The green letters are also encoded, but go green when both the across and down entries agree. (In a later iteration I had one go green and the other go dark grey, so I wasn’t reading the pairs in each cell.)
- The red letters are encoded letters that clash, showing the encoding is wrong.
Then I adapted the code to figure out the unique target letters (repeating the work I’d already done by hand and confirming I’d found them all), and to print each group with the number of occurrences in the clues I’d solved so far. It then printed digits 0-9 in the grid for the ten largest groups; I didn’t know what letters they represented, but I could see where they were the same. This could have been done by hand, actually, but I was enjoying the programming exercise and didn’t want to go back to pencil and paper just yet!

There are plenty of word patterns here but very few I could use (so 23d starts and ends with the same letter, from group 1? Big deal). The most interesting thing in this image is 9a (second cell from the left in the second row down): a ten-letter entry where the 1st, 4th and 5th letters are the same as each other, and the 2nd, 6th and 9th are the same as each other. There can’t be too many words fitting that pattern. I have a file on my computer containing a dictionary’s worth of words, so I scanned it for all words that look like that using a regular expression. I used the “back reference” feature of regexes: instead of me telling it to look for A in those positions, or B in those positions, or C, or D, and so on, I tell it to match any letter in the first position and to remember what it matched, and then later to match against whatever that first letter was. In the following command, the dot means “match any letter”, the round brackets mean “remember what matched here”, and the \1
and \2
mean “match one of the letters remembered earlier”. (The caret and dollar mean to make it a whole word search, although in this case it didn’t make a difference.)
$ cat UKMAC.txt | grep -Ei '^(.)(.).\1\1\2..\2.$'
rearrested
There was just one word: REARRESTED. I immediately told my program how to encode FOLKSY as REARRESTED (F->RE, O->AR and so on), and changed it a little more to copy the encoded letters wherever it could detect a “clash” of original letters (I had the first F encoding and the second O encoding being R, and I knew that was the same as the L encoding, so the program copied R to all the extra matching cells). Suddenly my grid was three quarters full!

As with just about all the gridfills I do, the final stage was figuring out what words could fit in the partially-filled entries I had in the grid, and then determining which of those matched the definition and wordplay of a clue. This was made harder by the fact that each partial entry could have come from several different partial answers! For example, after another few solves I had SP_R_S for 23d. This could have been SPARES, SPARKS, SPORTS, SPURNS, or a few other words. SPARKS was out as nothing encoded to K, but several of the others were possible so I decoded them: for example, SPARES could be decoded as BOIN, BEFN or BELK (none of which were helpful). BALK -> SPIRES looked very promising as BALK meant “stop short”, but the wordplay didn’t match it — so I kept looking, getting BULK -> SPORES which could be BAULK (also “stop short”) without its A, giving BULK (“majority”). An important Listener lesson: always check the parsing makes sense!

With many thanks to Lysander for an excellent puzzle — I particularly liked the bit at the beginning where I said “how on Earth am I supposed to do that?!”.
Dave Hennings said
Blimey.
Listener No 4686 – Dice Nets by Arden « Listen With Others said
[…] intersected. I typed up a table representing the intersections (having completely forgotten about the work I’d done for L4656, where I could describe entries and have it figure out the intersections for me), and instructed […]