View all entries in Article

Programming Challenge: Crypt Kicker

Problem 2.8.4 in Programming Challenges by Steven S. Skiena and Miguel A. Revilla:

Decrypt all words in the input text, based on a dictionary of known words.

Given below are samples of possible input:

  • Dictionary: and, dick, jane, puff, spot, yertle
  • Sample input: bjvg, xsb, hxsn, xsb, qymm, xsb, rqat, xsb, pnetfn

This is a substitution crypto with the constraints imposed by the possible mappings between encrypted and clear-text letters. First thing I thought I'd need was a way to get the plain words that could possibly match a crypto word. So I defined a utility function plain-word-list-from-crypted:

(defun plain-word-list-from-crypted (crypted &optional (dictionary *dictionary*))
  "Extract the words from the dictionary that matches the length of the crypted word."
  (if (not crypted)
      '("")
      (remove-if-not #'(lambda (word) (= (length crypted)
                                            (length word)))
                     dictionary)))

* (plain-word-list-from-crypted "rqat")
("dick" "jane" "puff" "spot")

For convenience, I stuffed away the input (the original text asked to parse input lines):

(defparameter *dictionary* '("and" "dick" "jane" "puff" "spot" "yertle"))
(defparameter *input* '("bjvg" "xsb" "hxsn" "xsb" "qymm" "xsb" "rqat" "xsb" "pnetfn"))

Unification

As you can tell, the plain word list doesn't tell us anything about the compatibility between words. We immediately see that the word puff is not one of the target words, by quickly performing the unification in our heads. Inventing <=> for unifiable and </> for not unifiable, we get (from crypto to plain side):

  • r <=> p
  • q <=> u
  • a <=> f
  • t </> f, as f is already associated with a

By setting up a mapping between each visited letter in the crypto and plain words, we can tell when the unification would fail by looking at the previous mappings set up by the unification engine. So, let's start by doing a simple two-word compatibility test:

(defun words-consistent-p (crypt-word plain-word)
  "Internally consistent?"
  (let ((mapping (make-hash-table))
        (reverse-mapping (make-hash-table)))
    (loop for crypt across crypt-word and plain across plain-word
      unless (gethash crypt mapping)
        do (setf (gethash crypt mapping) plain)

      unless (gethash plain reverse-mapping)
        do (setf (gethash plain reverse-mapping) crypt)

      never (or (not (char= (gethash crypt mapping) plain))
                (not (char= (gethash plain reverse-mapping) crypt))))))

It works by setting up a mapping for each letter in the word, as well as a reverse-mapping. Setting up only a (forward) mapping would work for qymm </> pufF, but give a false positive for qymM </> puff. That's why we need to look at the mappings on the right-hand side as well, i.e. the reverse mapping. That realization behind us, checking the words for consistency is a simple matter of filling the mappings and stop when a conflict is found.

It also gave me some much-needed practice in abusing LOOP.

Goal Solving

By unifying all rules, a sought goal is reached. In our case, that would be a full mapping between encrypted and plain letters.

Now, we can't reach our goal by just looking at word pairs. We also need to build up our constraint table, i.e. letter mapping, and continue solving for the other word pairs under the accumulated constraints. One way -- in fact, most likely the only way -- of doing that is to look at crypto words and the set of reasonable plain words associated with it, specifically, the ones of equal length (just a small readability optimization), and then trying to unify the crypto word with those plain words.

The plan:

  1. Start with the first crypto word in the input
  2. Find all possible plain-text matches for that word
  3. For each word, check the pair for consistency, building up a mapping along the way.
  4. If a pair was found, save the mapping and go back to step 2.
  5. When there are no more crypto words in the input, save the mapping and use it to translate between crypto and plain-text words.

WORDS-CONSISTENT-P needs to know how to communicate the results with its caller. Adding input mappings, we get (along with a helper function):

(defun update-hash-table (this other)
  "Update this with the key-value pairs of other."
  (maphash #'(lambda (k v) (setf (gethash k this) v))
           other)
  this)

(defun words-consistent-p (crypt-word plain-word mapping* reverse-mapping*)
  "Internally consistent?"
  (let ((mapping (make-hash-table))
        (reverse-mapping (make-hash-table)))
    (update-hash-table mapping mapping*)
    (update-hash-table reverse-mapping reverse-mapping*)
    (and
      ;; only return the new mapping if the words were consistent
      (loop for crypt across crypt-word and plain across plain-word
        unless (gethash crypt mapping)
          do (setf (gethash crypt mapping) plain)

        unless (gethash plain reverse-mapping)
          do (setf (gethash plain reverse-mapping) crypt)

        never (or (not (char= (gethash crypt mapping) plain))
                  (not (char= (gethash plain reverse-mapping) crypt))))
      (list mapping reverse-mapping))))

Good. It now takes input mappings, checks the words for compatibility and returns the updated mappings.

Reaching The Goal

We now have a way to use the accumulated mapping in the calling program. Looking back at the proposed algorithm and just writing without doing much thinking, we might end up with something similar to the following almost-but-not-quite-there-yet piece of code:

(defun solve-goal (crypto-words mapping reverse-mapping)
  (if crypto-words
    (dolist (plain-word (plain-word-list-from-crypted (first crypto-words)))
      (when (words-consistent-p plain-word (first crypto-words)
                                mapping reverse-mapping)
          (solve-goal (rest crypto-words)
                      (first (words-consistent-p plain-word (first crypto-words)
                                                 mapping))
                      (second (words-consistent-p plain-word (first crypto-words)
                                                 mapping)))))
    mapping))

Just reading it out loud, I think you'll get something quite similar to the original idea: stop when there are no words left, otherwise iterate through each plain word, and if it could be unified with the crypto word, continue trying to solve the goal for the rest of the input using the accumulated mappings.

It still lacks some details, and could be less inefficient. Let's do something about that:

(defun solve-goal (crypt-words &optional (mapping (make-hash-table))
                   (reverse-mapping (make-hash-table)))
  (if crypt-words
    (dolist (plain-word (plain-word-list-from-crypted (first crypt-words)))
      (let ((dicts (words-consistent-p (first crypt-words) plain-word mapping
                                       reverse-mapping)))
        (format t "~A <=> ~A? ~A~%" (first crypt-words) plain-word (if dicts "Yes" "No"))
        (when dicts
          (let ((final-mapping (solve-goal (rest crypt-words)
                                           (update-hash-table mapping (first dicts))
                                           (update-hash-table reverse-mapping (second dicts)))))
                     (when final-mapping
                       (return final-mapping))))))
    mapping))

Let's run this:

* (solve-goal *input*)
bjvg <=> dick? Yes
xsb <=> and? Yes
hxsn <=> dick? No
hxsn <=> jane? Yes
xsb <=> and? Yes
qymm <=> dick? No
qymm <=> jane? No
qymm <=> puff? Yes
xsb <=> and? Yes
rqat <=> dick? No
rqat <=> jane? No
rqat <=> puff? No
rqat <=> spot? Yes
xsb <=> and? Yes
pnetfn <=> yertle? Yes
#<HASH-TABLE :TEST EQL :COUNT 17 {AC1E799}>

Looks good. To verify the mapping is OK, try to translate the crypto text:

(defun translate-crypt-input (dictionary crypt-words)
  (let (words)
    (dolist (crypt-word crypt-words)
      (push (coerce (loop for c across crypt-word collecting (gethash c dictionary)) 'string)
            words))
    (format nil "~{~A~^ ~}" (nreverse words))))

* (translate-crypt-input (solve-goal *input*) *input*)
"dick and jane and puff and spot and yertle"

Sweet! Let's try it with another data set where the solver needs to do more backtracking:

(defparameter *input** '("rqat" "qymm" "bjvg" "xsb" "hxsn" "xsb" "qymm" "xsb" "rqat" "xsb" "pnetfn"))
* (solve-goal *input**)

rqat <=> dick? Yes
qymm <=> dick? No
qymm <=> jane? No
qymm <=> puff? No
qymm <=> spot? No
rqat <=> jane? No
rqat <=> puff? No
rqat <=> spot? No
NIL

Hey! What happened? Once it has failed, it seems to just continue failing!

Backtracking

Backtracking is the concept of trying to reach a goal with the first item in a data set and continue moving towards that goal by unification, peeling off another item from the input on success, until the first failure. When that happens, back up one step in the input and try again with next item.

As it turns out, we need to pass along a fresh set of mappings for each invocation of SOLVE-GOAL or the backtracking won't work. This concept maps nicely onto a recursive call chain. However, as noted, we can't allow for failures to propagate backwards, so we need to copy the mappings when passing them on.

While we're at it, let's make SOLVE-GOAL clue us in on the current step of the unification process:

(defun copy-and-update-hash-table (this other)
  (let ((new (make-hash-table)))
    (update-hash-table new this)
    (update-hash-table new other)
    new))

(defun solve-goal (crypt-words &optional (mapping (make-hash-table))
                   (reverse-mapping (make-hash-table)) (level 0))
  (if crypt-words
    (dolist (plain-word (plain-word-list-from-crypted (first crypt-words)))
      (let ((dicts (words-consistent-p (first crypt-words) plain-word mapping reverse-mapping))
            (spaces (coerce (loop for i from 0 to (* 2 level) collecting #\Space) 'string)))
        (format t "~A~A <=> ~A? ~A~%" spaces (first crypt-words) plain-word (if dicts "Yes" "No"))
        (when dicts
          (let ((final-mapping (solve-goal (rest crypt-words)
                                           (copy-and-update-hash-table mapping
                                                                       (first dicts))
                                           (copy-and-update-hash-table reverse-mapping
                                                                       (second dicts))
                                           (1+ level))))
                     (when final-mapping
                       (return final-mapping))))))
    mapping))

There! A test run should now give us a proper mapping, if the goal was reached:

* (solve-goal *input**)
 rqat <=> dick? Yes
   qymm <=> dick? No
   qymm <=> jane? No
   qymm <=> puff? No
   qymm <=> spot? No
 rqat <=> jane? Yes
   qymm <=> dick? No
   qymm <=> jane? No
   qymm <=> puff? No
   qymm <=> spot? No
 rqat <=> puff? No
 rqat <=> spot? Yes
   qymm <=> dick? No
   qymm <=> jane? No
   qymm <=> puff? Yes
     bjvg <=> dick? Yes
       xsb <=> and? Yes
         hxsn <=> dick? No
         hxsn <=> jane? Yes
           xsb <=> and? Yes
             qymm <=> dick? No
             qymm <=> jane? No
             qymm <=> puff? Yes
               xsb <=> and? Yes
                 rqat <=> dick? No
                 rqat <=> jane? No
                 rqat <=> puff? No
                 rqat <=> spot? Yes
                   xsb <=> and? Yes
                     pnetfn <=> yertle? Yes
#<HASH-TABLE :TEST EQL :COUNT 17 {AEBBB69}>

It seems so! Let's verify:

* (translate-crypt-input * *input**)
"spot puff dick and jane and puff and spot and yertle"

Nice!

Notes

Only the first goal will be reached, any other (possible) goals are discarded. Also, instead of hash tables, association lists and assoc/rassoc could be used. Less wasting could be accomplished with a stack where the failing result could be popped off. It could be written in a more idiomatic Common Lisp style.

But that's for another day!


Responses

You can follow any responses to this entry through the RSS 2.0 feed.

  1. adeht said on February 5th, 2009 at 12:10 (link)

    Here’s my quick (half an hour?) attempt. Forgive the silly names, it’s late :)

    http://paste.lisp.org/display/74915

    (defparameter *dictionary*
      '("and" "dick" "jane" "puff" "spot" "yertle"))
     
    (defparameter *input*
      '("bjvg" "xsb" "hxsn" "xsb" "qymm" "xsb" "rqat" "xsb" "pnetfn"))
     
    (defun solve ()
      (decrypt *input* (find-mapping *input* *dictionary* (make-mapping))))
     
    (defun find-mapping (input dictionary mapping)
      (cond ((null mapping) nil)
            ((complete-mapping-p mapping input) mapping)
            (t (let ((term (find-unmapped-term input mapping)))
                 (some (lambda (word)
                         (find-mapping (remove term input :test #'equal)
                                       (remove word dictionary :test #'equal)
                                       (extend-mapping term word mapping)))
                       dictionary)))))
     
    (defun complete-mapping-p (mapping input)
      (every (lambda (term)
               (every (lambda (c) (lookup-crypt c mapping)) term))
             input))
     
    (defun make-mapping () '((t . t)))
    (defun lookup-crypt (c mapping) (assoc c mapping))
    (defun lookup-plain (c mapping) (rassoc c mapping))
     
    (defun extend-mapping (term word mapping)
      (when (= (length term) (length word))
        (loop for tc across term
              for wc across word
              for m1 = (lookup-crypt tc mapping)
              for m2 = (lookup-plain wc mapping)
              do (cond ((and m1 m2)
                        (unless (eql m1 m2)
                          (return-from extend-mapping nil)))
                       ((or m1 m2)
                        (return-from extend-mapping nil))
                       (t (push (cons tc wc) mapping))))
        mapping))
     
    (defun find-unmapped-term (input mapping)
      (find-if (lambda (term)
                 (not (complete-mapping-p mapping (list term))))
               input))
     
    (defun decrypt (input mapping)
      (when mapping
        (mapcar (lambda (term) (decrypt-term term mapping)) input)))
     
    (defun decrypt-term (term mapping)
      (map 'string (lambda (c) (cdr (lookup-crypt c mapping))) term))
    
  2. Mikael Jansson said on February 5th, 2009 at 12:51 (link)

    adeht,

    Looks good!

    I did try the alist route (even had the same make-mapping as you did :)) when I was done with the hash-table version, but I was too tired to realize the fail-condition was as easy as comparing the cons pairs of the mapping. So I scrapped that and wrote a piece on what I had working.

    Mind if I add the paste to your comment?

  3. adeht said on February 5th, 2009 at 13:00 (link)

    Hey Mikael,

    I don’t mind at all. Go ahead, and thanks for posting something interesting.

  4. adeht said on February 5th, 2009 at 22:58 (link)

    Hello again Mikael,

    You missed a reference to the alphabet variable in the recursive call to find-mapping.

    Sorry! Thanks for noticing.
    –Mikael

  5. Mikael Jansson » Blog Archive » Jag, en arg LCHF-människa? said on March 5th, 2009 at 19:15 (link)

    [...] Programming Challenge: Crypt Kicker [...]

  6. Alex said on June 24th, 2009 at 22:16 (link)

    http://paste.lisp.org/display/82457

    (defun unify-letters (letter-map letter1 letter2)
      (char= letter2 (internhash letter-map letter1 letter2)))
     
    (defun unify-words (word1 word2 letter-map)
      (when (= (length word1) (length word2))
        (let ((map (copy-hash-table letter-map)))
          (values (every (sp #'unify-letters map _ _) word1 word2) map))))
     
    (defun unify (word1 word2 letter-map)
      (multiple-value-bind (succeded-p map) (unify-words word1 word2 letter-map)
        (when succeded-p
          (maphash (sp #'internhash letter-map _ _) map))))
     
    (defun solve (dictionary input)
      (let ((map (hash)))
        (loop for w1 in input
           do (loop for w2 in dictionary
           do (unify w1 w2 map)))
        map))
     
    (defun purify (input)
      (sort (remove-duplicates (copy-list input) :test #'string=) #'string&gt;))
     
    (defun decrypt (dictionary input)
      (let ((map (solve dictionary input)))
        (mapcar (lambda (word) (map 'string (sp #'gethash _ map) word)) input)))
    

If this is the first time you comment on my site, your post might get held up in the moderation queue. Don't panic, I'll approve it as soon as possible!

Code (i.e. literal) blocks should be wrapped inside <PRE>.

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Close
Powered by ShareThis