I've been playing a lot of Solitaire recently, and for some reason I got the crazy idea of encoding data as a series of playing cards. The idea started as a moderately-complicated encoding scheme resulting in security through obscurity (with a bit of whimsy thrown in) but then I thought a more straightforward approach would be useful in a greater variety of situations. Relative to the complicated scheme, that is.
So anyway, now I've got a method of representing arbitrary binary data as a series of playing cards — or, equivalently, a method of efficiently representing a series of playing cards as a binary stream. Each card is represented by either 5 or 6 bits. Because of this inequity, translating from binary to base 52 will produce the 5-bit cards (namely aces, kings, and queens) twice as frequently as the others.
Of course, when a computer converts from binary to base 52, it still has to internally represent the base 52 in binary. Using six bits to store a card, base 52 becomes a subset of base 64. The computer could just as easily represent one card with an 8-, 16-, or 32-bit value (of which only 6 bits are used), a pair of 8-bit characters, or perhaps a longer text string which might be the name of a graphic representing the card. Apparently, there may be Unicode codepoints representing the 52 playing cards, but I can't quickly find them in the charts.
The following table lists each card's “convenient” 6-bit encoding (convenient because the suit is in the high two bits and the value is in the low four bits) as well as its packed 5- or 6-bit binary form. Interestingly, when the packed form is 6 bits long, it's the same as the convenient 6-bit form. By the way, I designed the card graphics used in the table. Is anyone surprised that they look kind of like highway signs?
Notice that last ace of diamonds is generated from the last two bits of the message itself, plus three imagined zeroes. Because the message ended before completing a valid bit code representing a card, it's padded with at most five zeroes to complete the card. When the sequence of cards is assumed to represent binary data in multiples of eight bits, any leftover bits on the end can be discarded as padding when reconstructing the original binary message. (The alternative is to pad the message again, which would result in an extra null-byte appearing on the end of most reconstructed messages — which sometimes isn't a problem at all.) Interestingly, had I left the exclamation point out of the example, there wouldn't be an issue to point out.
On the other hand, if the binary message is not to be assumed to have byte-aligned length, then some method is needed to encode the exact original length of the binary message. My solution is relatively simple. Remove the last two bits from the binary message and set them aside. Encode the message as usual, but pay attention to how many padding bits are added to the end in the process. Take the number of padding bits (replacing the number zero with eight, and optionally replacing one with nine) and that's the value of a new card, whose suit is determined by the two bits set aside earlier. When the original binary stream is reconstructed, remove and set aside the last card before translating the sequence of cards as normal. Take the lower 3 bits of the 6-bit form of the last card as a number, and discard that many bits from the end of the binary stream already decoded. Finally, take the upper two bits of the 6-bit form of the last card, and append them onto the binary stream. The result is the original binary stream, with no extra padding bits. Of course, this procedure (which adds an extra card of overhead to about 2⁄3 of all messages) is only recommended where the binary message is not guaranteed to end on a byte boundary.
Now, when the original message is a sequence of cards, translating to binary data which must end on a byte boundary could require so much padding that a whole extra card appears on the end of the message. To avoid this, translate the cards to binary as normal but don't immediately pad to a byte boundary. Append a “dummy ” zero bit, and then pad the binary string to the next byte boundary, paying attention to how many pad bits (not counting the dummy) are required. If five or more pad bits were required, set to 1 the lowest bit of the final byte. When decoding, remove the lowest bit from the last byte and set it aside. Convert the resulting binary string to a sequence of cards; if there are leftover bits that don't make a complete card in packed format, discard them. If the set-aside bit is set (1) then discard the last-decoded card from the sequence. The result is the original sequence, without any extra cards that weren't there before.
Text and graphics on this page are copyright ©2010 Vid the Kid. The mapping between playing cards and binary data expressed in the table is hereby released into the public domain.