Base 52

I've been playing a lot of Solitaire recently, and for some reason I got the crazy idea of encoding data as a series of playing cards. The idea started as a moderately-complicated encoding scheme resulting in security through obscurity (with a bit of whimsy thrown in) but then I thought a more straightforward approach would be useful in a greater variety of situations.  Relative to the complicated scheme, that is.

So anyway, now I've got a method of representing arbitrary binary data as a series of playing cards — or, equivalently, a method of efficiently representing a series of playing cards as a binary stream.  Each card is represented by either 5 or 6 bits.  Because of this inequity, translating from binary to base 52 will produce the 5-bit cards (namely aces, kings, and queens) twice as frequently as the others.

Of course, when a computer converts from binary to base 52, it still has to internally represent the base 52 in binary.  Using six bits to store a card, base 52 becomes a subset of base 64.  The computer could just as easily represent one card with an 8-, 16-, or 32-bit value (of which only 6 bits are used), a pair of 8-bit characters, or perhaps a longer text string which might be the name of a graphic representing the card.  Apparently, there may be Unicode codepoints representing the 52 playing cards, but I can't quickly find them in the charts.

Encoding

The following table lists each card's “convenient” 6-bit encoding (convenient because the suit is in the high two bits and the value is in the low four bits) as well as its packed 5- or 6-bit binary form.  Interestingly, when the packed form is 6 bits long, it's the same as the convenient 6-bit form.  By the way, I designed the card graphics used in the table.  Is anyone surprised that they look kind of like highway signs?

Card 6-Bit Packed
Dec Bin Bin
Ace of Spades 1 000001 00000
Two of Spades 2 000010
Three of Spades 3 000011
Four of Spades 4 000100
Five of Spades 5 000101
Six of Spades 6 000110
Seven of Spades 7 000111
Eight of Spades 8 001000
Nine of Spades 9 001001
Ten of Spades 10 001010
Jack of Spades 11 001011
Queen of Spades 12 001100 00110
King of Spades 13 001101 00111
Ace of Diamonds 17 01001 01000
Two of Diamonds 18 010010
Three of Diamonds 19 010011
Four of Diamonds 20 010100
Five of Diamonds 21 010101
Six of Diamonds 22 010110
Seven of Diamonds 23 010111
Eight of Diamonds 24 011000
Nine of Diamonds 25 011001
Ten of Diamonds 26 011010
Jack of Diamonds 27 011011
Queen of Diamonds 28 011100 01110
King of Diamonds 29 011101 01111
Ace of Clubs 33 100001 10000
Two of Clubs 34 100010
Three of Clubs 35 100011
Four of Clubs 36 100100
Five of Clubs 37 100101
Six of Clubs 38 100110
Seven of Clubs 39 100111
Eight of Clubs 40 101000
Nine of Clubs 41 101001
Ten of Clubs 42 101010
Jack of Clubs 43 101011
Queen of Clubs 44 101100 10110
King of Clubs 45 101101 10111
Ace of Hearts 49 110001 11000
Two of Hearts 50 110010
Three of Hearts 51 110011
Four of Hearts 52 110100
Five of Hearts 53 110101
Six of Hearts 54 110110
Seven of Hearts 55 110111
Eight of Hearts 56 111000
Nine of Hearts 57 111001
Ten of Hearts 58 111010
Jack of Hearts 59 111011
Queen of Hearts 60 111100 11110
King of Hearts 61 111101 11111

Example

Text H e l l o I n t e r n e t !
Binary 0 1 0 0 1 0 0 0 0 1 1 0 0 1 0 1 0 1 1 0 1 1 0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1 1 1 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 1 0 1 1 0 1 1 1 0 0 1 1 1 0 1 0 0 0 1 1 0 0 1 0 1 0 1 1 1 0 0 1 0 0 1 1 0 1 1 1 0 0 1 1 0 0 1 0 1 0 1 1 1 0 1 0 0 0 0 1 0 0 0 0 1 ---
Cards Two of Diamonds Six of Spades Five of Diamonds Queen of Clubs Queen of Diamonds Ace of Hearts Seven of Hearts Four of Clubs Two of Spades Two of Diamonds Seven of Hearts King of Spades Ace of Clubs Two of Hearts King of Clubs Nine of Spades King of Clubs Queen of Spades Five of Diamonds Four of Hearts Eight of Spades Ace of Diamonds

Length and Padding 

Notice that last ace of diamonds is generated from the last two bits of the message itself, plus three imagined zeroes.  Because the message ended before completing a valid bit code representing a card, it's padded with at most five zeroes to complete the card.  When the sequence of cards is assumed to represent binary data in multiples of eight bits, any leftover bits on the end can be discarded as padding when reconstructing the original binary message.  (The alternative is to pad the message again, which would result in an extra null-byte appearing on the end of most reconstructed messages — which sometimes isn't a problem at all.)  Interestingly, had I left the exclamation point out of the example, there wouldn't be an issue to point out. 

On the other hand, if the binary message is not to be assumed to have byte-aligned length, then some method is needed to encode the exact original length of the binary message.  My solution is relatively simple.  Remove the last two bits from the binary message and set them aside. Encode the message as usual, but pay attention to how many padding bits are added to the end in the process.  Take the number of padding bits (replacing the number zero with eight, and optionally replacing one with nine) and that's the value of a new card, whose suit is determined by the two bits set aside earlier.  When the original binary stream is reconstructed, remove and set aside the last card before translating the sequence of cards as normal.  Take the lower 3 bits of the 6-bit form of the last card as a number, and discard that many bits from the end of the binary stream already decoded.  Finally, take the upper two bits of the 6-bit form of the last card, and append them onto the binary stream.  The result is the original binary stream, with no extra padding bits. Of course, this procedure (which adds an extra card of overhead to about 23 of all messages) is only recommended where the binary message is not guaranteed to end on a byte boundary.

Now, when the original message is a sequence of cards, translating to binary data which must end on a byte boundary could require so much padding that a whole extra card appears on the end of the message.  To avoid this, translate the cards to binary as normal but don't immediately pad to a byte boundary.  Append a “dummy ” zero bit, and then pad the binary string to the next byte boundary, paying attention to how many pad bits (not counting the dummy) are required.  If five or more pad bits were required, set to 1 the lowest bit of the final byte.  When decoding, remove the lowest bit from the last byte and set it aside.  Convert the resulting binary string to a sequence of cards; if there are leftover bits that don't make a complete card in packed format, discard them. If the set-aside bit is set (1) then discard the last-decoded card from the sequence.  The result is the original sequence, without any extra cards that weren't there before.

Intellectual Property

Text and graphics on this page are copyright ©2010 Vid the Kid. The mapping between playing cards and binary data expressed in the table is hereby released into the public domain.