Jailcell RC 4 Paper
Jailcell RC 4 Paper
Jailcell RC 4 Paper
Abstract
The possibility of construction of a cipher that
can be implemented using only paper and a pencil
while still remaining secure against computer
cryptanalytic attacks is investigated. A
modified version of the RC4 cipher that is
believed to meet these requirements is proposed.
This result challenges the belief, widely-held
by the cryptographic community, that paper-and-
pencil ciphers are inherently insecure.
1. Introduction
A scenario is defined in which a cipher is required that can be implemented
without the benefit of a computer. Several existing ciphers are considered for this
application. A modified version of the RC4 cipher is proposed which is believed to be of
practical use in the scenario. The security of this cipher is analyzed in light of known
cryptanalytic attacks against RC4, and the practical paper-and-pencil implementation of
the cipher is considered.
Data Encryption Standard (DES) cipher with an attack complexity of 2 56 using a custom-
designed computer costing less than $250,000, completing the attack in three days [4].
2. Discarded Options
Various ciphers have been previously designed that can operate without the use of
computers. Several such ciphers were considered for use in this scenario and ultimately
rejected for various reasons. Most well-known paper-and-pencil ciphers, such as the
simple substitution cipher, are known to be extremely insecure and were not considered.
3. The Cipher
RC4 (which stands for “Ron’s Code #4” or “Rivest Cipher #4”) is a cipher
developed by Ron Rivest in 1987 for RSA Data Security, Inc. [14] It is a stream cipher,
which means that it generates a string of pseudo-random numbers (called the keystream)
of the same length as the plaintext, and combines the two to obtain the ciphertext.
The operation for combining the two is usually a bitwise XOR (exclusive or), but
for the jail cell cipher, addition (modulo the size of the alphabet) may be used instead,
because it is computationally easier for humans. Any one-to-one function for mapping
the plaintext to the ciphertext using the keystream is equally secure.
Jeremy Lennert Jail Cell Cipher – March 19, 2002 -5-
(1) i = i + 1 mod m
(2) j = j + P[i] mod m
(3) swap P[i] and P[j]
(4) output k = P[P[i] + P[j] mod m]
Where P[x] denotes the value of the xth element of the permutation table P.
with K[i] denoting the ith value of the key. Key values are all in the range [0, N-1].
After this, the i and j counters are once again initialized to zero and the cipher
begins generating the keystream. Therefore, a small change in the secret key produces a
pseudo-random change in the initial permutation table, allowing the cryptographer to use
closely-related keys with much less risk. However, several keys which are weak under
this key-scheduling algorithm have been observed [5, 12].
For JCRC4, the original key schedule was thought to be infeasible due to limits
on human computational speed and accuracy. A different one, based on the idea of
double hashing [2], follows. The key-scheduling algorithm is shown in figure 1.
Keystream generation is shown in figure 2.
Jeremy Lennert Jail Cell Cipher – March 19, 2002 -6-
The initial permutation is derived from an array of key values K, where each
value is a nonzero alphabetical character or its numerical equivalent. For m = 37, K
needs to contain at least 13 values to achieve 64 bits of actual key information.
The first key value K[0] is taken to be the first alphabetical character to be
inserted into the permutation table. It is inserted into the slot of the permutation table
designated by the next key value K[1].
Each successive key value indicates the position relative to the last insertion (mod
m) where the next character should be inserted. For example, if the most recent character
A is inserted into slot 7, and the next key value is 12, then B is inserted into slot (7 + 12
mod 37) = 19. If this slot is occupied (for example, if X is already in slot 19), then the
position is displaced by the same key value again (19 + 12 mod 37 = slot 31). If m is
prime, and key values of 0 are disallowed, then all key values will be relatively prime to
m and therefore all table slots will eventually be probed if no open slot is found.
When the end of the key is reached, key values are reused, starting again with
K[0] (but using it as a displacement this time). Note that because zeroes cannot be used
in the key, zero cannot be the first character placed, nor can the first character be placed
in S[0]. This means that zero will appear in S[0] with 1/36 probability instead of 1/37.
This could be avoided if K[0] and K[1] were allowed to be zero, but if this is the case,
these key values cannot be reused, since zero is not relatively prime to m = 37.
Because RC4 is a stream cipher and the same permutation table configuration
cannot be used to send more than one message, K needs to be modified for each message.
The modification proceeds as follows: K[0] is incremented by 1 (mod m), and K[last] is
rotated to the front of the array (with all other values being moved down the list one slot).
This obviously does not rotate through all possible keys, but it does rotate through (m - 1)
* |K|, which for proposed JCRC4 is at least (36 * 13) = 468 keys, probably enough for
most realistic paper-and-pencil applications. Alternatively, the first thirteen keystream
outputs after the message is encrypted could be used as the key for the next message, but
this requires the cryptographer to memorize a completely new random string for every
message.
For a key of length l, the (l - 1)th output and later are guaranteed to be influenced
by every word of the key, because the i counter forced the generator to utilize at least x
values from the permutation table in the first x outputs, and no value is inserted by the
first key value (that value determines the first value to be inserted). Therefore, if the first
(l - 2) outputs are discarded, the cipher should not be vulnerable to partial-key attacks,
where information about the key is obtained from outputs that are unaffected by some
parts of the key.
However, it should be noted that if an attacker can recover the full state of the
permutation table, reconstruction of the original key is trivial, and so all other messages
sent by rotating the key as described above will then be readable by the attacker.
4. Cipher Security
The security of the JCRC4 cipher is analyzed in light of previous cryptanalytic
attacks against RC4. RC4 distinguishers are considered, as well as attacks on the RC4
generator and original key schedule.
Jeremy Lennert Jail Cell Cipher – March 19, 2002 -7-
4.1 Distinguishers
Distinguishers attempt to distinguish between the keystreams output by a pseudo-
random number generator and a truly random string. The ability to make such a
distinction does not itself constitute an attack, but attacks can often be derived from them,
and they can also be used to help the attacker more accurately guess the plaintext of a
message from knowledge of the ciphertext.
Distinguishers for RC4 are presented in [6] and [7], and some other statistical
anomalies of the RC4 generator are noted in [9]. Golić [7] estimates that his
distinguisher can be used to differentiate between an 8-bit RC4 keystream and a random
string by looking at roughly 240 outputs. Fluhrer and McGrew [6] use information theory
to revise this figure slightly, proving that 2 44.7 outputs are necessary using Golić’s
distinguisher to reduce the chances of false positive and false negatives to 10%. They
also present a better distinguisher that uses the output bigram frequencies from RC4
which can achieve the same accuracy with only 230.6 outputs.
However, even their distinguisher still needs 2 18.62 outputs, or roughly 400,000, to
distinguish between 5-bit RC4 and randomness with the same accuracy (JCRC4 is
slightly larger than 5-bit RC4). It is unlikely that any messages exceeding four hundred
thousand characters will be encrypted with paper and pencil, so this poses no significant
danger to JCRC4.
None of the above have experimentally broken 5-bit RC4, but it seems that it
might be within the reach of a determined attacker with a large budget to crack it in a
matter of days or weeks. However, this is a known-plaintext attack (that is, designed
assuming that the attacker already knows the plaintext and is only trying to recover the
key), and would increase considerably in complexity if the attacker had knowledge only
of the ciphertext. Additionally, depending upon the message, remaining secure for a
matter of days or weeks may be sufficient.
5. Cipher Implementation
The necessary time to implement JCRC4 and its key-scheduling algorithm using
only paper-and-pencil computation is estimated from experimental measurements of
human computation speed. The error occurrence rate in JCRC4 is estimated from the
same experimental data, and procedures for limiting error occurrence are suggested.
To confirm this estimate, high school students were asked to perform a series of
randomly generated two-digit addition problems modulo 37. All students were given
three pages (340 problems) and ten minutes to complete as many as possible. They were
instructed to prioritize accuracy above speed, and were not allowed to use calculators or
other aids. Furthermore, testers were instructed to answer the problems in order, rather
than skipping more difficult problems.
Thirty-seven students took the test. Two tests were removed from the sample due
to failure to answer the problems in order, and one additional test was removed because it
was not signed by the tester. On one test, two problems answered with “37” which
should have been zero were marked correct, since the error was systematic and appeared
to result from a misunderstanding of the modulus operation. In other instances, an
answer larger than 36 was counted wrong even if it was equivalent (mod 37) to the
correct answer.
The speeds ranged from 2.1 problems/minute to 20.9 problems/minute (average =
8.9 problems/minute = 6.7 seconds/problem). The accuracies ranged from 83.9% to
100% (average = 95.0%).
The two math classes which took the test differed significantly in both accuracy
and speed. In the first class, out of nine testers, the speeds ranged from 6.4
problems/minute to 20.9 problems/minute (average = 12.5 problems/minute = 5.31
sec/problem) and the accuracies ranged from 95.7% to 100% (average = 97.7%). The
average speed for this class was very close to the estimated speed. In the second class,
out of twenty-three testers, speeds ranged from 2.1 problems/minute to 13.9
problems/minute (average = 7.3 problems/minute) and the accuracies ranged from 83.9%
to 100% (average = 93.8%).
Exactly one student in each class obtained an accuracy of 100%, but the student to
do so in the first class was faster than the student to do so in the second. On average, the
first class exceeded the speed of the second by 70% of the second class’s speed, and
exceeded their accuracy by an absolute margin of 3.9%.
Two factors were observed which may have contributed to this discrepancy. The
first is that the first class performed the experiment on the same day that they took the
American Mathematics Competition (AMC) test, and so may have been “warmed up”
from the additional mathematics practice earlier in the day. The second is that the second
class performed the experiment during the first period of the school day (prior to 9:00
A.M.) and may have been experiencing some residual sleepiness after awakening.
It should be noted that most of the test subjects had never performed the modulus
operation prior to this experiment. 72% of all errors occurred on problems where the
modulus operation changed the answer, even though such problems made up only 50% of
all problems on the test. It is thought that, with practice, the frequency of errors on these
problems could become as low as the frequency of errors on problems requiring no
modulus. If this occurred (without altering the error rate on problems requiring no
modulus) the average accuracy would rise from 95.0% to 97.2%.
It is thought that this error rate could be further reduced with practice, thought it is
not certain by what margin or with how much practice. Eventually, a practitioner could
construct a mental table for two-digit addition modulo 37, at which point the speed and
accuracy of the operation may become comparable to the speed and accuracy for single-
Jeremy Lennert Jail Cell Cipher – March 19, 2002 - 10 -
100%
98%
96%
94%
92%
90%
88%
86%
84%
82%
0 4 8 12 16 20 24
The worst case scenario in practice is not quite this bad, because each placement
is not independent of the others. For example, it is not possible to experience the
maximum number of collisions for a placement on two consecutive placements.
However, the reuse of key values also subjects the placements to a very mild form of
secondary clustering, which may partially negate such effects. However, in any case, the
worst case scenario cannot be any worse than this, and may be significantly better.
Exhaustive computation reveals that the average number of collisions for a single
placement is always equal to the maximum number of collisions (i - 2) divided by 37
Jeremy Lennert Jail Cell Cipher – March 19, 2002 - 12 -
minus the maximum number of collisions, which indicates that the average number of
total collisions in the key set-up is:
36
i–2
39 – i 65 collisions (average)
i=3
Experimentally, however, out of 500 randomly-selected keys for which the key-
scheduling algorithm was completed by computer, the average number of collisions was
closer to 85. This is likely due to secondary clustering, where the reuse of key values
tends to cause values to be inserted in relative positions more likely to cause longer
chains of collisions.
(1) i = i + 1 mod m
(2) j = j + P[i] mod m
(3) swap P[i] and P[j]
(4) output k = P[P[i] + P[j] mod m]
Because both counters are set each round based, at least in part, upon their values from
the previous round, these errors are propagated throughout the rest of the keystream.
Such an error causes the entire keystream from that point onward to become invalid.
(2) The second occurs when one or more values in the permutation table become
inaccurate. On any round where the i counter points to an erroneous value, the error is
transmitted to the value of the j counter, creating an error of the first type, and
invalidating the entire keystream from that point onward.
On any round where the j counter points to an erroneous value, that word of the
output will be incorrect. Additionally, the incorrect value will be moved to the cell just
probed by the i counter, insuring that it will not be pointed to by i (and propagated
throughout the rest of the keystream) for at least N rounds.
(3) The third type of error occurs when the output for the current round is
miscalculated, or the plaintext character being encrypted is inadvertently shifted by a
value other than the output (the intended shift). This type of error corrupts only that letter
of the message, and cannot propagate.
Because line one only involves a simple increment, it is unlikely that any errors
will occur there. However, if an error did occur on line one, this would create an error of
the first type and corrupt the keystream from that point forward.
If an error occurred on line two while setting the j counter, this would also create
an error of the first type and corrupt the keystream from that point forward.
An error on line three would create an error of the second type, causing one or
two improper values to be entered into the permutation table. This would corrupt the
current output, and might corrupt additional outputs later on (see above).
An error on line four, or while applying the keystream word to the plaintext as a
shift, would result in an error of the third type, corrupting only that character of the
message.
If an error is made while computing the key-scheduling algorithm, an insertion
may be made in the wrong position, creating an error of the second type. Because each
insertion is also used as the starting point for the next insertion, barring a second error
that happens to correct the first, all subsequent values inserted into the permutation table
will also be erroneous.
encryption prior to a permanent error. These figures are made still more troublesome by
the fact that the first 11 outputs need to be discarded for security.
The key schedule requires even more operations. For a key with the calculated
average number of collisions, there exists about a 10% chance that a human with 97.7%
accuracy will compute the initial permutation table without error.
6. Conclusion
A scenario was defined to determine the requirements of a jail cell cipher. It was
determined that most traditional paper-and-pencil ciphers, machine ciphers, one-time
pads, book ciphers, and card ciphers to not meet these requirements. A modified version
of the RC4 cipher was proposed with a reduced permutation table size and a new key-
scheduling algorithm which appears to fulfill all the requirements defined by the
scenario.
Additional cryptanalysis of the key-scheduling algorithm for JCRC4 is needed to
verify its security. Additional research is also required to determine the potential
improvement in the speed and accuracy of human computation with practice. However,
it is apparent that Jail Cell RC4 can be performed using reasonable human computational
abilities in less than an hour without error-checking, and in less than a day with sufficient
error-checking to compensate for the experimentally determined error rate.
There are also no known cryptanalytic attacks which can be used to recover the
key of JCRC4 without nontrivial computational resources. The best known attack is
sufficiently computationally complex that none of the cited implementations [8, 10, 11]
have actually performed the attack, and the attack becomes considerably more difficult if
the plaintext of the encrypted message is unknown to the attacker.
Jeremy Lennert Jail Cell Cipher – March 19, 2002 - 15 -
References
[1] Nikita Borisov, Ian Goldberg, David Wagner; “Intercepting Mobile Communications:
the Insecurity of 802.11”
[2] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest; Introduction to
Algorithms, The MIT Press, McGraw-Hill Book Company, 1990, p.232-237
[3] Paul Crowley; “Problems with Bruce Schneier’s ‘Solitaire,’” August 13, 2001,
http://www.ciphergoth.org/crypto/solitaire/
[4] Electronic Frontier Foundation; “Cracking DES,” http://www.eff.org/descracker/
January 19, 1999
[5] Scott Fluhrer, Itsik Mantin, Adi Shamir; “Weaknesses in the Key Scheduling
Algorithm of RC4,” Lecture Notes in Computer Science Vol. 2259, SAC 2001
[6] Scott R. Fluhrer, David McGrew; “Statistical Analysis of the Alleged RC4 Stream
Cipher,” Fast Software Encryption Seventh International Workshop, Springer-
Verlag, March, 2000
[7] Jovan Dj. Golić; “Linear Statistical Weakness of Alleged RC4 Keystream Generator,”
EUROCRYPT 1997,
http://www.wisdom.weizmann.ac.il/~itsik/RC4/Papers/Golic.PDF
[8] Bob Jenkins; “solitaire, cryptonomicon” post on sci.crypt newsgroup, August 11,
1999; source code at http://burtleburtle.net/bob/crypto/partrc4.c and
http://burtleburtle.net/bob/c/brute.c
[9] Robert J. Jenkins Jr.; “ISAAC and RC4,” 1993-1996,
http://burtleburtle.net/bob/rand/isaac.html
[10] Lars R. Knudsen, Willi Meier, Bart Preneel, Vincent Rijmen, Sven Verdoolaege;
“Analysis Methods for (Alleged) RC4,” ASIACRYPT 1998,
http://www.wisdom.weizmann.ac.il/~itsik/RC4/Papers/Knudsen.ps
[11] S. Mister, S. Tavares; “Cryptanalysis of RC4-like Ciphers,” in the Workshop Record
of the Workshop on Selected Areas in Cryptography (SAC ’98), Aug. 17-18,
1998, p.136-148
[12] Andrew Roos, “A Class of Weak Keys in the RC4 Stream Cipher,”
http://www.achtung.com/crypto/roosattack.txt
[13] John J. G. Savard, “A Cryptographic Compendium,” 1998, 1999, 2000,
http://home.ecn.ab.ca/~jsavard/crypto/entry.htm
[14] Bruce Schneier; Applied Cryptography, Second Edition, John Wiley & Sons, 1996
[15] Bruce Schneier; “The Solitaire Encryption Algorithm,” Appendix, Cryptonomicon
(by Neal Stephenson), Avon Books, 1999, p.911-918
[16] Paul Verhaeghen, Reinhold Kliegl, Ulrich Mayr; “Sequential and coordinative
complexity in time-accuracy functions for mental arithmetic,” Psychology &
Aging Vol. 12(4), Dec., U.S.: American Psychological Assn. 1997, p. 555-564
[17] David Wheeler, Rodger Needham; “TEA, a Tiny Encryption Algorithm,” November
1994, http://www.ftp.cl.cam.ac.uk/ftp/papers/djw-rmn/djw-rmn-tea.html