String Compression Tool
-
infoInformation
Jelly's strings aren't Python's
str
type; rather, it is a list of characters. Each character is a one-character Pythonstr
object. By using operations like+
(Addition) or×
(Multiplication), you can get multi-characterstr
objects; however, Jelly will treat them as characters, so list operations may not work the way you want (for example, their length is considered1
), and some operations may fail (such asO
(Ord)). If you want to represent a single character, you can use”
(Character Literal). This compression tool will ignore this and instead use strings for everything. Note that a one-character string is different from a character in Jelly (unlike in many languages, e.g. Python); the former is a singleton list of the latter.Jelly has three ways of representing a string. The first is the trivial representation,
“string goes here”
. You can create a list of strings using multiple“
s, so“hello“world”
returns["hello", "world"]
(technically[["h", "e", "l", "l", "o"], ["w", "o", "r", "l", "d"]]
). If this string is at the end of the program, you can omit the terminator”
, but note that strings can span multiple lines (it is recommended to use¶
(Pilcrow) for this purpose), so it has to be the end of the program, not just the end of a link.The second is for two-character string literals:
⁾ab
returns"ab"
(["a", "b"]
). This is very straightforward.The final representation is dictionary compressed strings. These are optimal for English-like strings that contain a lot of words and not that much punctuation or other characters. Note that you cannot compress a string if it contains characters that are not newlines or printable ASCII. To decompress a string, it is first converted into a base-250 number where
¡
represents1
andż
represents250
(this is the same way that the Base 250 Number Terminator (’
) works). Then, this integer is fed through thesss
function to convert it to a string. Decompression of the integer begins with the empty string and repeats the following steps until the integer is exhaused (becomes0
):- Divide the integer by
3
, and vary the mode based on the remainder.- If the mode is
0
, divide the integer by96
, add 32 to the remainder, and add the character at that codepoint to the string (this is printable ASCII range, except for127
, which is the newline in Jelly's codepage). -
Otherwise, insert a dictionary-compressed word. Two flags are set: swap-case is initially false, and prepend-space is true if and only if the string so far isn't empty.
- If the mode is
2
, divide the integer by3
again. If the remainder is0
or2
, set the swap-case flag to true. If the remainder is1
or2
, swap the prepend-space flag (insert leading space at the start of the string, or don't insert space between words).
2
. If the integer was even, use the long dictionary (227845 words). If it was odd, use the short dictionary (20453 words). The dictionaries are available here (the file is very long, but you can view the raw file on GitHub). Finally, divide the integer by the length of the dictionary and index the remainder into the dictionary. If the swap-case flag is set, flip the case of the first letter. If the prepend-space flag is set, prepend a space to the word. Then, append the result to the string. - If the mode is
- If the mode is
- Divide the integer by
eval
it (to allow you to compress multiple strings). As long as you don't intentionally enter code that looks like a normal literal that will brick your browser, nothing abnormal should happen. If the string you want to use has Jelly string terminators or opening quotation marks, you will need to use ”
(Character Literal), as there is no way to insert them into a string literal. This compressor will output a valid literal but it will probably be shorter to concatenate the necessary characters together in a non-literal chain. There are often ways to shorten a string using built-in string constants but this compressor guarantees finding the shortest literal to represent your input.
Jelly cuts characters that aren't in its codepage out of its code. You will need to use Ọ
(Chr) on the codepoint to create those characters.
Warning: string compression does not support trailing spaces. The output shown here may not actually work correctly. You will need to insert them manually or just use a normal string.