CMIYC 2024 Data Sets
This year's plaintexts, hashes, and encrypted files were all based around the scenario inspired by the DEF CON 32 theme.Plaintexts were generated around a number of different ideas, source materials, etc., and then by default were randomly spread across different hash types, with some exceptions.
Several sets of plaintexts were deliberately unreasonable (long phrases, random words strung into a passphrase, etc.) to attack head-on. Instead, encrypted files gave away some or all of these long plaintexts, making them crackable - to those who figured out the hints available in the encrypted files.
Plaintexts
Password plaintexts this year were generated using a number of different inspirations for source wordlists, mutations, etc.
Below is a brief description of each group of plaintexts, and
a few examples of each:
Idea | Pro | Crack % | Street | Crack % | Description |
---|---|---|---|---|---|
Neverskip_10k | 4556 | 35.56 | 4653 | 15.24 | Pulled from the 2021 Neverskip data breach: !@#Jun2021 Pa$$w0rd P0l!cy963 |
Neverskip_dvorak_6k | 2770 | 24.44 | 2771 | 8.88 | Pulled from Neverskip, but then remapped if typing the Qwerty sequence on a Dvorak keyboard: Ecfa0315 Ecfa1604* >dab2012 |
book_ciphers | 300 | 38.00 | 300 | 6.67 | NT hashes of page:line:word tuples; .zip with same password naming the source book (from DEF CON theme reading list) and longer tuples; together form a random long phrase in a high-value hash like bcrypt. Note that for these "regular" published books, we grabbed screenshots from Google Books and OCR'ed them. Transcription errors count as phrase mutations :) Example: mhinson:$NT$f3ac29b4360c122b9d290e75bdd24e85:62:6:2 29:25:6 MadalineHinson.zip -> Hacking_Politics + 62:6:2 29:25:6 41:11:6 14:7:3 52:11:11 mhinson-a:$2a$10$TirIckbvSCnlZE6vSFOyReEEFOjhFn3NqZvd307.QCf8ff8JEQMje Downes that readers activists said |
fanfic_ciphers | 500 | 55.60 | 500 | 10.20 | Similar to book ciphers but using fanfics of books on the list; chapter:line:word instead of page. wdavid:$NT$1fb9ac2f9d653859d35c8044fd3fb6c0:1:15:11 3:15:5 WilliamDavid.zip -> fanfics.Aurora.Reversion_Past_the_Mean + wdavid-a: 1:15:11 3:15:5 0:10:3 1:13:4 wdavid-a:$2a$10$Ykb1PxH3YlLnLEuzXzLHSuMhuX.lBwmVII2V0OketiZvi5BhSLHQu the building his Stop |
transcript_ciphers | 1124 | 61.21 | 1128 | 29.61 | Similar to book ciphers but using MM:SS:wordnum from video transcripts (mostly Youtube), of Cory Doctorow interviews/talks re:Enshittification. mblalock2:$NT$b29e5519135e3a7e2d6fc84f43bd447f:52:37:9 35:07:17 MarybelleBlalock.zip -> transcript_5tJ2vgNPXxs + 52:37:9 35:07:17 40:25:13 38:18:12 28:24:10 mblalock-a:$2a$10$ZSa0bya3TRbTYSPYbDX3RedIExEEY1DK66amBubfQ/LULNHJ7DGaS use that to convinced means |
cmiyc-2014.10chr | 463 | 14.25 | 436 | 6.19 | Return of a data set from CMIYC 2014. Keep your .pot files! Ziberation* hemiYphere! Opposikion! |
cmiyc-2014.1984 | 466 | 2.79 | 458 | 1.31 | Return of another data set from CMIYC 2014. )Smithplesee -choosicec1p AlledSvith |
cmiyc-hibp | 747 | 63.32 | 791 | 38.69 | High-frequency words out of HIBP data sets. These should already be in your wordlists. ......... 927_sveta cheyenne! |
llm_generated | 885 | 14.35 | 840 | 6.79 | What you get when you ask a Llama for security advice. Batches using different themes (books/movies from DEF CON reading list, AI, space, the Matrix, etc.). Hal9000HatesYou PixelPwn NetWork90^! |
media_references | 3187 | 69.06 | 3200 | 5.22 | Cast lists (actor=~username, character=~password, with some stretching if too short), quotes or in-game text, and boardgame text for relevant TV, movies, and games (Star Trek, Cyberpunk 2077, Horizon Zero Dawn, Monopoly, Lie, Cheat & Steal). Advance to Boardwalk. Never fade away. Evelyn Parker Kee1Kee2Kee |
movie_phrases | 4743 | 21.21 | 4716 | 2.06 | Phrases pulled from dialog in movies on DEF CON's watch list - Snowpiercer, Children of Men, Seven Samurai - and then mutated. He was$born in0 Stock footage*of looking =Down on |
rfc-phrases-20k | 9139 | 24.19 | 9157 | 2.64 | Phrases pulled from the subset of RFCs that mention all of: free, open, Internet, and [^n]corporat, and then mutated. the Request_for The second @IMP1 to the=community |
rockyou2024 | 927 | 11.97 | 907 | 8.05 | Extract from Rockyou2024 long words that match livingon|aprayer|minga, then run some static rules against them. aprayerdance28 MINGAGUA1U8@ livingonlighta024 |
unprime | 1123 | 24.04 | 1076 | 7.16 | Scrape words from unprime-day, make phrases; mutate short ones. Amazon’s yearly ,5Audible vStudios0, |
Hash Types
In addition to various tried-and-true hash types, there were a number of less common or even made-up hash types.Hash Type | Pro | Crack % | Street | Crack % | Prefix | Description | References |
---|---|---|---|---|---|---|---|
bkr256 | 2599 | 0.58 | 2580 | 0.16 | $2k$ | Like bcr256 but with the last character of the salt missing | [none] |
bcr256 | 2738 | 1.13 | 2552 | 0.16 | $2b$ | Hashes generated using Python passlib.hash.bcrypt_sha256 | https://passlib.readthedocs.io/en/stable/lib/passlib.hash.bcrypt_sha256.html |
bcrypt | 3444 | 26.95 | 3547 | 5.89 | $2a$ | Plain bcrypt | https://en.wikipedia.org/wiki/Bcrypt |
saph512 | 2553 | 22.95 | 2635 | 1.71 | {x-isSHA512, 15000} | SAP-H hashes using x-isSHA512 and 15000 rounds | https://github.com/openwall/john/blob/bleeding-jumbo/src/sapH_fmt_plug.c |
rc2 | 3245 | 15.41 | 3217 | 0.50 | $RC2$ | Custom hash built from RC2 using multiple rounds and ciphertext-rotation | Bundle 2 hint file alg_issue.txt Bundle 4 hint file gen_rc2.py |
radmin3 | 248 | 45.97 | 249 | 13.65 | $radmin3$ | RAdmin remote management application hashes. Note, each user with an $radmin3$ hash also used that same plaintext in an expensive hash type (saph512, bcr256, rc2, etc.), giving some of those "for free" after a successful RAdmin crack. | https://www.synacktiv.com/publications/cracking-radmin-server-3-passwords.html |
shiro2 | 3376 | 28.20 | 3321 | 0.09 | $shiro2$ | Apache Shiro application hashes | https://shiro.apache.org/v2/command-line-hasher.html |
sm3crypt | 3331 | 30.83 | 3332 | 0.36 | $sm3$ | SM3crypt hashes used by some Linux distributions | https://gitee.com/src-openeuler/libxcrypt/blob/master/add-sm3-crypt-support.patch |
adsync | 2527 | 33.60 | 2600 | 8.23 | v1;PPH1_MD4 | Password sync hashes used by Azure AD/Entra ID | https://www.dsinternals.com/en/how-azure-active-directory-connect-syncs-passwords/ |
striphash | 3355 | 57.32 | 3331 | 15.16 | [none] | Mangled sha1 hashes where any 0 in the first nibble is dropped | https://github.com/hashcat/hashcat/issues/3833 Bundle 3 hint file maybe_sell_shoes Bundle 5 hint file cigo |
nt | 3517 | 83.34 | 3573 | 38.23 | $NT$ | Standard Windows NTMD4 / NTLM hashes | https://phrack.org/issues/50/8.html#article |
Encrypted Files & Hints
At contest start, bundle 1 (cmiyc-2024_pro_files_1.tar.pgp and cmiyc-2024_street_files_1.tar.pgp for Pro and Street respectively) was made available.At various times throughout the contest, additional bundles were released. These were announced via Mastodon posts.
The bundles contained encrypted files or other hash material that differed between Pro and Street, and hint files and flavor text that was the same across both (except for the occasional detail like a name that corresponded to a specific user in that class's user hash list).
Bundle | Time Offset | Description |
---|---|---|
1 | +00:00 | Initial set of archives:
|
2 | +07:09 |
|
3 | +10:25 |
|
4 | +27:40 |
|
5 | +33:47 |
|
6 | +35:39 |
|
7 | +46:00 |
|
Test Hashes
As usual, the test hashes we released during pre-registration were designed to hint at aspects of the upcoming contest. A set of test users had hashes in escalating (but still cheap) algorithms, then a final user with a bcrypt hash.The test users' passphrases were successive lines in the theme song for an '80's science show on PBS in the United States. The first two lines could be trivially cracked, where were enough to Google and find the show and learn the next two lines. A .zip file could be decrypted with the next missing line of that song; it gave a hint to how to find the line of dialog from a particular episode (available on Youtube) that was the final user's passphrase:
user1:e9fa198e766d7b9bdfdb872032b2c6ee:3-2-1
user2:{SSHA}4Sq+ps0Yeo9HvPV45aeF8Z4wWa1vUlpnTkZIRg==:Contact
user3:$1$mjlHQY$7ofnyhSdWnn6HyIMf8GIE0:It's the secret
user4:$1$h2lb97w8$rSnzQB3FHv9o4VXaOTEyx.:It's the moment
Zipfile passphrase: When everything happens
Once decrypted, the zipfile contained:
Do not submit the zip passphrase, it is not worth anything.
user5's password is the sentence spoken by the synthesized voice
introducing itself in episode 101.
And then the final user's bcrypt hash passphrase was:
I am a computer at Bell Laboratories and I am learning to talk.
This sort of back-and-forth was intended to foreshadow the inter-dependencies used in the encrypted archives that revealed plaintexts, the book/transcript ciphers, etc.