KoreLogic's Password Cracking Contest at DEF CON

CMIYC 2024 Data Sets

This year's plaintexts, hashes, and encrypted files were all based around the scenario inspired by the DEF CON 32 theme.

Plaintexts were generated around a number of different ideas, source materials, etc., and then by default were randomly spread across different hash types, with some exceptions.

Several sets of plaintexts were deliberately unreasonable (long phrases, random words strung into a passphrase, etc.) to attack head-on. Instead, encrypted files gave away some or all of these long plaintexts, making them crackable - to those who figured out the hints available in the encrypted files.

Plaintexts

Password plaintexts this year were generated using a number of different inspirations for source wordlists, mutations, etc.

Below is a brief description of each group of plaintexts, and a few examples of each:

Idea Pro Crack % Street Crack % Description
Neverskip_10k 4556 35.56 4653 15.24 Pulled from the 2021 Neverskip data breach:
!@#Jun2021 Pa$$w0rd P0l!cy963
Neverskip_dvorak_6k 2770 24.44 2771 8.88 Pulled from Neverskip, but then remapped if typing the Qwerty sequence on a Dvorak keyboard:
Ecfa0315 Ecfa1604* >dab2012
book_ciphers 300 38.00 300 6.67 NT hashes of page:line:word tuples; .zip with same password naming the source book (from DEF CON theme reading list) and longer tuples; together form a random long phrase in a high-value hash like bcrypt. Note that for these "regular" published books, we grabbed screenshots from Google Books and OCR'ed them. Transcription errors count as phrase mutations :) Example:
mhinson:$NT$f3ac29b4360c122b9d290e75bdd24e85:62:6:2 29:25:6
MadalineHinson.zip -> Hacking_Politics +
62:6:2 29:25:6 41:11:6 14:7:3 52:11:11
mhinson-a:$2a$10$TirIckbvSCnlZE6vSFOyReEEFOjhFn3NqZvd307.QCf8ff8JEQMje
Downes that readers activists said
fanfic_ciphers 500 55.60 500 10.20 Similar to book ciphers but using fanfics of books on the list; chapter:line:word instead of page.
wdavid:$NT$1fb9ac2f9d653859d35c8044fd3fb6c0:1:15:11 3:15:5
WilliamDavid.zip -> fanfics.Aurora.Reversion_Past_the_Mean +
wdavid-a: 1:15:11 3:15:5 0:10:3 1:13:4
wdavid-a:$2a$10$Ykb1PxH3YlLnLEuzXzLHSuMhuX.lBwmVII2V0OketiZvi5BhSLHQu
the building his Stop
transcript_ciphers 1124 61.21 1128 29.61 Similar to book ciphers but using MM:SS:wordnum from video transcripts (mostly Youtube), of Cory Doctorow interviews/talks re:Enshittification.
mblalock2:$NT$b29e5519135e3a7e2d6fc84f43bd447f:52:37:9 35:07:17
MarybelleBlalock.zip -> transcript_5tJ2vgNPXxs +
52:37:9 35:07:17 40:25:13 38:18:12 28:24:10
mblalock-a:$2a$10$ZSa0bya3TRbTYSPYbDX3RedIExEEY1DK66amBubfQ/LULNHJ7DGaS
use that to convinced means
cmiyc-2014.10chr 463 14.25 436 6.19 Return of a data set from CMIYC 2014. Keep your .pot files!
Ziberation* hemiYphere! Opposikion!
cmiyc-2014.1984 466 2.79 458 1.31 Return of another data set from CMIYC 2014.
)Smithplesee -choosicec1p AlledSvith
cmiyc-hibp 747 63.32 791 38.69 High-frequency words out of HIBP data sets. These should already be in your wordlists.
......... 927_sveta cheyenne!
llm_generated 885 14.35 840 6.79 What you get when you ask a Llama for security advice. Batches using different themes (books/movies from DEF CON reading list, AI, space, the Matrix, etc.).
Hal9000HatesYou PixelPwn NetWork90^!
media_references 3187 69.06 3200 5.22 Cast lists (actor=~username, character=~password, with some stretching if too short), quotes or in-game text, and boardgame text for relevant TV, movies, and games (Star Trek, Cyberpunk 2077, Horizon Zero Dawn, Monopoly, Lie, Cheat & Steal).
Advance to Boardwalk.
Never fade away.
Evelyn Parker
Kee1Kee2Kee
movie_phrases 4743 21.21 4716 2.06 Phrases pulled from dialog in movies on DEF CON's watch list - Snowpiercer, Children of Men, Seven Samurai - and then mutated.
He was$born in0
Stock footage*of
looking =Down on
rfc-phrases-20k 9139 24.19 9157 2.64 Phrases pulled from the subset of RFCs that mention all of: free, open, Internet, and [^n]corporat, and then mutated.
the Request_for
The second @IMP1
to the=community
rockyou2024 927 11.97 907 8.05 Extract from Rockyou2024 long words that match livingon|aprayer|minga, then run some static rules against them.
aprayerdance28 MINGAGUA1U8@ livingonlighta024
unprime 1123 24.04 1076 7.16 Scrape words from unprime-day, make phrases; mutate short ones.
Amazon’s yearly
,5Audible
vStudios0,

Hash Types

In addition to various tried-and-true hash types, there were a number of less common or even made-up hash types.

Hash Type Pro Crack % Street Crack % Prefix Description References
bkr256 2599 0.58 2580 0.16 $2k$ Like bcr256 but with the last character of the salt missing [none]
bcr256 2738 1.13 2552 0.16 $2b$ Hashes generated using Python passlib.hash.bcrypt_sha256 https://passlib.readthedocs.io/en/stable/lib/passlib.hash.bcrypt_sha256.html
bcrypt 3444 26.95 3547 5.89 $2a$ Plain bcrypt https://en.wikipedia.org/wiki/Bcrypt
saph512 2553 22.95 2635 1.71 {x-isSHA512, 15000} SAP-H hashes using x-isSHA512 and 15000 rounds https://github.com/openwall/john/blob/bleeding-jumbo/src/sapH_fmt_plug.c
rc2 3245 15.41 3217 0.50 $RC2$ Custom hash built from RC2 using multiple rounds and ciphertext-rotation Bundle 2 hint file alg_issue.txt
Bundle 4 hint file gen_rc2.py
radmin3 248 45.97 249 13.65 $radmin3$ RAdmin remote management application hashes. Note, each user with an $radmin3$ hash also used that same plaintext in an expensive hash type (saph512, bcr256, rc2, etc.), giving some of those "for free" after a successful RAdmin crack. https://www.synacktiv.com/publications/cracking-radmin-server-3-passwords.html
shiro2 3376 28.20 3321 0.09 $shiro2$ Apache Shiro application hashes https://shiro.apache.org/v2/command-line-hasher.html
sm3crypt 3331 30.83 3332 0.36 $sm3$ SM3crypt hashes used by some Linux distributions https://gitee.com/src-openeuler/libxcrypt/blob/master/add-sm3-crypt-support.patch
adsync 2527 33.60 2600 8.23 v1;PPH1_MD4 Password sync hashes used by Azure AD/Entra ID https://www.dsinternals.com/en/how-azure-active-directory-connect-syncs-passwords/
striphash 3355 57.32 3331 15.16 [none] Mangled sha1 hashes where any 0 in the first nibble is dropped https://github.com/hashcat/hashcat/issues/3833
Bundle 3 hint file maybe_sell_shoes
Bundle 5 hint file cigo
nt 3517 83.34 3573 38.23 $NT$ Standard Windows NTMD4 / NTLM hashes https://phrack.org/issues/50/8.html#article

Encrypted Files & Hints

At contest start, bundle 1 (cmiyc-2024_pro_files_1.tar.pgp and cmiyc-2024_street_files_1.tar.pgp for Pro and Street respectively) was made available.

At various times throughout the contest, additional bundles were released. These were announced via Mastodon posts.

The bundles contained encrypted files or other hash material that differed between Pro and Street, and hint files and flavor text that was the same across both (except for the occasional detail like a name that corresponded to a specific user in that class's user hash list).

Bundle Time Offset Description
1 +00:00 Initial set of archives:
  • arj.tgz: Bundle of ARJ files, for users who also had regular strong/expensive hashes (bcrypt shiro2, etc.), in a "circle": usera.arj, encrypted w/a truncated version of usera's password, contained a truncated version of userb's password, which was then used to encrypt userb.arj and so on. Toplevel status text file references Jocko Willink and an expression about redundancy.
  • bundle.zip: A toplevel notes file and individual recovery_userN files. Each recovery_userN file was encrypted with a different passphrase. Tools like zip2john warn about this, but who reads warnings? Individual user's files were encrypted with a 1-3 word phrase, taken from the media_references list above (zero mutations). That person's password (a high-value bcrypt or whatever) was then a longer version of that same quote. The passphrase for bundle.zip itself is every user's passphrase concatenated - i.e. effectively uncrackable.
  • gocryptfs.tar.bz2: A collection of per-user gocryptfs encrypted volumes. Each user's password was split into two overlapping pieces; one piece was the passphrase for their encrypted volume, and the other piece was in a plaintext file inside their volume. Cracking a gocryptfs volume would allow reconstructing their full plaintext passphrase.
2 +07:09
  • alg_issue.txt: First, vague hint about the $RC2$ hashes.
  • leak_astroturf.txt: Flavortext
  • leak_drone.txt: Flavortext, also sets up context for the following PCAP.
  • pro.pcap (or street.pcap): Capture of WPA1-encrypted packets, using a passphrase from RockYou lists. Contained plaintext authentications (telnet, FTP, HTTP, SMTP, etc.) that revealed various users' passwords. ...Except due to a packaging mistake, only the plains in street.pcap corresponded to user hashes, not the ones in pro.pcap. Uh, would you believe it was a deliberate red herring?
3 +10:25
  • maybe_sell_shoes: First hint about the striphash hashes. Filename, names, and the first paragraph structure, and consistent typoes are all references to the GitHub issue referenced above.
  • winoneforthe.zip: Bundle of individual per-user .zip files needed to make use of cracked NT hashes' book ciphers. Each .zip file was encrypted with the named user's NT password (a 2- or 3-tuple book cipher), and then contained a text file containing a reference to the source material (book name, Youtube video ID, AO3 fanfic name, etc.) and their full 5-tuple book cipher. Those yielded cracks in high-value hashes (bcrypt, etc.).
4 +27:40
    A zipfile named after a user, encrypted with that user's NT password (which was quite weak), containing:
  • account_metadata.yaml: Metadata about each user. Unlike past years' .yaml files, this did not contain any password hashes. However, it allows unambiguous mappings of real names to usernames - useful for some of the passphrase->encryptedfile->passphrase relationships. And, it gives Department info for every user. Some password generation techniques were tied to only users from a specific Department; some of the other hints files make these explicit ("everyone in my department"), etc.
  • gen_rc2.py: A reference implementation of the $RC$ hash type. Provides a working, if not fast, way to generate such hashes and check for matches.
  • leak_investors.txt: Flavortext.
  • leak_vuln.txt: Flavortext.
5 +33:47
    A zipfile encrypted with a portion of its own filename, containing:
  • app.log: Fabricated "debug" application logs that reveal the prefix of passwords on successful login along with a checksum of the full password; these were then used in high-value hash types (faster to eliminate most guesses by verifying checksum first, vs guessing everything w/that prefix). Also the occasional login failure where the user typed their password in the username field, resulting in their full plaintext being logged.
  • cigo: Explicit hint about striphash. Also reveals more miserable office politics.
  • leak_blackmail.txt: Flavortext.
  • leak_social2.txt: Flavortext; also implies the use of LLMs which might help finding llm_generated plains.
6 +35:39
  • training_needed.txt: Internal email pointing at the use of book ciphers concentrated in a particular Department, referencing:
    • MoryMoctorow.zip: Example book cipher archive, passphrase in training_needed.txt, contained mmoc.txt.
    • mmoc.txt: Youtube video ID of a Cory Doctorow presentation, and a 5-tuple book cipher.
    • mmoc_passwd: Individual NTLM hash and bcrypt hash for Mory Moctorow; crack with the 2-token book cipher (same as MoryMoctorow.zip) and the full 5-word phrase respectively.
  • research_needed.txt: Hint that a group of users have been digging into Cory Doctorow's presentations and interviews (leading to the transcript_ciphers data set).
  • leak_datamine.txt: Flavortext.
7 +46:00
  • Unlike the others this was not a download bundle; it was a picture of August Dvorak teaching a class, with name and metadata suggesting which users (the "Training" department) have passwords in the Neverskip_dvorak_6k data set.

Test Hashes

As usual, the test hashes we released during pre-registration were designed to hint at aspects of the upcoming contest. A set of test users had hashes in escalating (but still cheap) algorithms, then a final user with a bcrypt hash.

The test users' passphrases were successive lines in the theme song for an '80's science show on PBS in the United States. The first two lines could be trivially cracked, where were enough to Google and find the show and learn the next two lines. A .zip file could be decrypted with the next missing line of that song; it gave a hint to how to find the line of dialog from a particular episode (available on Youtube) that was the final user's passphrase:

user1:e9fa198e766d7b9bdfdb872032b2c6ee:3-2-1
user2:{SSHA}4Sq+ps0Yeo9HvPV45aeF8Z4wWa1vUlpnTkZIRg==:Contact
user3:$1$mjlHQY$7ofnyhSdWnn6HyIMf8GIE0:It's the secret
user4:$1$h2lb97w8$rSnzQB3FHv9o4VXaOTEyx.:It's the moment

Zipfile passphrase: When everything happens

Once decrypted, the zipfile contained:

Do not submit the zip passphrase, it is not worth anything.

user5's password is the sentence spoken by the synthesized voice
introducing itself in episode 101.

And then the final user's bcrypt hash passphrase was:

I am a computer at Bell Laboratories and I am learning to talk.
This sort of back-and-forth was intended to foreshadow the inter-dependencies used in the encrypted archives that revealed plaintexts, the book/transcript ciphers, etc.