This was originally 2 years ago with references to the product and vendor. Unfortunately I got threatening legal letters, and almost lost my job! I’ve now stripped out the vendors references and am putting it up simply because I think others will find my methodologies and write up interesting.
Some software I was working with stored its password using a custom written encoding method – and charge people for professional services who want to decode their passwords. Recently I wanted to migrate users from this system to an alternate one, so I had no choice but to reverse engineer the hash. I’m publishing this here to demonstrate why people shouldn’t invent their own hash without fully understanding the consequences, and as an interesting example of crypto work.
The first step was to locate where the software stores its passwords – which I found was within the windows registry. Within the registry each user has a value containing among other things, the password.
This means that it is possible to change a users password, and observe the output hash – allowing us to build a dictionary for plain text to cipher text mappings. Having tried various entries, I came up with hashes like these below:
Plain | Cipher |
---|---|
a | %DgFf&kJo |
b | %D4Ff&kJo |
c | %DNFf&kJo |
bbbbbb | %DZ()noaw |
bbbbbbbbb | %DZ()noawDZ() |
bbbbbbbbbbbb | %DZ()noawDZ()noaw |
My list of plain to cipher mappings was longer than this, but I noticed a number of things as this stage:
- cipher text always begins with a %, and can therefore be ignored.
- cipher text is at least 8 chars long, and increases by 4 cipher chars for each 3 chars of plain text after that
- the result of encoding the same letter 12 times shows a pattern that repeats for every 8 cipher letters.
The 2nd point in the above was key for solving the problem, as it pointed me towards base64 encoding. The 2nd point I took away from the above was that it was not worth looking at plain text inputs longer than 6 chars.
My next step was to start inputting 6 character passwords, deciding I would focus mainly on repeating characters in my input text, such as ‘aaaaaa’ and ‘bbbbbb’. I would then compare the base64 encoded input, vs the output, such as below.
Plain text | Base64 plain text | Cipher text |
---|---|---|
111111 | MTExMTEx | Whbo7D)f |
222222 | MjIyMjIy | W5)n7lbg |
333333 | MzMzMzMz | WO7m79fh |
444444 | NDQ0NDQ0 | Vxvl6Sja |
From trying out more and more inputs such as the above, I noticed a couple more things, that can be observed in the snipped above.
- repeated chars in the base64 of input text does not result in repeated chars in the cipher output (see 333333 above
- an ‘M’ in column 1 of the input base64’ed resulted in a ‘W’ in the output
- an ‘M’ in column 5 of the input base64’ed resulted in a ‘7’ in the output
- The cipher text has 64 unique symbols in its output , therefore further supporting the base64 theory.
The observations above lead me to suspect this encoding system simply used 8 ‘translation maps’, and working these out would be as simple as inputting enough samples to build a full table of results. This means that with an input of ‘abcdefghi’, ‘a’ would be translated using TM1 (translation map 1), ‘b’ would use TM2, ‘c’ would use TM3, and so on, up until the 9th char of ‘i’ which would use TM1
I quickly wrote a script to build up these translation maps for me, and this is the result:
== TM 1 cipher | $%&()*123456789ABCDEFGHIJLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwx plain | RQ baZYfedcTSXWVULKJIPONM w 25 translations found == TM 2 cipher | $%&()*123456789ABCDEFGHIJLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwx plain | 23 n ij 1 yz q klmUVWX ST EFGH CD 24 translations found == TM 3 cipher | $%&()*123456789ABCDEFGHIJLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwx plain | JI xw NMP ts po lkhg 98 54 1 FE BA dc ZY VU RQ 32 translations found == TM 4 cipher | $%&()*123456789ABCDEFGHIJLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwx plain | NMgjihcbaZYnmlkHGFEDCBAPOLKJIXWVUTSRQfedvutsrqpo321 zyxw/+987654 63 translations found == TM 5 cipher | $%&()*123456789ABCDEFGHIJLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwx plain | KJIL NMPO o dcfeZYbaVUXWRQTS 25 translations found == TM 6 cipher | $%&()*123456789ABCDEFGHIJLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwx plain | 3 12 yz STUVWX CDEFGH 8 ijklmn 24 translations found == TM 7 cipher | $%&()*123456789ABCDEFGHIJLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwx plain | FE 98 BA hg lk pots xw 1 54 JI KNM RQ VU ZY dc 32 translations found == TM 8 cipher | $%&()*123456789ABCDEFGHIJLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwx plain | WX7456HABCD89+/cdefYZabUVQRSTMNOPIJKLEFG 123wxyzstuvopqrklmnghij 63 translations found 288 / 384 translations found!
With these translation maps, I was able to build a decode method, which in turn revealed that passwords are padded with ‘+’ signs up to 6 letters, or multiples of 3 there after. Whilst my translation maps are not complete, they are sufficiently complete to decode the 500+ passwords I have tried so far. I imagine the reason that the tables are incomplete is because I’m using 7 bit chars for my input, and not even using all of those, for example you can’t have a newline, or delete char in your password.
Just noticed a slight typo:-
“an ‘M’ in column 4 of the input base64’ed resulted in a ‘7’ in the output”
…I think it should be column 5?
great stuff Ian! How long did this take you?
Thanks, fixed!
In the first sentence, change “stores it’s password” to “stores its password”. It is not said as “stores it is password”.
Thanks – fixed.
You’ve fixed it in the second paragraph but not in the first. Although I guess that’s slightly less important than the hack itself. 😀
Well done btw…
Oh yes.. fixed in 2 more places, silly me.
I’ll know for next time that it’s not it’s 😉
Epic.
Mind boggling on how you did that.
Very good mate
Ah you beat me to it 🙂
Although I took the lazy approach of diassembling GMS 🙂
NTMUserReg.DLL
Imagebase+0xF130 = Start of hashing loop 🙂
(quick wingraph screeny: http://img22.imageshack.us/img22/5815/loopyloop.png)
Find more projects like this! hehe
Im trying to find the phrase in this “761269 12766547 82039649 53739689 2994919 22579 83683 650729 402733 402733 3048371 3048371 319373 3167371 30232117 3048371 3150487 2037839 22579 364547 2834761 84923 2211421 120569929 7189583 495917 3167371 70100689 20201263 731099 2969077 31244849 6453583 650729 4630127 6869651 625357 1204601 1100371 120457 131867 597851 546121 385019 3048371 51997837 6869651 51791 99763 1859761 27576671 133511713 3150487 631973 743569” but having no luck, any help is greatly appreciated, thanks