### Working out the THPS2 PS1 filenames

Posted:

**Sat Aug 05, 2017 8:56 pm**First post so it would be unwise of me to post a link right now. I'm bruteforcing the one filename I don't have, and this will take a long time (it's a file from THPS1). You can find it on my GitHub Gist as "THPS2 PS1 (almost) complete filename list".

So for those who don't know, most of the actual data is in cd.wad and the metadata that actually tells you where everything is is cd.hed, which consists of a bunch of 3-uint32_t tuples with the filename_hash, the offset_into_the_wad, and the length. The upside is it's nice and fast to look things up. The downside is it's a nuisance to rip as you lack the filenames.

Here's the hash algorithm. It's a broken version of CRC-32 which may have been "obfuscated"... but for some reason they still kept the usual 0xEDB88320 polynomial so it sticks out like a sore thumb anyway.

Also, huge shoutouts to Graymatter LTI for including a whole bunch of juicy stuff in the PC version that really shouldn't be there, like, ooh, the build logs and the complete symbol table for the PS1 demo version, and all the definitions used in the TRG files which makes reverse-engineering the maps that much easier. But on a more humble note, they at least kept the filenames fully intact on the PC version, which was invaluable for a few sets of filenames.

EDIT: Oops, the comment said "rotate right" - it's a left rotate.

So for those who don't know, most of the actual data is in cd.wad and the metadata that actually tells you where everything is is cd.hed, which consists of a bunch of 3-uint32_t tuples with the filename_hash, the offset_into_the_wad, and the length. The upside is it's nice and fast to look things up. The downside is it's a nuisance to rip as you lack the filenames.

Here's the hash algorithm. It's a broken version of CRC-32 which may have been "obfuscated"... but for some reason they still kept the usual 0xEDB88320 polynomial so it sticks out like a sore thumb anyway.

Code: Select all

```
uint32_t namehash(const char *s)
{
uint32_t csum = (uint32_t)(int32_t)-1;
for(int i = 0; s[i] != '\x00'; i++) {
uint32_t a1 = (uint32_t)(uint8_t)s[i];
// Convert to lower case
if(a1 >= 'A' && a1 <= 'Z') {
a1 += 32;
}
// Broken CRC-32 implementation
//
// Normal CRC-32:
// * Shift right, if carry then XOR by 0xEDB88320
//
// THPS2 Special! Wow! edition:
// * Rotate left, shift original XOR'd val right, if carry then XOR by 0xEDB88320
uint32_t v0 = (a1 ^ csum);
uint32_t bits = (v0 & 0xFF);
// Reference implementation
for(int j = 0; j < 8; j++)
{
// Rotate left
csum = ((csum>>31)&1)|(csum<<1);
// Then apply the polynomial
if((bits & 1) != 0) {
csum ^= 0xEDB88320;
}
// Shift along
bits >>= 1;
}
}
return csum;
}
```

EDIT: Oops, the comment said "rotate right" - it's a left rotate.