Artist and programmer Jason Rohrer had a cool idea. If we XOR an MP3 with some other file (call it A), there will be no statistical correlation between the resulting file and the MP3. He thinks this means that the resulting file (which he calls a “Mono” file) is not a copy or a derivative work of the MP3, so it can be distributed freely. (How, after all, can anyone claim copyright on arbitrary streams of bits?) He wrote a piece of software, Monolith, to demonstrate the concept. He also placed a WAV file in the public domain that he encourages people to use as “file A”. The math works out such that if you have the Mono file and the “file A” that was used to create it, you can get the original MP3 back out.
Whether distributing a “Mono” file made from a copyrighted MP3 and some arbitrary, unknown file is copyright infringement is a bit of a tree-falling-in-the-woods question. Yes, it’s an infringement, I suppose, but because almost nobody can replay the sound from the Mono file, the copyright holder is unlikely to notice.
But when you distribute the Mono file when everybody knows that you’re supposed to use Rohrer’s “file A” as the other file in order to get the MP3 back out, you’re just distributing the MP3 in a different encoding, like distributing it in some encrypted form, or in a ZIP file. Rohrer recognizes that MP3s are copies of sound recordings because “an algorithm exists that can convert an MP3 file into sound mechanistically without any additional information”. The same is true of Mono files. Since everybody knows that you just XOR the mono file with Rohrer’s freely-distributed “file A” to get the MP3, there’s a well-known algorithm that can turn the Mono file into sound. It fits within the statutory requirement that to infringe a sound recording copyright, one must create “phonorecords or copies that directly or indirectly recapture the actual sounds fixed in the recording.” The Mono files recapture the actual sounds fixed in the recording just as much as the original MP3 does; there’s just an extra step involved.
“Fine,” you respond. “So if you can XOR it with some other file and get a copyrighted work, it’s a copy. So a file full of zeroes is an infringement, because if you XOR it with an MP3, you get the MP3.” Obviously, though, the game is over if the algorithm that gets you the copyrighted work back out of your file needs to crib from the copyrighted work in order to do so.
So, in other words, if a computer can get the copyrighted work back out of it without seeing the copyrighted work, it’s a copy. So an MD5 isn’t a copy, and neither is any other sort of short audio fingerprint. But if there’s some way to replay the content in the file, it’s a copy.