Business Ethics & Corporate Crime Research Universidade de São Paulo
FacebookTwitterGoogle PlusYoutube

The noise-bombing power of CSAM hash removals: How criminals respond to that

Author: Carolina Christofoletti

Link in original: Click here

When we talk about Child Sexual Abuse Material (CSAM) hash detection technology, a common critic comes to be the fact that hashes are, in fact, very sensible to any byte alteration, so that the correction data (variations) will be far too broad to expand, considerably, the memory space needed to host those.

True, that is a point. But, despite that, I would like to explore today some collateral effect of this affirmation: How do criminals react to that and how does it, behaviourally, happens to guide, in the desired direction, the state of arts.

This Insight comes from a read on the Project Arachnid: Online Availability of Child Sexual Abuse Material Canadian Centre of Child Protection Research Report, where the data founds comes to be really startling.

There is something else I would like to comment, that is, the password and the TOR case, but I will not follow up with those in this opportunity. At this occasion, the problem of CSAM hash policies as they are is what I would like to discuss.

Ever-living links

Without any legal obligation for Internet Service Providers to hash check their services, CSAM hash checks arrive, maybe, at a time when it is already too late: When the URLS are, somehow, found and things are of public access. The fact that criminals refer to this material in very different ways and that, sometimes, “tricky naming” is done in a way to avoid crawlers will conclude that the current model for CSAM detection in online platforms is insufficient.

The mandatory report model has become insufficient to deal with the reputational trade-off of CSAM finding: Also, for platforms that claim to be the privacy guardian, something such as a platform using a third-part technology to “scan” for CSAM turns, in the next second, in a reputational “damage”, weird as it may seem.

The fact that the mandatory report model exists with only an implicit mandate for proper controls put every website, image hoster manager or whatsoever in the stand-by-mode: Wait for the removal notice. If they ever come, we did not report because we have not found it. If they do not, good, our platform will look like as if CSAM free.

Perverse, is it not? Yes, and the background problem is a legal one: We are comfortable with the thesis that companies can be held liable if CSAM was found in their platforms, which lacked previous controls (even though we have to deal, in this case, with “immunity of prosecution” cases).

But we are kind of odd to the thesis that website administrators can be held liable because, as the Canadian Center for Child Protection recently found, CSAM forums keep uploading hash modified versions of previously known content and, despite the fact that the CSAM nature of things is visible to everyone, administrators (they too!) keep waiting for the algorithm to come and remove that, or for a removal notice to be issued somewhere.

The result is that third-part organizations and technologies are misappropriated by platforms who, at the current stage, are very comfortable in letting cat-and-mouse to be played inside their channels. If CSAM hotlines and Law Enforcement Agencies want no CSAM file to be hosted in this URL any more, tell will have to monitor it perpetually: As the term reads, we are talking about “content removal” and not “URL removal”, what “activates” and “inactivates” those CSAM links perpetually.

CSAM cases are, as we see, much harder than the unindexing one.

More data is still needed to see how often CSAM files return to CSAM URLs where they were once found, and the analysis here would have “duplicates URLs” with prior removal request order issued.

Criminals are getting smarter

As the Canadian Centre for Child Protection states, criminals are getting smarter: They learnt that platforms are constantly implementing, as a matter of criminal compliance, hash-checks and they are starting to “edit” the hashes.

The question is, whose responsibility is that: Of CSAM hotlines and Law Enforcement Personnel, to improve their detection systems or of platforms, to start to work with visual pattern detection technologies and start mapping what the editing metrics of their own platform is. For example, the study in question found that criminals are adding “noise” to the known CSAM images (removed as such).

Using that “photo-editing” feature, the hash value is changed, but the change is imperceptible to human eyes. Noise is but something one does with a software and through moving scales (pattern identifiable, so) and that would need, so, this file to have been upload, previously, to somewhere else for editing. Watch out the gatekeepers.

Curious case this one from the noise, because criminals wanted that to be visually imperceptible. That is not the most obvious way of changing a hash, and here is the tricky point: Is that, maybe, that criminals did not want changes to be imperceptible to avoid panic among their gang members? After all, when changes are perceptually visible, the translation there is “poisoned file” and “cops”.

But they haven’t count on the Nietzschean return of everything

When virtual images appeared, criminals breath with relief- finally, their files could be copied without damaging for every next copy.  When hash system was born, criminals were faced with the need of altering that. As paper copies, the acceptable number of “editions” is limited, though still big, because after “editing it” sequentially, you damage the file – there is nothing to see there any more.

Let us not underestimate the power of hash values in the fight against known CSAM files: They created a time bomb on those.

Solving the Trust problem with a central monitor

Hashing platforms is, alone, not sufficient to deal with known CSAM hashed files, in the very case that they appear in their unmodified, unedited version.

The problem is one very similar to the “report button” one. When CSAM hash-matches technologies are operational, the only thing we can guarantee is that, if the same file with the same hash value appears once again in a platform whose database is, at this time, updated to recognize this hash-value, a sign would be given to the administrator.

The question is, what if the administrator simply decides to remove it, without giving any further notice of this violation to anyone? What if, instead, the platform decides simply to ignore those findings?

At present, even though mandatory CSAM reporting is an enforced rule in some countries, we are unable to guarantee that those reports are complete. If talking the “hidden data” problem for new files (whose existence is unknown to hash databases) would present additional implementation problems (unknown to detection systems, after all), the completeness of hash-based reports could be verified, if only, hash match warnings were shared with a third-party organization.

The fact that known hash detections are shared with someone else might cause the desired effect of removing, by peer pressure, the file as soon as possible – since matches would, also here, be timestamped.

Think about it.