USP - Universidade de São Paulo

Eggbert it: What Facebook Transparency Center says about how should we read its CSAM Transparency Data

Image retrieved from: Pinterest

Author: Carolina Christofoletti

Link in original: Click here

The CSAM data does not exist

I really appreciate Facebook efforts to write such a thing on a Transparency Report. “It might be tempting to read our content actioned metric as an indicator of how effectively we find violations or the impact of those violations on our community. However, the volume of content we take action on is only part of the story”

Translating it to you, it shall mean that the content actioned data does not absolutely mean anything. In terms of CSAM, we must distinguish Terms of Service Violations from CSAM, where the legal definition applies. I will briefly show you how you poison the data: If you removed and reported all child nudity that appeared on your platform to NCMEC, it does not mean that you are properly enforcing your CSAM policies. Nor does it mean that the platform is protected against the infiltration of CSAM files.

Even though this would be a subject for another article, I wanted to highlight you something that seems to me quite interesting: Why we do not see how many NCMEC reports done by Facebook were proved to be CSAM? After all, we are reporting to NCMEC does not mean that the reports are matching and the high number of unproved reports could show us, later on, a risk aversion policy that ends up, more than it should, on NCMECs personnel hands.

“[The amount of content Facebook actions] doesn’t reflect how long it took to detect a violation or how many times users saw that violation while it was on Facebook or Instagram.” […] “After we detect the URL, we remove the 10 million posts”. This proves that my previous guess was right (see here and here)

“As an example, consider a cyberattack during which spammers share 10 million posts featuring the same malicious URL.” (click here to see my previous writtings on that ) – That means that Facebook or Instagram had 10 million chances of seeing this content but didn’t. URLs and hashes are, I guess, mostly realized when this data value turns to be part of the platform’s algorithm – that means, only after a long time they were already hosted there.

“This metric can go up or down due to external factors that are out of our control.” – As, for example, CSAM links on Facebook and Instagram being shared on a CSAM groups with a huge amount of hits in little intervals of time.

“After we detect the URL, we remove the 10 million posts. Content actioned would report 10 million pieces of content acted on, an enormous spike. But this number doesn’t necessarily reflect that we got better at acting on spam; it reflects more that spammers decided that month to attack Facebook with unsophisticated spam that was easy to detect” – This is exactly why, when reading a Transparency Report, the number of how many single pieces of CSAM (and not Child Safety whatsoever) contents were reported.

In case of links, this metric shall be displayed separately (because contents on the link can change and because those are external referrals that shall not be included in the platform’s data, though they should be mentioned for purposes of counted how much CSAM content was referred through Facebook or Instagram)

“Content actioned also doesn’t indicate how much of that spam actually affected users: people might have seen it a few times, or a few hundred or thousand times. (That information is captured in prevalence.)” – The data is measurable, and here is Facebook itself who is saying it. In a platform that accepts known hashes upload with the purpose of removing it, this is a data that would interests me a lot (I have written about it here). After all, if one let the file to be uploaded, one must guarantee that it will be removed before, at least, the first view. More than that, in a platform that applies Artificial Intelligence for CSAM detection, it might be able to explain what happened to content that took a long time to be detected and how can those technologies be improved accordingly to the platform’s need. This is exactly what I have pointed out in another article.

Facebook and Instagram Data are very different

“On Instagram, [Facebook] removes the whole post if it contains violating content, and counts this as one piece of content actioned, regardless of how many photos or videos there are in the post.” . That means that CSAM galleries on Instagram, shared in a single post (multiphotos) are counted only once. If the Policy is that, the question is then if, once the first strike requirement is met, the other contents are being reviewed and their eventual violations reported.

But, from my own point of view, a post with a CSAM file, posted together with others or not, is a single file. And a post containing a collage of, for example, four CSAM files should be counted as four violations. It is not because criminals had a single effort of clicking a single bottom to report multiple content that there is a single violation. You may agree with me, in the collage case for example, that the platform had four chances of identifying it, “gaining” three other violations if they ever found it. Same thing with the single post. So, CSAM data should appear separated, according to the number of images. Otherwise, we are poisoning the data.

Take a look at Facebook now. I cannot, especially when talking about photos and videos – features that also appear on Instagram- figure out why the metric is so different. “When a Facebook post has multiple photos or videos, we count each photo or video as a piece of content. For example, if we remove two violating photos from a Facebook post with four photos, we would count this as two pieces of content actioned: one for each photo removed. If we remove the entire post, then we count the post as well. So for example, if we remove a Facebook post with four photos, we would count this as five pieces of content actioned: one for each photo and one for the post. If we only remove some of the attached photos and videos from a post, we only count those pieces of content.”

Now you might conclude with me that, the reason why Instagram and Facebook’s NCMEC data is so different is not because Instagram Policies are better enforced or the platform has not so much violations as Facebook, but simply, because the metrics are different.

Only the first strike is being counted

When accounts, groups or events are made inaccessible by Facebook moderating system, this data counts as a single piece of removed content. According to Facebook, automatically removed data does not integrate their metrics. That means, in the case of CSAM content, that CSAM pages are being count as a single removal when, in fact, they are many.

“In the metrics included in the Community Standards Enforcement Report, we only count the content in accounts, Pages, Groups or events that we determined to be violating during our reviews of those objects and that we explicitly took action on. We don’t count any content automatically removed upon disabling the account, Page, Group or event that contained that content”

In cases of CSAM pages, this turns to be an important data: When a single content (ex. Profile picture or sharing of a known CSAM hash) leads to the removal of the entire account, is the account checked for purposes of any other CSAM content or not? This data remains as a hidden one. And, while we are talking about pages, though the contents are removed, it changes absolutely everything in terms of reporting (specially where a mandatory report rule to NCMEC applies).

Single files spread around Facebook’s platforms or CSAM pages?

Does it change anything if, just in case, CSAM researchers find out that, in a place like Facebook, Instagram or Whatsoever, we are not talking about single CSAM files being occasionally shared between users in such a speed that creates their high number of NCMEC reports but, in fact, of CSAM groups or even pages?

Yes. Everything. It changes, first of all, the audience data and the way policies (e.g. group policies) should be ready to deal with this situation. It also changes, from a Trust & Safety perspective, the where to look question. After all, are CSAM groups and pages the case, Trust & Safety Teams must be ready to look for patterns in this exact environment. What, by the way, shall reveal itself as an “easier” and more disruptive task than having to look at individuals, one by one, to identify wherever the problem emerges.

Despite that, according to Facebook, “Except for fake accounts on Facebook, this report does not currently include any metrics related to accounts, Pages, Groups or events we took action on—just content within those objects.”

Old CSAM URLs remain

Does it change anything if we discover that a great portion of Facebook’s removal on Child Safety Policy Enforcements and, more specific, CSAM Policy Enforcement is constituted of URLs? Everything. Rather than an imagery technology, the focus here turns to a textual feature (even though images are not, preliminary, excluded for this very same sharing).

How many is the first data that informs about what composition of problem is. Secondly, it informs what the Policy Standards should be. Who holds the URLs data today? Mainly, CSAM hotlines (I have written about it here). Being in direct communication with those should mean, mainly, improving the ability of removing CSAM URLs when they were first assessed by a trusted partner, are the URL active or not. Also, considering that old, inactive URLs can serve as a way to easily found active links if they are indexed, removing all URLs that have been related to CSAM a day whatsoever is something that should be immediately taken under consideration.

And to avoid platforms abuse, the URL policy shall be so fast, or even faster, as the imagery one. Mainly, but not only, because there is no, for example, machine learning that can learn which links host CSAM files and which not.

But, as things seem to be, URL policies are enforced from now on. “When we enforce on URLs, we remove any current or future content that contains those links. We measure how much content we actioned based on if a user attempts to display this content on Facebook.”.

The Policy is only an attempted one, meaning that it is only appliable to URLs users try to share, without success, after Facebook has catalogued there. Current and future, only.