Microsoft-developed system may be introduced this year once complication of handling pictures posted alongside billions of tweets can be overcome
Twitter is to introduce a tagging system to prevent child abuse images being posted on its service, which now sees millions of pictures posted among the 2bn tweets every five days.
The intention is to introduce the system, which uses a Microsoft-developed industry standard called "PhotoDNA", later this year if possible.
The move was revealed exclusively to the Guardian as the prime minister, David Cameron, steps up pressure on internet providers and particularly on search engines to block access to images of child abuse online.
Twitter's move has come independently of UK pressure. Microsoft and Facebook already use PhotoDNA to monitor images posted to the social network, Microsoft's Skydrive service and accessible via its Bing search engine.
PhotoDNA works by producing a "hash" – a single number generated from the binary data of a picture or video, and some biometric information in the picture. The method still works even if the image is resized or altered.
When an image is posted, its hash is compared against known images of child abuse which have been flagged by operations such as the UK's Internet Watch Foundation and the US's National Center for Missing & Exploited Children's (NCMEC) Child Victim Identification Program. The latter found 17.3m images of abuse in 2011 – and since 2002 has reviewed more than 65m images and videos of child sexual exploitation reported by the authorities. However, the "hash" database is reckoned to be much smaller.
Microsoft developed the system in 2009 with Dartmouth College in the US, and donated the technology to the NCMEC. Facebook began using it in 2011.
"One of the most exciting things that we're working on is implementing PhotoDNA," said Del Harvey, senior director of Twitter's Trust & Safety team. "It's really fantastic that we're making progress on getting that in place. And it's good that others in the industry are working on it, or on implementing it, because this is one of those areas which is not about competition, it's about co-operation. We're trying to keep the user safe."
Harvey worked on preventing child abuse before joining Twitter in 2008. She said that there are complications to implementing PhotoDNA on Twitter, based on the sheer scale and speed of the service. It is also complicated by the involvement of outside companies called Content Delivery Networks (CDNs), which store copies of data posted online at locations closer to users, so they can be downloaded more quickly.
"You think 'we'll just delete the image', but then you face the question of whether it's hosted on a CDN. In that case, how do you make sure it gets flushed out? What if there's a backlog of requests for images to delete? You start to wonder if these things really have to be this complicated just to delete an image – and the answer turns out to be yes, it really does have to be this complicated."
Google said in June that it has used a "hashing" technology to identify child abuse images online since 2008. It does not use PhotoDNA, but its system is compatible with it.