Skip to main content

On facial recognition and retroactive indexing

Anonymizing data is already quite difficult, as shown in this 2015 paper on the reidentifiability of scrubbed credit card metadata. Beyond ineffective anonymizing, another disturbing aspect is the rate at which AI and ML are improving at image recognition. In particular, face recognition is approaching practicality for general purpose use (See Amazon Rekognition for example). While these technologies aren't quite there yet, they will inevitably reach that point. Once coupled with a data sets that are already publicly available, this means that large public image repositories like Imgur will become petri dishes for face recognition data.

These technologies affect existing data retroactively. What is now an unlabeled morass of anonymous pictures could conceivably become treasure troves in the future for data brokers when the cost of picking out pictures of one's likeness from billions of images becomes easily affordable. This can and should be concerning to anyone who's put up anything incriminating on the internet. At this point in history, that's pretty much everyone.

Toss in unintentional location tagging, and a list of pictures becomes not just a timeline, but also a trace of your history of existence.

It won't just be the FBI that can access a large database of facial recognition data. It'll be just about anyone from a potential employer to a stalker.

Automatic License Plate Recognition and its kin are already a good source of side income for anyone who can point a camera at a street. Without due regulation in place, face recognition has the potential to be even more lucrative. The best part is that even if facial recognition itself is regulated, collecting unlabeled video footage of public spaces will likely remain viable for a long time to come.

What does all this mean? Soon it will become practical to start with someone's face and locate any and all pictures featuring them whether those pictures were made available to the public by that person or by others. Even ones where the subject was photographed without their knowledge.

Comments

Popular posts from this blog

Critical Section Contention

If you've done any sort of multi-threaded programming on Windows, chances are you've worked with CRITICAL_SECTION s. They are lightweight and effective. However, if your critical sections are causing too much contention, they might be the cause of serious performance problems. While working on Network Identity Manager, I was curious about how all of its hundreds of critical sections are doing. Network Identity Manager is a multi-threaded plug-in based application that is often prone to contention during certain operations. But how bad is it? Critical section contention is a fact of life. It is the normal mode of operation for them. However, each time it happens, the application may pay a fairly hefty performance penalty. We are only interested in keeping contention in check. As such, we would like to know which critical sections are experiencing the highest contention levels so we can evaluate design changes that may help alleviate them. A thorough treatment on the semantic...

Hello there, Mr. Stallman!

The last time I saw Richard Stallman was when he got into a bus that I was on in Cambridge, MA about 6 or so years back.  His name comes up often whenever the Free Software Foundation , GNU , or open source software in general is discussed.  It should have been no surprise for me to see his name show up in Ohloh , a “ free public directory of open source projects and people ”.  What surprised me was where I found it: There we go.  Five slots above mine. My Kudo Position recently received a fairly large boost (as evidenced by the +2339 above) due to Secure-Endpoints pushing Network Identity Manager and the KCA Provider for Network Identity Manager into GitHub.  Once this got me excited about my Kudo Rank, I discovered that much like democracy, the rank isn’t based upon my opinion of my work or the quantity thereof, but on what others think about it. So how exactly is the Kudo Rank and Kudo Position calcuated? Ohloh’s About Kudos page describes the proce...

How usability wins

Back in 2000, when I left the USA and was working in Sri Lanka for a what would turn out to be a nearly two year stint at Slimline, I noticed something interesting.  Before I left, whenever people spoke of searching on the web, the names MSN or Yahoo! would inevitably crop up, since they were the dominant search interfaces* at the time.  There was some talk of Google being an emerging search provider, but it was yet to catch up with the incumbent giants.  The situation in Sri Lanka, however, was markedly different.  Everyone at Slimline was using Google for search. This was intriguing for me, specially since searching the web was synonymous with using Google (although the verb “google” wasn’t in use yet).  How was it that places like this were such early converts to Google search, seemingly so far ahead of the curve compared to USA where a majority were still stuck on MSN and Yahoo!? I don’t have a definite answer, but if I were to make an educated guess, it...