Face Recognition Study - FAQ
Faces of Facebook: Privacy in the Age of Augmented
Reality
Alessandro
Acquisti (Heinz College, Carnegie Mellon University)
Ralph Gross (Heinz College, Carnegie
Mellon University)
Fred Stutzman (Heinz College, Carnegie
Mellon University)
DRAFT slides: Faces of Facebook: Privacy in the Age
of Augmented Reality, presented at BlackHat Las Vegas, August 4, 2011
MANUSCRIPT: Acquisti, Alessandro; Gross, Ralph; and Stutzman,
Fred (2014) "Face
Recognition and Privacy in the Age of Augmented Reality," Journal of
Privacy and Confidentiality: Vol. 6: Iss. 2, Article
1.
Available at: http://repository.cmu.edu/jpc/vol6/iss2/1
The authors gratefully acknowledge research support from the
National Science Foundation under grant # 0713361, from the US Army Research
Office under contract # DAAD190210389, from the Carnegie Mellon Berkman Fund,
from the Heinz College, and from Carnegie Mellon Cylab.
The authors thank Nithin Betegeri,
Aravind Bharadwaj, Varun
Gandhi, Markus Huber, Aaron Jaech, Ganesh Raj ManickaRaju, Rahul Pandey, Nithin
Reddy, and Venkata Tumuluri
for outstanding research assistantship, and Laura Brandimarte, Samita Dhanasobhon, Nitin Grewal,
Anuj Gupta, Hazel Diana Mary, Snigdha
Nayak, Soumya Srivastava, Thejas Varier, and Narayana Venkatesh for additional assistantship.
Please note: this is a DRAFT document. We will keep
adding Q&As if we receive or read relevant questions about the study in
comments and emails. Please bear with us as we add content and work towards a
final, clean version of this FAQ. Thank you!
Index
1.
•What is this research
about?
2.
•What were the results
of Experiment 1?
3.
•What were the results of Experiment 2?
4.
•What were the results
of Experiment 3, and how do they relate to "Augmented Reality"?
5.
•Are these results
scalable?
6.
•What are the
implications of this study?
9.
•Were the tests IRB
approved?
Summary
We investigated the feasibility of combining publicly
available Web 2.0 data with off-the-shelf face recognition software for the
purpose of large-scale, automated individual re-identification. Two experiments
demonstrated the ability of identifying strangers online (on a dating site
where individuals protect their identities by using pseudonyms) and offline (in
a public space), based on photos made publicly available on a social network
site. A third proof-of-concept experiment illustrated the ability of inferring
strangers' personal or sensitive information (their interests and Social
Security numbers) from their faces, by combining face recognition, data mining
algorithms, and statistical re-identification techniques. The results highlight
the implications of the inevitable convergence of face recognition technology
and increasing online self-disclosures, and the emergence of ``personally
predictable'' information. They raise questions about the future of privacy in
an "augmented" reality world in which online and offline data will
seamlessly blend.
General questions
Q. What is
this research about?
We studied the consequences and
implications of the convergence of three technologies: face recognition, cloud computing,
and online social networks. Specifically, we investigated whether the
combination of publicly available Web 2.0 data and off-the-shelf
face recognition software may allow large-scale, automated, end-user individual
re-identification. We identified strangers online (across different online
services: Experiment 1), offline (in the physical world: Experiment 2), and
then inferred additional, sensitive information about them, combining face
recognition and data mining, thus blending together online and offline data
(Experiment 3). Finally, we developed a mobile phone application to demonstrate
the ability to recognize and then predict someone's sensitive personal data
directly from their face in real time.
Q. What
were the results of Experiment 1?
Experiment 1 was about
online-to-online re-identification. We took unidentified profile photos from a
popular dating site (where people use pseudonyms to protect privacy), compared
them - using face recognition - to identified photos from social networking
sites (namely, we used what of a Facebook profile can be publicly accessed via
a search engine; we did not even log on to the network itself), and ended up
re-identifying a statistically significant proportion of members of the dating
site.
Q. What
were the results of Experiment 2?
Experiment 2 was about
offline-to-online re-identification. It was conceptually similar to Experiment
1, but we focused on re-identifying students on the campus of a North American
college. We took images of them with a webcam and then compared those shots to
images from Facebook profiles. Using this approach, we re-identified about one
third of the subjects in the experiment.
Q. What
were the results of Experiment 3, and how do they relate to "Augmented
Reality"?
We use the term augmented reality
in a slightly extended sense, to refer to the merging of online and offline
data that new technologies make possible. If an individual's face in the street
can be identified using a face recognizer and identified images from social
network sites such as Facebook or LinkedIn, then it becomes possible not just
to identify that individual, but also to infer additional, and more sensitive,
information about her, once her name has been (probabilistically) inferred. In
our third experiment, as a proof-of-concept, we predicted the interests and
Social Security numbers of some of the participants in the second experiment.
We did so by combining face recognition with the algorithms we developed in
2009 to predict SSNs from public data. SSNs were nothing more than one
example of what is possible to predict about a person: conceptually, the goal
of Experiment 3 was to show that it is possible to start from an anonymous face
in the street, and end up with very sensitive information about that person, in
a process of data "accretion." In the context of our experiment, it
is this blending of online and offline data - made possible by the convergence
of face recognition, social networks, data mining, and cloud computing - that
we refer to as augmented reality.
Q. Are
these results scalable?
The capabilities of automated face
recognition *today* are still limited - but keep improving. Although our
studies were completed in the "wild" (that is, with real social
networks profiles data, and webcam shots taken in public, and so forth), they
are nevertheless the output of a controlled (set of) experiment(s). The results
of a controlled experiment do not necessarily translate to reality with the same
level of accuracy. However, considering the technological trends in cloud
computing, face recognition accuracy, and online self-disclosures, it is hard
not to conclude that what today we presented as a proof-of-concept in our
study, tomorrow may become as common as everyday's
text-based search engine queries.
Q. What are
the implications of this study?
Our study is less about face
recognition and more about privacy concerns raised by the convergence of
various technologies. There is no obvious answer and solution to the privacy
concerns raised by widely available face recognition and identified (or
identifiable) facial images. Google's Eric Schmidt observed that, in the
future, young individuals may be entitled to change their names to disown youthful
improprieties. It is much harder, however, to change someone's face. Other than
adapting to a world where every stranger in the street could predict quite
accurately sensitive information about you (such as your SSN, but also your
credit score, or sexual orientation), we need to think about policy solutions
that can balance the benefits and risks of peer-based face recognition.
Self-regulation, or opt-in mechanisms, are not going to work, since the results
we presented are based on publicly available information.
Q. Face
recognition has been around for a long while, and Web 2.0 companies have
deployed it in their tools/applications. What is new about this study?
Indeed, in recent times, Google
has acquired Neven Vision, Riya, and PittPatt and deployed
face recognition into Picasa. Apple has acquired Polar Rose, and deployed face
recognition into iPhoto. Facebook has licensed Face.com to enable automated
tagging. So far, however, these end-user Web 2.0 applications are limited in
scope: They are constrained by, and within, the boundaries of the service in
which they are deployed. Our focus, however, was on examining whether the
convergence of publicly available Web 2.0 data, cheap cloud computing, data
mining, and off-the-shelf face recognition is bringing us closer to a world
where anyone may run face recognition on anyone else, online and offline - and
then infer additional, sensitive data about the target subject, starting merely
from one anonymous piece of information about her: the face.
The National Science Foundation
(under Grant 0713361) and the U.S. Army Research Office (under Contract
DAAD190210389, through Carnegie Mellon's CyLab). We
also received support from the Carnegie Mellon Berkman Fund, Heinz College, and
CyLab.
Q. Were the
tests IRB
approved?
Yes, they were approved. As in our
previous studies, no SSNs (or faces) were harmed during the writing of this
paper.