Skip Navigation

Stop Collecting Immigrants’ Social Media Data

Social media data in the hands of the DHS to privacy, opens the door to discrimination and abuse, and threatens freedom of speech and association.

June 30, 2019
Cross-posted from the New York Times.
Since the 2016 elec­tion, Congress has woken up to the consequences of allow­ing social media compan­ies to hold vast stores of inform­a­tion about hundreds of millions of users and use it for their own purposes. But it contin­ues to close its eyes to the dangers of allow­ing the Depart­ment of Home­land Secur­ity to tap into the same well of inform­a­tion for immig­ra­tion decisions.
The cent­ral­iz­a­tion of highly personal inform­a­tion in the hands of this power­ful agency is detri­mental to privacy, opens the door to discrim­in­a­tion and abuse, and threatens free­dom of speech and asso­ci­ation.
An errant Face­book comment flagged by an algorithm can mark someone as a secur­ity risk, barring the door to a refugee flee­ing war or a mother seek­ing to visit her Amer­ican chil­dren. Despite claims of threats to national secur­ity, there is scant payoff. Empir­ical research shows that the like­li­hood of getting killed in a terror­ist attack by an immig­rant or visitor to this coun­try is vanish­ingly small.
And posts and tweets are often unre­li­able. People posture, joke, speak in short­hand and use cultural refer­ences that are hard for others to inter­pret. It’s no surprise that the D.H.S.’s own pilot programs show that social media has not been useful in identi­fy­ing threats.
As my colleagues and I have docu­mented, the D.H.S. is find­ing ways to use social media data in several programs. It makes its way into the agency’s network of data­bases through searches of phones and laptops at the border and checks of people apply­ing for visas and immig­ra­tion bene­fits. It is used to vet Syrian and Iraqi refugees, as well as some asylum seekers. The D.H.S. has several opaque multi­mil­lion dollar contracts with private data analyt­ics compan­ies like Palantir.
The State Depart­ment, D.H.S.’s close part­ner in visa vetting, is build­ing a registry of social media handles that will make it easier to track what people say online.
Since 2016, trav­el­ers from 38 (mostly European) visa waiver coun­tries have been asked to volun­tar­ily provide their social media handles. And since last month, the almost 15 million people who apply for visas to enter the United States each year must disclose all social media handles that they have used in the last five years on 20 major plat­forms, includ­ing Face­book, Instagram, and Twit­ter.
Amer­ic­ans are caught up in this net too. The D.H.S.’s data­bases aren’t limited to foreign nation­als. And even a foreign nation­al’s social media activ­ity reveals that person’s network of friends, relat­ives and co-work­ers, some close and some distant, but all fair game for the D.H.S.
Social media surveil­lance does­n’t always stop when trav­el­ers reach Amer­ican shores, where their web of local contacts are likely to expand. Last year, Immig­ra­tion and Customs Enforce­ment awar­ded a $100 million contract for continu­ous monit­or­ing of 10,000 people annu­ally that it calls high risk, and D.H.S. lead­er­ship has made it plain that it is look­ing for ways to monitor visit­ors and immig­rants inside the United States.
Social media can reveal the most intim­ate aspects of our lives: whether a person is gay or straight, whether she is a gun owner or a supporter of Planned Parent­hood, whether she goes to the mosque on Fridays or to church on Sundays.
While this type of inform­a­tion is not relev­ant to secur­ity, it can be used to go after people the author­it­ies disfa­vor by refus­ing them entry to the coun­try, deport­ing them, target­ing them for invest­ig­a­tion, shar­ing their inform­a­tion with a repress­ive foreign govern­ment or just hass­ling them at the airport.
One of Pres­id­ent Trump’s first acts in office was to bar trav­el­ers from several Muslim coun­tries. When the ban was struck down by federal courts, the State Depart­ment imposed addi­tional vetting meas­ures that just happened to cover about the same number of people as the ban. The follow­ing year, a draft D.H.S. report proposed tagging young Muslim men as “at-risk persons” for intens­ive screen­ing and continu­ous monit­or­ing. The admin­is­tra­tion has gone after those oppos­ing its draconian immig­ra­tion policies too, using social media to track activ­ists from the south­ern border to New York City.
The D.H.S.’s own tests show that social media content is an unre­li­able basis for making judg­ments about national secur­ity risk. A brief prepared for the incom­ing Trump admin­is­tra­tion expli­citly ques­tioned its util­ity: In pilot programs it was diffi­cult to match indi­vidu­als to their social media accounts, and even where a match was found, it was hard to judge whether there were “indic­at­ors of fraud, public safety, or national secur­ity concern.”
False negat­ives were a prob­lem too. One program for vetting refugees found that social media did not “yield clear, artic­ul­able links to national secur­ity concerns,” even for applic­ants who were iden­ti­fied as poten­tial threats based on other types of screen­ing.
Given the volume of social media inform­a­tion, it’s no surprise that the D.H.S. is look­ing for algorithms to help. But computers are even worse than humans in making sense of what is said on social media, partic­u­larly when it comes to nuance and context. Even the best natural language processing program gener­ally achieves 70 percent to 75 percent accur­acy, which means more than a quarter of posts would be misin­ter­preted.
Tone and senti­ment analysis, which D.H.S. offi­cials have floated as an option, is even less accur­ate. Accord­ing to one study, it had a 27 percent success rate in predict­ing polit­ical ideo­logy based on what people post on Twit­ter.
Accur­acy takes a nose-dive when the speech being analyzed is not stand­ard English, which is used to train most tools. The post “Bored af den my phone finna die!!!!” was flagged by an algorithm as Danish with 99.9 percent confid­ence.
Algorithms simply cannot make the types of judg­ment calls required in many immig­ra­tion settings: What inform­a­tion is derog­at­ory? What suggests that someone is a national secur­ity threat? Last year, ICE backed away from one auto­mated vetting program after data scient­ists declared a computer simply could not figure out who would be a “posit­ively contrib­ut­ing member of soci­ety,” “make contri­bu­tions to the national interest” or commit a crime or terror­ist act, and could instead easily resort to biased prox­ies.
It’s all too easy to see how social media inform­a­tion can be used for the Trump admin­is­tra­tion’s most egre­gious initi­at­ives. It can tell the govern­ment who has criti­cized Amer­ican foreign policy, so they can be denied permis­sion to travel here. It can reveal where a child goes to school, allow­ing ICE agents to lie in wait outside for an undoc­u­mented parent.
But we cannot lay blame at the feet of the Trump admin­is­tra­tion alone. Efforts to lever­age social media star­ted during the Obama admin­is­tra­tion and have been cheered on by many in Congress. When admin­is­tra­tion offi­cials tout social media monit­or­ing efforts to congres­sional commit­tees, they are rarely ques­tioned on the implic­a­tions of accu­mu­lat­ing this data or even on the effect­ive­ness of these efforts.
More recently, some members of Congress have raised concerns about partic­u­lar programs based on media reports about the track­ing of activ­ists and protest­ers. And the hack­ing of license plate inform­a­tion collec­ted by Customs and Border Protec­tion has promp­ted calls for better inform­a­tion secur­ity. This nibbling at the edges does­n’t grapple with the implic­a­tions of allow­ing these data collec­tion programs to prolif­er­ate.
It’s time for Congress to conduct a full review of the use of social media in immig­ra­tion decisions. It should start by requir­ing the D.H.S. to account for all the ways in which it collects and uses this inform­a­tion, provide object­ive assess­ments of its useful­ness and explain how it plans to protect the privacy of the millions of people whose inform­a­tion is, or soon will be, in its data­bases.