Stop Collecting Immigrants’ Social Media Data
Social media data in the hands of the DHS to privacy, opens the door to discrimination and abuse, and threatens freedom of speech and association.
June 30, 2019
Cross-posted from the New York Times.
Since the 2016 election, Congress has woken up to the consequences of allowing social media companies to hold vast stores of information about hundreds of millions of users and use it for their own purposes. But it continues to close its eyes to the dangers of allowing the Department of Homeland Security to tap into the same well of information for immigration decisions.
The centralization of highly personal information in the hands of this powerful agency is detrimental to privacy, opens the door to discrimination and abuse, and threatens freedom of speech and association.
An errant Facebook comment flagged by an algorithm can mark someone as a security risk, barring the door to a refugee fleeing war or a mother seeking to visit her American children. Despite claims of threats to national security, there is scant payoff. Empirical research shows that the likelihood of getting killed in a terrorist attack by an immigrant or visitor to this country is vanishingly small.
And posts and tweets are often unreliable. People posture, joke, speak in shorthand and use cultural references that are hard for others to interpret. It’s no surprise that the D.H.S.’s own pilot programs show that social media has not been useful in identifying threats.
As my colleagues and I have documented, the D.H.S. is finding ways to use social media data in several programs. It makes its way into the agency’s network of databases through searches of phones and laptops at the border and checks of people applying for visas and immigration benefits. It is used to vet Syrian and Iraqi refugees, as well as some asylum seekers. The D.H.S. has several opaque multimillion dollar contracts with private data analytics companies like Palantir.
The State Department, D.H.S.’s close partner in visa vetting, is building a registry of social media handles that will make it easier to track what people say online.
Since 2016, travelers from 38 (mostly European) visa waiver countries have been asked to voluntarily provide their social media handles. And since last month, the almost 15 million people who apply for visas to enter the United States each year must disclose all social media handles that they have used in the last five years on 20 major platforms, including Facebook, Instagram, and Twitter.
Americans are caught up in this net too. The D.H.S.’s databases aren’t limited to foreign nationals. And even a foreign national’s social media activity reveals that person’s network of friends, relatives and co-workers, some close and some distant, but all fair game for the D.H.S.
Social media surveillance doesn’t always stop when travelers reach American shores, where their web of local contacts are likely to expand. Last year, Immigration and Customs Enforcement awarded a $100 million contract for continuous monitoring of 10,000 people annually that it calls high risk, and D.H.S. leadership has made it plain that it is looking for ways to monitor visitors and immigrants inside the United States.
Social media can reveal the most intimate aspects of our lives: whether a person is gay or straight, whether she is a gun owner or a supporter of Planned Parenthood, whether she goes to the mosque on Fridays or to church on Sundays.
While this type of information is not relevant to security, it can be used to go after people the authorities disfavor by refusing them entry to the country, deporting them, targeting them for investigation, sharing their information with a repressive foreign government or just hassling them at the airport.
One of President Trump’s first acts in office was to bar travelers from several Muslim countries. When the ban was struck down by federal courts, the State Department imposed additional vetting measures that just happened to cover about the same number of people as the ban. The following year, a draft D.H.S. report proposed tagging young Muslim men as “at-risk persons” for intensive screening and continuous monitoring. The administration has gone after those opposing its draconian immigration policies too, using social media to track activists from the southern border to New York City.
The D.H.S.’s own tests show that social media content is an unreliable basis for making judgments about national security risk. A brief prepared for the incoming Trump administration explicitly questioned its utility: In pilot programs it was difficult to match individuals to their social media accounts, and even where a match was found, it was hard to judge whether there were “indicators of fraud, public safety, or national security concern.”
False negatives were a problem too. One program for vetting refugees found that social media did not “yield clear, articulable links to national security concerns,” even for applicants who were identified as potential threats based on other types of screening.
Given the volume of social media information, it’s no surprise that the D.H.S. is looking for algorithms to help. But computers are even worse than humans in making sense of what is said on social media, particularly when it comes to nuance and context. Even the best natural language processing program generally achieves 70 percent to 75 percent accuracy, which means more than a quarter of posts would be misinterpreted.
Tone and sentiment analysis, which D.H.S. officials have floated as an option, is even less accurate. According to one study, it had a 27 percent success rate in predicting political ideology based on what people post on Twitter.
Accuracy takes a nose-dive when the speech being analyzed is not standard English, which is used to train most tools. The post “Bored af den my phone finna die!!!!” was flagged by an algorithm as Danish with 99.9 percent confidence.
Algorithms simply cannot make the types of judgment calls required in many immigration settings: What information is derogatory? What suggests that someone is a national security threat? Last year, ICE backed away from one automated vetting program after data scientists declared a computer simply could not figure out who would be a “positively contributing member of society,” “make contributions to the national interest” or commit a crime or terrorist act, and could instead easily resort to biased proxies.
It’s all too easy to see how social media information can be used for the Trump administration’s most egregious initiatives. It can tell the government who has criticized American foreign policy, so they can be denied permission to travel here. It can reveal where a child goes to school, allowing ICE agents to lie in wait outside for an undocumented parent.
But we cannot lay blame at the feet of the Trump administration alone. Efforts to leverage social media started during the Obama administration and have been cheered on by many in Congress. When administration officials tout social media monitoring efforts to congressional committees, they are rarely questioned on the implications of accumulating this data or even on the effectiveness of these efforts.
More recently, some members of Congress have raised concerns about particular programs based on media reports about the tracking of activists and protesters. And the hacking of license plate information collected by Customs and Border Protection has prompted calls for better information security. This nibbling at the edges doesn’t grapple with the implications of allowing these data collection programs to proliferate.
It’s time for Congress to conduct a full review of the use of social media in immigration decisions. It should start by requiring the D.H.S. to account for all the ways in which it collects and uses this information, provide objective assessments of its usefulness and explain how it plans to protect the privacy of the millions of people whose information is, or soon will be, in its databases.