Skip Navigation
Image of Social Media and Surveillance
Informe

Social Media Monitoring

Resumen: Personal information gleaned from social media posts has been used to target dissent and subject religious and ethnic minorities to enhanced vetting and surveillance.

marzo 11, 2020
//
mayo 22, 2019
Image of Social Media and Surveillance
marzo 11, 2020
//
mayo 22, 2019

Introduction

This report was updated in March 2020 to reflect Immigration and Customs Enforcement’s new policy concerning border searches of electronic devices.

The Department of Homeland Security (DHS) is rapidly expanding its collection of social media information and using it to evaluate the security risks posed by foreign and American travelers. This year marks a major expansion. The visa applications vetted by DHS will include social media handles that the State Department is set to collect from some 15 million travelers per year. 1 Social media can provide a vast trove of information about individuals, including their personal preferences, political and religious views, physical and mental health, and the identity of their friends and family. But it is susceptible to misinterpretation, and wholesale monitoring of social media creates serious risks to privacy and free speech. Moreover, despite the rush to implement these programs, there is scant evidence that they actually meet the goals for which they are deployed.

While officials regularly testify before Congress to highlight some of the ways in which DHS is using social media, they rarely give a full picture or discuss either the effectiveness of such programs or their risks. The extent to which DHS exploits social media information is buried in jargon-filled notices about changes to document storage systems that impart only the vaguest outlines of the underlying activities.

To fill this gap, this report seeks to map out the department’s collection, use, and sharing of social media information by piecing together press reports, information obtained through Freedom of Information Act requests, Privacy Impact Assessments, 2 System of Records Notices (SORNs), 3 departmental handbooks, government contracts, and other publicly available documents.

In light of DHS’s expanding use of social media monitoring programs, understanding the ways in which the department exploits social media is critical. Personal information gleaned from social media posts has been used to target dissent and subject religious and ethnic minorities to enhanced vetting and surveillance. Some DHS programs are targeted at travelers, both Americans and those from other countries. And while the department’s immigration vetting programs ostensibly target foreigners, they also sweep up information about American friends, family members, and business associates, either deliberately or as a consequence of their broad scope.

Muslims are particularly vulnerable to targeting. According to a 2011 Pew survey (which was followed by a similar survey in 2017), more than a third of Muslim Americans who traveled by air reported that they had been singled out by airport security for their faith, suggesting a connection between being a devout Muslim and engaging in terrorism that has long been debunked. 4 A legal challenge to this practice is pending. 5 According to government documents, one of the plaintiffs, Hassan Shibly, executive director of the Florida chapter of the Council on American-Islamic Relations, was pulled aside for secondary screening at the border at least 20 times from 2004 to 2011. 6 He says he was asked questions like “Are you part of any Islamic tribes?” and “Do you attend a particular mosque?” 7 Shibly’s story is hardly unique. 8

Concerns about such screenings are even more urgent under the Trump administration, which has made excluding Muslims a centerpiece of its immigration agenda through policies such as the Muslim ban and implementation of “extreme vetting” for refugee and visa applicants, primarily those from the Muslim world. 9 A leaked DHS draft report from 2018 suggests that the administration is considering tagging young Muslim men as “at-risk persons” who should be subjected to intensive screening and ongoing monitoring. 10 If implemented, such a policy would affect hundreds of thousands of people. 11 DHS’s social media monitoring pilot programs seem to have focused in large part on Muslims: at least two targeted Syrian refugees, one targeted both Syrian and Iraqi refugees, and the analytical tool used in at least two pilots was tailored to Arabic speakers. 12

More generally, social media monitoring — like other forms of surveillance — will impact what people say online, leading to self-censorship of people applying for visas as well as their family members and friends. The deleterious effect of surveillance on free speech has been well documented in empirical research; one recent study found that awareness or fear of government surveillance of the internet had a substantial chilling effect among both U.S. Muslims and broader U.S. samples of internet users. 13 Even people who said they had nothing to hide were highly likely to self-censor online when they knew the government was watching. 14 As Justice Sonia Sotomayor warned in a 2012 Supreme Court case challenging the warrantless use of GPS tracking technology, “[a]wareness that the Government may be watching chills associational and expressive freedoms. And the Government’s unrestrained power to assemble data that reveals private aspects of identity is susceptible to abuse.” 15

DHS’s pilot programs for monitoring social media have been notably unsuccessful in identifying threats to national security. 16 In 2016, DHS piloted several social media monitoring programs, one run by ICE and five by United States Customs and Immigration Services (USCIS). 17 A February 2017 DHS inspector general audit of these pilot programs found that the department had not measured their effectiveness, rendering them an inadequate basis on which to build broader initiatives. 18

Even more damning are USCIS’s own evaluations of the programs, which showed them to be largely ineffective. According to a brief prepared by DHS for the incoming administration at the end of 2016, for three out of the four programs used to vet refugees, “the information in the accounts did not yield clear, articulable links to national security concerns, even for those applicants who were found to pose a potential national security threat based on other security screening results.” 19 The brief does show that USCIS complied with its own rules, which prohibit denying benefits solely on the basis of public-source information — such as that derived from social media — due to “its inherent lack of data integrity.” 20 The department reviewed 1,500 immigration benefits cases and found that none were denied “solely or primarily because of information uncovered through social media vetting.” 21 But this information provided scant insights in any event: out of the 12,000 refugee applicants and 1,500 immigration benefit applicants screened, USCIS found social media information helpful only in “a small number of cases,” where it “had a limited impact on the processing of those cases — specifically in developing additional lines of inquiry.” 22

In fact, a key takeaway from the pilot programs was that they were unable to reliably match social media accounts to the individual being vetted, and even where the correct accounts were found, it was hard to determine “with any level of certainty” the “authenticity, veracity, [or] social context” of the data, as well as whether there were “indicators of fraud, public safety, or national security concern.” 23 The brief explicitly questioned the overall value of the programs, noting that dedicating personnel “to mass social media screening diverts them away from conducting the more targeted enhanced vetting they are well trained and equipped to do.” 24

The difficulties faced by DHS personnel are hardly surprising; attempts to make judgments based on social media are inevitably plagued by problems of interpretation. 25 In 2012, for example, a British national was denied entry at a Los Angeles airport when DHS agents misinterpreted his posting on Twitter that he was going to “destroy America” — slang for partying — and “dig up Marilyn Monroe’s grave” — a joking reference to a television show. 26 As the USCIS pilot programs demonstrate, interpretation is even harder when the language used is not English and the cultural context is unfamiliar. If the State Department’s current plans to undertake social media screening for 15 million travelers are implemented, government agencies will have to be able to understand the languages (more than 7,000) and cultural norms of 193 countries. 27

Nonverbal communications on social media pose yet another set of challenges. As the Brennan Center and 34 other civil rights and civil liberties organizations pointed out in a May 2017 letter to the State Department:

“If a Facebook user posts an article about the FBI persuading young, isolated Muslims to make statements in support of ISIS, and another user ‘loves’ the article, is he sending appreciation that the article was posted, signaling support for the FBI’s practices, or sending love to a friend whose family has been affected?” 28

All of these difficulties, already substantial, are compounded when the process of reviewing posts is automated. Obviously, using simple keyword searches in an effort to identify threats would be useless because they would return an overwhelming number of results, many of them irrelevant. One American police department learned this lesson the hard way when efforts to unearth bomb threats online instead turned up references to “bomb” (i.e., excellent) pizza. 29 Natural language processing, the tool used to judge the meaning of text, is not nearly accurate enough to do the job either. Studies show that the highest accuracy rate achieved by these tools is around 80 percent, with top-rated tools generally achieving 70–75 percent accuracy. 30 This means that 20–30 percent of posts analyzed through natural language processing would be misinterpreted.

Algorithmic tone and sentiment analysis, which senior DHS officials have suggested is being used to analyze social media, is even less accurate. 31 A recent study concluded that it could make accurate predictions of political ideology based on users’ Twitter posts only 27 percent of the time, observing that the predictive exercise was “harder and more nuanced than previously reported.” 32 Accuracy plummets even further when the speech being analyzed is not standard English. 33 Indeed, even English speakers using nonstandard dialects or lingo may be misidentified by automated tools as speaking in a different language. One tool flagged posts in English by black and Hispanic users — like “Bored af den my phone finna die!!!!” (which can be loosely translated as “I’m bored as f*** and then my phone is going to die”) — as Danish with 99.9 percent confidence. 34

Crucially — as the USCIS pilot programs discussed above demonstrated — algorithms are generally incapable of making the types of subjective evaluations that are required in many DHS immigration programs, such as whether someone poses a threat to public safety or national security or whether certain information is “derogatory.” Moreover, because these types of threats are difficult to define and measure, makers of algorithms will turn to “proxies” that are more easily observed. But there is a risk that the proxies will bear little or no relationship to the task and that they will instead reflect stereotypes and assumptions. The questioning of Muslim travelers about their religious practice as a means of judging the threat they pose shows that unfounded and biased assumptions are already entrenched at DHS. It would be easy enough to embed them in an algorithm.

Despite these serious shortcomings in terms of effectiveness and critics’ well-founded concerns about the potential for targeting certain political views and faiths, DHS is proceeding with programs for monitoring social media. 35 The department’s attitude is perhaps best summed up by an ICE official who acknowledged that while they had not yet found anything on social media, “you never know, the day may come when social media will actually find someone that wasn’t in the government systems we check.” 36

The consequences of allowing these types of programs to continue unchecked are too grave to ignore. In addition to responding to particular cases of abuse, Congress needs to fully address the risks of social media monitoring in immigration decisions. This requires understanding the overall system by which DHS collects this type of information, how it is used, how it is shared with other agencies, and how it is retained – often for decades – in government databases. Accordingly, this paper maps social media exploitation by the four parts of DHS that are most central to immigration: Customs and Border Protection (CBP), the Transportation Security Administration (TSA), Immigration and Customs Enforcement (ICE), and United States Citizenship and Immigration Services (USCIS). It also examines DHS’s cooperation with the Department of State, which plays a key role in immigration vetting.


Case Studies: Using Social Media to Target First Amendment– Protected Activity

Key Findings

While the ways in which these DHS units use social media vary, our review identified eight common threads.

1. Social media information is collected from travelers, including Americans, even when they are not suspected of any connection to illegal activity.

People planning to travel to the United States are increasingly being asked to provide social media identifiers, such as their Twitter or Instagram handles, enabling the creation of a registry of their online postings. In December 2016, DHS began asking the travelers who come to the United States from countries covered by the Visa Waiver Program — some 23.6 million annually, primarily from Western Europe — to voluntarily provide their social media identifiers. 1 In May 2017, the State Department, as part of implementing the Muslim ban executive order, began requiring some categories of visa applicants — estimated at 65,000 applicants annually — to provide a list of the identifiers they had used on social media platforms within the previous five years. 2 In March 2018, the State Department started to ramp up its efforts, proposing a new rule that would collect social media identifiers from every visa applicant — i.e., the 15 million people who apply for visas each year. 3 The proposal was approved in April 2019, with minor privacy-related changes. 4

Social media data can also be collected via searches of electronic devices, which DHS carries out — often without suspicion of criminal activity — on both American and foreign travelers. The department claims the authority to undertake warrantless searches of these devices not just at points of entry but also in areas in the broad vicinity of the border. 5 These searches are conducted primarily by CBP and ICE. While ICE does not report statistics on these searches, CBP does. Its searches of travelers’ electronic devices at ports of entry have been steadily increasing over the past several years. In fiscal year 2015, CBP searched the devices of 8,503 travelers. 6 By fiscal year 2017, this number had gone up to 30,200 — an increase of over three and a half times. 7 According to ABC News, 20 percent of these searches are carried out on American travelers. 8

Finally, through contracts with various private companies, DHS acquires massive commercial databases of online information, including social media data. 9 Unlike the direct collection of social media handles by DHS and the State Department, there is no assurance that an individual will be accurately connected to a social media profile. The difficulty of matching people to profiles was a major shortcoming of automated monitoring tested by the pilot programs discussed above. 10

2. Social media checks extend to travelers’ family, friends, business associates, and social media contacts.

When DHS checks the social media of someone trying to obtain permission to come to the United States or someone already at or near the border, it inevitably picks up information about people with whom they interact. For example, ICE agents searching a traveler’s smartphone at or near the border can scroll through her Facebook and Twitter accounts and record what they find, including information about friends and family, location data, and sensitive personal information, all without any suspicion of criminal activity. With reasonable suspicion of criminal activity, ICE agents can download the entirety of her social media accounts and go through them later. 11

In addition, CBP agents conducting social media checks for people applying for visa waivers (available to the citizens of 38 countries) can examine not only the applicant’s posts but those of the people who interacted with her on social media (even if uninvited), and may retain information so long as the agent believes it is “relevant” to the waiver decision. 12 The program also allows agents to proactively

identify an applicant’s secondary and tertiary contacts who might “pose a potential risk to the homeland” or “demonstrate a nefarious affiliation on the part of the applicant.” 13 Examining contacts and networks may make sense when pursuing someone who is suspected of wrongdoing. But applying this technique to people who are simply seeking to travel opens the door to fishing expeditions for information that can easily be misinterpreted.

Automated analytical tools used by DHS combine social media with other types of information to identify and map possible associations among people and organizations. ICE and CBP both use data systems developed by the data mining 14 company Palantir Technologies, Inc., that are equipped with tools to analyze social networks. 15 However, the reliability of the information ingested by these systems is not verified; DHS has exempted them from the relevant requirements of the Privacy Act, and there are functionally no mechanisms for the individuals whose information is included to challenge the accuracy of the data. 16 According to reports from watchdog groups and the press, these systems are being used by ICE to identify individuals for deportation. 17

These data sets are also used by DHS to undertake broader trend, pattern, and predictive analyses, through a number of systems that are described in this paper. 18 While the privacy impact assessments for these systems often identify the sources of information used in these analyses, there is almost no publicly available information regarding what types of trends or patterns DHS is seeking to identify or how social media information fits into these types of analyses.

3. DHS frequently uses social media information for vague and open-ended evaluations that can be used to target unpopular views or populations.

Our review showed that in many instances — including the Visa Waiver Program and warrantless searches at the border by CBP and ICE — DHS personnel are charged with examining social media to identify information relating to undefined “national security” risks or concerns. 19 Publicly available documents do not indicate what type of information might be regarded as indicative of a national security risk, and it has been reported that at least some agents are uncertain about what type of information would be considered to be suggestive of a national security risk. 20 While agents obviously must have some flexibility to make judgments, the breadth of discretion combined with weak safeguards opens the door for discrimination based on political or religious views.

Social media information forms part of the data set that DHS uses to assign risk assessments to individual travelers through CBP’s Automated Targeting System (ATS). These assessments are highly consequential because they determine who is allowed to enter the country and what level of questioning they are required to undergo. 21 But there is no publicly available information about the accuracy, effectiveness, or empirical basis of risk assessments. 22 In fact, the information that goes into one’s risk assessment need not be “accurate, relevant, timely, [or] complete,” as DHS exempted ATS from these Privacy Act requirements. 23 This is particularly troubling because in other settings, such as the criminal justice system, risk assessments have been shown to disproportionately impact minorities. 24

For example, as of at least 2017, DHS compares refugee and asylum applicant information from social media and other sources against the information provided by an applicant to identify any inconsistencies. Such social media checks are, however, performed only on select populations of asylum seekers and refugees. With the exception of Iraqis and Syrians, these applicant populations have not been publicly identified. 25 However, one prominent refugee organization reported in 2018 that these measures are applied to refugee applicants from the Muslim countries of Egypt, Iran, Iraq, Libya, Mali, Somalia, South Sudan, Sudan, Syria, and Yemen, as well as North Korea. 26 All of the countries covered by the Trump Muslim ban are on this list.

4. DHS is continuously monitoring some people inside the United States and plans to expand these efforts.

DHS is increasingly implementing programs to continuously monitor people inside the United States, where freedom of speech, association, and religion are constitutionally protected. For example, using social media and other sources, ICE monitors students who enter the United States planning to study a “nonsensitive” topic and later change to one the State Department categorizes as “sensitive” (e.g., nuclear physics, biomedical engineering, or robotics). 27 ICE’s Overstay Lifecyle Program targets visitors from a number of unidentified countries to uncover derogatory information for ongoing monitoring, including through social media. And ICE’s planned Visa Lifecycle Vetting Initiative would keep tabs on 10,000 foreign visitors flagged as “high risk” by monitoring their social media activity. 28 As noted earlier, a draft CBP report recommended continuously monitoring young Muslim men while they were in the United States. If implemented, this discriminatory policy would affect hundreds of thousands of people. 29

5. DHS is increasingly seeking and using automated tools to analyze social media.

While the full scope of DHS’s efforts to use algorithms is not known, our research shows that at least three branches of DHS — CBP, ICE, and USCIS — now use automated tools to analyze social media information, either alongside other data or by itself. For instance, CBP’s Analytical Framework for Intelligence has automated analytic capabilities, developed by Palantir, to identify “non-obvious” links among data points, people, and organizations. 30 Similarly, ICE has contracted with the data mining firm Giant Oak to continuously monitor, aggregate, and analyze social media data to provide ICE with prioritized rankings of leads for its overstay enforcement initiatives. 31 The push toward automation raises concerns given the poor track record of automated systems trying to make complicated judgments and the ambiguity of many social media posts, as amply demonstrated by the USCIS pilot programs. 32

6. Social media information collected for one purpose is used by DHS in a range of other contexts, increasing the likelihood of misinterpretation.

The difficulty of interpreting Facebook posts and offhand tweets likely only worsens as they are captured in numerous databases and systems and used for a range of analyses. Empirical research shows that as data becomes further and further removed from the context and aim of its original collection, it is less likely to be useful for secondary analysis. 33

The DHS data architecture is a vast, ambiguous, and highly interconnected system in which social media is available for several types of secondary analyses. For example, when someone applies for a visa waiver to visit the United States, that person is asked to provide his or her social media identifiers, such as a Twitter handle. 34 The CBP officer who evaluates the applicant’s tweets conducts an individualized assessment and has available a range of biographical information that provides context for what the applicant has said on social media. We know from DHS’s own pilot programs discussed above that this type of analysis is mostly unproductive. These problems are exacerbated when the information is used for secondary analyses. The social media identifiers as well as information obtained from CBP border searches also make their way into the Automated Targeting System, where the information can be used to generate “risk assessments” for other individuals altogether. 35 The information in ATS also feeds into numerous other data systems and is used, for example, by TSA to prescreen travelers and to vet visa applicants and people applying for immigration benefits. 36 When already difficult-to-interpret information is taken out of context, the risks of misunderstandings only increase.

7. Social media information collected by DHS is shared with other law enforcement and security agencies under broad standards.

The past two decades have seen a proliferation in information-sharing arrangements among various government agencies and even with foreign governments. With stringent standards and controls, such arrangements can serve valid purposes. But sharing information about people’s political and religious views, especially when gleaned from the ambiguous realms of social media, only expands the possibility of abuse and inappropriate targeting. For example, the CBP program to track and interrogate journalists and activists at the southern border, discussed earlier, was apparently carried out in cooperation with Mexican authorities. 37 The target list showed that Mexican authorities had deported seven people and arrested three others, including nationals of the United States, Honduras, and Spain. 38 These actions by Mexico, which raise serious concerns about the targeting of political speech and organizing, could well have been the result of the sharing of information by CBP. Reporting also indicates that agents from the San Diego office of the Federal Bureau of Investigation (FBI) were involved in the operation, raising concerns about whether CBP intended that the FBI target the Americans on the list for surveillance. 39

Unfortunately, DHS programs generally have low standards for sharing highly personal information, such as that found on social media, and the standards do not differentiate between Americans’ information and that of people from other countries. This information can easily be shared with entities ranging from the Department of State, the FBI, and congressional offices to foreign governments and Interpol. For example, data obtained from CBP searches of travelers’ electronic devices at the border, which can include the full contents of these devices, can be shared with federal, state, tribal, local, or foreign governmental agencies or multilateral government organizations when CBP believes the information could assist enforcement of civil or criminal laws. 40 ICE, too, can disseminate any device information “relating to national security” to law enforcement and intelligence agencies. 41 Information from ICE’s LeadTrac system, which is used to vet and manage leads of suspected overstayers and status violators and includes social media information, can be shared with any law enforcement authorities engaged in collecting law enforcement intelligence “whether civil or criminal.” 42

Information shared with agencies can proliferate even further because DHS frequently does not place limits on re-dissemination. USCIS’s Alien Files system, for example, stores social media information on people applying for immigration benefits (such as a change of status from one type of visa to another, or naturalization) but does not seem to limit re-dissemination. 43 In addition, sometimes databases ingested by DHS do not adequately reflect the dissemination restrictions of the original system. For example, the Department of State databases for visa applications include some modest sharing restrictions, but it does not appear that these restrictions are honored when the State Department information is ingested into CBP’s systems. 44

8. DHS systems retain information for long periods, sometimes in violation of the department’s own rules.

While databases are part of how DHS carries out its functions, its extended retention of large pools of personal information untethered to any suspicion of criminal activity raises serious concerns about privacy risks and misuse of data. 45 In 1974, the Church Committee’s report on surveillance abuses by U.S. intelligence agencies warned: “The massive centralization of . . . information creates a temptation to use it for improper purposes, threatens to ‘chill’ the exercise of First Amendment rights, and is inimical to the privacy of citizens.” 46 The accumulation of data intrudes on people’s privacy by allowing government authorities to know the details of their personal lives. These risks have only become greater because data — including what we say on social media — can be so readily combined and searched.

To manage these types of risks, as well as to ensure that inaccurate and out-of-date information is weeded out, DHS’s data systems incorporate rules that limit the retention of information beyond a specified time frame. Unfortunately, these retention limits are often not carried over from one DHS database to another, so that once social media information is shared (often automatically), it is kept in the receiving database for longer than intended.

For example, CBP’s ATS stores a range of data from various sources, and as DHS admits, it fails to “consistently follow source system retention periods,” instead retaining most information for 15 years by default. 47 This means that information stored in ATS, such as Visa Waiver Program applications that include applicants’ social media identifiers and are supposed to be kept for no more than three years, may be retained for five times that long. 48

The lack of respect for retention rules is not limited to particular programs either. Since 2015, data from ATS has been copied in bulk to go into the consolidated DHS-wide Data Framework, a new system that is expected to play an enormous role in the agency’s operations. 49 Because ATS does not abide by source restrictions, the Data Framework likely does not comply with such restrictions either, and will instead rely on data that may be outdated, incorrect, or already deleted from the source system. 50

Moreover, some systems have long retention periods by design. USCIS’s Alien Files — which contain the official record of an individual’s visa and immigration history and may include social media information — are stored for 100 years after the individual’s date of birth. 51 Long retention periods for social media information further exacerbate the risk of misinterpretation. A social media post from 2007 may take on a whole new meaning by 2022, and even more so decades later.

The appendix at the end of this report contains further details on the retention of information in DHS systems.


As the findings above show, DHS incorporates social media into almost all aspects of its immigration operations, from visa vetting to searches of travelers to identifying targets for deportation. Hundreds of thousands or even millions of people, including Americans, are caught up in this net. While some of what the department is doing may well be justified, the scope of its monitoring activities is hidden behind jargon-filled notices and only rarely evaluated. Policymakers and the public need to know the when, why, what, and how behind DHS social media monitoring so that they can make informed judgments about the risk, efficacy, and impact of these initiatives.

Customs and Border Protection

Customs and Border Protection (CBP) is the arm of DHS charged primarily with securing the nation’s borders. CBP uses social media information as part of its review of applications to enter the United States. Social media information is also part of CBP’s preflight risk assessments and watch list screening and is used to develop broader intelligence analysis products. 1 CBP’s reliance on social media to perform these critically important functions is misplaced. DHS’s own pilot programs show that social media information is rarely a reliable basis for making judgments. And the vague standards used to assess social media invite discrimination against certain individuals, such as those involved in protest and activism and Muslim travelers. Unreliable social media information is easily shared within and beyond DHS, exposing personal information to a range of actors and increasing the risk that the data will be used out of context.

1. Visa Vetting

A. Visa Waivers (ESTA Program)

DHS, in consultation with the State Department, administers the Visa Waiver Program, through which citizens of 38 mainly Western European countries can travel to the United States for business or tourism without obtaining a visa. 2 In fiscal year 2017, more than 23 million travelers came to the United States through the program. 3 Travelers from these countries who wish to obtain a visa waiver must complete a mandatory online application on the Electronic System for Travel Authorization (ESTA). 4 The information provided through ESTA is vetted against security and law enforcement databases to determine whether applicants are eligible to travel under the program and to ensure they do not pose a law enforcement or security risk. 5 These travelers are also continually screened in real time. 6

Social media information is increasingly being used in this process to vet for national security concerns, although only one American was killed in a terrorist attack by a traveler on the Visa Waiver Program between 1975 and 2017, according to a study by the Cato Institute. 7 While social media checks were previously used by CBP, the agency added a new question to the forms in December 2016, asking all applicants to voluntarily provide their social media identifiers, such as any usernames and platforms used. 8 If applicants choose to provide identifying information, officers may use it to locate their profiles and accounts when the initial screening indicates “possible information of concern” or “a need to further validate information.” 9

Regardless of whether ESTA applicants have chosen to provide their social media identifiers, CBP officers may still choose to manually check their accounts; it does not appear that the officer must first make a finding of “possible information of concern” or “a need to further validate information” in order to do so. 10 In such instances, in addition to the interpretive issues identified above, it is unclear how CBP officials confirm that they have correctly connected the applicant to the right social media accounts. This was a recurring problem in the pilot programs discussed previously. 11

Publicly available documents do not indicate what types of postings on social media would be considered by CBP to be indicative of a national security threat. 12 But the vagueness of the standards creates the risk that innocuous social media activity will be used as a means of excluding people of certain political or religious beliefs. In a nod to this risk, CBP documents state that information from social media “will not” be the sole basis upon which CBP denies someone entry to the United States. 13 But this restriction may not be particularly effective because CBP could combine one questionable or weak social media “find” with virtually any other information to deny a visa waiver. For example, CBP and other arms of DHS are not permitted to use ethnicity as the sole basis for suspecting an individual is undocumented, but ethnicity combined with other factors — such as appearing nervous — has been used to stop people on suspicion of undocumented status. 14

The social media check can also extend to associates who posted on or interacted with an applicant on their social media profile, which could include Americans and other contacts living in the United States if “relevant to making an ESTA determination.” 15 In addition, CBP uses “link analysis” to proactively identify contacts of applicants (e.g., friends, followers, or “likes”), as well as the applicant’s secondary and tertiary contacts who might “pose a potential risk to the homeland” or “demonstrate a nefarious affiliation on the part of the applicant.” 16 CBP has no qualms about drawing adverse conclusions from things that third parties have posted — rather, it “presumes” that at least some of the information posted on the applicant’s site, including from third parties, is accurate because “individuals generally have some degree of control over what is posted on their sites.” 17

Thus, even if nothing posted by the applicant suggests he or she poses a risk, CBP could still potentially deny a visa waiver based in part on concerns related to a tweet posted by a “friend” or follower, who could easily be someone the applicant does not even know. Unfortunately, unlike some other DHS programs, there is no opportunity for the applicant to address or explain the inferences that CBP draws from their social media.

DHS rules require officers to collect only the minimum personally identifiable information “necessary for the proper performance of their authorized duties.” 18 But according to the 2017 privacy audit of ESTA, DHS’s Privacy Office could not verify whether CBP was adhering to this requirement. 19 Other significant controls — that DHS officers are limited to reviewing publicly available information and must use official DHS accounts to conduct such checks — can be circumvented using a technique called “masked monitoring.” 20 But the circumstances triggering such monitoring and the applicable rules are not publicly available. 21

All social media information about those applying for visa waivers (and potentially about their friends and contacts), as well as other data from ESTA applications and related paperwork, is stored in CBP’s Automated Targeting System (ATS). 22 CBP agents use the information in ATS to assign risk assessments to travelers, which can impact their vetting and questioning at the border. ATS risk assessments and other analyses also feed into a number of watch lists, such as the FBI’s Terrorist Screening Database and TSA Watch Lists, as well as analytical products on trends and threats. 23 In other words, what a person says on social media, which is often context-specific and ambiguous to outsiders, feeds into every aspect of CBP’s work and that of DHS more broadly.

ESTA information — about applicants and their friends and families — is also disseminated widely to a broad range of entities, including the Departments of Justice and State. 24 As of December 2018, the National Vetting Center (NVC), a presidentially created clearinghouse and coordination center for vetting information, has been involved in ESTA’s work. 25 CBP is required to regularly share ESTA application data with a number of agencies involved in the NVC, including the CIA and the Department of Defense, to be compared against the holdings of those agencies. 26 Beyond the bulk sharing with the NVC, ESTA information sharing with other agencies is not confined to situations in which there is an indication that the traveler has violated the law. Rather, it can take place simply when DHS determines that the information “would assist in the enforcement of civil or criminal matters.” 27 In addition, DHS and the National

Counterterrorism Center (NCTC), which is charged with collecting counterterrorism intelligence, have entered into a memorandum of understanding allowing DHS to disclose the entire ESTA data set to the NCTC. 28 This data set would go far beyond information about individuals suspected of any connection to terrorism and would include information gathered during routine interactions with the public (e.g., screening travelers, reviewing immigration benefit applications, issuing immigration benefits). 29

In sum, the ESTA program demonstrates that CBP collects highly personal information available on social media about those applying for visa waivers and the people in their networks. CBP uses this information, which is highly contextual and subject to interpretation, to decide whether an individual poses an undefined “security risk.” All of this information is stored in DHS databases for years and potentially used for a range of purposes, often far removed from the purpose of the initial collection. 30 The information is shared in bulk with the NCTC, and with other law enforcement agencies as long as it could be of “assistance” to them, creating risks to privacy and freedom of speech and association.

B. Visa Applications

The State Department has ramped up its collection of social media information from people applying for visas, which it shares with DHS to be vetted using ATS. 31 In May 2017, the State Department began requiring some categories of visa applicants — estimated at 65,000 per year — to provide the identifiers they used on all social media platforms within the previous five years. 32 It seems likely that this move was aimed primarily at travelers from the Muslim ban countries; the Federal Register notice announcing the rule change indicated that it was being implemented as part of the Muslim ban, and the notice’s estimate of the number of travelers who would be affected by the change approximately matched those affected by the overall ban. 33

In March 2018, the State Department sought to vastly expand the collection of social media identifiers to the approximately 15 million people who apply for visas each year. 34 The Office of Management and Budget (OMB) approved the proposal in April 2019, which means the State Department will begin collecting from nearly all visa applicants their social media identifiers associated with any of 20 listed social media platforms, more than half of which are based in the United States (Facebook, Flickr, Google+, Instagram, LinkedIn, Myspace, Pinterest, Reddit, Tumblr, Twitter, Vine, and YouTube). 35 The other platforms are based in China (Douban, QQ, Sina Weibo, Tencent Weibo, and Youku), Russia (Vkontakte), Belgium (Twoo), and Latvia (Ask.fm). 36 Applicants will also have the option of providing identifiers for platforms not included on the list. 37

As with the DHS social media collection programs described throughout this paper, there is limited information on what the State Department’s review of applicants’ social media activity will entail. We only know that it is meant to enable consular officers to confirm applicants’ identity and adjudicate their eligibility for a visa under the Immigration and Nationality Act. 38 While the notice does state that “the collection of social media platforms and identifiers will not be used to deny visas based on applicants’ race, religion, ethnicity, national origin, political views, gender, or sexual orientation,” this restriction is easily circumvented: a social media post revealing an applicant’s religious or political affiliation may not alone justify denial, but other information in his or her file could easily be used as a pretext, particularly given the broad discretion exercised by consular officials. 39 According to the statement supporting the notice, consular officers will also be directed not to request passwords, violate the applicant’s privacy settings or the platforms’ terms of service, or engage with the applicant on social media; to comply with State Department guidance limiting the use of social media; and to avoid collecting third-party information. 40

The State Department’s expected trove of information will likely be used for a variety of purposes beyond visa vetting. Social media identifiers collected by the State Department will be stored in the Consolidated Consular Database, which is ingested into ATS and becomes available to DHS personnel. 41 Further, that information will be used in coordination with other department officials and partner U.S. government agencies. 42 Indeed, numerous other agencies have access to the visa records system in which applicants’ social media information will be stored, and — along with foreign governments — can obtain information from the system. 43

In sum, the State Department’s collection of social media information, which already includes 65,000 visa applicants (likely those targeted by Trump’s Muslim ban), is on track to begin creating a registry that will include 15 million people after the first year alone. Not only will this information be used by the State Department in undefined ways to make visa determinations, but it will be yet another source of personal information that is funneled into DHS’s many interconnected and far-reaching systems. 44

2. Warrantless Border Searches

CBP conducts warrantless searches at the border on a wide variety of electronic devices, such as phones, laptops, computers, and tablets, many of which are likely to result in the collection of social media information. According to CBP, these searches are meant to help uncover evidence concerning terrorism and other national security matters, criminal activity like child pornography and smuggling, and information about financial and commercial crimes. 45 However, CBP documents also describe these searches as “integral” to determining an individual’s “intentions upon entry” and to providing other information regarding admissibility. 46

While some of these searches are conducted manually, CBP also has technical tools for extracting information from these devices, potentially including information stored remotely. 47 It has purchased powerful handheld Universal Forensic Extraction Devices (UFEDs), developed by the Israeli company Cellebrite, which can be plugged into phones and laptops to extract in a matter of seconds the entirety of a device’s memory, including all data from social media applications both on the device and from cloud-based accounts like Facebook, Gmail, iCloud, and WhatsApp. 48

Searches by CBP of travelers’ electronic devices at ports of entry have increased dramatically over the past several years. In fiscal year 2015, 8,503 people had their devices searched. 49 By fiscal year 2017, the number had reached 30,200 — an increase of over three and a half times. 50 According to CBP, these searches do not require a warrant, due to “a reduced expectation of privacy associated with international travel.” 51 Both American and foreign travelers are subjected to these warrantless searches. 52 In 2017, 10 U.S. citizens and one green card holder filed suit challenging warrantless searches of electronic devices at the border. 53 The complaint highlights the intrusiveness of these searches, both for the person being searched and for the traveler’s family, friends, and acquaintances, given the many contact lists, email messages, texts, social media postings, and voicemails that cellphones and laptops often contain. In November 2019, the U.S. District Court in Massachusetts ruled in the case that CBP’s and ICE’s suspicionless searches of electronic devices at ports of entry violate the Fourth Amendment and that these searches require reasonable suspicion that devices contain contraband. 54

Under a January 2018 directive, CBP is permitted to conduct two types of searches: “basic” and “advanced,” both of which would allow collection of information from social media. 55 The 2018 directive changed CBP’s previous, more permissive rule, likely as a partial and belated response to a 2013 federal court decision, United States v. Cotterman. In that case, a federal court of appeals held that the fact that a device was seized at a border did “not justify unfettered crime-fighting searches or an unregulated assault on citizens’ private information,” and required that officers have reasonable suspicion of criminal activity to conduct forensic searches of electronic devices. 56 CBP is permitted to refer travelers to ICE at any stage of the inspection process, at which point ICE’s rules would apply; while ICE also issued a 2018 policy barring the use of advanced searches without reasonable suspicion, it is not yet known how personnel are being directed to implement this policy, meaning that ICE’s searches may in practice be more permissive than CBP’s. 57

Under CBP’s new rules, a basic search permits an agent to view information that “would ordinarily be visible by scrolling through the phone manually.” 58 No suspicion of criminal wrongdoing or national security risk is required for basic searches. For either type of search, agents are prohibited from “intentionally” accessing data that is “solely stored remotely”; only information that is “resident on the device and accessible through the device’s operating system or through other software, tools, or applications” may be viewed. 59 CBP officers are supposed to disable network connectivity or request that the traveler do so (e.g., by switching to airplane mode) prior to the search; they are also supposed to conduct the search in the presence of the traveler in most circumstances, though the individual will not always observe the actual search. 60

Despite these new guidelines, CBP agents will probably still be able to access social media information during a search. If a traveler has social media data downloaded onto his or her device or cached in some way, it is likely accessible even if connectivity is turned off. 61 For example, if a traveler was scrolling through a Twitter or Facebook feed prior to being selected for a search, any loaded data, such as his or her newsfeed, would be accessible on the user’s phone or laptop.

The officer may also request that the traveler provide any passcodes needed to access the contents of a device. 62 Although a traveler can refuse to provide a code, CBP may then keep the device in order to try to access its contents by other means. 63 U.S. citizens must be admitted to the country even if they do not provide passcodes, though their phones may still be held for five days or longer. 64 Noncitizens, however, including visa holders and tourists from visa waiver countries, may be denied entry entirely. 65

An advanced search occurs when an officer connects an electronic device to external equipment, via a wired or wireless connection, to review, copy, or analyze its contents. 66 Advanced searches are highly intrusive, and the tools that CBP has purchased allow it to capture all files and information on the device, including password-protected or encrypted data. 67

Officers are authorized to perform advanced searches if there is reasonable suspicion that one of the laws enforced or administered by CBP has been violated or if there is a “national security concern.” 68 In creating an exception for “national security concerns,” DHS policy departs from the Cotterman decision, which required reasonable suspicion for all forensic searches. While DHS does not define what constitutes a national security concern, national security is an expansive term that could easily swallow up the requirement of suspicion for these highly intrusive searches. The examples listed in the 2018 privacy impact assessment suggest that national security searches will be based on watch lists. However, this category includes not just lists kept by the government — primarily the FBI and DHS — but other lists as well, such as unspecified “government-vetted” watch lists and a “national security-related lookout in combination with other articulable factors as appropriate.” 69 And, of course, these examples are not exhaustive, leaving open the possibility that agents will use the cover of national security to undertake forensic searches even when there is no relevant watch list.

Following both basic and advanced searches, the officer enters notes about the interaction, including “a record of any electronic devices searched,” into TECS, CBP’s primary law enforcement system. 70 This typically includes device details, the type of search performed (basic or advanced), and the “officer’s remarks of the inspection.” 71 CBP may detain a device, or copies of the information it contains, for up to five days, although it can keep a device longer when there are unspecified “extenuating circumstances.” 72 If there is no probable cause to seize and retain a device or the information it contains, the device must be returned to the traveler and any copies destroyed. 73 However, CBP may retain without probable cause any information “relating to immigration, customs, and other enforcement matters,” which seems to allow it to essentially circumvent the probable cause requirement. 74 For instance, information that could be considered useful for determining whether an individual may be permitted to travel to the United States could be stored in the individual’s Alien File, 100 years after their date of birth. 75

Any information that is copied directly from an electronic device during an advanced search (presumably based on probable cause) is stored in ATS, which allows agents to further analyze information collected by comparing it against various pools of data and applying ATS’s analytic and machine learning tools to recognize trends and patterns. 76 CBP may disclose information from electronic device searches to other agencies, both within and outside DHS, if it is evidence of violation of a law or rule that those agencies are charged with enforcing. 77

Notably, a December 2018 DHS inspector general report concluded that CBP had not been following its own standard operating procedures prior to the implementation of the new rules. 78 The report, which was based on a review of CBP’s electronic device searches at ports of entry from April 2016 to July 2017, found that officers frequently did not document searches properly, that they consistently failed to disable network connection prior to search (specifically for cell phones), and that the systems used and data collected during searches were in many cases not adequately managed and secured. 79 For instance, officers often failed to delete travelers’ information stored on the thumb drives used to transfer data to ATS during advanced searches. 80 The report also found that CBP had no performance measures in place to assess the effectiveness of its forensic searches of electronic devices. 81

The 2018 directive instructed CBP to develop and periodically administer an auditing mechanism to ensure that border searches of electronic devices were complying with its requirements. 82 However, the agency has published neither the requirements nor the results of the audits. In February 2019, the Electronic Privacy Information Center (EPIC) sued for the release of this information. 83

Even if the rules are operating as intended, they may also be applied discriminatorily. For instance, Muslim travelers have long been singled out for additional scrutiny because of their faith, 84 which President Trump and his administration have repeatedly and inaccurately connected with “terrorism.” 85 Just months after the new policy was issued, the Council on American-Islamic Relations (CAIR) sued CBP on behalf of a Muslim American woman whose iPhone was seized and its contents imaged when she came home from Zurich. 86 She was also questioned about her travel history and whether she had ever been a refugee. 87 The lawsuit asked CBP to explain what suspicion warranted the forensic search and demanded deletion of the information seized. 88 The government quickly settled, agreeing to delete the data it had seized. 89

In sum, CBP is increasingly deploying its claimed warrantless border search authority to search the electronic devices of both visitors and American travelers. Basic searches conducted without any suspicion of wrongdoing can result in the scrutiny of travelers’ social media information. Advanced searches will result in the collection of huge amounts of personal information, including from social media, about both the person whose device is being searched and that person’s contacts. CBP has stated that it has this broad authority in order to help uncover information related to terrorism and criminal activity and to determine admissibility. But there is little indication in public documents as to what type of content officers should be looking for, especially in deciding whether a traveler can enter the country, allowing for unfocused fishing expeditions. And these searches are not subject to even minimal safeguards—such as an instruction to avoid making decisions based solely on social media or a prohibition on profiling. And the search is just the start. CBP is permitted to retain information relating to immigration, customs, or other enforcement matters it finds useful, including a copy of the contents of phones and laptops; as discussed further below, the agency may also further analyze the information using unknown tools and algorithms. 90

3. Searches Pursuant to Warrant, Consent, or Abandonment

CBP also collects information from electronic devices in three other situations:

  • When it has a warrant authorized by a judge or magistrate based on probable cause; 91
  • When an officer finds an abandoned device that he or she suspects “might be associated with a criminal act” or was found in “unusual circumstances” (such as between points of entry in the “border zone,” 92 the area within 100 miles of any U.S. boundary in which Border Patrol claims authority to conduct immigration checks 93 ); and
  • When the owner has consented. 94

According to CBP, once the information is determined to be “accurate and reliable,” it is used to support the agency’s border enforcement operations and criminal investigations. 95 DHS materials note that such information is “typically” used only to corroborate evidence already in the agency’s possession. 96

Agents are explicitly allowed to collect information stored in the cloud when spelled out in a warrant or when the owner consents, but it is not clear whether cloud data can be accessed from abandoned devices. 97 A CBP officer or agent can submit devices found in one of the aforementioned scenarios for digital forensic analysis, which is usually undertaken by a team of agents at the intelligence unit for the relevant Border Patrol sector. 98

If the CBP agent determines after conducting one of these examinations that an electronic device holds information that is “relevant” to the agency’s law enforcement authorities, the agent may load all information into a standalone information technology system for analysis. 99 This is the rare database that “may not be connected to a CBP or DHS network.” 100 The tools built into these stand-alone systems allow CBP to perform various analyses on the collected information. 101 One system, ADACS4, is used to analyze data from electronic devices in order to discover “connections, patterns, and trends” relating to “terrorism” and the smuggling of people and drugs, as well as other activities that threaten border security. 102

CBP retains information associated with arrests, detentions, and removals, including data obtained from electronic devices, for up to 75 years. Even information that does not lead to the arrest, detention, or removal of an individual and that may be completely irrelevant to DHS’s duties — may be stored for 20 years “after the matter is closed.” 103

The information collected by CBP from electronic devices is frequently disseminated within DHS and to other federal agencies or state and local law enforcement agencies with a need to know, and less frequently to foreign law enforcement partners. 104 In addition to sharing with agencies investigating or prosecuting a violation of law, CBP may also share information for unspecified counterterrorism and intelligence reasons. 105

The CBP search authorities detailed above allow the collection of social media information. While the warrant and consent authorities seem reasonably cabined, the authority to search abandoned devices is quite expansive, especially if it is read to apply to all devices found within 100 miles of U.S. land or coastal borders, where two-thirds of Americans live. 106 It is not clear why the information from these categories of devices is held in a separate database, unconnected to other DHS systems. As with other collection programs, CBP uses the social media information it collects to conduct trend or pattern analyses and shares it with other agencies, raising concerns about how potential misinterpretations and out-of-context information are deployed. 107

4. Analytical Tools and Databases

After CBP personnel collect social media information including from ESTA and visa applications, from electronic devices searched under their claimed border search authority, and from numerous other sources 108 — the data is provided to analysts who conduct one or more of three main types of analyses:

A. Assigning individual risk assessments: comparing an individual’s personally identifiable information against DHS-held sources to assess his or her level of risk, such as whether the individual or her associates may present a security threat, in order to determine what level of inspection she is required to undergo and whether to allow her to enter the country;

B. Trend, pattern, and predictive analysis: identifying patterns, anomalies, and subtle relationships in data to guide operational strategy or predict future outcomes; 109 and

C. Link and network analysis: identifying possible associations among data points, people, groups, entities, events, and investigations. 110

These analytical capabilities are interrelated and interdependent and serve as the backbone of CBP intelligence work. Because the ways in which CBP conducts these analyses and draws conclusions from data depend heavily on interactions among the agency’s various data systems, this section will provide an overview of the key systems and their analytical functions. It shows that the social media information in each of these databases is amassed on the basis of overbroad criteria and without accuracy requirements, shared widely with few or no restrictions, analyzed using opaque algorithms and tools, and often retained longer than the approved retention schedules.

A. Assigning Individual Risk Assessments

The primary system CBP uses for combining and analyzing data, including for assigning risk assessments, is the Automated Targeting System (ATS). There is scant publicly available information regarding the foundation, accuracy, or relevance of these risk assessments; nor do we know whether the factors used in assessments are non-discriminatory. 111 We do know, however, that social media is likely a common source in formulating risk assessments. ATS contains copies of numerous databases and data sets that include social media information, such as CBP’s ESTA, the FBI’s Terrorist Screening Database (TSDB), and data from electronic devices collected during CBP border searches. 112 ATS also appears to ingest social media information directly from commercial vendors. 113 CBP agents use secret analytic tools to combine the information gathered from these various sources, including from social media, to assign risk assessments to travelers, including Americans flying domestically. 114 These assessments may get a person placed on a watch list like the TSDB, 115 and determine whether the person gets a boarding pass or if additional screening is necessary. 116

To be clear, the individuals who are subjected to these measures are not necessarily suspected of a crime or a link to criminal activity. 117 Rather, an individual’s risk level is determined by a profile, which can be influenced by social media information contained in ATS or other databases, as well as ad hoc queries of information on the internet, including queries of social media platforms. 118 Notably, DHS exempted ATS from accuracy requirements under the Privacy Act, so the information that goes into one’s risk assessment need not be correct, relevant, or complete. 119

ATS’s individual risk assessment capabilities are also leveraged by ICE in its enforcement activities against people who have overstayed their visas. ATS receives the names of potential overstays from CBP’s arrivals and departures management system, and ATS automatically vets each name against its records to create a prioritized list based on individuals’ “associated risk patterns.” 120 The prioritized list is then sent to ICE’s lead management system, LeadTrac (discussed further in the ICE Visa Overstay Enforcement section below). 121

It is not clear what standard is used in determining “risk” in these profiles or how exactly social media information is weighted. But it seems likely that ATS’s data mining toolkit, which includes “social network analysis” capabilities that may rely on social media information, is an important part of formulating risk assessments. 122

Risk assessments and other records in ATS are retained for 15 years, unless the information is “linked to active law enforcement lookout records . . . or other defined sets of circumstances,” in which case the information is retained for “the life of the law enforcement matter.” 123 Notably, the most recent ATS privacy impact assessment admits that the system fails to “consistently follow source system retention periods, but instead relies on the ATS-specific retention period of 15 years,” often retaining data for a period that exceeds the data retention requirements of the system from which it originated (for instance, three years for sources from ESTA). 124 Therefore, ATS passes information to partners long after it has been corrected or deleted from other databases.

ATS information, including personally identifiable information, is disseminated broadly within DHS and to other federal agencies, and many DHS officers have direct access to ATS. 125 It is unclear, however, whether risk assessments and the underlying social media data on which they are based may be disseminated beyond ATS.

B. Trend, Pattern, and Predictive Analysis

Essential to the process of assigning risk assessments are the CBP-formulated “rules,” or “patterns” identified as “requiring additional scrutiny,” that CBP personnel use to vet information in ATS in order to evaluate an individual’s risk level. 126 These patterns are based on trend analyses of suspicious activity and raw intelligence, as well as CBP officer experience and law enforcement cases. 127 In addition to assigning risk assessments, ATS is used as a vetting tool by both USCIS (for refugees and applicants for certain immigration benefits) and the Department of State (for visa applicants) and to analyze device data obtained at the border. 128 For each of these functions, CBP agents use ATS to compare incoming information against ATS holdings and apply ATS’s analytic and machine learning tools to recognize trends and patterns. 129

CBP agents also use ATS for preflight screenings (which will be discussed in more detail in the TSA section) to identify individuals who, though not on any watch list, “exhibit high risk indicators or travel patterns.” 130 ATS’s analytic capabilities likely underpin its determinations of “high risk” patterns.

ATS is also central to a DHS-wide “big data” effort, the DHS Data Framework. Similar to ATS in structure and purpose but wider in scope, the Data Framework is an information technology system with various analytic capabilities, including tools to create maps and time lines and analyze trends and patterns. 131

The Data Framework ingests and analyzes huge amounts of data from across the department and from other agencies. 132 Originally the Data Framework was meant to import data sets directly from dozens of source systems and categorize the data in order to abide by retention limits, access restriction policies, and ensure that only particular data sets are subject to certain analytical processes. 133 However, as of April 2015, data sets started being pulled straight from ATS instead of from the source systems, and the Data Framework stopped tagging and categorizing data before running analytics. 134 DHS said this change was merely an “interim process” of mass data transfer in order to expedite its ability to identify individuals “supporting the terrorist activities” in the Middle East. 135 The interim process was originally established to last for 180 days, with the possibility of extensions in 90-day increments. 136 However, the interim period continued for at least three and a half years (April 2015–October 2018), and it is unclear whether it is still ongoing. 137

The Data Framework’s interim process and its extraction of data directly from ATS are troubling in part because ATS does not comply with the retention schedules of different source systems but rather tends to rely on its own 15-year retention period. 138 By bypassing source systems and extracting information directly from ATS, the interim process creates a risk that outdated or incorrect information, or information that was deleted from its source system many years earlier, will be input into the Data Framework’s classified repository. Hence, information collected from an individual for one purpose — such as screening for the Visa Waiver Program — not only is retained longer than it should be, but is channeled into larger and larger analytical systems for unknown and unrelated purposes.

According to DHS senior leadership, the Data Framework also incorporates “tone” analysis. 139 Purveyors of tone analysis software have made dubious claims about its ability to predict emotional states and aspects of people’s personality on the basis of social media data. 140 These claims, however, have been thoroughly debunked by empirical studies. 141 The unreliability of such software increases dramatically for non-English content, especially when people use slang or shorthand, which is often the case with social media interactions. 142

The Data Framework and its analytical results are used extensively throughout DHS, including by CBP, DHS’s Office of Intelligence and Analysis, TSA’s Office of Intelligence and Analysis, and the DHS Counterintelligence Mission Center. 143 DHS uses the Data Framework’s classified data repository to disseminate information externally, including “bulk information sharing” with U.S. government partners. 144

C. Link and Network Analysis

A central element of CBP network analysis capabilities is the collection of information on a huge number of individuals in order to draw connections among people, organizations, and data. For this purpose, CBP agents use the CBP Intelligence Records System (CIRS) to gather information about a wide variety of individuals, including many who are not suspected of any criminal activity or seeking any type of immigration benefit, such as people who report suspicious activities; individuals appearing in U.S. visa, border, immigration, and naturalization benefit data who could be associates of people seeking visas or naturalization, including Americans; and individuals identified in public news reports. 145 The system stores a broad range of information, including raw intelligence collected by CBP’s Office of Intelligence, data collected by CBP pursuant to its immigration and customs authorities (e.g., processing foreign nationals and cargo at U.S. ports of entry), commercial data, and information from public sources such as social media, news media outlets, and the internet. 146 Notably, the system is exempt from a number of requirements of the Privacy Act that aim to ensure the accuracy of records. 147 Accordingly, it appears that information in CIRS may be ingested, stored, and shared regardless of whether it is accurate, complete, relevant, or necessary for an investigation. There is no public guidance on quality controls for information eligible for inclusion in CIRS. 148

Huge swaths of data from CIRS, ATS, and other systems, including social media information, are then ingested by another database, the Analytical Framework for Intelligence (AFI). 149 AFI provides a range of analytical tools that allow DHS to conduct network analysis, such as identifying links or “non-obvious relationships” between individuals or entities based on addresses, travel-related information, Social Security numbers, or other information, including social media data. 150

It is possible that ATS risk assessments are among the unspecified data transferred from ATS to AFI. 151 In addition, AFI users may upload internet sources and other public and commercial data, such as social media, on an ad hoc basis. 152 The data need only be relevant, a fairly low standard, and the rules allow data of “unclear” accuracy to be uploaded. 153 CBP agents use AFI to search and analyze databases from various sources, including Department of State and FBI databases and commercial data aggregators. 154 Social media information in AFI can be used in ongoing projects and finished intelligence products, which can be disseminated broadly within DHS and to external partners. 155

The data mining firm Palantir — a longtime government contracting partner that helped facilitate one of the National Security Agency’s most sweeping surveillance programs 156 — is intimately involved in AFI’s operation. 157 Documents obtained by the Electronic Privacy Information Center (EPIC) through a Freedom of Information Act (FOIA) request refer to joint “AFI and Palantir data” and state that “data from AFI and Palantir can be shared with other stakeholder[s] and agencies” in compliance with AFI rules. 158 “Palantir data” may refer to personal information about people that Palantir ingests from disparate sourcessuch as airline reservations, cell phone records, financial documents, and social media — and combines into a colorful graphic that purports to show software-generated linkages between crimes and people. 159

According to an investigation by Bloomberg News, law enforcement agencies may use this “digital dragnet” to identify people who are only very tangentially related to criminal activity: “People and objects pop up on the Palantir screen inside boxes connected to other boxes by radiating lines labeled with the relationship: ‘Colleague of,’ ‘Lives with,’ ‘Operator of [cell number],’ ‘Owner of [vehicle],’ ‘Sibling of,’ even ‘Lover of.’” 160 The value of discovering such linkages in investigations, while much hyped, is open to debate. 161 And as the volume of information grows, so does the risk of error. Given that the information in AFI is not required to be accurate, it is likely that the data from Palantir is similarly unverified. 162 Palantir also supplies AFI’s analytical platform and works extensively with ICE, as discussed later. 163

Data Transfer From CIRS & ATS to AFI

Since 2015, CBP has awarded contracts worth about $3.2 million to Babel Street, an open-source and social media intelligence company, for software licenses and maintenance for the CBP unit that manages AFI, the Targeting and Analysis Systems Program Directorate. 164 According to the company’s website, Babel Street technologies provide access to millions of data sources in more than 200 languages; a number of analytic capabilities, including sentiment analysis in 18 languages; and link analysis. 165 Users can also export data to integrate with Palantir analytic software. 166 CBP likely uses Babel Street’s web-based application, Babel X, which is a multilingual text-analytics platform that has access to more than 25 social media sites, including Facebook, Instagram, and Twitter. 167 There are few details about how Babel Street software is used by CBP and what sorts of social media data it may provide for AFI.

Additionally, ATS and the DHS Data Framework both have their own link and “social network” analysis capabilities, though little is known about how those capabilities function. 168

In sum, while we know that CBP undertakes extensive analyses of social media information, from assessing risk level to predictive and trend analysis to “social network analysis,” we know almost nothing about the validity of these techniques or whether they are using discriminatory proxies. Partnerships with data mining companies such as Palantir raise additional concerns about the incorporation of large pools of unverified data into DHS systems, as well as privacy concerns about allowing a private company access to sensitive personal data. 169 The increasing consolidation of data into CBP’s expansive intelligence-gathering databases, as well as into the DHS Data Framework, further compounds the issues created by DHS’s vague, overbroad, and opaque standards for collection of social media data and its tendency to recycle that data for unknown and potentially discriminatory ends.

Transportation Security Administration

The Transportation Security Administration (TSA) is in charge of security for all modes of transportation — aviation, maritime, mass transit, highway and motor carrier, freight rail, and pipeline — into, out of, and within the United States. 1 Although most visible at airports, TSA also works behind the scenes via its Secure Flight program, which runs passenger records against a variety of watch lists and information held in CBP’s Automated Targeting System (ATS). 2 As with ATS’s risk assessments, very little is publicly known about the scientific foundation and validity of TSA’s security determinations. We do know that many of the lists that TSA uses to vet passengers rely on social media information, with the attendant risks of misinterpretation, and have been widely criticized for being inaccurate. 3

Concerns about TSA’s use of social media information are compounded by the lack of transparency surrounding how individuals are designated as security risks.

TSA’s Secure Flight program collects passenger records from airlines and works in conjunction with CBP’s ATS to flag passengers for enhanced screening or denial of boarding. 4 Secure Flight checks roughly two million passenger records daily against a variety of watch lists. 5 Its automated matching system assigns a percentage score to each record, indicating the confidence level of a match between the passenger and a watch list entry. 6 Those whose scores meet the minimum threshold are identified and subjected to enhanced security screening by on-the-ground TSA personnel. 7 Secure Flight also identifies a (potentially overlapping) category of travelers and their companions called “Inhibited Passengers,” which includes individuals who are confirmed or possible matches to watch lists, as well as individuals about whom DHS possesses “certain derogatory holdings that warrant enhanced scrutiny” or who have “a high probability of being denied boarding.” Both of the latter categories remain undefined. 8

The watch lists used to designate individuals as security risks include the No Fly and Selectee components of the Terrorist Screening Database (TSDB), TSA Watch Lists, and watch lists derived from ATS’s prescreening of international flights. 9 Social media forms part of the basis for placing individuals on these watch lists, which are described below. 10

1. Watch Lists

A. Terrorist Screening Database

Maintained by the FBI’s Terrorist Screening Center and commonly known as the “terrorist watch list,” the TSDB is the database of individuals whom the government categorizes as being “known” or “suspected” of having ties to terrorism. 11 DHS receives information from the TSDB through the DHS Watchlist Service, which maintains a synchronized copy of the database and disseminates records from it to parts of DHS. 12 The FBI and other federal agencies submitting nominations for the TSDB are encouraged to include social media information as a source for suspicion, even if the information is uncorroborated. 13 The watch list has long been criticized for being bloated and error prone; as of 2016 it included one million names, including those of about 5,000 Americans. 14 The standards for categorizing individuals as “suspected” of ties to terrorism are so broad that even people three degrees removed from a “suspicious” person could be included on the list. 15

TSA’s Secure Flight

The TSDB is the source of the No Fly List and the Selectee List, both of which also rely on broad standards that could allow, for example, the inclusion of individuals who have engaged in civil disobedience. 16 The No Fly List has been the subject of extensive litigation, in which federal courts have criticized the government’s failure to ensure adequate procedures to allow individuals to contest their inclusion on the list. 17

B. TSA Watch Lists

The watch lists created and managed by TSA’s Office of Intelligence and Analysis are also likely to incorporate social media information in at least some cases. 18 These lists are based on information in TSA Intelligence Service Operations Files, which are compiled from TSA security incidents, intelligence provided by other agencies, and broadly from commercial sources and publicly available data; they are used to flag people who are not on another relevant watch list to receive additional scrutiny during travel. 19 One such list, the “95 list,” created in February 2018, includes individuals who make physical contact with a TSA employee or dog, loiter near screening checkpoints, are the subject of a credible threat of violence, or are “publicly notorious.” 20 While some of this information likely comes from agents, it seems that public notoriety and perhaps even the threat of violence are factors that TSA gleans from social media.

C. ATS-Generated Watch Lists

TSA’s Secure Flight also screens passenger records against watch lists derived from ATS’s prescreening of international flights. 21 The ATS prescreening is informed by TSA-crafted rules or “threat-based intelligence scenarios,” which ATS then compares against both passenger records and its plethora of other sources, including social media. ATS identifies individuals for enhanced TSA screening based on “matches” to information found in ATS. 22 Such matches could be based on a profiling rule or based on a passenger’s identifiers, which may include names, phone numbers, or social media handles. 23 ATS compiles its list of matches to share with Secure Flight, including individuals who, though not on any other watch list, “exhibit high risk indicators or travel patterns.” 24 There are no public criteria for what constitutes a high risk indicator or travel pattern that could trigger a flag on ATS.

According to a privacy impact assessment published in April 2019, TSA uses ATS to generate watch lists for TSA’s “Silent Partner” and “Quiet Skies” programs. 25 Little is known about Silent Partner, but according to internal TSA documents, Quiet Skies originally involved undercover federal air marshals shadowing thousands of travelers on flights and through airports, documenting whether travelers use a phone, go to the bathroom, fidget, or have a “cold penetrating stare,” among other behaviors. 26 Following a series of reports by the Boston Globe, TSA announced that it curtailed the Quiet Skies program in December 2018 and will no longer require agents to compile reports on travelers who exhibit routine passenger behaviors. 27 However, TSA now uses ATS, with its numerous social media sources, to create a list of travelers for Quiet Skies. TSA formulates rules for CBP personnel to check against ATS holdings on passengers on outbound international flights and domestic flights subsequent to international flights to create a Quiet Skies List of individuals designated for enhanced screening. 28 A similar Silent Partner List is created for passengers on in-bound international flights. 29 In addition to being designated for enhanced screening, individuals on the Quiet Skies and Silent Partner Lists may be subject to “observation by the TSA Federal Air Marshal Service (FAMS) while the individual is onboard the flight or in the airport.” 30

International Flight Data Transfer

The privacy impact assessment notes that individuals “will remain on the Quiet Skies List for a period of time,” though the period is unspecified. 31 Names of individuals who are flagged in ATS based on matches to TSA’s rules are retained in ATS for seven years, while names of international travelers whose activities do not match the risk patterns are retained for seven days “to conduct additional analysis.” 32 This information can be used for future risk assessments and watch lists. 33

Social media also plays a role in TSA screenings of passengers on domestic flights. 34 For domestic flights, Secure Flight screens airline records using watch lists and unspecified “rules” and then shares the names of watch list matches and other “Inhibited Passengers” with ATS. 35 ATS users then perform comparisons, apply “risk-based rules,” and conduct federated queries to identify pertinent CBP-held information on those travelers, which could include social media information. At the same time, ATS users create a separate list of CBP-identified “Inhibited Passengers” based on analyses of ATS sources, including social media. 36 CBP sends the results of the ATS screening back to Secure Flight, and TSA and CBP personnel compare the Secure Flight and ATS-generated lists of “Inhibited Passengers” via a common dashboard display. 37 TSA agents then make final decisions on enhanced screening and boarding denial, which could be informed by the ATS-held social media records. 38

Additionally, TSA agents use an ATS “decision-support” tool called ATS-Passenger (ATS-P), available on mobile devices through an ATS mobile application, to view information in ATS and create a prioritized list of “potentially high-risk passengers.” 39 According to the most recent privacy impact assessment, TSA personnel can search and filter ATS information by creating “user-defined rules” based on “operational, tactical, intelligence, or local enforcement efforts.” 40 The ability of each user to define his or her own rules — a process about which there is little information publicly available — creates opportunities for discriminatory application. ATS-P also allows users to query other available federal government systems and publicly available information, including social media data. 41 The fact that this system is applied to domestic flights raises the possibility that it could be used to target American travelers on the basis of their political and religious views. 

Domestic Flight Data Transfer

2. TSA PreCheck

TSA also uses Secure Flight to identify low-risk passengers for TSA PreCheck, a fee-based program that allows travelers expedited transit through airports. 42 Secure Flight screens PreCheck applicants against its own information as well as several lists of preapproved low-risk travelers from other agencies and other parts of DHS, including CBP’s Trusted Traveler programs. 43 Since these lists rely on databases that include social media information, it is likely that what people say on social media influences PreCheck designations. 44

Indeed, TSA has sought to highlight social media in its PreCheck screening efforts. In December 2014, the agency announced that it was planning to expand PreCheck by hiring contractors to screen applicants using “risk scoring algorithms using commercial data, including social media and purchase information.” 45 In response to criticism from civil society about the use of social media data and the reliance on private companies to determine security risks, 46 TSA backtracked, issuing a revised proposal that barred bidders from using any available social media for prescreening efforts. 47 In September 2017, TSA awarded an ongoing contract worth more than $22 million to Idemia, a big-data biometrics company, for Universal Enrollment Services, which includes PreCheck enrollment. 48 Idemia captures and submits enrollment data, including biographic, biometric, identity, and citizenship documentation, to the government for vetting and case management purposes. 49 While the contract documents do not indicate that Idemia will use social media information to conduct “security threat assessments” and “identity assurances” for PreCheck, Idemia’s website describes the company’s data mining mission in general as including “geolocations, audit trails and social media conversations.” 50

In sum, TSA’s Secure Flight uses a range of watch lists that rely at least in part on social media information in its preflight screening and decision making, about which very little is known. TSA and CBP also have an extensive information-sharing arrangement in which TSA relies on ATS holdings, which include social media data, to screen “Inhibited Passengers” and to aid in “decision-support” via the ATS mobile application. TSA’s PreCheck also may include the collection and analysis of social media information to designate certain individuals as “low risk.” The use of context-dependent and easily misinterpreted social media in secret analyses raises concerns about the use of discriminatory criteria to target travelers, both domestic and international, as well as the impact on free speech.

U.S. Immigration and Customs Enforcement

Immigration and Customs Enforcement (ICE) investigates cross-border crime and immigration violations. 1 Its activities range from combating child pornography and human trafficking to conducting raids at workplaces and targeting people, including activists, for immigration violations outside courthouses and schools. 2 ICE relies on social media data, which is often unreliable, to support its extremely broad investigative authorities; the agency has also explored expanding its collection of social media data to make dubious and likely discriminatory judgments about whether individuals should be permitted to enter or remain in the country.

ICE has two main branches: Homeland Security Investigations (HSI), which conducts both criminal and civil investigations, and Enforcement and Removal Operations (ERO), which is primarily responsible for detention and deportation. 3 Most of the activities described below are conducted by HSI — the second-largest investigative arm in the federal government 4 — which extracts, consults, and analyzes social media data during its investigations, including vetting and investigating overstay leads and conducting warrantless border searches, as well as in its intelligence-gathering and analysis initiatives. In turn, these investigations inform ERO’s removal operations.

1. Investigations

HSI often relies on social media in conducting an investigation. 5 First, ICE agents may manually collect data from publicly available and commercial sources, including social media, whenever they determine that the information is “relevant for developing a viable case” and “supports the investigative process.” 6 According to privacy impact assessments, such information is meant to be used to verify information that is already in the agency’s possession, such as a target’s current and former places of residence and cohabitants, and to identify other personal property. 7 However, it may also be used “to enhance existing case information” by providing identifying details like date of birth, criminal history, and business registration records. 8

Social media information is also gathered during undercover operations related to criminal investigations, during which agents are permitted to “friend” individuals on social media sites and collect any information they come across as a result. 9 In addition, HSI agents gain access to social media information through other investigatory activities — namely vetting and overstay enforcement initiatives and extractions of data from electronic devices obtained during border searches and investigations — which are discussed in the next sections.

The Investigative Case Management (ICM) system is the primary database that stores information collected by ICE during criminal and civil investigations. 10 ICE agents can use ICM to automatically query a plethora of internal and external systems, as well as to manually search various pools of data and copy and upload the results; the information ICE can query includes results from CBP’s Automated Targeting System (ATS), which contains social media information from a number of sources. 11 ICM data is disseminated within DHS and shared broadly with outside agencies. 12 In addition to wide authority to share information through formal channels with state, local, and federal law enforcement agencies, ICE agents are known to share information informally with individual state or local law enforcement officers. 13 In addition, ICM records that pertain to individuals, or “subject records,” are shared via the Law Enforcement Information Sharing (LEIS) Service, a web-based data exchange platform that allows partner law enforcement agencies to access DHS systems, including but not limited to ICM and TECS, CBP’s primary law enforcement system. 14

ICM was developed by the private data mining company Palantir. 15 According to contract notices, Palantir currently has a contract for work relating to ICM that has so far totaled $51.6 million. 16 Though Palantir’s 2014 proposal for ICM described the system as intended for use by ICE’s investigative branch, HSI, in 2016 DHS disclosed that it is also used by ICE’s deportation branch, ERO, to obtain information “to support its civil immigration enforcement cases.” 17

ICE has also invested in other software systems to enable it to analyze information from social media. For example, in June 2018, it was reported that ICE had signed a $2.4 million contract with Pen-Link, 18 a company offering software to law enforcement that can collect and analyze “massive amounts of social media and internet communications data.” 19 One of the services included in the Pen-Link contract with ICE is Pen-Link X-Net, 20 which collects and analyzes large quantities of internet-based communications data, from an “extensive, ever-growing list of providers,” including social media platforms. 21 Such sweeping collection and analysis is likely to scoop up swaths of irrelevant and unreliable information and risks misinterpreting innocuous connections and patterns as illicit activity.

Finally, West Publishing, a subsidiary of Thomson Reuters, provides HSI with access to the company’s Consolidated Lead Evaluation and Reporting (CLEAR) system, through a 2017 contract worth $20 million. 22 CLEAR combines a wide array of public and proprietary records, including data from social networks and chat rooms, to create “customizable reports, Web Analytics, mapping, and link charts.” 23 According to other contract documents, CLEAR provides essential support to ICE’s ability to investigate criminals and to uphold and enforce customs and immigration law “at and beyond our nation’s borders.” 24 CLEAR also interfaces with information from Palantir as well as with ICE’s main analytical system, FALCON. 25

2. Visa Overstay Enforcement

ICE has identified visa overstays as a serious threat to national security and over the past several years has ramped up its enforcement, tracking travelers who have allegedly remained in the United States beyond the time originally permitted; its efforts have included social media monitoring. 26 While two of the 9/11 hijackers had overstayed their visas, 27 there is little evidence that overstays pose a significant ongoing threat to national security. Research from the Cato Institute shows that the chance of being killed in an attack by a foreign-born terrorist is 1 in 4.1 million for an attacker on a tourist visa and 1 in 73 million for an attacker on a student visa, the two most common overstay categories. 28 Given that the overstay rate in 2017 was 2.06 percent for tourist visas and 4.15 percent for student visas, the chance of being killed by someone overstaying a visa is infinitesimal. 29

At a May 2017 congressional hearing, DHS described the basic process used to vet overstay leads: CBP’s arrivals and departures management system sends potential leads — identified by matching entry and exit records to ATS, which automatically screens, prioritizes, and sends them over to ICE’s lead management system, LeadTrac. 30 Analysts then vet these leads — against government databases, public indices, and unnamed commercial databases that provide aggregated information from social media and other public sites, as well as through internet searches on social media platforms — to determine whether there is a potential violation that could require a field investigation. 31 According to DHS documents prepared for the incoming administration at the end of 2016, ICE personnel target individuals for overstay enforcement who exhibit “specific risk factors,” which are based in part on “analysis of dynamic social networks.” 32 These analyses of social networks may be informed by the data gathered from social media sites. According to the DHS inspector general, ICE agents do not have policies and guidance on “appropriate system use” of the roughly 17 information technology systems upon which analysts rely for overstay work. 33

In 2014 ICE set up a special unit called the Open-Source Team, which uses a broad range of publicly available information, including social media, to help “locate specific targeted individuals, identify trends and patterns, and identify subtle relationships.” 34 A document obtained by the Brennan Center via FOIA request highlights three Open-Source Team “success stories,” all of which involve individuals from Muslim-majority countries. 35

In August 2016, ICE launched a series of pilot programs that aim to use social media to bolster vetting, lead investigation, and enforcement. 36 One of these programs, the “Domestic Mantis Initiative,” vets leads pulled from the Student and Exchange Visitor System (SEVIS) on students who enter the United States planning to study a “nonsensitive” field of study and later change to one the State Department categorizes as “sensitive” because of its potential connection to national security–related technology (e.g., nuclear physics, biomedical engineering, and robotics). 37 Using social media and other sources, ICE continuously monitors these students during their time in the United States, although it is not known what would constitute suspicious activity that would cause immigration authorities to take action. 38

Also in August 2016, ICE launched another pilot program, most often referred to as the Overstay Lifecycle program. 39 According to a report by the DHS inspector general, the program screens the social media activity of a category of nonimmigrant visa applicants from certain countries to help uncover “potential derogatory information not found in Government databases”; both the category of applicants and the specific countries involved were redacted from the publicly available report. 40 The report noted that the pilot was to screen social media activity at the time of visa application and to “continue social media monitoring” (during a time frame or process that was redacted from the report, but could extend to the time that subjects were in the United States) using a “web search tool” that analyzes social media data to develop so-called “actionable information.” 41 As with other uses of social media by DHS, it is unclear what types of information would raise flags about visa applicants.

ICE’s Overstay Lifecycle program was designed to supplement PATRIOT, an existing program that screened applicants at 28 visa security posts but did not monitor people who were granted visas and traveled to the United States. 42 The newer program aims to close this gap in enforcement by conducting continuous vetting and monitoring of some visa applicants, from the time they file a visa application to the time they depart from the country or violate their terms of admission, to uncover any “derogatory information.” 43

The visa applicants subject to continuous monitoring would be those who have applied through one of “at least two” specific State Department posts abroad, though the posts are not publicly identified. 44 According to the 2016 DHS report to Congress, these posts would be selected “to complement existing HSI screening efforts and in response to recent global acts of terrorism perpetrated in those countries.” 45 According to the same report, DHS also planned to incorporate social media vetting tools into both PATRIOT and LeadTrac, and modify LeadTrac to ingest information from visa applicants upon entry. 46 It is not clear whether this system change has occurred.

It is clear, however, that ICE has relied heavily on the data mining firm Giant Oak, Inc., to support these programs and will continue to do so in the future. According to publicly available contract notices, in August and September of 2018, both ICE’s Visa Security Program and its Counterterrorism and Criminal Exploitation Unit (CTCEU) contracted with Giant Oak for “open source/social media data analytics.” 47 These contracts are in addition to previous social media data analytics contracts between ICE and Giant Oak. 48 A contract recently obtained by the Brennan Center via FOIA request shows that CTCEU utilizes Giant Oak’s Search Technology tool (GOST) to aid in proactive investigation of national security leads that have incomplete address information or were returned from field investigations unresolved. 49 This tool is used for bulk screening and prioritization of individuals based on “threat level” and continuously monitors and evaluates changes in patterns of behavior over time. According to the CEO of Giant Oak, the tool lets the government know when overall patterns change—for example, when a group of individuals becomes “more . . . prone to violence.” 50

According to the contract, Giant Oak continuously monitors social media and other online sources and returns to CTCEU any information that identifies an individual’s possible location (including location of affiliated organizations), contact information, and employers. 51 Upon “exhaustion” of that so-called tier 1 information, ICE can request a follow-up search for information about the person’s associates (e.g., friends, family members, coworkers) that could help locate an individual. 52 The documents also note that the contract grants a Giant Oak “Social Scientist” access to classified information; he/she “tweaks the algorithms” behind GOST to better serve CTCEU’s needs, and works to further specialize the transliteration and name matching tools for “certain ethnic groups, non-Roman languages and alphabets, or countries of origin.” 53 There is no publicly available information on the scope of ICE’s other contracts with Giant Oak.

3. Extreme Vetting

As detailed below, after sustained opposition from many stakeholders, ICE announced in May 2018 that it had shelved its search for an automated tool for its Extreme Vetting Initiative (now rebranded as the Visa Lifecycle Vetting Initiative). 54 Instead, it has opted to spend $100 million to hire “roughly 180 people to monitor the social media posts of those 10,000 foreign visitors flagged as high-risk, generating new leads as they keep tabs on their social media use.” 55 Monitoring will continue while these individuals are in the United States, although ICE has stated that it would stop if a visitor was granted legal residency. 56 There is no public information on the types of social media posts that ICE considers indicative of risk, but if ICE endeavors to undertake predictive tasks based on the criteria outlined in the first version of this program, there is a high risk that the program can be used in discriminatory ways.

ICE awarded this reimagined, human-centered monitoring contract to SRA International (now CSRA Inc., owned by General Dynamics) in June 2018; several vendors filed challenges, which were ultimately denied by the U.S. Government Accountability Office (GAO). 57 As of May 2019, about $15 million has been awarded to SRA to carry out the contract. 58

As the above discussion makes clear, ICE relies heavily on social media to vet certain categories of individuals at the time of application. It is likely that these are predominantly individuals who are the focus of the Trump administration’s anti-Muslim extreme vetting initiatives. Moreover, the agency is moving toward using social media to monitor and track visa holders and students throughout their stay in the United States, where they would be covered by the First Amendment. It is also evident that the agency intends to rely more and more on software and other automated technologies, which the USCIS pilot programs, discussed earlier, determined were of limited usefulness. 59

Finally, it is worth highlighting that many of the ICE programs described above have been rolled out as pilots. While pilot programs are a useful way to assess new tools, ICE does not seem to systematically measure their effectiveness. It also does not issue privacy impact assessments for most of these activities, which would at least provide a bare minimum of information to illuminate the impacts of ICE programs. Last, public information provided by ICE does not clearly indicate which pilots are still active and how they relate to newer initiatives, leaving the public in the dark about the agency’s activities.


Automated extreme vetting

4. Electronic Device Searches

ICE also collects, extracts, and analyzes information, including social media data, from electronic devices (e.g., cell phones, laptops, tablets, thumb drives) obtained during warrantless border searches and investigations pursuant to search warrant, subpoena, or summons, or provided voluntarily. 60 For the past decade, ICE, like CBP, has invested in Cellebrite Universal Forensic Extraction Devices (UFEDs), hand-held tools that can instantly extract the full contents of any device, including phones, laptops, and hard drives. 61 In recent years ICE has ramped up its purchasing of UFEDs, spending an additional $3.7 million on the tools (which cost between $5,000 and $15,000 each) and licensing since March 2017. 62 Though it is not known precisely in what circumstances and for what purposes ICE personnel use these devices, it is clear that ICE has the capability to easily extract swaths of data, including social media information, from electronic devices. While the searches of devices obtained during investigations are limited by the scope of the relevant search warrant, subpoena, or summons, ICE claims broad authority to search and store data from devices seized at the border, including social media data and other personal information.

A. Warrantless Border Searches

ICE, like CBP, collects information obtained from electronic devices at the border, which it justifies as necessary to supplement its investigations and enforcement of immigration laws, and both agencies draw distinctions between “basic” (manual) and “advanced” (forensic) device searches. 63 Following the May 2018 decision in United States v. Kolsuz, which found that forensic examinations of cell phones at the border require some level of individualized suspicion, ICE issued an internal policy update that month prohibiting the use of advanced searches without reasonable suspicion “in order to limit litigation risk.” 64 Previously, ICE operated under policy guidance issued nearly a decade prior, which allowed agents to “search, detain, seize, retain, and share” electronic devices and any information they contain without individualized suspicion. 65 Formal guidance on the implementation of the May 2018 policy change for advanced searches has yet to be released, and the language of the policy suggests that ICE continues to claim a right of access to information on travelers’ phones and in their social media accounts through basic searches, even where there is no suspicion of wrongdoing. A 2019 federal district court decision — holding that both basic and advanced border searches of electronic devices can reveal a wealth of personal information and therefore require reasonable suspicion that devices contain contraband — may prompt further change, though for now it only governs the activities of CBP and ICE agents in Massachusetts, where the case was filed. 66

ICE claims its authority to search electronic devices at the border derives from statutes passed by the First Congress, such as the Act of August 4, 1790, which grants customs inspection authority over “goods, wares, or merchandise” entering the country. 67 Though the 2009 privacy impact assessment asserts that “travelers’ electronic devices are equally subject to search” as the “merchandise” described in 1790, the amount of sensitive information contained in electronic devices like cell phones is hardly comparable. 68 Indeed, as the Supreme Court noted critically in a recent case, treating cell phones as functionally identical to other physical items of similar size “is like saying a ride on horseback is materially indistinguishable from a flight to the moon.” 69

According to the 2009 ICE directive on border searches of electronic devices, detained devices are typically held for no more than 30 days, unless “circumstances exist that warrant more time.” 70 Copies of the content obtained from devices are stored on either an ICE external hard drive or a computer system, neither of which is connected to a shared or remote network. 71 However, notes from any stage of the search process, typically relating to information that is “relevant” to immigration, customs, or other laws enforced by DHS, 72 can be stored by ICE in “any of their recordkeeping systems,” such as the Intelligence Records System. 73 The standard for relevance is left undefined, leaving ample room to collect a range of innocuous and often personal electronic content.

ICE can disseminate copies of information from an electronic device to federal, state, local, and foreign law enforcement agencies. 74 While ICE must have reasonable suspicion that the information on a device is evidence of a crime in order to share device information with other federal agencies for subject matter assistance, no suspicion is required to ask for technical assistance, which can encompass translation and decryption services. 75 Further, ICE is specifically authorized to disseminate any device information “relating to national security” to law enforcement and intelligence agencies. 76

In short, while ICE’s advanced searches now require reasonable suspicion, ICE can access information stored on devices and from social media with no suspicion of criminal activity by conducting basic searches, which can reveal a wealth of information gleaned from travelers’ social media accounts. It uses this information to support investigations and make admissibility determinations, but also as a broader means of information collection. There are few restrictions on how information obtained from electronic devices is used and disseminated. And the information, including social media identifiers and other personal data, can be stored in any number of ICE’s databases, to which countless people have access, and shared with law enforcement as long as it is considered to “relate” to national security.

B. Extraction and Analysis of Electronic Media

Once ICE has obtained access to electronic devices through a warrantless border search or obtained access to “electronic media” (a slightly broader category that also includes thumb drives, hard drives, other storage devices, etc.) via subpoena 77 or warrant, it can extract and analyze information if the data could be “pertinent” to an investigation or enforcement activity. 78

According to the 2015 Privacy Impact Assessment for the Forensic Analysis of Electronic Media, which encompasses electronic devices obtained during both border searches and investigations, the data extracted and analyzed by ICE could pertain to numerous individuals beyond the person in possession of the device, including witnesses, informants, members of the public, and victims of crimes. 79 Extracted data may also include sensitive personally identifiable information such as medical and financial information, records containing communications such as text messages and emails, and records of internet activity. 80 These records could reveal a host of sensitive data, including medical conditions, political and religious affiliations, and internet browsing preferences.

Information extracted from devices that are obtained during investigations is retained according to a proposed schedule that varies depending on the nature and outcome of the investigation. 81 There is extremely wide authority to disclose information to other agencies — including federal, state, local, and foreign law enforcement counterparts. 82 There also seems to be broad authority for re-dissemination by law enforcement partners. 83

ICE uses a variety of unspecified electronic tools to analyze the media it extracts from devices via its border search authority or obtains during investigations. 84 The 2015 privacy impact assessment lists four types of analyses that agents can conduct using these tools: time frame analyses (to help determine when various activities occurred on a device), data hiding (to find and recover concealed data), application and file analyses (to correlate files to installed applications, examine a drive’s file structure, or review metadata), and ownership and possession reviews (to identify individuals who created, modified, or accessed a given file). 85 The tools also can be used to “highlight anomalies” in the data. 86

Social media information and other data extracted from electronic devices during investigations and border searches are stored in ICE’s Intelligence Records System. 87 That data is then ingested into FALCON-SA, which has a number of analytical capabilities including “social network analysis,” and will be discussed in the Analytical Tools and Databases section below. 88

Thus, based on a low threshold of “pertinence,” ICE uses sophisticated tools to extract social media data from electronic devices that it obtains during border searches and investigations. The extracted data is then subject to a variety of analyses (about which we know little), while notes about the information may be shared widely within and beyond DHS and potentially channeled into other systems for additional analyses. ICE’s extraction of social media data from electronic media is yet another example of how the extensive DHS information-sharing apparatus enables data to be collected for one purpose under a malleable standard and then stored, shared, and reused for secondary purposes.

5. Analytical Tools and Databases

The numerous sources of information gathered by ICE operations and investigations are consolidated into several large databases. The main ICE database for compiling and analyzing social media information is the FALCON Search & Analysis System (FALCON-SA). 89 ICE personnel use FALCON-SA to conduct two kinds of analyses using social media data: trend analysis, or identifying patterns, anomalies, and shifts in data to guide operational strategy or predict future outcomes; 90 and link and network analysis, or identifying connections among individuals, groups, incidents, or activities. 91 This section will describe how FALCON-SA and its source systems enable these processes by gathering and storing information from numerous sources about a wide variety of individuals, disseminating information broadly, and applying unknown analytical tools to draw conclusions that impact ICE operations.

Although FALCON-SA does not itself extract data directly from social media, users can add social media information from other systems to FALCON-SA without restriction, and FALCON-SA automatically ingests data from several other databases that store social media information. 92 For instance, every day, FALCON-SA ingests information from ICE’s Investigative Case Management (ICM) system relating to current or previous law enforcement investigations, as well as ICE and CBP lookout records, 93 which can include records of electronic devices searched at the border, including details gleaned from inspections of social media applications. 94 ICM also transfers to FALCON-SA telecommunications information about subjects of ICE criminal investigations, potential targets, associates of targets, or any individuals or entities who call or receive calls from these individuals. 95 On an ad hoc basis, ATS border crossing data and inbound/outbound shipment records are also uploaded into FALCON-SA. 96 While ICM’s telecommunications data and ATS’s border crossing and shipment data likely do not include social media information, once all these elements are combined with the other sources in FALCON-SA, the aggregation of information may collectively reveal a wealth of details about an individual’s travels, family, religious affiliations, and more. 97

FALCON-SA users are able to combine these various forms of information and apply the system’s built-in trend analysis tools in order to highlight patterns, shifts in criminal tactics, emerging threats, and strategic goals and objectives. 98 These findings are then shared with DHS and ICE leadership, agents, officers, and other employees in the form of law enforcement intelligence products or reports. To produce these reports, FALCON-SA retains “all information that may aid in establishing patterns of unlawful activity,” even if that information may not be strictly relevant or necessary for an investigation, and even if the accuracy of the information is unclear. 99

To support its network analytics functions, FALCON-SA, like CBP’s Analytical Framework for Intelligence, regularly ingests large amounts of information about individuals from another database, the ICE Intelligence Records System (IIRS). 100 IIRS, like the CBP Intelligence Records System, contains information on a wide variety of individuals, including people who are not suspected of any criminal activity or seeking any type of immigration benefit, such as associates of people seeking visas or naturalization, including Americans; people identified in public news reports; and people who have reported suspicious activities or incidents. 101 IIRS also contains electronic data and other information collected during ICE investigations and border searches, likely including social media data extracted from devices. 102 Sources for the system also include records from commercial vendors and publicly available data, such as social media information, which are not required to be relevant or necessary. 103

Information from IIRS, ICM, and ATS is transferred to FALCON-SA, where it informs FALCON-SA’s network analysis, highlighting associations between individuals and data elements. 104 Users can then identify possible connections among existing ICE investigations and create visualizations (e.g., maps, charts, tables) that display connections and relationships among people and enterprises. 105 According to documents obtained by the Electronic Privacy Information Center via FOIA, FALCON has “social network analysis” capabilities that seem to rely on social media data. 106 ICE has not made clear whether ERO agents can directly access FALCON-SA to track down undocumented immigrants, although they can get such information from HSI. 107

Schedule of Data Transfer to FALCON-SA

Notably, the operations of FALCON-SA, which is one of three ICE FALCON modules, 108 are intimately connected with ICE’s contracts with the technology company Palantir. 109 According to the Palantir Licensing Terms and Conditions for FALCON, released in response to a FOIA request, FALCON is based on Palantir’s Gotham platform, a software system unique to ICE that allows the agency to analyze complex data sets containing detailed personal information about individuals. 110 Publicly available contract notices reveal that in November 2018, Palantir began a new one-year, $42.3 million contract with ICE for “FALCON Operations & Maintenance,” which brings the total for such contracts for FALCON to about $94 million. 111 Many aspects of Palantir’s work with ICE — described in further detail in the Investigations section, above — remain undisclosed, such as the privacy protections for personal information, including social media data, that resides in FALCON-SA.

In sum, ICE’s analytical tools aim to fully exploit the broad array of sensitive information, including social media data, collected by ICE agents and other DHS components. FALCON-SA houses social media data, for which there are no accuracy requirements, from numerous sources. This information is subjected to unspecified trend and network analyses, the efficacy of which is not publicly understood. While people seeking immigration benefits bear the brunt of this scrutiny, their American friends, relatives, and business associates are sucked into these repositories of information as well.

U.S. Citizenship and Immigration Services

U.S. Citizenship and Immigration Services (USCIS) processes and adjudicates applications and petitions for a variety of immigration benefits, including adjustment of status (for instance, from a student visa to a green card), naturalization, and asylum and refugee status. 1 USCIS’s Fraud Detection and National Security Directorate (FDNS) performs background checks, processes immigration applications, investigates immigration benefit fraud, and functions as the link between USCIS and law enforcement and intelligence agencies. 2 The ambiguous nature of social media information collected by USCIS raises concerns about how it will be interpreted, especially for Muslims who are the targets of many of these programs. Indeed, while USCIS is expanding these programs, an inspector general report shows that the agency has not evaluated much less demonstrated — their effectiveness.

1. Vetting

FDNS uses social media in a few contexts relating to its vetting initiatives, primarily to aid in determining an individual’s admissibility or eligibility. 3 In 2014, FDNS started a pilot Social Media Division, which was made permanent in 2016. It was later expanded under an initiative known as FDNS “Enhanced Review.” 4

In 2015 and 2016, USCIS undertook five pilot programs to test the use of social media for screening and vetting. Four programs targeted refugees, and one focused on K-1 (fiancé[e]) visa applicants for adjustment of status. 5 While it is unclear which pilots have continued or been made permanent, public documents show that examining social media has become a key part of vetting refugees and asylum seekers in particular.

A. Vetting for Refugees and Asylum Seekers

According to DHS documents, the Social Media Division of FDNS performs social media vetting on “certain” asylum applications and screens refugee applicant data for “select populations” against publicly available information. 6 In October 2017, the director of USCIS told Congress that the “select populations” included Syrians, and that USCIS was working to refine and expand its use of social media to target additional categories of refugee and asylum applicants. 7 This statement came shortly after the Trump administration announced new “enhanced vetting capabilities” for refugees from 11 countries identified as posing a “higher risk.” 8 The countries were not publicly identified by the administration, but it seems likely that this additional screening is targeted primarily at Muslims: the FDNS “Enhanced Review” was triggered by the Muslim ban executive order. 9 Refugee Council USA, a coalition of organizations focused on refugee protection and resettlement, told CNN that as of January 2018 the list of countries subject to enhanced review included Egypt, Iran, Iraq, Libya, Mali, North Korea, Somalia, South Sudan, Sudan, Syria, and Yemen. 10 Of the earlier social media pilots undertaken by USCIS, at least two focused solely on refugee applicants from Syria, one focused solely on refugee applicants from Syria and Iraq, and at least two used automated tools that were capable only of translating social media posts from Arabic. 11

All refugee applicants, as well as those who gain status through an applicant (e.g., a spouse or child), undergo a variety of checks. 12 “Select applicant populations” are subject to social media checks, during which an FDNS officer looks at social media for information relating to their claim for refugee status or indication of potential fraud, criminal activity, or national security concerns. 13 During such checks, officers initially collect information using a government-affiliated account and username and do not interact with applicants through social media; this process is defined as overt research. When USCIS deems that an application presents a national security or public safety concern and overt research could “compromise the integrity” of an investigation, officers are permitted to use identities that do not identify their DHS or government affiliation in a process known as masked monitoring. 14

A 2015 FDNS memorandum on the use of social media for refugee processing notes that officers will limit collection of information related to First Amendment–protected activities to information that is “reasonably related to adjudicative, investigative, or incident response matters.” 15 The privacy impact assessment for refugee vetting notes that officers may provide the refuge seeker a chance to view and explain a social media posting found during vetting, and that the decision on a refugee’s resettlement or employment eligibility cannot be made solely on the basis of information obtained from social media. 16

As of November 2016, DHS reported that no immigration benefit had been denied “solely or primarily” as a result of information found on social media. 17 In fact, DHS concluded that information found during screening had merely a “limited” impact in “a small number of cases” in which the data was used for developing additional avenues of inquiry, and that social media information had little to no impact in the vast majority of cases. 18 This low “hit” rate raises questions about the value of focusing resources on collecting and analyzing this type of data.

For asylum seekers, DHS officers compare information from social media and other public and commercial sources against the information that applicants provide regarding when they entered the United States, how long they have been in the country, and even when they “encountered harm outside the United States.” 19 Asylum officers are trained to compare public and commercial data with “applicant-reported information”; if they find an inconsistency, they “must confront the applicant with that information” and provide an opportunity to explain it. 20

Although FDNS has tested automated tools to vet the social media of individuals seeking refuge, the extent to which such tools are currently used is not known. In pilot programs related to refugee applications, officers identified serious problems with the tools tested. Some of these were practical problems, such as language limitations (most tools are English-focused) and efforts by social media companies to prevent their platforms from being used as surveillance tools by blocking access to big data feeds. 21 Further, when automated tools were used, officers had to manually review the results just to decipher whether the applicant had been correctly matched to the social media account identified. 22

In reviewing flagged items, FDNS officers are required to check for “national security indicators,” but there seems to be a lack of clarity about what this means. In 2017, two years after the pilot programs were launched, DHS personnel reportedly expressed a need for a definition of what constitutes a “national security indicator in the context of social media.” 23 The DHS inspector general noted a similar problem: his office was unable to evaluate specific policies and procedures for the pilot programs — because none existed. 24 Even more troubling, the inspector general found that DHS had simply failed to measure the effectiveness of the pilot programs, making them unsuitable as models for future initiatives.

According to the refugee program privacy impact assessment, five separate systems retain information for refugee processing, but none are described as containing social media data collected by DHS. 25 For example, the results of background checks, which may be informed by social media information, are stored in the State Department’s refugee case management system, the Worldwide Refugee Admissions Processing System (WRAPS), but only in the form of a check’s outcome (“clear” or “not clear”). 26

However, social media information is kept in a far-reaching system known as the Alien Files (A-Files), which covers every immigrant and some visitors to the United States. 27 USCIS is the main custodian of the system, with ICE and CBP regularly contributing to and using the data contained in it. 28 An individual’s A-File is considered the official record of his or her immigration history and is used by a wide array of agency personnel for legal, fiscal, and administrative needs, such as naturalization and deportation proceedings. 29 A September 2017 notice in the Federal Register made clear that DHS collects and keeps social media information (handles, aliases, associated identifiable information, and search results) relating to immigrants, including legal permanent residents and naturalized citizens. In an email to the news site Gizmodo, DHS stated that “the notice does not authorize USCIS to search the social media accounts of naturalized citizens,” which begs the question of whether other authorities are used to undertake such searches and leaves unaddressed the implications for people who have legal permanent resident status. 30 Regardless of whether new collection occurs, the 100-year A-File retention period means that DHS and other agencies can access and potentially use information gathered from social media long after an immigrant has completed the naturalization process. Despite questions from the press, DHS has not publicly clarified if and how this information could be used in the future.

To summarize, social media is used by USCIS in vetting people who apply for immigration benefits (such as students who become employed and change their visa status, or green card holders who become naturalized), and this information is retained in their A-Files. As discussed above, USCIS itself found that social media monitoring was not particularly helpful when it tested social media vetting for five programs. It has nonetheless proceeded with expanding its use of social media in several contexts, especially the vetting of refugee applicants and asylum seekers. It appears that such uses are focused on checking information provided by applicants, which may be justified for situations in which people seeking such status do not have documentation. But the ambiguous nature of social media raises concerns, as does the apparent targeting of certain — likely Muslim — applicants for such additional screening. Finally, as the inspector general’s evaluation of these programs clearly indicates, DHS has made no effort to evaluate their effectiveness.

B. Vetting for the Controlled Application Review and Resolution Program

Social media reviews are also used in the Controlled Application Review and Resolution Program (CARRP), a secretive FDNS program instituted in 2008 for flagging and processing cases that present “national security concerns.” 31 An individual who is placed on the CARRP track is essentially blacklisted. 32 According to a study by the American Civil Liberties Union (ACLU), CARRP uses vague, overly broad, and discriminatory criteria and disproportionately targets Muslims and individuals from Muslim-majority countries. 33 The program has been challenged in court as “extra-statutory, unlawful, and unconstitutional.” 34 A USCIS briefing book indicates that in July 2016, officers began screening social media accounts for Syrian and Iraqi CARRP cases specifically, though other documents suggest that social media is used to vet other populations as well. 35

Applicants can be referred to CARRP in a variety of ways. Individuals who are flagged as known or suspected terrorists (including anyone in the FBI’s overbroad Terrorist Screening Database, discussed above 36 ) are automatically flagged as a national security concern and put on the CARRP track. 37 People can also be referred to CARRP at any stage of the screening and adjudicative process (e.g., when applying for citizenship or a green card) if they might present a “national security concern.” 38 According to CARRP officer guidance, officers may utilize open-source research, including searching social media information, to identify an indicator of a national security concern. 39 The training handbook lists three broad categories of “non-statutory indicators” officers can consider to be indicative of a national security concern: “employment, training, or government affiliations” (e.g., foreign language expertise); “other suspicious activities” (e.g., unusual travel patterns); and “family members or close associates” (e.g., a roommate, coworker, or affiliate) who have been identified as national security concerns. 40

While these factors could be relevant to national security, they also give USCIS officers great discretion and present serious due process and free speech concerns, particularly in the case of individuals who are in the United States and seeking adjustment of status.

C. Immigration Benefits Determinations

FDNS officers consult social media websites and commercial data sources, including Thomson Reuters’s CLEAR database (discussed in the ICE section, above), during the screening of immigration benefit request forms, applications, or petitions. 41 According to information provided by FDNS to the DHS Privacy Office, data collected from social media during the benefit determination process is stored in the applicant’s A-File, whether or not it was found to be derogatory, but applicants are given the opportunity to explain or refute any “adverse information” found through social media. 42 However, USCIS has not complied with the Privacy Office’s 2012 recommendation to update the privacy impact assessments for several programs, including Deferred Action for Childhood Arrivals (DACA), to reflect that social media is used as a source of information and to address the privacy risks posed by such collection and how they would be mitigated. 43

When someone applies for an immigration benefit (such as naturalization), the applicant’s information is screened against data contained in USCIS, ICE, and other law enforcement databases for eligibility, fraud, and national security concerns. 44 In line with other DHS programs, USCIS is increasingly looking to automate many of the checks that it had previously performed manually. Since June 2017, USCIS and CBP have been working to gradually implement an interagency effort called “continuous immigration vetting.” 45 Through this program, applicants applying for green cards or naturalization will have the biographical and biometric information they provide, as well as any information received by USCIS thereafter, automatically checked against CBP holdings. These checks will continue until the time of naturalization. 46 This new program is currently intended to uncover “potential national security concerns,” 47 although the recently published privacy impact assessment notes that the agency hopes to expand the process to vet for public safety concerns and fraud as well. 48

Continuous immigration vetting relies on a connection between an existing USCIS screening tool called ATLAS 49 and CBP’s ATS, which ingests and analyzes social media and other data from a plethora of sources. When someone applies for a benefit or information about an individual (such as an address) is updated, ATLAS automatically scans for potential matches to derogatory information in other government databases. 50 ATLAS itself analyzes information to detect patterns and trends; for example, it visually displays relationships among individuals on the theory that they could reveal potential ties to criminal or terrorist activity. 51

With continuous immigration vetting, ATLAS also automatically sends any new information it receives over to ATS. ATS checks CBP holdings for matches to information about any individuals who have been flagged as a potential national security threat. 52 But ATS also stores the applicant or benefit holder’s information for future use. Whenever derogatory information associated with an individual is added to a government database, ATS automatically checks for a “match and/or association” to the USCIS information and sends results back to ATLAS. 53 It is not clear that this new system will rely on social media. The privacy impact assessment notes that although ATS connects with multiple data sets, USCIS and CBP have tailored the initiative so that only “relevant” data sets are checked, although these are not identified. 54

2. Administrative Investigations

FDNS conducts administrative investigations in order to procure additional information that can help determine an individual’s eligibility for an immigration benefit. Administrative investigations seek to verify relationships that are the basis for an individual to receive an immigration benefit, identify violations of the Immigration and Nationality Act, and identify other grounds of admissibility or removability. 55

An officer can decide that an investigation is warranted on the basis of the results of a “manual review,” which can be triggered by three mechanisms: a notification generated by ATLAS (when there is a match to one of its predefined rules), a fraud tip referral from the public or government officials, or a manual referral submitted by USCIS adjudications staff. 56 In order to officially open an administrative investigation after a manual review, the officer must determine that the tip is “actionable.” 57 There are no publicly available criteria for this determination. The relevant privacy impact assessment notes only that investigations are performed due to suspected or confirmed fraud, criminal activity, or public safety or national security concern, or simply when a case is randomly selected for assessments to determine whether benefits have been obtained by fraud. 58 The broadness of these criteria suggests that the bar for opening an investigation is low and largely left to the officer’s discretion.

As is the case with the screening of immigration benefits, FDNS may collect information from public sources, including social media, to serve as an additional check for other information collected during these investigations, support or refute any indication of fraudulent behavior, and identify threats to public safety or national security. 59 By way of example, FDNS is known to check an applicant’s social media to help uncover “sham marriages.” 60 That said, FDNS materials specify that an officer may not deny an immigration benefit, investigate benefit fraud, or identify public safety and national security concerns based solely on public source information. 61 Rather, such information may only be used to identify possible inconsistencies and must be corroborated with authoritative information on file with USCIS prior to taking action. 62 Any information found on a social media site and used during an investigation will be stored in both the applicant’s hard-copy file and in the Fraud Detection and National Security Data System (FDNS-DS), regardless of whether it was found to be derogatory. If the information collected is found to be derogatory, the individual must be given the chance to explain or refute it, as is the FDNS standard with all derogatory information found from publicly available sources. 63

As the above discussion shows, USCIS/FDNS has taken significant steps to incorporate social media into its various vetting and screening activities, including making admissibility and eligibility determinations for certain refugees and asylum seekers and for those placed on the CARRP track. There are questions about whether this vetting disproportionately targets Muslims and those from Muslim-majority countries. In refugee and asylum cases, social media could serve as a source of information for people who don’t have many documents, but it could also serve as a way to weed out people due to ideological, racial, or religious prejudices or on the basis of misinterpretations. Administrative investigations too can use social media, although its use in that context is restricted to verification, and those affected have the opportunity to refute derogatory information. In line with other programs, USCIS is relying more and more on automation to support certain checks and screening processes.

Conclusion

Social media provides a huge trove of information about individuals their likes and dislikes, their political and religious views, the identity of their friends and family, their health and mental state — that has proved irresistible for security and law enforcement agencies to collect and mine in the name of national security and public safety. Increasingly, DHS is vacuuming up social media information from a variety of sources, ranging from travelers’ electronic devices to commercial databases, and using it to make decisions about who gets to come to the United States and the level of screening to which travelers are subjected. But there are serious questions about these programs: the evidence shows they are not effective in identifying risk, and they open the door to discrimination and the suppression of speech, association, and religious belief. Congress must fulfill its oversight responsibilities and require DHS both to come clean about the full extent of its social media surveillance and ensure that these programs are based on empirical evidence of effectiveness, safeguard against discrimination, and include robust privacy protections.

Appendix

DHS databases generally have a records retention schedule approved by the National Archives and Records Administration. The following appendix contains details on the retention schedules for the DHS systems that likely store social media data and other sensitive information.

DHS Component

Database/System

Retention Schedule

DHS Component

CBP

Database/System

Automated Targeting System (ATS)

Retention Schedule

Risk assessments and other records in ATS are retained for 15 years, unless the information is “linked to active law enforcement lookout records . . . or other defined sets of circumstances,” in which case the information is retained for “the life of the law enforcement matter.” 1 This period may exceed the data retention requirements of the system from which the data originated, and therefore ATS may pass information to partners even after it has been deleted from other CBP databases. 2

DHS Component

CBP

Database/System

Electronic System for Travel Authorization (ESTA)

Retention Schedule

CBP stores information from social media platforms collected during ESTA vetting in ATS. 3 ESTA application and vetting information is retained for a total of three years in active status (one year after the traveler’s two-year travel authorization period expires), at which point the account information is archived for 12 years. 4 Data linked to active law enforcement lookout records continues to be accessible for the duration of the law enforcement activities to which it is related. 5

DHS Component

CBP

Database/System

Analytical Framework for Intelligence (AFI)

Retention Schedule

Finished intelligence products in AFI are retained for 20 years. 6 Unfinished products that don’t contain personal information are retained in AFI for 30 years; those that do must be recertified annually for continued relevance and accuracy. 7

DHS Component

CBP

Database/System

CBP Intelligence Records System (CIRS)

Retention Schedule

Finished intelligence files are retained in CIRS for 20 years and raw, unevaluated information for 30 years. 8

DHS Component

CBP

Database/System

Stand-alone IT systems (e.g., ADACS4)

Retention Schedule

Information collected during CBP’s searches of electronic devices ob- tained pursuant to warrant, consent, or abandonment is stored in “stand- alone” technology systems (unconnected to other DHS databases). Information associated with arrests, detentions, and removals may be stored for up to 75 years, and information that does not lead to the arrest, detention, or removal of an individual may be stored for 20 years “after the matter is closed.” 9

DHS Component

TSA

Database/System

Secure Flight

Retention Schedule

The passenger data from Secure Flight shared with CBP is deleted within seven days after the flight itinerary for passengers who do not require additional scrutiny. Passenger information on “potential” hits is retained in ATS for 15 years, 8 years after that information is removed from the Secure Flight system. 10 Confirmed matches to a watch list record or other derogatory information are retained for 99 years. 11 Data pertaining to individuals who match TSA’s “rules” for lists such as the Silent Partner and Quiet Skies Lists are retained by both ATS and Secure Flight for seven years. 12 Secure Flight information that is linked to a border security, national security, significant health risk, or counterterrorism matter will be retained in ATS for the life of the matter. 13

DHS Component

TSA

Database/System

TSA Watch Lists

Retention Schedule

TSA Watch List master files are maintained for 30 years after the date of entry. 14

DHS Component

TSA

Database/System

PreCheck

Retention Schedule

TSA retains information on individuals whose PreCheck applications were rejected because of their criminal history and places such individuals on a permanently retained list of passengers who are ineligible for PreCheck. 15 Information pertaining to an individual who is a match to a watch list will be retained for 99 years, or seven years after TSA learns that the individual is deceased, whichever is earlier. 16 Information pertaining to a PreCheck applicant who originally appeared to be a match to a watch list, but who was subsequently determined not to be a match, will be retained for seven years. 17 Information pertaining to an individual approved for PreCheck who no longer participates in the program is retained for one year after the request to stop participation in PreCheck is received. 18

DHS Component

ICE

Database/System

LeadTrac

Retention Schedule

Data stored in LeadTrac is retained for 75 years after the cases to which those records relate are closed. 19

DHS Component

ICE

Database/System

Investigative Case Management (ICM) System

Retention Schedule

Under the proposed schedule, ICM records would be retained for 20 years from the end of the fiscal year in which a case is closed. 20 After 20 years, the information would either be destroyed or retained further under a new retention schedule if deemed necessary. However, cases would be permanently retained if deemed to be of significant “historical interest” (not defined). All ICM records will be treated as permanent records until a records retention schedule is approved. 21

DHS Component

ICE

Database/System

FALCON-Search and Analysis (FALCON-SA)

Retention Schedule

Routinely ingested data is retained in accordance with the approved re- cord retention schedule and SORN of those source systems. FALCON-SA data uploaded in an ad hoc manner, user-created visualizations, and search queries are retained in the system for the same length of time as the associated ICE case file. If there is no associated ICE case number, the retention period is 20 years. 22 Information from ATS pertaining to border crossings that is uploaded into FALCON-SA on an ad hoc basis is retained in FALCON-SA for 15 years after the relevant border crossing. 23

DHS Component

ICE

Database/System

ICE Intelligence Records System (IIRS)

Retention Schedule

ICE is in the process of drafting a proposed record retention schedule for the sources maintained in the ICE Intelligence Records System. 24

DHS Component

USCIS

Database/System

Alien Files (A-Files)

Retention Schedule

An individual’s A-File is retained for 100 years after his or her date of birth. 25

DHS Component

USCIS

Database/System

Fraud Detection and Na- tional Security Data System (FDNS-DS)

Retention Schedule

An individual’s record is stored in FDNS-DS for 15 years after the date of his or her last interaction with FDNS personnel. However, records “related” (undefined) to the individual’s A-File are transferred there and retained in accordance with the A-File retention schedule. 26

DHS Component

DHS-Wide

Database/System

DHS Data Framework

Retention Schedule

Data is retained in accordance with the retention schedules of source systems. 27

Endnotes for Sidebars