The Trump administration has said that it is reviewing all 55 million people with valid visas to visit, live in, or work in the United States for potential deportation violations, including through review of their social media posts. This expansion of “continuous vetting” marks a dramatic escalation in the surveillance of legal immigrants while glossing over the technological and practical difficulties that make accurate universal surveillance of tens of millions of people impossible. It also represents an attack on freedom of speech, forcing millions of visa holders to change their behavior by avoiding political protests, deleting social media posts, and refraining from expressing opinions that someone might deem “anti-American” — even though the First Amendment protects such expression regardless of citizenship.
What “Continuous Vetting” Really Means
Under traditional continuous vetting, visa holders are monitored for signs that they shouldn’t be allowed to stay in the United States. What’s new is the scale: this expanded process would now apply to all 55 million visa holders. This is not a one-time background check. In U.S. immigration practice, continuous vetting is a process, spearheaded by U.S. Citizenship and Immigration Services, that re-screens visa applicants and holders for new data by checking government watchlists, law enforcement and immigration records, and, increasingly, social media content. This is what the State Department has indicated all 55 million visa holders will now be subjected to.
The State Department’s “Catch and Revoke” initiative, which the agency says uses artificial intelligence to assess and revoke visas from ostensibly pro-Hamas international students and other foreign visa holders, is another form of continuous vetting. One analysis indicates that it combines social media monitoring, visa status tracking, and automated threat assessments to carry out the task.
In addition, as agencies modernize case systems, they’re closing the gap between detecting events (border crossings, status changes, flagged posts) and triggering enforcement actions. In April, U.S. Immigration and Customs Enforcement commissioned a prototype case-management platform, called ImmigrationOS, to connect identification to removal. These systems represent a fundamental shift from periodic reviews to persistent, event-based surveillance. At the heart of this surveillance apparatus are tools like Babel X, an AI-powered text and social media analytics software that has been licensed by U.S. Customs and Border Protection since at least 2019. According to internal documentation, it is used to “support targeting and vetting, . . . [and] identify potential derogatory and confirmatory information,” with results stored in CBP systems. Babel Street, which sells Babel X, markets a “persistent search” capability — like a Google Alert on steroids, it continuously monitors online sources for new information about individuals, incorporates publicly and commercially available data sources, and produces alerts for its government clients.
What Continuous Monitoring Can — and Can’t — Do
While monitoring 55 million people sounds like something computers can do automatically, the reality is much messier. Real surveillance systems need massive physical infrastructure. They require giant server farms, thousands of human analysts, and large amounts of money to keep running. Consider a single benchmark: NSA’s Utah Data Center, which was built to ingest and analyze massive communications streams and has continued campus expansion into 2025, according to a notice from the Army Corps of Engineers. In drought-stressed years it has drawn tens of millions of gallons of water in a single month, and earlier reporting put its annual power spend on the order of $40 million.
The operational obstacles are significant as well. Tech giants with unlimited budgets struggle to monitor their own platforms accurately. Tracking every digital move of 55 million people across the entire internet would require surveillance infrastructure bigger than anything that currently exists, with coordination across dozens of agencies that can barely talk to each other. Even with the White House pushing to eliminate “data silos” and DOGE’s rapid-fire consolidation of Americans’ personal information — against concerns that it violates the Privacy Act and leaves sensitive information vulnerable to leaks and hacks — these challenges remain unresolved.
Why Algorithms Cannot Reliably Score Beliefs
The problem compounds when you consider what these systems are often actually trying to detect, through automated sentiment analysis: complex concepts like “anti-Americanism” or “anti-Semitism.” These terms are not machine-readable categories; instead, they indicate sentiments for which there is no agreed-upon definition. While some automated tools offer sentiment analysis, that function only estimates whether language sounds positive or negative; it does not determine a person’s stance toward a specific target, which is the relevant question for categories like “anti-American.” These tools detect tone and patterns, but they don’t resolve contested meanings or infer beliefs. Sarcasm and irony routinely invert the literal meaning of words, further degrading accuracy, a limitation documented across recent survey work on sarcasm detection and stance analysis.
Even vendors admit these judgments depend on shifting context, contested definitions, and incomplete data. When the Department of Homeland Security floated a plan to algorithmically rate visa applicants’ “positive contribution[s]” and threat to “national interests,” 54 technical experts wrote a letter advising the acting Homeland Security secretary that “no computational method can provide reliable or objective assessments of traits” the government sought to measure. They predicted such ill-defined targets would produce substantial false positives, incorrectly flagging people who posed no threat. DHS ultimately abandoned the proposal.
Language and culture compound the problem. Systems trained on mainstream American English mislabel dialects and speech that mixes multiple languages at much higher rates than standard English. Studies of models that are used to flag slurs or abusive posts show that these automated tools often mislabel ordinary “African-American English” as “toxic” and become unreliable when the dialect or platform changes, which is common in immigrant online spaces.
A great deal of contemporary political expression is also multimodal: that is, it may include memes, images with text, and coded euphemisms. Text-only classifiers (systems that read the words but ignore the image) miss crucial context like who a subject is, whether a sentence is a joke, or what an image implies beyond the language in it. At the same time, multimodal models (systems that look at both the image and text) remain fragile and inconsistent despite active research.
Finally, the math of rare events guarantees collateral damage at scale. If true cases are extremely rare, screening 55 million people will mostly flag innocents, even with good tools, a risk the technical experts warned DHS about. This is why even a “small” false-positive rate affects millions of people: a potential 5 percent false positive rate in a review of 55 million people would flag 2.75 million people, the equivalent of the entire population of the city of Chicago.
Government testing hasn’t closed these gaps. DHS’s inspector general in 2017 found that social-media screening pilots lacked clear measures of effectiveness and could not support department-wide scaling — hardly a foundation for automating determinations about beliefs. Ultimately, scoring beliefs is not just technically hard. It is poorly defined, highly subjective, and empirically error-prone. Using these outputs to trigger immigration consequences invites inaccurate and biased results at national scale.
When Fear Does the Work: Lessons from the United States and Abroad
People change what they read, search, and say when they believe they’re being monitored. After Edward Snowden’s 2013 revelations about NSA surveillance, page views of Wikipedia articles on privacy-sensitive topics fell significantly. That hesitation carries over into expression. Experimental work has found that when people feel government surveillance is present, they become less willing to speak on social media, even when they would otherwise participate, creating an online “spiral of silence.”
Chilling effects also reach association and community life. Research in Uganda and Zimbabwe has found three consistent outcomes of government surveillance: self-censorship; a reluctance to engage with individuals or organizations believed to be under watch; and an erosion of trust that makes organizing and collective action harder. Similar patterns have been documented in U.S. communities subjected to targeted monitoring. In New York City, for instance, the NYPD’s surveillance of Muslim communities led people to avoid sensitive conversations and groups and practice self-censorship in religious and political life.
Indeed, simply suspecting surveillance exists can be enough to change behavior, regardless of whether the state is actually watching. In Zimbabwe, observers describe online users engaging in routine self-censorship in light of threats and arrests for critical posts; even when capabilities are uneven, the expectation of scrutiny leads people to delete or withhold speech. In Uganda, mandatory SIM registration, social media monitoring, and periodic crackdowns have produced a cumulative chilling effect: Critics and ordinary users alike hold back for fear of legal or professional consequences.
In short, this fragmented approach succeeds precisely because it’s unpredictable. By talking up “AI-driven” vetting and catch-and-revoke, officials plant the idea that any post, follow, or status change could trip a wire. Citizens can’t know when or how they’re being monitored, so they police themselves. A system doesn’t need to watch everyone to control them; it only needs people to think it can. In the meantime, the government builds the technology for tomorrow.
The risks posed by the statement about expanded continuous vetting based on nebulous standards are thus two sides of the same coin. The threat of surveillance is enough to change behavior even without continuous monitoring. And the kind of analysis the government says it wants to do does not work. Instead, these efforts will chill speech and pull innocent people into their net.
Nasser Eledroos is a computer scientist and policy advocate focusing on rules, standards, and guidance for how technology is built and used.