For the study, published in the journal Science Advances, the researchers investigated past misinformation campaigns from China, Russia, and Venezuela that were allegedly waged against the US before and after the 2016 presidential election.
The team identified the patterns these campaigns followed by analysing posts to Twitter and Reddit and the hyperlinks or URLs they included.
After running a series of tests, they found their model was effective in identifying posts and accounts that were part of a foreign influence campaign, including those by accounts that had never been used before.
The researchers hope that software engineers will be able to build on their work to create a real-time monitoring system for exposing foreign influence in American politics.
"What our research means is that you could estimate in real time how much of it is out there, and what they're talking about," said Jacob Shapiro, Professor at the Princeton School of Public and International Affairs.
"It's not perfect, but it would force these actors to get more creative and possibly stop their efforts. You can only imagine how much better this could be if someone puts in the engineering efforts to optimise it."
The team combined data on troll campaigns from Twitter and Reddit with a rich dataset on posts by politically engaged users and average users collected over many years by New York University's Center for Social Media and Politics (CSMaP).
The troll data included publicly available Twitter and Reddit data from Chinese, Russian, and Venezuelan trolls totaling 8,000 accounts and 7.2 million posts from late 2015 through 2019.
The team considered the characteristics of the post itself, like the timing, word count, or if the mentioned URL domain is a news website.
They also looked at what they called "metacontent," or how the messaging in a post related to other information shared at that time --for example, whether a URL was in the top 25 political domains shared by trolls.
The team tested their method extensively, examining performance month to month on five different prediction tasks across four influence campaigns.
Across almost all of the 463 different tests, it was clear which posts were and were not part of an influence operation, meaning that content-based features can indeed help find coordinated influence campaigns on social media, said the study.