Vaccination debates are prone to misinformation and this has been especially true during the COVID-19 pandemic.1,2 A particularly interesting case is the AstraZeneca COVID-19 vaccine, which has had significant media coverage related to the tensions over EU-UK exports, as well as an alleged link to very rare cases of blood clots.
To study media coverage of the AstraZeneca COVID-19 vaccine, we analysed tweets containing ‘#AstraZeneca’ and links in the most frequently retweeted tweets. Additionally, we investigated co-tweet networks and revealed bot activity related to the discourse.
We conducted a Thick Big Data3 analysis of the AstraZeneca hashtag on Twitter. Thick Big Data is a mixed-method analysis, combining large data set computational analysis with qualitative insight into specific tweets (selected based on quantitative criteria). We used a Python GetOldTweets3 script to collect 221,922 tweets with ‘#AstraZeneca’ from 1 January 2021 to 22 March 2021. We focused on 50,080 tweets in the English language. Furthermore, we established the final URLs from these tweets by extracting them from popular link shorteners.
We focused on time periods before and after 7 March 2021 because this was the date when Austrian authorities took a precautionary step to suspend vaccinating with a batch of AstraZeneca’s vaccine.
Furthermore, we studied which media sources were most often linked in frequently retweeted tweets (more than 10 times for both time periods). We also analysed the most liked and most retweeted tweets qualitatively, which enabled us to determine which content receives the largest public exposure and backing.
Finally, we concluded Coordination Detection analysis4 to identify cooperative efforts to propagate certain tweets. To detect coordination, we conducted similarity analysis of tweet corpuses using the Min-Hash5 local-sensitive hashing method that is useful to process large sets of textual data. As a result, we were able to identify similar items based on the Jaccard similarity between sets of strings’ hashtag n-grams – a commonly used statistical language model that can be used to distinguish text strings. We adopted a 0.8 Jaccard similarity threshold as proposed in the method introduced by Pacheco et al.4
To project each tweet’s coordination network, we drew an edge between two accounts with matching tweet text corpus using co-occurrence analysis functionality from the quanteda package for R. In the output, we reached a graph with 4200 nodes representing tweets with a similar or identical corpus, which accounted for >11,000 unique Twitter user handles.
We also performed bot identification with Botometer, a machine-learning platform that computes a bot likeliness ranking. Botometer measures the score by comparing an account against tens of thousands of labelled inputs in its database.6 Considering prior coordination detection analysis that we performed, a Complete Automation Probability score was set to ≥0.76.
Before 7 March 2021, in the top ten linked sources among the tweets retweeted more than 10 times, four Western news media sources were identified as follows: AFP (ranked no. 18,806 most popular website according to Alexa traffic ranking), Politico.eu (no. 20,905 in Alexa), the Telegraph (no. 1730 in Alexa), and the Guardian (no. 172 in Alexa).
After 7 March 2021, 7 NewsRand, a relatively unpopular Nigerian news website (ranked no. 126,271 in Alexa), was more frequently linked than the four Western news media sources above, which all dropped out of the top ten linked sources. More interestingly, GreatGameIndia (ranked no. 126,482 in Alexa), an Indian website that has previously been described as spreading disinformation and in particular COVID-19 fake news,7 was number 5 before 7 March 2021, increased to number 2 in the period after, and included tweets linking AstraZeneca to eugenics.
RT (formerly Russia Today, ranked no. 312 in Alexa) had a clear lead in both time periods. RT is a state-owned media outlet that has been described as supporting Russian diplomatic goals as an information warfare tool.8 A qualitative analysis of RT links showed that they are predominantly negative when referring to the AstraZeneca vaccine, even though they repeat actual news rather than pure disinformation.
Outside media links, the most retweeted (2656 retweets as of 26 March 2021) tweet overall from the first time period was one by Robert Kennedy Jr, a known anti-vaxxer advocate, whose account on Instagram was terminated in February 2020 because of COVID-19 disinformation. The tweet was discrediting AstraZeneca vaccine as ‘controversial’, ‘heavily invested in by Bill Gates’ and ‘being rejected over widespread concerns’. The most retweeted (2015 retweets as of 26 March 2021) tweet from the second period was one by Disclose.tv, a site described as involved in disinformation.9
Tweet coordination analysis has revealed 10,728 instances in the coordination carried out by 1137 unique handles, of which 2278 instances and 616 unique handles are related to automatic bot accounts, according to Botometer Complete Automation Probability score of ≥0.76. The largest coordination network had 451 coordination instances by 111 accounts, of which 74 scored above our threshold on the Botometer. The second-largest network consisted of 37 accounts responsible for 47 coordination instances; of these 37 accounts, 13 accounts can be considered automated (see Fig. 1).
The first coordination network connects accounts that tweet about current vaccination situations in Bangladesh, praising the Bangladesh Prime Minister, Sheikh Hasina, for fighting the pandemic with the ‘#ThankYouPM’ hashtag, which is often used on Twitter to refer to India’s Prime Minister, Narendra Modi.
Another topic presented in this coordination network is large vaccine donations by India, one of the largest vaccine producers, to Bangladesh. The Indian government is using vaccine distribution to strengthen its ties through vaccine diplomacy.10
In both cases presented above, automated bot accounts are engaged in an activity of political astroturfing11 and also involved a co-tweet coordination network. Automated bot accounts spread identical or very similar texts to amplify political message reach: in this case, vaccine diplomacy.
When it comes to the second-largest coordinated network, the ratio of accounts involved to the number of coordination instances is much smaller while representing only two coordinated messages. Those messages referred to the official stand of European Medicines Agency regarding safety and efficacy of the AstraZeneca vaccine.
This is a network composed predominantly of European Commission’s employee’s accounts and high-level officials, even though 35% of these accounts are considered in this analysis to be automated to some extent. Moreover, research of shared links indicated that all the messages connected to the article ‘Remarks by Commissioner Stella Kyriakides on vaccines’.12 All these messages used shortened URL from the domain ‘smh.re’ that indicates the use of employee advocacy software Smarp, which helps employees to coordinate corporate messages through their private accounts. Considering the facts mentioned earlier, one can safely assume that this network presented a centralised health advocacy communication campaign coordinated by the European Commission’s employees to address increased criticism of AstraZeneca vaccines.
Astroturfing, in addition to coordination of activities in social media, sway public opinion.13 While those actions can be used to disseminate political propaganda, as in the case of coordination activity in Bangladesh, they can also be helpful for health advocacy campaigns, as in the case of European Commission employees. The methods, goals, and degree of automation can be different – in both cases, coordination techniques were used while aiming to appear as organic content of Twitter users. These findings bring new insights into the use of coordinated activity in social media in the context of diplomacy, politics, and health advocacy during the global COVID-19 pandemic.
Our results focus on a short time period and only on one vaccine-related hashtag. Nevertheless, the picture that emerges is deeply troubling. Twitter discourse about #AstraZeneca abounds in misinformation and reputable media news sources representation is, at best, on par with the misinforming sources, and, at worst, significantly smaller.
Popular fear-mongering tweets are spread not only by individual powerful activists and conspiracy websites but also by state-owned media, supported by bot networks. Given the fact that Russia has a heavy interest in promoting its own Sputnik-V vaccine, both for economic and political reasons, the activity of RT in posting often negative information about #AstraZeneca may be perceived as part of a larger campaign, potentially aimed at discrediting the vaccine. Clearly, India and the EU also see the potential in public health online campaigning.
Our research highlights the following three points: first, professional vaccine misinformation more frequently relies on disproportionate reporting of negative news rather than brute disinformation. Second, coordinated networks and bots are routinely used for vaccine communication and, without regulation or clear counteractions from social network, the socially damaging misinformation arm race is unavoidable. Third, future research should focus on other vaccines as well as analysing other topics that are covered by the identified coordination networks. Our research is limited to one case on one social network site; thus, it is essential for future studies to have a wider scope. Nevertheless, it is already evident that developing new interventions, such as continuous monitoring of coordinated networks, is required to detect and reduce misinformation in the public health discourse.