896
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Technology acceptance and transparency demands for toxic language classification – interviews with moderators of public online discussion fora

, , &
Received 22 Jul 2022, Accepted 10 Jan 2024, Published online: 18 Feb 2024

ABSTRACT

Many online discussion providers consider using algorithm-based moderation software to support their employees in moderating toxic communication. Such technology is also attractive for public discussion providers, including public administration and public service media. To ensure successful implementation, however, it is crucial that moderators can correctly understand and use the software according to context-specific workplace requirements. This exploratory case study sheds light on the technology acceptance of algorithm-based moderation software by moderators in German public administration and public service media. Specifically, we focus on the moderators’ user characteristics and workplace requirement perceptions as preconditions for technology acceptance. We combined twelve structured qualitative interviews with moderators with an enhanced cognitive walkthrough (ECW) of an algorithm-based moderation dashboard. Additionally, stimuli of two different transparency mechanisms were added to the interview. Our findings suggest that transparency is one of the most requested characteristics of algorithm-based moderation software and, when met, transparency is beneficial for the acceptance of automated content classification in these systems. However, the findings suggest that different AI perceptions and technology commitment among moderators corresponded with different transparency motives related to the moderation system. We assume that addressing those differing motives by different transparency mechanisms may positively affect technology acceptance.

1. Introduction

Online discussions are constantly threatened by toxic content, such as hate speech, discrediting and discriminatory comments, harassment, and threats of violence (Bormann et al., Citation2022; Masullo et al., Citation2021). While the level of toxicity appears to be increasing, platform providers also face legal obligations to respond to or delete such content (Riedl et al., Citation2021). Therefore, many platform providers consider using algorithm-based moderation software to support human moderators. Such software often employs Machine Learning (ML) algorithms for content classification (Gillespie, Citation2020; Gorwa et al., Citation2020; Jhaver, Birman, et al., Citation2019). These Artificial Intelligence (AI) systems can automatically flag, sort, and filter harmful content and thereby support moderators and community managers in their costly and exhausting work of manual moderation (Wojcieszak et al., Citation2021).

Algorithm-based moderation software is also attractive for online discussions organized by public administration and public service media. By the latter, we mean publicly funded media that are committed to democratic principles such as accessibility, pluralism and independence (Bardoel & Lowe, Citation2007). According to deliberation scholars, these institutions play complementary roles in communicating the will of citizens to the democratic system (Coleman & Blumler, Citation2009). Typical examples of online discussions hosted by such institutions include citizen participation processes, mini publics as well as (news) comment sections or participatory journalism projects. In these discussions, algorithm-based moderation is helpful for at least two reasons: First, due to their central importance for public will formation, these democratic institutions are subject to high normative requirements and democratic standards (Ramsey, Citation2013). The online discussions they host are expected to form an integral part of a public sphere that enables citizens to access relevant information, contribute their views, and thereby fulfill their right to democratic participation (Coleman & Blumler, Citation2009; Janssen & Kies, Citation2005; Ramsey, Citation2013). To ensure this, it is crucial that these institutions provide a non-violent and non-discriminatory environment for online discussions. Second, public organizations face high levels of toxicity because of growing distrust in politics and (public service) media in some parts of the population (Hanitzsch et al., Citation2018; Kirk & Schill, Citation2021). Especially within news media, the number of toxic comments has made various outlets shut down their comment sections for controversial topics or altogether (Green, Citation2018). Here, algorithm-based moderation could be a valuable instrument to maintain online discussions as a means of connecting with the audience.

However, the integration of AI-based moderation software in public administration and public service media is far from being a daily routine. The reasons for this primarily stem from the organization and resources of these public institutions, which generally have few human, financial, and technical resources, and limited digital know-how. This is especially true for public administration in Germany, which is regularly rated as mediocre regarding the level of digital readiness (European Commission, Citation2021). Additionally, moderation of online discussions itself is a controversial issue in these institutions because they are bound to high democratic standards of freedom of speech. In fact, some participants of online discussions perceive moderation as a nontransparent intervention or even as censorship (Mathew et al., Citation2020; Wright, Citation2006). Here, adding AI-based moderation software may create additional challenges because algorithms are commonly considered to be error-prone and opaque (Ananny & Crawford, Citation2018; Gillespie, Citation2018; Gorwa et al., Citation2020; Suzor et al., Citation2019).

To ensure successful integration in public institutions, moderation software must be designed in accordance with these unique workplace requirements of the public sector, acknowledging the socio-technical nature of such algorithm-based moderation processes. A crucial resource in this aligning process is the technology acceptance of moderators working in these public institutions. However, there is limited knowledge on moderators' actual perceptions of algorithm-based moderation software, the workplace requirements that the target group faces, and how these perceptions and requirements affect technology acceptance of algorithm-based moderation software.

The current study investigates relevant user characteristics and workplace requirements for the implementation of algorithm-based moderation software in public administration and public service media. Our research interest is threefold: Regarding relevant user characteristics, we investigate the perception of AI as well as prevalent levels of technology commitment by moderators of public online discussions (RQ1). We then analyze interactions between moderators and algorithm-based moderation systems within a realistic workplace scenario (RQ2). Thereby, we intend to explore how moderators understand and use algorithm-based moderation software for public online discussions and how this potentially affects technology acceptance. Finally, we ask how transparency mechanisms in AI-based moderation software are understood and what they are used for by the target population (RQ3). Thus, the current study intends to reveal nuanced insights into the demands of moderators and their sense-making of transparency mechanisms. Our data comes from a case study of an interdisciplinary research cooperation, in which software developers for citizen participation and social scientists develop and evaluate an algorithm-based moderation dashboard for public online discussions, which is specifically tailored to the target group of moderators from the field of public administration and public service media in Germany. Based on theories on technology acceptance and transparency, we conducted an “Enhanced cognitive walkthrough (ECW)” (Bligård & Osvalder, Citation2013) and qualitative interviews with twelve full- and part-time moderators from both institutions who also had different levels of work experience. We discuss our findings and integrate them into a theoretical scheme that can advance the research on technology acceptance of AI-assisted moderation systems and the role of transparency in this process. Additionally, our study adds the use cases of public administration and public service media to a growing corpus of research that aims at developing transparent and accountable machine learning from a socially situated, agent-centered perspective (Ehsan et al., Citation2021; Holstein et al., Citation2019; Park et al., Citation2021; Veale et al., Citation2018).

2. Moderation of toxic communication online

The following sections define the key concepts used in this study and summarize the state of research on the (algorithm-based) moderation of public online discussions. We then turn to the state of research on technology acceptance to understand which indicators explain the acceptance and successful use of a new technology at the workplace. We find that technology acceptance is influenced by users’ perception of job relevance as well as technology-related perceptions. As a last step, we summarize the literature on whether and how transparency is suitable to foster technology acceptance. These theoretical considerations form the basis of our exploratory empirical design.

2.1. Moderation of public online discussions

In this study, we focus on the moderation of undesired content, specifically toxic communication. In online discussions, toxicity can be defined as a “rude, disrespectful, or unreasonable comment that is likely to make other users leave a discussion” (Risch & Krestel, Citation2020, p. 2). This definition usually includes explicit forms of toxic language, like profanity, insults, or threats. It also expands upon the phenomenon of hate speech, in which group-related misanthropy can take more latent forms like sarcastic, discriminatory or discrediting remarks, and undemocratic appeals (Davidson et al., Citation2017; Risch & Krestel, Citation2020; Stoll et al., Citation2020; Waseem, Citation2016). Previous studies have confirmed that being exposed to toxic language makes readers unwilling to join a discussion (Springer et al., Citation2015). At the same time, it comes along with a variety of additional negative effects, such as triggering ongoing toxic commenting behavior (Cheng et al., Citation2014; Ziegele et al., Citation2018), amplifying polarized and stereotypical thinking (Hsueh et al., Citation2015) as well as increasing distress for readers and moderators (Riedl et al., Citation2020).

A useful approach to tackle the spread of toxicity is moderation. Moderation is an umbrella term to describe the management and governance of content on online platforms (Wright, Citation2006). In practice, moderation can be understood as a two-step process, which consists of (1) the selection of relevant content for moderation along specific criteria and (2) the choice of an appropriate moderative action with regard to a selected content (Heinbach & Wilms, Citation2022). The most common form of moderation is content moderation, which is usually understood as the identification and removal of deviant or inappropriate content, such as spam, harassment, or hate speech. However, there is a growing body of literature looking for a systematization of evolving moderation techniques from various fields of application: Caplan (Citation2018) distinguishes between artisanal, community-reliant and industrial content moderation, which describe various levels of organizational efforts to outsource moderation. Regarding the variety of moderative actions, Gillespie (Citation2022) points to the practice of reducing content visibility by excluding it from recommendation systems, thereby complementing the actions of structuring, like community downvoting, trigger warnings, or temporary holds (Goldman, Citation2021). Within the framework of interactive moderation, moderative actions heavily rely on acts of communication: Moderators actively and visibly participate in the discussions by, for example, requesting compliance with netiquette, answering users’ questions, or contributing additional information to the discussion (Stroud et al., Citation2015; Ziegele & Jost, Citation2020).

Moderation has been widely discussed as a beneficial, even necessary precondition to foster respectful debate online (Edwards, Citation2002; Grimmelmann, Citation2015; Wright, Citation2006; Wright & Street, Citation2007). According to many scholars, discussion without procedure bears the risk of degenerating into a “tyranny of the loudest shouter” (Blumler & Coleman, Citation2001, pp. 17–18). This is especially problematic in online settings, which are used as forums for public consultation, like crowdsourcing opinions and ideas for journalistic content creation or within political participatory processes. Moderation is considered a useful instrument to prevent discussants from attacking, intimidating, or drowning out others, and therefore empower inclusive, respectful, and ultimately legitimate democratic discourse. At the same time, structuring, sanctioning, or even blocking participants is a sensitive intervention that may restrict individuals’ rights to voice their opinion. Drawing on data from the advocacy project OnlineCensorship.org, Myers West (Citation2018) shows that users experiencing content removal perceive it as a restriction of their capacity for self-expression and communication with others, leading to frustration and anger and is often followed by appealing to the takedown. The sensitivity of moderative actions is also supported by moderators who state that, especially within the context of public service media, they have a mandate to reflect a wide range of viewpoints and therefore have difficulty justifying the blocking of users (Paasch-Colberg & Strippel, Citation2022). Instead, it is considered to be more of an “ultima ratio” intervention, needing careful, individual assessment as well as an increased demand for justification towards both the moderated user and towards the community as a whole. In essence, this makes moderation a valuable, but also a highly resource-intensive task for the providers of online discussions.

2.2. Algorithm-based moderation systems and automated content classification

Moderation systems increasingly rely on automated content classification, especially during the first step of moderation, meaning the selection of content that potentially requires moderation. Gorwa et al. (Citation2020) define algorithmic content moderation “as systems that aim to identify, match, predict, or classify some piece of content (e.g. text, audio, image or video) on the basis of its exact properties or general features, leading to a decision and governance outcome (e.g. removal, geoblocking, account takedown)” (p.3). The term of algorithmic content moderation therefore refers to “hard moderation” (Gorwa et al., Citation2020, p. 3), meaning an automated removal of content, for example deleting a comment because it was classified as spam or hate speech by an algorithm. This definition, however, excludes moderation software that applies automated, algorithm-based classification and sorting as a first step of moderation, while leaving the autonomy to choose an appropriate moderative action with human moderators or community managers. In this case, algorithm-based classification would not necessarily lead to removal, but can be followed by a variety of actions, such as a temporary exclusion of users from the debate, public sanctioning, reduction of visibility, de-escalation, or even toleration (Gillespie, Citation2022; Goldman, Citation2021). In this paper, we investigate the acceptance of algorithmic classification of toxic language as a recommendation for moderation. We therefore proceed using the term algorithm-based moderation, meaning moderation systems that are supported by algorithmic suggestion or classification before any moderative action (including interactive moderation).

Recent developments in the field of Artificial Intelligence have led to a significant improvement of many Natural Language Processing (NLP) tasks, including the automated classification of content. These trends have also arrived at providers of online discussion fora, community managers, and moderators of comment sections. It is hoped that AI-based moderation software can support the time-consuming, psychologically demanding and costly manual identification of harmful, offensive, or toxic content (Pitsilis et al., Citation2018; Risch et al., Citation2019; Stoll et al., Citation2020; Waseem, Citation2016; Zampieri et al., Citation2019) and help moderators to manage the high number of user content in comment sections. In practice, tools with AI-based moderation have already been implemented in moderation software, such as SWAT, Conversar.io, or Coral (Beuting, Citation2021; Vox Media, Citation2020). However, the accurate classification of toxic comments, including latent constructs like discriminatory content or sarcasm, is still a challenging issue in ML research (Farha et al., Citation2022; Mandl et al., Citation2021; Risch et al., Citation2021), including high risks of misclassification (York & McSherry, Citation2019). There is also evidence that such algorithms are prone to specific biases. For example, Hede et al. (Citation2021) show that the AI implemented in the Jigsaw Perspective API systematically overestimates toxicity in comments in which identities such as Black, gay, Muslim, or feminist are mentioned. In scenarios where moderation decisions are exclusively bound to algorithmic decision-making, such biases may drastically increase the risk of systematic false removal of comments from online discussions. Therefore, additional supervision of algorithm-based moderation is preferred especially by providers of public discussions such as governmental online fora or public news media outlets, where there is a high normative demand for free speech without censorship (Wright, Citation2006).

3. Fostering technology acceptance

The aim of this paper is to explore moderators’ demands on algorithm-based moderation software for public administration and public service media. We consider moderators’ viewpoint on this software to be a key resource in successful implementation, as they actively shape its usage (or misuse or ignorance respectively) within the work environment. To ensure successful integration of such software in the work environment, it is crucial that moderators approve and use the algorithm-based recommendations in the intended way.

3.1. Technology acceptance and its antecedents

A well-established construct to measure how and why people use new technologies is the technology acceptance model (Davis, Citation1985). According to this model, people's use of technological innovations depends on two beliefs: Perceived ease of use and perceived usefulness in the working context. The former refers to the anticipated effort a person must make to use the system, whereas the latter is defined as the perceived benefit that the use of this system will generate in the workplace (Marangunić & Granić, Citation2015). In numerous extensions of the model, researchers investigated antecedents of these beliefs to actively foster technology acceptance, including job relevance and user characteristics (Marangunić & Granić, Citation2015; Venkatesh, Citation2000). Regarding job relevance, a technology is perceived as relevant when it effectively supports its users in the tasks that they face at their workplace (Venkatesh & Davis, Citation2000). User characteristics include technology commitment and individual machine heuristics.

Technology commitment refers to a multidimensional concept that measures the individual willingness to use technology (Neyer et al., Citation2012). These dimensions include, among others, technology competence perception and technology control conviction. Technology competence perception measures people's perceived self-efficacy regarding the use of technology. Specifically, people are asked whether dealing with technical innovations is usually too much of a challenge for them or whether they fear to cause damage by misusing new technology. Technology control conviction measures the self-reliance in the appropriation process of new technologies. Here, people are asked whether they are confident with learning a new technology by themselves. Machine heuristics are general, stereotypical perceptions of machines, which are invoked when interacting with or evaluating them (Sundar & Kim, Citation2019). These stereotypes can be either positive or negative. Stereotypical perceptions of machines may affect perceptions of algorithm-based systems (Molina et al., Citation2022), as they are used as “mental shortcuts” for evaluation. A similar assumption with regards to perceptions of algorithm-based systems underlie the concepts of “Algorithmic Aversion” (Dietvorst et al., Citation2015) and “Algorithmic Appreciation” (Logg et al., Citation2019): While there is a general preference for algorithmic decision-making, especially when it comes to analytical tasks (M. K. Lee, Citation2018; Logg et al., Citation2019), Dietvorst et al. (Citation2015) have shown that humans tend to be more critical of incorrect decisions made by an algorithm in contrast to human mistakes and quickly lose trust after a wrong algorithmic decision. For toxic comment moderation, Wang (Citation2021) showed that readers with positive machine heuristics tended to perceive news outlets as more credible and less biased when an AI moderated the discussions. Additionally, Molina et al. (Citation2022) found that this was also the case for people's trust in AI classification of content that required moderation. In contrast, participants with a negative machine heuristic were more trusting of human selection of moderation-worthy content. The occurrence of positive or negative machine heuristics can, among others, be explained by other user characteristics: Sundar (Citation2020), for example, argues that machine heuristics are more likely to mediate people's judgment of algorithm-based software when their familiarity with AI is low. summarizes the antecedents for moderators’ technology acceptance of algorithm-based moderation software according to previous research.

Figure 1. Scheme of antecedents of technology acceptance (own illustration).

Figure 1. Scheme of antecedents of technology acceptance (own illustration).

3.2. Fostering technology acceptance from a socio-technical perspective

Previous research on technology acceptance underlines the importance of job relevance perception and prevalent user characteristics of the target population using software in their everyday work. A successful implementation of a technology is therefore mainly dependent on users’ perception of its fit to requirements they face in their work environment. Specifically, this refers to the question of how users actually understand and make use of the technology in solving work-related tasks. Technology acceptance research is therefore a good example of a socio-technical approach in technology adoption. Broadly speaking, this research perspective offers a framework to identify critical subsystems to be aligned for joint optimization in technology usage. Next to the technical subsystem, it highlights the significance of a personnel subsystem, the organizational subsystem, and the external environment subsystem (Bélanger et al., Citation2013; Yu et al., Citation2023). A particular emphasis is usually put on the personnel subsystem (Makarius et al., Citation2020). It includes information on users’ skills, self-perception, and worldviews (Yu et al., Citation2023). Regarding technology adoption, such user characteristics may shape individuals’ approach to work with and make use of technical functions. Accordingly, these individual predispositions also shape task-related interactions between users and technology, leading to a particular usage behavior (Niehaus & Wiesche, Citation2021). This perspective is also adopted by research labeled as agent-centered or using a socially situated approach (Ehsan et al., Citation2021; Holstein et al., Citation2019; Park et al., Citation2021; Veale et al., Citation2018).

In this paper, we adopt this approach for the use case of algorithm-based moderation of online discussions in public administration and public service media settings. To this end, we focus on technology acceptance from a moderator-centric perspective with a particular interest in user characteristics as well as workplace requirements, which determine job relevance perception. We further focus on human-machine interaction by analyzing individual sense-making and usage of algorithm-based moderation technology in a realistic workplace scenario. The number of studies on this particular use case as well as transferable work is rather limited. This is because of two reasons: Research dealing with the adoption of AI-assisted technology in the workplace is usually situated in sales, HR, or product team divisions of large companies (Ehsan et al., Citation2021; Holstein et al., Citation2019; Park et al., Citation2021). This research, however, does not adequately depict the requirements that public sector employees face due to a) the sensitivity of the moderation task itself and b) the high legal and democratic standards in public administration and journalism. An example for a study with a more fitting scope is Jhaver, Birman, et al. (Citation2019), who focused on the needs of moderators of political online discussions, namely (volunteer) moderators from Reddit while working with the algorithmic moderation system Automoderator. In the domain of public administration, Veale et al. (Citation2018) interviewed 27 public sector machine learning experts from 5 OECD countries. However, these studies (as well as the aforementioned) have in common that all interviewees either had ML backgrounds or had extensive exposure to ML systems in the past. For the target population of moderators in public administration and public service media, however, the question of ML expertise is still an open one. In Germany, moderators of public online discussion fora are often recruited part-time to assist in moderation, while they similarly continue working in attached departments or newsrooms. Accordingly, they may have specific knowledge on the topic discussed as well as the organization that hosts the discussion. However, they are less likely to have exposure to or knowledge of automated content classification or AI.

To address this research gap of prevalent user characteristics of moderators in the field of public administration employees and public service media, their job relevance assessment, and their human-system interaction, we ask the following research questions. The first research question aims at exploratively examining user-specific characteristics, focusing especially on technology- and AI-related perceptions. To depict and describe the interactions between users and algorithm-based moderation systems, our second research question focuses on collecting impressions from our target population handling such software in a realistic work scenario:

RQ1: How do moderators from the field of public administration and public service media perceive AI and how do they assess their level of technology commitment?

RQ2: What observations can be made while using algorithm-based moderation software in the working context of public administration and public service media?

3.3. Fostering technology acceptance through transparency

Our study additionally intends to explore mechanisms that foster technology acceptance in our target population. Particularly, we assume that lacking transparency of algorithms could pose a challenge to public administration and public service media, because their actions and decisions are evaluated against the backdrop of high democratic standards. Therefore, we explore moderators’ perceptions of AI transparency. Research has defined transparency as the disclosure of information that is relevant for a decision-making process (Turilli & Floridi, Citation2009). Such information could include information on the decision outcome, the decision-making process, and the reasons for the decision (de Fine Licht & de Fine Licht, Citation2020). The term is connected to the concepts of explainability, interpretability, and accountability. Still, while these concepts describe desired aims, transparency is a necessary precondition to reach those goals (Suzor et al., Citation2019; Turilli & Floridi, Citation2009). In terms of algorithm-based moderation, transparency most commonly refers to opening the black box behind algorithmic decision-making (Floridi et al., Citation2018; Gillespie, Citation2018; Gorwa et al., Citation2020) by providing an explanation on why a comment was recommended for removal (Jhaver, Bruckman, et al., Citation2019). An extensive literature corpus has addressed transparency towards platform users (Jhaver, Bruckman, et al., Citation2019; Molina et al., Citation2022; Suzor et al., Citation2019). In contrast, less is known about transparency mechanisms targeted towards moderators (notable exception: Ehsan et al., Citation2021; Jhaver, Birman, et al., Citation2019).

For our study, we explore both moderators’ initial thoughts on transparency as well as their reaction to two specific transparency mechanisms, which are currently among the most frequently used mechanisms for algorithm-based moderation systems (Bunde, Citation2021). For the latter, we draw on a transparency operationalization from Molina et al. (Citation2022) and modify them for our specific use case. In their study on content moderation, they distinguished between two types of transparency mechanisms: The “transparency (only)” mechanism primarily aims to present the rationale behind a classification decision. In their experimental study (which used mock-up stimuli), participants of the transparency condition were provided with a list of words that the classification system used for classification. This is in line with approaches such as feature importance (Bhatt et al., Citation2020), which draws on highlighting the most predictive words for classification decisions. Thereby, decision makers can understand which comment features were most important for classification. Suzor et al. (Citation2019) argued that this kind of transparency mechanism could foster better understanding of the functioning of algorithmic prediction for moderation as well as allow for a greater scope of control and help foster accountability. In comparison, the so-called “interactive transparency” mechanism employs feedback to establish AI-human interaction. In their study, Molina et al. (Citation2022) operationalized this condition by complementing a list of words that were relevant for the classification with the possibility for human moderators to modify this list. Mainly, these changes were made by curating the list, for example, by recommending words for inclusion or exclusion. They argue that this interaction may foster engagement, a sense of agency, and perceived control. This is in line with research that underlines the benefits of interactivity in human and AI collaboration via agency (Diakopoulos & Koliska, Citation2017; Dietvorst et al., Citation2018; Sundar, Citation2020). We add to this interpretation by assuming that moderators may gain insights into the training data of algorithmic classification systems, since they directly participate in their enhancement. Therefore, feedback itself can be understood as a transparency mechanism, since it helps moderators to a) gain insight into the algorithm's training data as well as b) experience and comprehend the algorithm training procedure.

We employ these two modified transparency mechanisms in our study. Specifically, we ask which of these two mechanisms employees from public administration and public service journalism favor and what is the rationale behind it. We argue that we may find results different from existing research due to very different workplace requirements in the public sector as well as the prevalent AI perceptions among the employees. We assume that user characteristics and specific work requirements may lead to an individual interpretation of the transparency mechanisms offered. This is in line with research on transparent and explainable AI support systems, which emphasizes a socially situated, agent-centered perspective (e.g., Doshi-Velez & Kim, Citation2017; Ehsan et al., Citation2021; Kou & Gui, Citation2020). Suzor et al. (Citation2019) similarly claim that – in order to develop meaningful AI transparency – it is essential to “to pay attention to the capacity of different audiences to interpret the information disclosed (…)” (p. 1529). Accordingly, our third research question focuses on both user interpretation and approval of different transparency mechanisms:

RQ3: How do moderators from the field of public administration and public service media understand different transparency mechanisms, and which one do they prefer for their work?

4. Method

To answer our research questions, we conducted qualitative semi-structured interviews with moderators from the field of public administration and public service media. We combined in-depth interviews with enhanced cognitive walkthroughs, which were centered around testing an algorithm-based moderation system in a real-life setting.

4.1. Case description

We based our research on the use case of an algorithm-based moderation dashboard designed for the target population of moderators in German public administration and public service media. The moderation dashboard is developed as part of an interdisciplinary research cooperation of software-developers for citizen participation and social scientists in Germany. The overarching scholarly aim of the project is to train the algorithms for the moderation software and evaluate its suitability for the needs of moderators, in order to facilitate moderation and foster rational, reciprocal, and respectful public debates in the field of public administration and public service media. The moderation dashboard is integrated into an existing discussion platform developed by Liquid Democracy e.V., a non-profit organization specialized in digital citizen participation. The implementation of the modular prototype is under open-source license.

The current version of the moderation software includes a Machine Learning-based classifierFootnote1 that predicts the potential toxicity of (German) user comments and alerts the moderators using the moderation dashboard. When conducting the study, the implemented classifier reached a macro-F1-score of 65%.Footnote2 Thus, the classifier confronted participants with both correct judgment and misjudgment of the AI. The participants were not provided with the detailed performance scores of the ML classifier but only informed that the moderation dashboard relied on an AI to evaluate comment toxicity.

4.2. Enhanced cognitive walkthrough and interview

Twelve moderators from public administration and public service media were invited to participate in a simulation, a so-called “Enhanced cognitive walkthrough (ECW)” (Bligård & Osvalder, Citation2013). This is a variation of the classic cognitive walkthrough (Lewis & Wharton, Citation1997), which is suitable for usability testing and evaluation. The test consists of a goal-oriented, realistic simulation, where participants are asked to explore a software interface to solve work-related tasks. At the same time, participants are asked to verbalize their impressions (think-aloud method, Fonteyn et al., Citation1993). The tasks were specifically created to fit the practical scenario of moderating an online discussion. They involved the identification, review, and execution of moderation of toxic comments posted to a political online discussion. While the original walkthrough method only focuses on operation-related tasks, the enhanced cognitive walkthrough incorporates identification-related tasks. Thus, it is not only able to illustrate whether participants can execute an operation in the intended way. It also hints at whether participants know that a function was available and if the interface provides hints that the function exists (Bligård & Osvalder, Citation2013). Each simulation was accompanied by a participatory observation, in which one member of the research team evaluated whether the tasks had been successfully completed (see appendix D). In case of failure, the ECW additionally documents problem severity.

Our simulation was carried out on an open-source discussion platform by Liquid Democracy e.V. (https://github.com/liqd). As part of the scenario, a real-life political discussion on the topic of equality between men and women was reproduced on the platform. Within this simulation, participants were asked to moderate the public discussion. They were supported by the algorithm-based moderation software which recommended potentially toxic comments for moderation. Our simulation scenario offered examples of correct AI judgments as well as misjudgments. Moderators could either remove a comment, respond to the comment's author publicly, or reject the AI recommendation (see appendix B).

Each ECW was immediately followed by a semi-structured qualitative interview. In a one-on-one scenario, participants were asked to reflect on their experiences with the moderation dashboard and algorithm-based moderation. The interview served to further elaborate on open or interesting points from the ECW and learn more about the perception and acceptance of AI. The interview was structured along a theory-based guideline that included open-ended questions adapted from the theories on technology acceptance and transparency described in previous sections of this study. Accordingly, it included questions addressing constructs such as machine heuristics (Sundar & Kim, Citation2019) and algorithmic aversion (Dietvorst et al., Citation2015), technology commitment (Neyer et al., Citation2012), technology acceptance (Venkatesh & Davis, Citation2000), and usability (Y. Lee & Kozar, Citation2012) (see appendix C). At the end of each interview, participants were exposed to two different mock-up stimuli of transparency mechanisms (see appendix B). Their design was inspired by the operationalization of “transparency (only)” and “interactive transparency” by Molina et al. (Citation2022). Participants were asked to identify and critically reflect on those transparency mechanisms and vote for the one they deemed more suitable for their workplace.

The innovative combination of the two methods provided us with a strong data basis to back our analysis. Firstly, we had a variety of user statements from the walkthrough and the interview, providing us with insights on both spontaneous, authentic reactions to the moderation dashboard as well as thoroughly considered, in-depth answers to our questions. Secondly, we were able to match the users’ performance and our observations with the interviews. This provided insights into patterns between actual performance in usage, underlying perceptions as well as opinions specifically on algorithm-based moderation systems.

4.3. Participants and interviews

Participant recruitment was organized via a pool of public institutions that had declared their interest to test and evaluate a prototype of the moderation software. Members from the pool qualified for participation if they had experience in executing moderation tasks in online discussions in two key democratic institutions, namely public administration and public service media. The institutions were not part of the research consortium, not involved in the development of the moderation system, and they did not have prior knowledge of it. The call for participation was sent out via personal invitation. Our final sample consists of twelve individuals who were diverse in age, gender identity, occupation, and work experience. Three participants are student assistants for content moderation of public service online news outlets, three are trained moderators for online citizen participation processes, and four work in public administration or federal agencies that are experienced in hosting online discussions. Two participants work for online discussion providers specialized in participation processes in urban planning. Notably, while all participants had experience with the moderation of at least one online discussion, our sample included participants with different levels of experience (part-time student assistant vs. full-time professionals). Moreover, our sample is diverse regarding “full-time moderators” and “organizers with moderation experience”. While the former category comprises participants, who were exclusively employed for moderation tasks, organizers acted as moderators but were also responsible for the execution and coordination of participation and user engagement projects. The participants were also diverse regarding their prior exposure to AI-assisted moderation software. summarizes the information on participants’ professional background.

Table 1. Information on participants’ professional background regarding content moderation.

The interviews were conducted in an interview setting with the participant and one or two members of the research team using video conferencing software. In total, we conducted twelve interviews with an average duration of 80 minutes in August and September 2021. During the simulation, the participants were asked to share their screen. The simulations and interviews were recorded.

4.4. Analysis

In line with the aims of various case studies (Baxter & Jack, Citation2008), this study aims at the identification, description, and generation of assumptions regarding possible connections of the observed constructs. To do so, the analysis of the material was carried out in two steps. The first step revolved around the analysis of the participants’ statements during the walkthrough and the interview. For this purpose, all twelve recorded ECWs and interviews were anonymized and transcribed. The analysis was conducted according to the evaluative qualitative content analysis (Kuckartz, Citation2012). The focus of this analysis method is on the case-related assessment, classification, and evaluation of content. First, the interview material was reviewed, and observations were inductively developed and summarized (inductive category formation). This allowed the reduction of the material to the essential content (Mayring, Citation2015). The second screening of the material was structured along these observations. The goal was to critically review the observations again and to substantiate or rebut them with relevant text passages from the interviews. For better documentability, the participants are abbreviated as TN, and each text passage is assigned an id. The second screening also allowed to identify gradations of the identified constructs among participants and thus allows the researcher to develop hypotheses on possible correlations (Kuckartz, Citation2012). This form of analysis is a particularly effective method of exploring constructs as well as possible interrelationships.

In a second step, the observation material from the ECW was analyzed. To do so, we used the scores on successful completion of the subtasks: For every subtask regarding the identification, review, and moderation of toxic comments, participants were assigned a score between 1 (no successful completion) to 5 (successful completion) by the observing researcher (see appendix D). We created both subtask-related average scoresFootnote3 as well as user-related average scores. For the analyses in this paper, we consider only the user-related average scores, which were based on an evaluation of individual performance on all subtasks on average and were used as a measure to externally assess participants’ technology commitment.

The following results are mostly based on a joint analysis of the participants’ statements during the ECW and the subsequent interviews, and they are backed up by user-related average scores from the ECW. The appendix A provides an overview of all observations drawn from the material as well as selected participant statements. Note that statements were translated into English and slightly modified for better readability.

5. Results

5.1. General perceptions of AI

RQ1 first asked for general perceptions of AI in our target population. Contrary to our expectations, we found that prior exposure to AI was diverse in the field of public administration and journalism. Some of our participants already had contact with moderation software in which an automated classification software was integrated [TN8, id114; TN9, id117; TN10, id23]. Two participants used algorithmic recommendation systems within other software for private use [TN3, id5; TN12, id134]. Two participants reported to have a strong affinity for programming [TN2, id3; TN6, id85]. The other participants reported low or no conscious prior exposure to AI. Interestingly, the AI perceptions of the participants were not necessarily linked to their prior experience with algorithm-based software. Our data suggests three response patterns: The first pattern primarily refers to AI as “Science Fiction.” Participants assigned to this pattern often stated that they drew their AI understanding from fictional literature or movies. Prevalent AI-related questions in this group were more of a pop-cultural nature, e.g., whether AI was comparable to humans, if AI had feelings, and whether they would be more intelligent than humans in the future [TN1, id2; TN4, id7; TN10, id23; TN12, id26]. Additionally, only the participants assigned to this pattern stated a general distrust or even fear towards AI. Response pattern two summarizes answers which referred to AI most commonly as “algorithms.” Participants stated that they expected to get in touch with these algorithms using the Internet [e.g. TN8, id18] and had ambivalent opinions on it [TN11, id24]. Participants from this response pattern were similar in terms of stating that they had limited expertise regarding the functionality of AI as well as regarding their perceptions of AI being opaque and nontransparent [TN5, id10; TN9, id22]. We additionally observed what one might call responsibility outsourcing: Participants assigned to the second response pattern stated that they would support AI as long as it is programmed by “smart and responsible scientists” [TN5, id10; TN8, id20] or regulated by the government [TN11, id25]. Half of the participants were assigned to this response pattern. A third response pattern referred to AI as “autonomous learning systems” [TN3, id5]. At the same time, the participants assigned to this pattern differentiated between various kinds of algorithms that can be subsumed under the “umbrella term” or “buzzword” AI [TN2, id3; TN7, id15]. Respondents in this group were very open-minded to AI usage, but referred to problems with biased training datasets as well as poor model performance of AI systems [TN2, id4; TN7 id15]. The three response patterns had in common that participants perceived their current AI exposure with a feeling of passiveness. Every participant stated to have a vague feeling of being exposed to AI on a regular basis but had trouble pinpointing when and where this occurred and what the scope or function of this AI was. This aspect of passiveness in AI usage is particularly interesting, since concepts on AI perception like algorithmic aversion or appreciation neglect this dimension of perception up to this point. One participant summed it up:

“I can’t think of anything off the top of my head that I am in contact with them [AI], but they are in contact with me.” [TN8, id18]

In contrast, all participants welcomed algorithm-based moderation software, regardless of their general attitude towards AI. The respondents sensed that AI was particularly useful for “assembly-line work” [TN3, id44], meaning repetitive tasks of limited scope [TN10, id72], which are time-consuming and emotionally stressful [TN1, id39; TN5, id51; TN7, id59, id60], and which require fast responses and permanent supervision [TN1, id40]. In the case of moderation, this applies to pre-selection of comments worthy of removal. From the point of view of public institutions, this is perceived as particularly attractive, because comment sections are at risk of remaining closed otherwise due to limited resources for their supervision [TN11, id75]. Algorithm-based moderation systems help employees to “upscale their work beyond the limits of their capability” [TN5, id52], and even help those who are not professionally trained in moderation practices to conduct moderation tasks [TN5, id52]. This means that moderation tasks can be assigned to the relevant departments without having to coordinate them with trained staff in other departments, such as the press team [TN4, id46]. This greatly reduces workload for both teams. At the same time, participants confirmed that the hurdles for public institutions in terms of moderating and deleting comments are very high [TN11, id74] and discourse without censorship is regarded as a high property [TN5, id10; TN10, id71; TN11, id75]. This is also mirrored in their rejection of algorithm-based decision-making: All participants made it clear that they – for various reasons – wished for algorithm-based recommendations for comment moderation, while the final decision should be made by themselves. However, all participants wished for algorithm-based support of moderation in their working context, provided that it works reliably [TN3, id44; TN11, id76]. Interestingly, this was also true for participants of response pattern 1 who tend to be critical or even fearful of AI in general. Their answers made it clear that they actually differentiated between algorithms for moderation and artificial intelligence:

I think that there is also the one which is more apparent here, this algorithm-controlled [system], so that it is not so much about self-learning [aspects], but about the evaluation of large amounts of data and also language (…) And I think that’s actually very positive. [TN12, id27]

This is corroborated by the statements of the participants about the functionality of the software they used in the scenario. Most of them assumed that the algorithm worked like a word filter for undesired content rather than a self-learning system [observation 2], even though they were informed about the implementation of an AI within the moderation dashboard. Therefore, we could not observe a spillover effect from general AI-related set of attitudes on algorithm-based moderation systems, since the latter was not recognized as such.

5.2. Technology commitment

Next to their general perception of AI, RQ1 also investigated prevalent levels of technology commitment in our target population. During the interviews, participants were asked whether and when they felt confident while working with new technology. In general, our participants reported a rather high technology control and technology competence conviction. They stated that they regularly familiarize themselves with new systems with a feeling of joy and curiosity [TN2, id81; TN3, id82]. They were able to develop solutions to problems independently, for example by googling, using self-administered tutorials, or trial and error [TN6, id85; TN8, id87; TN9, id88]. The preferred approach to access new technology was to “play with it” [TN5, id84]:

“That depends on the feeling of whether I can break something or not. I didn’t have that feeling at all, I couldn’t actually break anything. And that’s why I was able to approach it with the best possible strategy to learn it quickly, which was to simply play. Click on everything, open everything, try it out. You always have the back button. You always have the undo button. And then I’m not afraid at all, and then I can try things out. And then I learn relatively quickly.” [TN5, id84]

As a further information source, we added participants´ self-assessment while using the moderation dashboard during the cognitive walkthrough as well as participants´ success scores, which were assigned by a participant observer. With a few exceptions, the self-assessment matched the external assessment. The highest observed values of technology commitment were found for members of the response pattern 2, the lowest observed values for members of response pattern 1.

In sum, we identified high levels of technology commitment as well as heterogeneous attitudes towards AI in the target group of public administration or public service journalism employees. Regarding AI-related perceptions, we identified three response patterns that turned out to be more differentiated than the current concept of algorithmic aversion. Next to positive or negative stereotypes, it included abstract vs. specific ideas on AI functionality as well as the perception of passiveness vs. activeness in AI usage. However, the general approval of algorithm-based moderation software was – at a first glance – not related to general attitudes in AI. It seemed that especially critical participants differentiated between a more abstract idea of AI as well as the algorithms presented. The latter were not understood as AI, but more as a word filter for slurs.

5.3. Use of algorithm-based moderation software in working context

RQ2 aimed at systematizing the observations regarding public administration and public service media moderators’ use of algorithm-based moderation software in their working context. To answer this question, we especially use the parts of the interview in which participants directly referred to the ECW. Within this simulation, participants were asked to moderate a public discussion. They were supported by an algorithm-based moderation system, which recommended potentially toxic comments for moderation. With a macro-F1-score of about 65%, classifications contained correct judgments as well as misjudgments. In general, the participants assessed the quality of the AI recommendations very differently. While some participants stated they were rather satisfied with the AI performance, others heavily criticized it (observation 6). This might be due to different workplace environments or different personal thresholds regarding undesired content. When witnessing misjudgments, some participants stated that these would prompt them to ignore the further AI recommendations [TN3, id110; TN8, id114; TN11, id76]:

“And in this respect, after I read three comments where I did not find the assessment correct, I no longer perceived the AI as a helpful evaluation system, so to speak, and then slipped back into the old moderation [pattern]. That I do everything myself and read everything myself and evaluate everything myself.” [TN3, id110]

This may be interpreted as a sign of algorithmic aversion, where seeing a system fail can lead to a drastic deterioration of trust in AI. At the same time, it reflects the importance of (observed) performance of algorithms for users’ acceptance. Although intuitive, it can be underlined once again that working correctly and reliably may be one of the core characteristics of an algorithm being considered by users in job relevance and ease of use assessment. The variation in performance assessment however shows that performance perceptions of the very same algorithm differ among individuals.

The next observation refers to moderators’ decision-making whether a comment should be removed or not. We found that almost no moderator relied on their personal impression or the AI´s recommendation alone. Instead, the AI's recommendations were integrated into a system of several indicators of “checks and balances”. One of the most frequently mentioned aspects was whether other discussion participants had flagged comments as worthy of moderation as well [TN4, id97; TN5, id98; TN8, id102; TN9, id103]. Additionally, moderators stated that they ask for other moderators’ opinions on a regular basis or cross-check their decision with organizational guidelines [TN7, id101; TN9, id104]. This shows that it is convenient for moderators to be guided along several indicators to reduce the risk of wrong removal. Therefore, it is not necessary to fully trust the AI but to be offered several additional indicators to cross-check its assessment. A limited number of participants even assessed the quality of the AI alongside dashboard design features or their opinion of the website provider [observation 5]. For example, one participant noticed a spelling error in the dashboard. This heavily deteriorated his perception of the quality of the AI [TN7, id95]:

“For me it raised questions, why was the sentence so crooked? That was a signal for me - maybe it was a coincidence. But […] when I’m already a bit sceptical and then I see something like that, I ask myself if it [the AI] doesn’t get that right, how can it get other things right?” [TN7, id95]

Another participant claimed to have more confidence in AI decision-making when she had more insight into who the software provider was and how the product was developed [TN12, id96]:

“(…) It would help me to trust (...) if I could understand how it (...) has been developed (…). I don’t know anything about it. So, I don’t know where the server [is hosted], hardware, software (…). Maybe you can watch a video where it’s explained how it was created or something.” [TN12, id96]

A third participant reflected on his attitudes towards the moderation system provider [TN6, id94]:

“The question is do I have to have trust [the AI]? Or do I have to have confidence in you that you’re providing me with a system that’s actually functional?” [TN6, id94]

Interestingly, this only occurred to people who had either very abstract knowledge of AI or had some trouble using the dashboard during the simulation. One user actually stated that he had trouble identifying “where the AI was” [TN1, id92]. We therefore assume that people – when they doubt their own judgment of AI or struggle while using it – turn to other visible product characteristics they can assess, like dashboard design or platform provider brand. The perception of those features then influences the quality perception of the AI, even though it is not necessarily related to it. At the same time, these statements underline that algorithm-based moderation systems can indeed be regarded as socio-technical systems, because the AI cannot be removed from the larger social context it is applied in.

In summary, we conclude that the performance of the ML model is important to moderators, but it is not the only requirement when it comes to practical day-to-day usage. Rather than only assessing the quality of AI recommendations, the participants rely on a variety of indicators such as other moderators' judgment, their own impression, or even design features or platform provider brand to assess whether they trust an algorithm-based moderation system. By providing additional factors for such cross-checking, participants are more likely to feel confident using AI as well as accept misjudgments of an AI.

5.4. Transparency of algorithm-based moderation

RQ3 deals with the importance of transparency for algorithm-based moderation in the workplace. Interestingly, all participants brought up the question of transparency on their own either during the ECW or during the interview [observation 8]. This was especially true for members of response pattern 2, in which acceptance of AI was inseparably linked to aspects of transparency. However, the participants used the term to describe different objectives they wanted to pursue. For the specific work context, we identified five motives why the participants wished for transparency. The first motive refers to the transparency of the AI towards the moderator. Here, the term transparency means justification. Specifically, it means that the AI should explain which factors were decisive for its assessment in every individual case [TN3, id143; TN10, id156]. The participants suggested doing this by highlighting the most important terms within a comment [TN12, id159]. This was done to understand the AI and to cross-check whether their personal impression was identical and to reduce ambiguities [TN9, id103, id155]. One participant emphasized that understanding the AI helped him to gain a feeling of safety [TN5, id146]:

“(…) But if I understand [the decision] and also see what it [the AI] can’t do and what its limits are, I feel safe and I gladly accept this help.” [TN5, id146]

Secondly, a particular idea of transparency was connected to decision support. The participants stated that – especially with long comments – they would be interested in being offered a shorter version to reduce their own workload [TN2, id140]. If the AI was transparent – here meaning that the AI could highlight the most important terms in a comment – the participants would only read them to decide whether a comment should be removed or not. By this, transparency mechanisms actually were identified as another way to achieve a more nuanced pre-selection. Thirdly, the participants were interested in the more abstract logic of the AI, meaning the algorithm behind the AI. However, this was only the case for a small group of participants [TN2, id141; TN6, id147, TN7, id151]. Here, transparency was understood to allow users to see behind the functionality of AI with the clear intention to control and to improve it. The fourth motive behind transparency was targeted towards communicating with moderated users. The participants repeatedly stated that they wished to offer an explanation to discussants on why their comments were removed [e.g., TN9, id154]. This form of communication was perceived as appreciative and prevented potential further conflict [TN2, id142; TN5, id145]. During the interviews, it became apparent that most moderators valued this kind of engaged moderation as a high property of public political discussions [e.g., TN3, id123; TN5, id146]. As a fifth motive, transparency should also enable communication to other community members as an opportunity to explain discussion rules to them [TN6, id149] and therefore prevent further moderation interventions.

In the last step, we presented two mock-ups of transparency mechanisms to the participants. In stimulus 1 (transparency) we added additional highlighting of the most predictive words within comments that were classified as potentially toxic by the AI. In stimulus 2 (interactive transparency), participants were asked whether they agreed with the AI classification overall to enable further AI training. We received mixed but overall positive reactions to our approach of adding further transparency mechanisms to the moderation system. Regarding the first mechanism, participants stated that it corresponded exactly to what they had imagined [TN1, id162; TN2, id163]. According to the participants, this mechanism helped best to deliver a justification to the AI recommendation, to gain a feeling of security, and it offered another way of pre-selection [e.g., TN1, id172]. All participants appreciated this mechanism, mainly because it allowed them to decide faster [e.g., TN4, id175]. The only criticism against highlighting was the assumption of one of the more knowledgeable participants in the field of AI, who stated that highlighting words might convey the impression of the AI being a simple word filter and therefore could not explain the complexity of the AI functionality in an adequate way. Another participant critically reflected on the fact that the highlighting function narrowed moderators´ view down to just a few words:

“My first reaction is that it is not that simple. I need to read it [the comment], understand the meaning and the context. But after two minutes I don’t feel like it anymore and I’m very grateful for this highlighting, because it makes my job easier. The second text is so long that I actually don’t want to read it. In a real-life situation, I would just block it right away without second thoughts.” [TN5, id176]

Other participants confirmed that this transparency mechanism is particularly attractive for moderators:

“The option with the yellow highlights is indeed a really good indicator for moderators to make a decision: do I reject here, do I block here, or how do I deal with it? From my moderator’s point of view, I would immediately say that the option with the yellow highlights would make more sense to me.” [TN11, id182]

Interactive, or feedback transparency was also welcomed, but received more criticism. One participant saw the danger of confusing the AI when different moderators with different thresholds for comment removal would train it [TN8, id167]:

“What must be deleted immediately for some, is fully ok for others. I could imagine (…) that this is a bit confusing and (…) inconsistent for the AI, because the AI might think: Wait, yesterday I would have had to delete it, why is it okay today?” [TN8, id167]

Others stated that providing feedback to the AI is another very complex task, which is added to the workload of moderators, who are already short on time [TN2, 161; TN9, id168]. Four out of twelve participants did not recognize the stimulus as an opportunity to provide feedback to the AI. While this might also hint at poor stimulus construction, it was remarkable that this only happened with people belonging to response pattern 1 and therefore had only a very abstract idea and pretty pop-cultural approach to AI. However, participants from response pattern 3 overwhelmingly welcomed this mechanism, as it gave them a chance to assert some control over the AI.

Being asked which of the two mechanisms they would prefer working with if they had to decide for one, five participants voted in favor of stimulus 1 (transparency) and four participants voted in favor of stimulus 2 (interactive transparency). Three participants explicitly wished for both mechanisms or found these two were interdependent. On the one hand, participants wished to give the AI a more elaborate feedback on which words it was doing wrong [TN3, id174]. On the other hand, especially those who were dissatisfied with the AI’s performance pointed out that highlighting is only useful when a certain model performance can be guaranteed [TN11, id182]. Interestingly, most participants in favor of the feedback function (interactive transparency) were either in a leading position in their department or showed a high sense of (emotional) involvement. One participant even stated that he would differentiate according to whether he was just a staff member or if he was in a leading position [TN6, id94]. These people emphasized a personal sense of accountability for moderation tasks. In real-life settings, it is possible that this emotional involvement may vary greatly between moderators.

Reflecting on those outcomes, we conclude that transparency mechanisms that rely on highlighting the most predictive features in comments are a very practical, low-threshold approach to facilitate understanding and working with algorithm-based moderation systems. They enable moderators to conduct their work faster and also provide information, which they can use to communicate their decision to moderated users and their community. Additionally, it aligns with the image of transparency mechanisms that many participants already had in their mind. It is therefore easy to use for participants with different AI perceptions as well as different levels of technology commitments alike. However, it might also be misleading in terms of educating people on the correct functionality of AI, as highlighting may be an oversimplification. In contrast, interactive transparency functions are highly valued by moderators who either have some knowledge or affinity towards AI or feel the urgent need to guarantee a solid model performance. This was especially appealing for those participants of response pattern 3, who were curious to grasp the functionality behind the AI in order to control it. They even wished for an extension by combining both transparency mechanisms. However, other participants with limited AI knowledge or lower levels of technology commitment failed to properly understand this function.

6. Discussion

The aim of this paper was to explore how user characteristics and workplace requirements of moderators from the field of public administration and public service media shape technology acceptance of algorithm-based moderation software. This included 1) perceptions of AI and technology commitment as well as 2) interactions between moderators and algorithm-based moderation systems in the workplace. We also aimed at investigating how transparency mechanisms within the software are understood and used by moderators. To answer these research questions, we combined in-depth qualitative interviews with an enhanced cognitive walkthrough based on data from an interdisciplinary German case study. Our use case was a newly developed AI-assisted moderation software that was tested by twelve German-speaking moderators from the field of public administration and public service media.

Summing up our findings, we identified three patterns of AI perceptions in our sample: The first response pattern depicted AI as science fiction. Participants assigned to this response pattern had an abstract and pop-culture-influenced idea of AI. The second response pattern was centered around the idea of AI as algorithms that participants encountered during online usage. Participants assigned to this response pattern had positive as well as negative opinions on AI, but expressed their concern regarding the non-transparency of these algorithms. Participants assigned to the third response pattern defined AI as autonomous learning systems and were the most knowledgeable on the topic of AI. In their eyes, relevant AI challenges revolved around poor ML model performance as well as the danger of discrimination due to imbalanced training data. Accordingly, AI perceptions differed not only along related positive versus negative attitudes but drew a more nuanced picture of specific versus abstract ideas of AI as well as challenges connected to these ideas. Unexpectedly, the different response patterns were not associated with prior exposure to AI. Additionally, we found differing, but overall high self-disclosed levels of technology control conviction and technology competence conviction among the participants. Interestingly, these different AI perceptions did not seem to affect the acceptance of algorithm-based moderation software overall: All participants welcomed such software, as it was considered suitable for reducing individual workload, speeding up work on time-consuming and annoying tasks, and making moderation easily accessible to untrained staff members.

Ultimately, we found that AI transparency was greatly valued by employees of public administration and journalism. At the same time, the term transparency was connected to various objectives that participants wished to satisfy in order to perform work-related tasks. To meet the transparency motives of our target group, we found that a transparency mechanism that highlights the most predictive features in a comment is a useful, low-threshold, and comprehensive option: It was understood as a justification behind AI classification, was considered useful while performing moderation decisions, and facilitated communication with moderated users and the community. This was true for all participants, regardless of their individual AI perception or technology commitment. At the same time, we see that such mechanisms could promote the misconception that ML-based text classification works like a word filter. Interactive transparency mechanisms to improve and provide feedback to AI were welcomed by the more tech-savvy participants as well as particularly involved participants (either emotional or by job position). However, this option turned out to be time-consuming and difficult to understand for some.

Drawing on these results, we offer a preliminary theoretical scheme that extends the findings on technology acceptance from the literature review () for the use case of moderators working with algorithm-based moderation software in public administration and public service media (). In essence, we suggest integrating 1) moderators' specific perceptions of AI, 2) their transparency motives, and 3) the extent to which transparency mechanisms as system characteristics meet the transparency-related demands into the explanatory model of technology acceptance.

Figure 2. Exploratory concept of interrelation between AI perceptions, technology commitment, transparency motives and technology acceptance (own illustration).

Figure 2. Exploratory concept of interrelation between AI perceptions, technology commitment, transparency motives and technology acceptance (own illustration).

In line with findings from the literature review, we identified relevant user characteristics and job context requirement perceptions that the moderators perceived as crucial for successful technology implementation. Regarding user characteristics, moderators’ level of technology commitment was medium to high. However, rather than a description of machine heuristics, we identified three more nuanced perception patterns regarding AI. We also found that our sample was distinguishable among moderators and organizers as well as the level of emotional involvement participants showed with their moderation task. Regarding job context requirements, public administration and public service media moderators confirmed, in line with our assumptions from the literature review, that high normative standards of free speech without censorship as well as institutions’ limited resources in implementing and maintaining moderation are important to them.

In contrast to the assumptions in , our data did not indicate a direct link between these user characteristics and job context requirements and participants’ technology acceptance. That was particularly true for the role of machine heuristics (Dietvorst et al., Citation2015; Sundar & Kim, Citation2019) that were expected to serve as a direct shortcut to evaluate the algorithm-based moderation dashboard. Even though we found quite opaque and negative perceptions of AI within our sample of moderators, none of the participants completely opposed using AI-based moderation software. Instead, technology-related user characteristics as well as perceptions of their job context requirements were connected to differing transparency-related motives that the participants expressed during the interviews. The five transparency motives of justification, decision support, control, communication towards moderated users, and education of the community relate to previous findings on moderators’ attitudes towards moderating toxic comments (Myers West, Citation2018; Paasch-Colberg & Strippel, Citation2022). Especially some moderators’ desire to communicate their decisions to the moderated users and educating the community illustrate that moderation tasks consist of much more than just the identification and blocking of toxic comments. Rather, it is a social practice in which explanation and education are both intended by the moderators as well as expected by users (Myers West, Citation2018; Paasch-Colberg & Strippel, Citation2022). To the best of our knowledge, the identification of these differing transparency motives based on different AI perceptions and additional technology commitment is a genuine contribution of the current study. They demonstrate that different backgrounds can result in different transparency demands placed in an AI-based moderation software.

Our findings also suggest that transparency mechanisms meeting these transparency motives may positively affect technology acceptance among our target population. Within our use case, the transparency-related motives illustrate and explain moderators' starting points in the sense-making process while working with both algorithm-based moderation software and transparency mechanisms. Specifically, the transparency-related motives helped us gain insights into participants' reasons why using transparency mechanisms can lead to higher technology acceptance. Accordingly, we assume that achieving a fit between individual transparency motives and offered transparency mechanisms could lead to a maximum increase of technology acceptance. Again, this is in line with research that advocates for applying a user-centric perspective in the development of interpretable algorithm-based systems. Discussed under “interpretability” or “social transparency” (Doshi-Velez & Kim, Citation2017; Ehsan et al., Citation2021), such research shows the necessity to consider users’ capabilities when using explainable AI. In sum, the findings from our use case corroborate these claims by showing that individuals evaluate the usefulness of transparency mechanisms based on their own capabilities as well as their fit to job-related tasks they face in their work environment. Further, exploring the differing nuances of moderators’ understanding and motives regarding transparency was valuable to identify potential mechanisms behind differing levels of technology acceptance. It is, however, important to mention that our findings cannot tell whether this effect arises based on an increase of a) ease of use, b) perceived usefulness in the job context, or c) both.

Ultimately, it may be helpful to illustrate the theoretical scheme and the patterns outlined in by presenting three examples from our case study that illustrate patterns between user characteristics, job context perception, transparency motives, preferences for differing transparency mechanisms and technology acceptance. The three examples are depicted in the appendix E.

The first example illustrates a type of moderator who understood their role primarily as a moderator executing moderation rules (Moderator), who had a more abstract idea of AI (AI perception response pattern 1 or 2), and who was particularly aware of the limited resources within the working environment (Limited Resources). These moderators were looking for decision support within transparency mechanisms (transparency motives: decision support) and indicated that the highlighting function (transparency only) was the most suitable one, because they could focus only on the highlighted parts when evaluating a comment. They also indicated that integrating a highlighting function within the dashboard would make it more likely that they use the dashboard, because it would help them to speed up their moderating activities. The second example depicts a type of moderator who equally understood their role primarily as a moderator executing moderation rules (Moderator), but who also showed signs of high emotional involvement (high emotional involvement: yes) and who was particularly aware of normative standards of free speech within online discussions (normative standards of free speech within online discussion). This type ended up preferring the highlighting function as well, but for completely different reasons: They appreciated the highlighting function because it provided a justification of the AI decision for individual tasks and was used as communication clues to enable communication towards the moderated user and the community. The third example depicts a type of moderator who is characterized by a particularly high technology commitment as well as an AI perception as self-learning systems (response pattern 3). Additionally, this type of moderator often took on the role of an organizer, meaning they had a wide range of tasks and responsibilities regarding the discussion/participation process besides moderation (Organizer). These moderators particularly asked for transparency mechanisms to control the inner rule-making of the AI and ended up preferring either the feedback function (interactive transparency) or a combination of feedback and highlighting. In fact, the implementation of such mechanisms was stated as essential for the decision whether to use algorithm-based moderation systems at all.

Taken together, we hope that these patterns help to illustrate how different AI perceptions and levels of technology commitment among moderators describe and explain differing transparency needs as well as different acceptance for technology and transparency mechanisms in particular. It is again important to emphasize that these examples are based on our qualitative material and may benefit from further quantitative testing in future studies.

7. Limitations

Our findings are constrained by several limitations. First, due to the exploratory nature and the complex multi-method design of this study, our findings are limited to a small target population of twelve individuals and therefore cannot be regarded as representative of moderators in public administration or public service media. Due to the small sample, the theoretical framework must be tested in further empirical studies in order to enable generalizable statements about the theoretical relationships it contains. Also, our observations are closely connected to the use case and the state of development of the tested prototypical moderation system. The prototype was developed with a team of software developers who provide an open-source platform for citizen participation, which is commonly used by public administration in Germany. While this certainly helped to provide a realistic platform environment for public administration employees, it is limited by the performance of the implemented language classification algorithm and therefore might have affected participants’ perception of the algorithm. With a macro f-score of 0.65, there is still room for improvement. This low performance may have contributed to the impression of some participants that the AI might work like a word filter or blacklist. Considering the importance of system performance for trust-building in algorithms, it also cannot be ruled out that offering better classification algorithms may change the observed interrelationships between technology acceptance and transparency motives. While the interrelationship between performance perception and transparency demands goes beyond the scope of this study, we consider it to be a valuable addition to the presented framework and an interesting point of departure for further studies (e.g., Yin et al., Citation2019). Additionally, it must be underlined that some of our participants did not fully understand the feedback stimulus. While this was only the case for less tech-savvy participants, it may also derive from the construction of our stimulus. Another source of confusion might be that a feedback function without any further information on its internal representations is not the most intuitive function with regards to a common transparency understanding. It therefore should be critically assessed whether the feedback function can be appropriately analyzed through the lens of transparency literature. One sign that it might be useful to do so is that participants have indicated that they consider the function suitable to meet their transparency-related motives. Finally, it should be acknowledged that moderators’ technology acceptance is one of several key resources that allow for a successful implementation within a work environment. While our study in the context of socio-technical system theory focused on the analysis of the personnel subsystem (namely moderators), the decision on the use of a particular technology usually lies with the head of organization (representing the organizational subsystem), who may consider additional aspects such as implementation costs, accessibility, or legal aspects in the decision. Future research may consider focusing on the interplay of the personnel subsystem and the organizational subsystem in shaping acceptance of AI-based moderation software.

8. Conclusion

Public administration and public service media are increasingly interested in using the potential of algorithm-based software to moderate their online discussions. Against this backdrop, the question of how to foster employees’ technology acceptance in order to enable successful integration in the working environment is gaining importance. However, the current state of research on algorithm-based (moderation) systems does not differentiate the special requirements that moderators of such institutions face due to high democratic standards. This study addresses this research gap by exploring technology acceptance of algorithm-based moderation software among employees of these institutions that are tasked with moderating discussions. Specifically, we investigate moderators’ user characteristics and workplace requirement perceptions as preconditions for technology acceptance as well as the role of transparency.

Our case study offers theoretical, practical and methodological contributions to further research and application. Theoretically, we extend and specify previous models of technology acceptance by adding specific job requirements of public institutions, different patterns of moderators’ AI-related perceptions, and transparency motives as well as their possible interrelationships. Within this model, we find that moderators’ transparency motives play a crucial role. We are able to identify five different transparency-related demands placed on algorithm-based moderation systems and relate them to different sets of AI-related attitudes and capabilities of the moderators. While these transparency motives themselves are novel, our findings suggest that addressing them through different transparency mechanisms could be an effective and innovative way to increase technology acceptance.

From a practical standpoint, our research offers valuable implications for designers of XAI as well as algorithm-based moderation software for public administration and public service media. These include that highlighting the most predictive features is a suitable transparency mechanism for the target group as a default option mainly due to limited resources and easy accessibility. We suggest using combined mechanisms of highlighting and feedback to smaller organizers who regard themselves as particularly tech-savvy and are willing to “walk the extra mile” to train the AI.

Ultimately, the multi-method approach of an enhanced cognitive walkthrough and qualitative interviews offers methodological insights for researchers interested in investigating socio-technical systems in realistic workplace scenarios. The combination of scenario-based task solving, observation and in-depth interviewing offers a solid database that allows extensive insights into users’ sense-making of a software tool. We hope that the approach will prove to be a valuable addition within the methodological toolkit of socio-technically oriented research.

Supplemental material

Supplemental Material

Download MS Word (1.3 MB)

Acknowledgments

This research was conducted in the project “AI-supported collective- social moderation of online discussions” (KOSMO) supported by the Federal Ministry of Education and Research under Grant 01IS19040C to Marc Ziegele. We would like to thank the participants whose substantial feedback and willingness to participate made this study possible as well as Liquid Democracy e.V. and Institut für partizipatives Gestalten GmbH for their support in conducting the interviews. We would like to thank the anonymous reviewers for their valuable advice.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Supplemental data

Supplemental data for this article can be accessed online at https://doi.org/10.1080/07370024.2024.2307610.

Additional information

Funding

The work was supported by the Bundesministerium für Bildung und Forschung [01IS19040C].

Notes on contributors

Lena Katharina Wilms

Lena Katharina Wilms is a doctoral researcher at the Department of Social Sciences at Heinrich Heine University Duesseldorf. She develops and evaluates design and software features to improve online participation processes using AI-assisted moderation. She is particularly interested in designing inclusive and transparent democratic online environments.

Katharina Gerl

Katharina Gerl is a postdoctoral researcher at the Düsseldorf Institute for Internet and Democracy (DIID). Her research focuses on the implications of digital technologies for political institutions, political communication and participation. She conducted several studies evaluating the usage of online-based tools by political organizations at Heinrich Heine University.

Anke Stoll

Anke Stoll is a research associate at the Institute of Media and Communication Science at the Ilmenau University of Technology, Germany. Her research focusses on fairness and transparency of AI systems in democratic processes and includes the development and evaluation of machine learning-based methods for online discussions, software for moderation systems, and participation platforms.

Marc Ziegele

Marc Ziegele is an Assistant Professor for Political Online Communication at the Department of Social Sciences at Heinrich Heine University Düsseldorf, Germany. In his research, he studies online incivility, deliberation, journalistic moderation, and media trust in the digital age.

Notes

1 To classify the comments, we fine-tuned a pre-trained BERT-model (Bi-directional Encoder Representations from Transformers, Devlin et al., Citation2018) provided by deepset.ai (https://www.deepset.ai/(21.09.2022)) via the hugging face library (https://huggingface.co/bert-base-german-cased, (21.09.2022); package: Transformers (version 4.22.1, Wolf, Citation2020).)

2 The macro-F1-score is the harmonic mean of the arithmetic means of precision and recall for each class (here the two classes of the binary category toxic language: toxic [1] vs. not toxic [0]). It provides an established and informative indicator for overall model performance of a classifier (Powers, Citation2011).

3 Subtask-related scores were assigned for accessing the moderation dashboard, evaluating the toxicity of comments, executing the moderative action, and controlling the results. These scores helped us to identify functions within the dashboard that hindered participants from using the platform in an intended way. The results served the purpose of a usability testing and were forwarded to the software engineering team of the research collaboration to improve the software.

References

  • Ananny, M., & Crawford, K. (2018). Seeing without knowing: Limitations of the transparency ideal and its application to algorithmic accountability. New Media & Society, 20(3), 973–989. https://doi.org/10.1177/1461444816676645
  • Bardoel, J., & Lowe, G. F. (2007). From public service broadcasting to public service media. The core challenge. In G. F. Lowe & J. Bardoel (Eds.), From public service broadcasting to public service media: RIPE@ 2007 (pp. 9–26). Nordicom, University of Gothenburg.
  • Baxter, P., & Jack, S. (2008). Qualitative case study methodology: Study design and implementation for novice researchers. The Qualitative Report, 13(4), 544–559.
  • Bélanger, F., Watson-Manheim, M. B., & Swan, B. R. (2013). A multi-level socio-technical systems telecommuting framework. Behaviour & Information Technology, 32(12), 1257–1279. https://doi.org/10.1080/0144929X.2012.705894
  • Beuting, S. (2021). Wenn Künstliche Intelligenz das Forum moderiert [When Artificial Intelligence moderates the forum]. Deutschlandfunk. (Retrieved January , 2022) https://www.deutschlandfunk.de/sehr-wahrscheinlich-hass-wenn-kuenstliche-intelligenz-das.2907.de.html?dram:article_id=502849
  • Bhatt, U., Xiang, A., Sharma, S., Weller, A., Taly, A., Jia, Y., Ghosh, J., Puri, R., Moura, J. M. F., & Eckersley, P. (2020, January). Explainable machine learning in deployment. In Proceedings of the 2020 conference on fairness, accountability, and transparency (pp. 648–657). https://doi.org/10.1145/3351095.3375624
  • Bligård, L. O., & Osvalder, A. L. (2013). Enhanced cognitive walkthrough: Development of the cognitive walkthrough method to better predict, identify, and present usability problems. Advances in Human-Computer Interaction, 2013, 1–17. https://doi.org/10.1155/2013/931698
  • Blumler, J. G., & Coleman, S. (2001). Realising democracy online: A civic commons in cyberspace (2). IPPR/Citizens Online Research Publication. https://dlc.dlib.indiana.edu/dlc/bitstream/handle/10535/3240/blumler.pdf?sequence=1
  • Bormann, M., Tranow, U., Vowe, G., & Ziegele, M. (2022). Incivility as a violation of communication norms—A typology based on normative expectations toward political communication. Communication Theory, 32(3), 332–362. https://doi.org/10.1093/ct/qtab018
  • Bunde, E. (2021, January). AI-Assisted and explainable hate speech detection for social media moderators–A design science approach. In Proceedings of the 54th Hawaii International Conference on System Sciences, Honolulu, Hawaii (pp. 1264–1273).
  • Caplan, R. (2018). Content or Context Moderation? Artisanal, Community-Reliant, and Industrial Approaches. Data & Society Research Institute. (Retrieved October , 2022). https://datasociety.net/wpcontent/uploads/2018/11/DS_Content_or_Context_Moderation.pdf
  • Cheng, J., Danescu-Niculescu-Mizil, C., & Leskovec, J. (2014). How community feedback shapes user behavior. In Proceedings of Eighth International AAAI Conference on Weblogs and Social Media (pp. 41–50). https://www.aaai.org/ocs/index.php/ICWSM/ICWSM14/paper/view/8066/8104
  • Coleman, S., & Blumler, J. G. (2009). The Internet and democratic citizenship: Theory, practice and policy. Cambridge University Press.
  • Davidson, T., Warmsley, D., Macy, M., & Weber, I. (2017). Automated hate speech detection and the problem of offensive language. Proceedings of the International AAAI Conference on Web & Social Media, 11(1), 512–515. https://doi.org/10.1609/icwsm.v11i1.14955
  • Davis, F. D. (1985). A Technology Acceptance Model for Empirically Testing New End-User Information Systems: Theory and Results. [ Doctoral dissertation, Massachusetts Institute of Technology DSpace@MIT. https://dspace.mit.edu/bitstream/handle/1721.1/15192/14927137-MIT.pdf
  • De Fine Licht, K., & de Fine Licht, J. (2020). Artificial intelligence, transparency, and public decision-making. AI & Society, 35(4), 917–926. https://doi.org/10.1007/s00146-020-00960-w
  • Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. doi: 10.4855/0arXiv.1810.04805
  • Diakopoulos, N., & Koliska, M. (2017). Algorithmic transparency in the news media. Digital Journalism, 5(7), 809–828. https://doi.org/10.1080/21670811.2016.1208053
  • Dietvorst, B. J., Simmons, J. P., & Massey, C. (2015). Algorithm aversion: People erroneously avoid algorithms after seeing them err. Journal of Experimental Psychology: General, 144(1), 114–126. https://doi.org/10.1037/xge0000033
  • Dietvorst, B. J., Simmons, J. P., & Massey, C. (2018). Overcoming algorithm aversion: People will use imperfect algorithms if they can (even slightly) modify them. Management Science, 64(3), 1155–1170. https://doi.org/10.1287/mnsc.2016.2643
  • Doshi-Velez, F., & Kim, B. (2017). Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608. https://doi.org/10.48550/arXiv.1702.08608
  • Edwards, A. R. (2002). The moderator as an emerging democratic intermediary: The role of the moderator in Internet discussions about public issues. Information Polity, 7(1), 3–20. https://doi.org/10.3233/IP-2002-0002
  • Ehsan, U., Liao, Q. V., Muller, M., Riedl, M. O., & Weisz, J. D. (2021, May). Expanding explainability: Towards social transparency in ai systems. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (pp. 1–19). https://doi.org/10.1145/3411764.3445188
  • European Commission. (2021). The Digital Economy and Society Index (DESI). (Retrieved January , 2022). https://digital-strategy.ec.europa.eu/en/policies/desi
  • Farha, I. A., Oprea, S. V., Wilson, S., & Magdy, W. (2022, July). SemEval-2022 task 6: iSarcasmeval, intended sarcasm detection in English and Arabic. In Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022), Stroudsburg, PA, USA (pp. 802–814). https://doi.org/10.18653/v1/2022.semeval-1.111
  • Floridi, L., Cowls, J., Beltrametti, M., Chatila, R., Chazerand, P., Dignum, V., Luetge, C., Madelin, R., Pagallo, U., Rossi, F., Schafer, B., Valcke, P., & Vayena, E. (2018). AI4People—an ethical framework for a good AI society: Opportunities, risks, principles, and recommendations. Minds and Machines, 28(4), 689–707. https://doi.org/10.1007/s11023-018-9482-5
  • Fonteyn, M. E., Kuipers, B., & Grobe, S. J. (1993). A description of think aloud method and protocol analysis. Qualitative Health Research, 3(4), 430–441. https://doi.org/10.1177/104973239300300403
  • Gillespie, T. (2018). Custodians of the Internet: Platforms, content moderation, and the hidden decisions that shape social media. Yale University Press.
  • Gillespie, T. (2020). Content moderation, AI, and the question of scale. Big Data & Society, 7(2), 1–5. https://doi.org/10.1177/2053951720943234
  • Gillespie, T. (2022). Do not recommend? Reduction as a form of content moderation. Social Media + Society, 8(3), 1–13. https://doi.org/10.1177/20563051221117552
  • Goldman, E. (2021). Content moderation remedies. Michigan Telecommunications and Technology Law Review, 28(1), 1–59. https://doi.org/10.36645/mtlr.28.1.content
  • Gorwa, R., Binns, R., & Katzenbach, C. (2020). Algorithmic content moderation: Technical and political challenges in the automation of platform governance. Big Data & Society, 7(1), 1–15. https://doi.org/10.1177/2053951719897945
  • Green, M. (2018). No comment! Why more news sites are dumping their comment sections. (Retrieved July , 2022) KQED. https://www.kqed.org/lowdown/29720/no-comment-why-a-growing-number-of-news-sites-are-dumping-their-comment-sections
  • Grimmelmann, J. (2015). The virtues of moderation. Yale Journal of Law and Technology, 17(1), 42–109.
  • Hanitzsch, T., Van Dalen, A., & Steindl, N. (2018). Caught in the nexus: A comparative and longitudinal analysis of public trust in the press. The International Journal of Press/politics, 23(1), 3–23. https://doi.org/10.1177/1940161217740695
  • Hede, A., Agarwal, O., Lu, L., Mutz, D. C., & Nenkova, A. (2021). From toxicity in online comments to incivility in American News: Proceed with caution. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, Online, (pp. 2620–2630). Association for Computational Linguistics.
  • Heinbach, D., & Wilms, L. (2022). Der Einsatz von Moderation bei #meinfernsehen2021. In F. Gerlach & C. Eilders (Eds.), #meinfernsehen 2021 – Bürgerbeteiligung: Wahrnehmungen, Erwartungen und Vorschläge zur Zukunft öffentlich-rechtlicher Medienangebote (pp. 217–236). Nomos Verlagsgesellschaft mbH & Co. KG.
  • Holstein, K., Wortman Vaughan, J., Daumé III, H., Dudik, M., & Wallach, H. (2019). Improving fairness in machine learning systems: What do industry practitioners need? In Proceedings of the 2019 CHI conference on human factors in computing systems (pp. 1–16). https://doi.org/10.1145/3290605.3300830
  • Hsueh, M., Yogeeswaran, K., & Malinen, S. (2015). Leave your comment below”: Can biased online comments influence our own prejudicial attitudes and behaviors? Human Communication Research, 41(4), 557–576. https://doi.org/10.1111/hcre.12059
  • Janssen, D., & Kies, R. (2005). Online forums and deliberative democracy. Acta Política, 40(3), 317–335. https://doi.org/10.1057/palgrave.ap.5500115
  • Jhaver, S., Birman, I., Gilbert, E., & Bruckman, A. (2019). Human-machine collaboration for content regulation: The case of reddit automoderator. ACM Transactions on Computer-Human Interaction (TOCHI), 26(5), 1–35. https://doi.org/10.1145/3338243
  • Jhaver, S., Bruckman, A., & Gilbert, E. (2019). Does transparency in moderation really matter? User behavior after content removal explanations on reddit. Proceedings of the ACM on Human-Computer Interaction, 3(CSCW), 1–27. https://doi.org/10.1145/3359252
  • Kirk, R., & Schill, D. (2021). Sophisticated Hate Stratagems: Unpacking the Era of Distrust. American Behavioral Scientist, 68(1), 1–23. https://doi.org/10.1177/00027642211005002
  • Kou, Y., & Gui, X. (2020). Mediating community-AI interaction through situated explanation: The case of AI-Led moderation. Proceedings of the ACM on Human-Computer Interaction, 4(CSCW2), 1–27. https://doi.org/10.1145/3415173
  • Kuckartz, U. (2012). Qualitative inhaltsanalyse. Beltz Juventa.
  • Lee, M. K. (2018). Understanding perception of algorithmic decisions: Fairness, trust, and emotion in response to algorithmic management. Big Data & Society, 5(1), 1–16. https://doi.org/10.1177/2053951718756684
  • Lee, Y., & Kozar, K. A. (2012). Understanding of website usability: Specifying and measuring constructs and their relationships. Decision Support Systems, 52(2), 450–463. https://doi.org/10.1016/j.dss.2011.10.004
  • Lewis, C., & Wharton, C. (1997). Cognitive walkthrough. In M. Helander (Ed.), Handbook of human-computer interaction (pp. 717–732). Elsevier Science BV.
  • Logg, J. M., Minson, J. A., & Moore, D. A. (2019). Algorithm appreciation: People prefer algorithmic to human judgment. Organizational Behavior and Human Decision Processes, 151, 90–103. https://doi.org/10.1016/j.obhdp.2018.12.005
  • Makarius, E. E., Mukherjee, D., Fox, J. D., & Fox, A. K. (2020). Rising with the machines: A sociotechnical framework for bringing artificial intelligence into the organization. Journal of Business Research, 120, 262–273. https://doi.org/10.1016/j.jbusres.2020.07.045
  • Mandl, T., Modha, S., Shahi, G. K., Madhu, H., Satapara, S., Majumder, P., Schaefer, J., Ranasinghe, T., Zampieri, M., Nandini, D., & Jaiswal, A. K. (2021). Overview of the hasoc subtrack at fire 2021: Hate speech and offensive content identification in English and indo-aryan languages. In Forum for Information Retrieval Evaluation, December 13–17, 2021, India.
  • Marangunić, N., & Granić, A. (2015). Technology acceptance model: A literature review from 1986 to 2013. Universal Access in the Information Society, 14(1), 81–95. https://doi.org/10.1007/s10209-014-0348-1
  • Masullo, G. M., Tenenboim, O., & Lu, S. (2021). “Toxic atmosphere effect”: Uncivil online comments cue negative audience perceptions of news outlet credibility. Journalism, 24(1), 101–119. https://doi.org/10.1177/14648849211064001
  • Mathew, B., Illendula, A., Saha, P., Sarkar, S., Goyal, P., & Mukherjee, A. (2020). Hate begets hate: A temporal study of hate speech. Proceedings of the ACM on Human-Computer Interaction, 4(CSCW2), 1–24. https://doi.org/10.1145/3415163
  • Mayring, P. (2015). Qualitative content analysis: Theoretical background and procedures. In A. Bikner-Ahsbahs (Ed.), Approaches to qualitative research in mathematics education (pp. 365–380). Springer.
  • Molina, M. D., Sundar, S. S., & Lee, E.-J. (2022). When AI moderates online content: Effects of human collaboration and interactive transparency on user trust. Journal of Computer-Mediated Communication, 27(4), 1–12. https://doi.org/10.1093/jcmc/zmac010
  • Myers West, S. (2018). Censored, suspended, shadowbanned: User interpretations of content moderation on social media platforms. New Media & Society, 20(11), 4366–4383. https://doi.org/10.1177/1461444818773059
  • Neyer, F. J., Felber, J., & Gebhardt, C. (2012). Entwicklung und Validierung einer Kurzskala zur Erfassung von Technikbereitschaft. Diagnostica, 58(2), 87–99. https://doi.org/10.1026/0012-1924/a000067
  • Niehaus, F., & Wiesche, M. (2021). A socio-technical perspective on organizational interaction with AI: A literature review. ECIS 2021 Research Papers, 156. https://aisel.aisnet.org/ecis2021_rp/156
  • Paasch-Colberg, S., & Strippel, C. (2022). “The boundaries are blurry … ”: How comment moderators in Germany see and respond to hate comments. Journalism Studies, 23(2), 224–244. https://doi.org/10.1080/1461670X.2021.2017793
  • Park, H., Ahn, D., Hosanagar, K., & Lee, J. (2021). Human-AI interaction in human resource management: Understanding why employees resist algorithmic evaluation at workplaces and how to mitigate burdens. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (pp. 1–15). https://doi.org/10.1145/3411764.3445304
  • Pitsilis, G. K., Ramampiaro, H., & Langseth, H. (2018). Detecting offensive language in tweets using deep learning. arXiv preprint arXiv:1801.04433.
  • Powers, D. M. W. (2011). Evaluation: From precision, recall and F-measure to ROC, informedness, Markedness and correlation. Journal of Machine Learning Technologies, 2(1), 37–63.
  • Ramsey, P. (2013). The search for a civic commons online: An assessment of existing BBC online policy. Media, Culture & Society, 35(7), 864–879. https://doi.org/10.1177/0163443713495079
  • Riedl, M. J., Masullo, G. M., & Whipple, K. N. (2020). The downsides of digital labor: Exploring the toll incivility takes on online comment moderators. Computers in Human Behavior, 107, 1–32. https://doi.org/10.1016/j.chb.2020.106262
  • Riedl, M. J., Naab, T. K., Masullo, G. M., Jost, P., & Ziegele, M. (2021). Who is responsible for interventions against problematic comments? Comparing user attitudes in Germany and the United States. Policy & Internet, 13(3), 433–451. https://doi.org/10.1002/poi3.257
  • Risch, J., & Krestel, R. (2020). Toxic comment detection in online discussions. In B. Agarwal, R. Nayak, N. Mittal, & S. Patnik (Eds.), Deep learning-based approaches for sentiment analysis (pp. 85–109). Springer.
  • Risch, J., Stoll, A., Wilms, L., & Wiegand, M. (2021). Overview of the GermEval 2021 shared task on the identification of toxic, engaging, and fact-claiming comments. In Proceedings of the GermEval 2021 Shared Task on the Identification of Toxic, Engaging, and Fact-Claiming Comments, Düsseldorf, Germany (pp. 1–12).
  • Risch, J., Stoll, A., Ziegele, M., & Krestel, R. (2019). hpiDEDIS at GermEval 2019: Offensive language identification using a German BERT model. In Preliminary proceedings of the 15th Conference on Natural Language Processing (KONVENS 2019), Erlangen, Germany (pp. 403–408).
  • Springer, N., Engelmann, I., & Pfaffinger, C. (2015). User comments: Motives and inhibitors to write and read. Information, Communication & Society, 18(7), 798–815. https://doi.org/10.1080/1369118X.2014.997268
  • Stoll, A., Ziegele, M., & Quiring, O. (2020). Detecting impoliteness and incivility in online discussions: Classification approaches for German user comments. Computational Communication Research, 2(1), 109–134. https://doi.org/10.5117/CCR2020.1.005.KATH
  • Stroud, N. J., Scacco, J. M., Muddiman, A., & Curry, A. L. (2015). Changing deliberative norms on news organizations’ Facebook sites. Journal of Computer-Mediated Communication, 20(2), 188–203. https://doi.org/10.1111/jcc4.12104
  • Sundar, S. S. (2020). Rise of machine agency: A framework for studying the psychology of human–AI interaction (HAII). Journal of Computer-Mediated Communication, 25(1), 74–88. https://doi.org/10.1093/jcmc/zmz026
  • Sundar, S. S., & Kim, J. (2019, May). Machine heuristic: When we trust computers more than humans with our personal information. In Proceedings of the 2019 CHI Conference on human factors in computing systems (pp. 1–9). https://doi.org/10.1145/3290605.3300768
  • Suzor, N. P., West, S. M., Quodling, A., & York, J. (2019). What do we mean when we talk about transparency? Toward meaningful transparency in commercial content moderation. International Journal of Communication, 13(18), 1526–1543.
  • Turilli, M., & Floridi, L. (2009). The ethics of information transparency. Ethics and Information Technology, 11(2), 105–112. https://doi.org/10.1007/s10676-009-9187-9
  • Veale, M., Van Kleek, M., & Binns, R. (2018, April). Fairness and accountability design needs for algorithmic support in high-stakes public sector decision-making. In Proceedings of the 2018 chi conference on human factors in computing systems (pp. 1–14). https://doi.org/10.1145/3173574.3174014
  • Venkatesh, V. (2000). Determinants of perceived ease of use: Integrating control, intrinsic motivation, and emotion into the technology acceptance model. Information Systems Research, 11(4), 342–365. https://doi.org/10.1287/isre.11.4.342.11872
  • Venkatesh, V., & Davis, F. D. (2000). A theoretical extension of the technology acceptance model: Four longitudinal field studies. Management Science, 46(2), 186–204. https://doi.org/10.1287/mnsc.46.2.186.11926
  • Vox Media. (2020). Your community is our priority: Coral features tools and experiences for commenters, moderators, community managers, and journalists alike. (Retrieved January , 2022). https://coralproject.net/tour/
  • Wang, S. (2021). Moderating uncivil user comments by humans or machines? The effects of moderation agent on perceptions of bias and credibility in news content. Digital Journalism, 9(1), 64–83. https://doi.org/10.1080/21670811.2020.1851279
  • Waseem, Z. (2016). Are you a racist or am i seeing things? Annotator influence on hate speech detection on twitter. In Proceedings of 2016 EMNLP Workshop on Natural Language Processing and Computational Social Science, Austin, TX, USA (pp. 138–142).
  • Wojcieszak, M., Thakur, A., Ferreira Gonçalves, J. F., Casas, A., Menchen-Trevino, E., & Boon, M. (2021). Can AI enhance people’s support for online moderation and their openness to dissimilar political views? Journal of Computer-Mediated Communication, 26(4), 223–243. https://doi.org/10.1093/jcmc/zmab006
  • Wolf, A. (2020). Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations (pp. 38–45). Association for Computational Linguistics.
  • Wright, S. (2006). Government-run online discussion fora: Moderation, censorship and the shadow of Control1. The British Journal of Politics and International Relations, 8(4), 550–568. https://doi.org/10.1111/j.1467-856x.2006.00247.x
  • Wright, S., & Street, J. (2007). Democracy, deliberation and design: The case of online discussion forums. New Media & Society, 9(5), 849–869. https://doi.org/10.1177/1461444807081230
  • Yin, M., Wortman Vaughan, J., & Wallach, H. (2019, May). Understanding the effect of accuracy on trust in machine learning models. In Proceedings of the 2019 chi conference on human factors in computing systems, Glasgow, Scotland UK (pp. 1–12).
  • York, J., & McSherry, C. (2019). Content moderation is broken. Let us count the ways. Electronic Frontier Foundation Deeplinks. (Retrieved January , 2022). https://www.eff.org/deeplinks/2019/04/content-moderation-broken-let-us-count-ways
  • Yu, X., Xu, S., & Ashton, M. (2023). Antecedents and outcomes of artificial intelligence adoption and application in the workplace: The socio-technical system theory perspective. Information Technology & People, 36(1), 454–474. https://doi.org/10.1108/ITP-04-2021-0254
  • Zampieri, M., Malmasi, S., Nakov, P., Rosenthal, S., Farra, N., & Kumar, R. (2019, June). SemEval-2019 Task 6: Identifying and Categorizing Offensive Language in Social Media (OffensEval). In Proceedings of the 13th International Workshop on Semantic Evaluation (pp. 75–86).
  • Ziegele, M., & Jost, P. B. (2020). Not funny? The effects of factual versus sarcastic journalistic responses to uncivil user comments. Communication Research, 47(6), 891–920. https://doi.org/10.1177/0093650216671854
  • Ziegele, M., Weber, M., Quiring, O., & Breiner, T. (2018). The dynamics of online news discussions: Effects of news articles and reader comments on users’ involvement, willingness to participate, and the civility of their contributions. Information, Communication & Society, 21(10), 1419–1435. https://doi.org/10.1080/1369118X.2017.1324505