966
Views
0
CrossRef citations to date
0
Altmetric
Research Articles

Using an Artificial intelligence chatbot to critically review the scientific literature on the use of Artificial intelligence in Environmental Impact Assessment

ORCID Icon, ORCID Icon, ORCID Icon, ORCID Icon, ORCID Icon & ORCID Icon
Pages 189-199 | Received 17 Aug 2023, Accepted 13 Feb 2024, Published online: 05 Mar 2024

ABSTRACT

There is considerable uncertainty about the role that Artificial Intelligence (AI) might play in Environmental Impact Assessment (EIA), including into research. AI large language model (LLM) chatbots have the potential to increase the efficiency of EIA research, but their outputs can create concerns. This paper investigates the potential time savings achievable using LLM chatbots to undertake a critical review of literature focussing on the use of AI in EIA. Using a combination of ChatGPT and Elicit, literature was reviewed to identify 12 key issues associated with the use of AI in EIA and this paper was prepared in three and a half days from initial conception. A protocol is developed to assist researchers in fact checking evidence delivered through Elicit (or other machine learning tools) which serves as a novel outcome of this research. Using comments from three peer reviewers allowed some more objective reflection on the credibility of the LLM chatbot-derived output, on the appropriateness of the time savings, and on the future research needed on the application of LLM chatbots in this context.

1. Introduction

Artificial Intelligence (AI) is a form of digital technology that can be applied in a variety of fields to add value to assessment exercises (Dupps Citation2023). One type of AI is a large language model (LLM) which is an algorithm capable of carrying out language processing tasks. But there is also considerable concern that the use of LLMs could ‘undermine academic integrity’ (Eke Citation2023, p. 100060). Dupps (Citation2023, p. 655) quotes J.K. Rowling in considering the role of LLMs in academic publishing: ‘Never trust anything that can think for itself if you can’t see where it keeps its brain – J.K. Rowling, Harry Potter and the Chamber of Secrets, 1998’. The underlying issue being that users do not know how it works, which raises questions about the validity of findings. These same concerns were raised by the reviewers of this paper as explained in the conclusions.

Impact Assessment (IA) ‘is the process of identifying the future consequences of a current or proposed action’ (International Association for Impact Assessment Citationundated). In the field of IA, the increasing application of AI, specifically AI chatbots that answer questions posed to them, has led to the International Association for Impact Assessment (IAIA) beginning the task of developing best practice principles for the use of AI in IA. This will supplement the existing suite of best practice principles to inform IA practitioners (see https://www.iaia.org/best-practice.php). Academic literature is already beginning to refer to the potential for the use of AI in IA (whether good or bad) (see, for example, Bice and Fischer Citation2020; Bond and Dusík Citation2020; Curmally et al. Citation2022; Sandfort et al. Citation2024). Our research has focused on project-level environmental impact assessment (EIA) specifically. This can be justified by EIA being the only form of IA that is globally mandated (Morgan Citation2012; Glasson and Therivel Citation2019), and also by the fact that there are over a hundred different types of impact assessment (Vanclay Citation2015), which significantly complicates any analysis if not simplified. Therefore, a pertinent question raised is what are the issues associated with the use of AI in EIA? This question provides the case example for our paper where we aim to demonstrate the potential for the use of AI (LLM chatbots specifically) in an academic context:

  1. to reduce the length of time taken to perform a critical evaluation of literature; and

  2. to synthesise current learning and engage in debate about the issues that researchers and practitioners need to take into account when considering the use of AI in EIA.

Whilst no novel outcomes are expected from the application of LLMs to this literature review, the research does aim to examine whether a credible literature review can be conducted as the launching point for further research, and whether this task can be completed more efficiently using such forms of AI. The next section sets out the approach we have used and, more importantly, explains how we have been transparent about the use of AI and the specific text and analysis that has been generated by AI, and how this is distinguished from the text and interpretation of the authors. Section 3 then details the ChatGPT-based analysis of the issues associated with the use of AI in EIA and includes a network diagram illustrating one level of issues with the use of AI in EIA and a second level of evidence for those issues, as identified by ChatGPT. Section 4 modifies the output from section 3 to resolve circular arguments and inconsistencies in the ChatGPT output. This section includes a summary figure which is the key output of this research. Section 5 provides our conclusions and suggestions for ways forward both in terms of the use of LLMs to help academic research efficiency related to EIA, and in terms of the issues inherent in the use of AI in EIA that need to be addressed moving forward. The section benefits from the comments of three reviewers of an initial manuscript that enabled the authors to be more objective about the value of the ChatGPT output.

2. Methods

A number of different AI tools are available to assist in this work. The most commonly known tool (Au Yeung et al. Citation2023) is OpenAI’s ChatGPT which is a large language model (LLM) that generates text-based responses to queries typed into a web-based interface (called a chatbot) (Kim et al. Citation2023), with responses based on a database of knowledge available on the web up to a point in time. A number of alternatives to ChatGPT exists, including Google Bard, Microsoft Bing, Perplexity AI, amongst others (Krause Citation2023; Goto and Katanoda Citation2023). Our approach was not to test a broad range of AI tools, or to synthesise understanding of the issues associated with the use of AI in EIA across a range of AI tools. Instead, it was to develop understanding quickly and efficiently. This helps to demonstrate the ability of LLM chatbots to generate a timely analysis (which we would argue is important in a fast-developing field), and helps to ensure EIA researchers are aware of some of the key potential issues before they start to use AI. Thus, we focused on ChatGPT, and used the latest version (ChatGPT 4.0) at the time of writing, acknowledging that a significant limitation of this version is that the knowledge base went up to September 2021 (this shortcoming has since been resolved), and was behind a paywall at the time of writing. In a fast-moving field, this meant that two years of knowledge was missing when the research was undertaken.

ChatGPT was accessed at https://openai.com/gpt-4 and the following question was asked: ‘What are the issues associated with the use of Artificial Intelligence in Environmental Impact Assessment’. For this query and all subsequent queries, the opportunity to regenerate, whereby ChatGPT has a further opportunity to answer the same question, was not used. After identifying several issues (level 1 issues), a further round of queries was entered into ChatGPT to ask, ‘what is the evidence for [issue x] being an issue associated with the use of artificial intelligence in environmental impact assessment?’ This provides a series of level 2 evidence factors deemed by ChatGPT to justify the identification of the issues.

The authors were aware of the ethical conundrum associated with the use of AI to develop text that is then submitted for peer review. Researchers are already using LLM chatbots extensively to help to write academic papers, and ChatGPT often now appears as an author (Stokel-Walker Citation2023). This practice is creating challenges for academic publishers who are responding in different ways, some embracing it as inevitable, others banning the practice (Sample Citation2023). For this paper, submitted to a Taylor & Francis journal, the policy was clarified in February 2023 and includes the statement ‘AI tools must not be listed as an author. Authors must, however, acknowledge all sources and contributors included in their work. Where AI tools are used, such use must be acknowledged and documented appropriately’ (Taylor & Francis Citation2023). The editor in chief of the journal Nature is attributed as saying that the ‘use of AI-generated text without proper citation could be considered plagiarism’ (Stokel-Walker Citation2023, p. 620). We therefore had to decide whether to use ChatGPT to generate results which we then entirely synthesised and interpreted, or to use the text generated verbatim; we chose the latter, again in the interests of transparency, and in the following section (and in supplementary data) any text in Box 2 (and evidence factors in Boxes 2.1–2.12 in supplementary data – see section 3) is a direct quote from the ChatGPT responses to our questions. This separates ChatGPT input from our own analysis – which is all the text outside Box 2 (and all sub-boxes 2.1–2.12). The one exception to this rule relates to citations to provide evidence of the credibility of claims made inside the boxes. Here again we chose to make use of AI to assist in identifying suitable evidence to further promote the potential for timeliness of AI-assisted research. ChatGPT currently produces fake references which cannot be used as evidence (Day Citation2023), so we used the AI machine learning tool Elicit (Ought Citation2023) for this purpose, with co-authors checking to ensure that the results did indeed provide credible evidence for the issue claim made. However, it quickly became apparent that the sources identified by Elicit were frequently unsuitable. Box 1 provides an example of such an inaccuracy.

Box 1. Issues with the use of AI to find evidence to support the credibility of AI issues in EIA using ‘Elicit’

Using ‘Elicit’ to identify scientific literature to support the identification of these issues was problematic. For example, taking the first identified issue and using the query ‘What is the evidence that over-reliance on artificial intelligence is an issue associated with the use of Artificial Intelligence in Environmental Impact Assessment?’ produced a range of references with the following text-based argument: ‘The papers suggest that over-reliance on artificial intelligence (AI) is an issue associated with the use of AI in environmental impact assessment’ (Ought, Citation2023). This was further justified with the text: ‘Pachot and Patissier (Citation2022) notes that while AI can be a valuable tool in the fight against environmental degradation, it is paradoxical that an energy-consuming technology is being used to address ecological challenges. Ligozat et al. (Citation2022) highlights the negative environmental impacts of AI for Green, including greenhouse gas emissions and energy consumption. The authors suggest that assessing the complete net environmental impacts of AI solutions for the environment is crucial. Cortés et al. (Citation2000) discusses the integration of AI techniques into Environmental Decision Support Systems (EDSS) and highlights the importance of quality information for effective environmental protection. Dostatni et al. (Citation2023) examines the use of AI for assessing the pro-environmental practices of companies and suggests that AI-based analysis can reveal significant differences in environmental practices between small, medium, and large enterprises’. However, it is clear on reading these references that none directly link EIA with AI, and none substantiate the claim, other than very indirectly at best.

Thus, to verify the validity of the issues identified by ChatGPT a protocol was developed as follows:

  1. Refer to the supplementary data file and use the ChatGPT question designed to identify level 2 factors in Elicit. For example, typing the question in the search box within Elicit ‘what is the evidence for [level 1 issue x] being an issue associated with the use of artificial intelligence in environmental impact assessment?

  2. Use keywords in Elicit: ‘[AI issue]’ AND ‘artificial intelligence’ AND ‘environmental impact assessment’.

  3. Use the same key words in Scopus and/or Google Scholar.

  4. Adapted human intelligence (authors) search using any or all of Elicit, Scopus and Google Scholar, using additional search terms and/or strings that seem appropriate and recording which search terms and/or strings and which databases found the evidence.

To work efficiently, the human co-authors subdivided the issues between them, and each worked through the Protocol.

In order to synthesise the results from ChatGPT in a digestible format, we develop a network diagram linking all the issues identified. This diagram was drawn by the authors using Miro, https://miro.com/app/dashboard/ simply as a tool to link together the ChatGPT output. We then use our own knowledge and understanding to modify this network diagram, removing circular references (where evidence provided that a level 2 factor is associated with the use of AI in EIA is simply a repeat of a level 1 issue, or where level 2 factors are repeated across more than one level 1 issue) and interpreting the linkages, to develop our own understanding of the issues associated with the use of AI in EIA; that is, we use the ChatGPT output as a starting point, but have to interpret it to address the inability of ChatGPT to recognise the circular references.

Also, as explained in the introduction, one of the limitations of using ChatGPT to undertake this kind of research is that the algorithms it uses are not known, nor are the databases. Therefore, the completeness of the results remains questionable. No systematic process was undertaken to verify the AI-generated results as this would run counter to the objective of producing timely research in as short a time period as possible. Instead, using processes of snowballing where evidence in the articles identified through the use of the search protocol pointed to other issues, and also through the authors’ own knowledge of AI issues identified from wider reading associated with other fields of enquiry, adaptations were made of the AI-generated results.

Thus, we develop two network diagrams – one which is entirely generated from ChatGPT output and not interpreted in any way by the authors, and a second which addresses the limitations of the output apparent to the authors.

Whilst one key aim of this paper is to provide valuable knowledge on the issues associated with the use of AI in EIA, we also set out to produce a timely analysis of the issues associated with the use of AI in EIA as efficiently as possible. However, a systematic evaluation of efficiency gains is problematic given that time sheets are not usually associated with academic research. That is, the time taken to write papers on particular topics is highly variable, and the quality of the product tends to vary considerably also. Instead, we set ourselves a target of writing and submitting a paper in the shortest possible time without compromising quality. A final element of the method for the paper was standard peer review. Whilst this process would be undertaken in any case, for this paper we have used it more reflectively to assist in the writing of the conclusions – to stimulate discussion about the use of AI in EIA. Thus, we highlight key issues raised by reviewers in the conclusions and give our own thoughts on them.

3. ChatGPT results

Box 2 provides the ChatGPT output for the initial question asked (see title of Box 2); this represents level 1 issues associated with the use of AI in EIA. Supplementary data is contained in Boxes 2.1-2.12 which provide the ChatGPT responses associated with a request to find evidence that the issue is credible; this represents level 2 factors associated with the issues identified with the use of AI in EIA.

The protocol outlined in section 2 above was followed in order to verify the level 1 issues identified by ChatGPT, the output of following this protocol is detailed in (along with an indication of which step in the protocol was reached before a credible reference was found), and the citations that appear in Box 2 are the product of this process. These citations represent the only text in the boxes that is not taken verbatim from the ChatGPT output (ChatGPT 4.0 does not provide citations to support the text delivered, and when prompted only provided questionable citations). No attempt was made to also verify the level 2 factors as it was considered to be superfluous in terms of meeting the objectives of the research.

Table 1. The protocol outputs for each of the issues identified in Box 2 (note that the citations identified all appear in Box 2).

All the issues identified by ChatGPT have been summarised in a network diagram illustrated in . This represents the unedited output from the use of AI and indicated a large number of circular references, that is, when asked for evidence that specific issues are associated with the use of AI in EIA, the output tends to repeat issues already identified.

Figure 1. AI issues in EIA as identified by ChatGPT, without author interpretation. The inner green circles represent level 1 issues, connection shown to all level 2 factors (yellow rounded boxes) with dashed lines.

Figure 1. AI issues in EIA as identified by ChatGPT, without author interpretation. The inner green circles represent level 1 issues, connection shown to all level 2 factors (yellow rounded boxes) with dashed lines.

Box 2. ‘What are the issues associated with the use of Artificial Intelligence in Environmental Impact Assessment’

Environmental Impact Assessment (EIA) involves evaluating the potential environmental impacts of a proposed project or development to aid decision-making processes. Integrating Artificial Intelligence (AI) into EIA processes can offer improved efficiency, predictability, and deeper insights. However, the use of AI in EIA also presents several challenges and issues:

  1. **Data Quality and Availability**: AI relies on large datasets for training and analysis. If the data used is incomplete, outdated, or biased, the AI’s predictions and insights could be flawed (Cortés et al. Citation2000). In the context of EIA, poor quality data can lead to incomplete assessments and misinformed decisions (Schibuola and Byer Citation1991; Wang et al. Citation2006; Nadeem and Hameed Citation2008).

  2. **Transparency and Interpretability**: Many advanced AI algorithms, especially deep learning models, can be seen as ‘black boxes’ (Wischmeyer Citation2020; de Fine Licht and de Fine Lichtde Fine Licht Citation2020). It can be challenging to understand how they derive specific results. In an EIA context, this can make it hard for stakeholders to trust or understand the AI’s conclusions (Curmally et al. Citation2022; Chapter 9: Artificial intelligence solutions for environmental and social impact assessments; Stahl and Leach Citation2023).

  3. **Over-reliance on AI**: There’s a risk that decision-makers might overly rely on AI’s conclusions without critical evaluation (Broussard Citation2018; Campolo and Crawford Citation2020; Khan and Nawaz Chaudhry Citation2023). EIA is a complex process that involves not only data but also local knowledge, socio-cultural considerations, and ethical judgments.

  4. **Loss of Expert Intuition**: Automated processes can potentially reduce the need for human expertise in specific tasks. Over time, this could erode the base of experienced professionals who bring crucial intuition and judgment to the EIA process (Spector and Shanshan Citation2019; Khan and Nawaz Chaudhry Citation2023).

  5. **Ethical Considerations**: AI can unintentionally introduce or perpetuate biases. For instance, if training data for the AI does not adequately represent all ecological or community contexts, the system might not account for them adequately in its assessments (McGovern et al. Citation2022).

  6. **Data Security and Privacy**: As with all digital solutions, AI systems are vulnerable to cyber threats. Data breaches can expose sensitive environmental and community information (Di Minin et al. Citation2021).

  7. **Economic and Social Impacts**: The integration of AI might lead to job losses in certain routine areas of EIA, leading to potential economic and social repercussions (Gurstein Citation1985; Aghion et al. Citation2018; Hagerty and Rubinov Citation2019; Bonab et al. Citation2021; Uren and Edwards Citation2023).

  8. **Regulatory and Legal Issues**: Existing EIA regulations and guidelines might not account for the introduction of AI tools. This can lead to legal ambiguities and challenges in integrating AI outputs into legally binding documents or decisions (Giuffrida Citation2019; Lozo and Onishchenko Citation2021; Makhkamov Citation2022; Pagallo et al. Citation2022).

  9. **Standardization and Validation**: AI models vary widely in their architectures and training processes. Ensuring standardized methods for AI integration in EIA and validating their effectiveness becomes crucial (Curmally et al. Citation2022; Kaur et al. Citation2022; McGovern et al. Citation2022; Schwartz et al. Citation2022; Um et al. Citation2022; Koyamparambath et al. Citation2022).

  10. **Environmental Misrepresentations**: A poorly trained AI might not recognize rare or uncommon environmental phenomena, leading to underestimations or misrepresentations of potential impacts (Cortés et al. Citation2000; Galaz et al. Citation2021; Alemohammad et al. Citation2023).

  11. **Temporal and Spatial Challenges**: AI models trained on past data might not always predict future environmental changes effectively, especially in the context of rapidly changing environmental conditions, such as those driven by climate change (Rizzoli and Young Citation1997; Cortés et al. Citation2000).

  12. **Scalability and Transferability**: AI models trained for specific contexts or regions might not be directly transferable to other scenarios or geographies without additional training or adjustment (Liu and Chih-Wei Citation2009; Francini et al. Citation2023; Mustak et al. Citation2023).

Given these challenges, it’s crucial to strike a balance between leveraging AI’s capabilities in EIA and ensuring that human judgment, expertise, and local context continue to play pivotal roles in the assessment process.

Note: this text is all verbatim from the ChatGPT results based on the query except for the citations – these were added by the authors after following the protocol outlined in section 2.

4. Modified ChatGPT results

A considerable amount of data was generated by ChatGPT, as summarised in . Despite the level 1 issues being verified by evidence identified through literature review, it quickly became apparent that there were problems with the level 2 factors. These could be placed into one of three categories:

  1. Circular arguments – a level 1 issue is repeated as a level 2 factor.

  2. Repetition – the same level 2 factor is repeated under two or more level 1 issues.

  3. Bullshit (after Costello Citation2023)Footnote1 – issues or factors identified by ChatGPT are not issues or factors per se, instead they are statements of fact.

To achieve objective 2, the results summarised in need to be amended by the authors to remove circular arguments, repetition and bullshit. Boxes 3 to 5 explain how this was done in turn, for each of these three categories, by providing some examples of the decision making of the author team. Box 3 provides two examples of how this task was undertaken for circular arguments, Box 4 provides two examples of how this task was undertaken for repetition, and Box 5 provides two examples of how this task was undertaken for bullshit.

Box 3. Dealing with circular arguments in ChatGPT findings – two examples of author interventions

The supplementary data file provides the level 2 factors identified by ChatGPT. Two examples of where some of these level 2 factors replicate level 1 issues (and are therefore circular) are:

  1. The level 1 issue ‘loss of expert intuition’ (see Box 2) has associated with it as a level 2 factor ‘ethical and cultural sensitivity’ (Box 2.4, supplementary data). However, there is already a level 1 issue ‘ethical considerations’ (see Box 2). In this case, the level 2 factor was deleted, and amended to create a link between level 1 issues ‘loss of expert intuition’ and ‘ethical considerations’. See .

  2. The level 1 issue ‘data quality and availability’ (see Box 2) has associated with it as a level 2 factor ‘temporal variability’ (see Box 2.1, supplementary data). In this case, the level 2 factor was deleted, and amended to create a link between level 1 issues ‘data quality and availability’ and ‘temporal and spatial challenges’. See .

Box 4. Dealing with repetition in ChatGPT findings – two examples of author interventions

The supplementary data file provides the level 2 factors identified by ChatGPT. Two examples of where some of these level 2 factors replicate level 1 issues (and are therefore circular) are:

  1. The level 1 issue ‘over-reliance on AI’ (see Box 2) has associated with it as a level 2 factor ‘public perception and trust’ (see Box 2.3, supplementary data); the level 1 issue ‘ethical considerations’ (see Box 2) also has associated with it as a level 2 factor ‘public perception and trust’ (see Box 2.5, supplementary data). In this case, the surplus level 2 factor was deleted from and an additional link added such that both ‘over-reliance on AI’ and ‘ethical considerations’ link to one level 2 factor ‘public perception and trust’ (see ) (Note – there are other duplicates of this same level 2 factors that have been dealt with in the same way).

  2. The level 1 issue ‘temporal and spatial challenges’ (see Box 2) has associated with it as a level 2 factor ‘interdisciplinary challenges’ (see Box 2.11, supplementary data); the level 1 issue ‘standardisation and validation’ (see Box 2) also has associated with it as a level 2 factor ‘interdisciplinary challenges’ (see Box 2.9, supplementary data). In this case, the surplus level 2 factor was deleted from and an additional link added such that both ‘temporal and spatial challenges’ and ‘standardisation and validation’ link to one level 2 factor ‘temporal and spatial challenges’ (see ).

Box 5. Dealing with bullshit in ChatGPT findings – two examples of author interventions

The supplementary data file provides the level 2 factors identified by ChatGPT. Two examples of where some of these level 2 factors are duplicated for more than one level 1 issue are:

  1. ‘Case studies’ is a level 2 factor associated with four separate level 1 issues with the following descriptions:

    1. Level 1 issue ‘data quality and availability’ with level 2 factor: **Case Studies**: Many AI projects that have failed or have not achieved desired performance can be traced back to issues with the data used for training or validation. For instance, a project aiming to predict habitat destruction based on satellite images might not perform well if the training data does not adequately capture the range of habitats in a region or if the data contains many inaccuracies (see supplementary data Box 2.1). The case studies are not specifically specified so the evidence base cannot be confirmed.

    2. Level 1 issue ‘loss of expert intuition’ with level 2 factor: ‘**Case Studies**: Specific case studies, especially those where AI-driven assessments were later revised or corrected following expert reviews, highlight the complementary nature of AI and human intuition. They serve as evidence that while AI can augment the EIA process, it can’t replace the nuanced understanding and judgment of human experts’ (see supplementary data Box 2.4). The case studies are not specifically specified so the evidence base cannot be confirmed.

    3. Level 1 issue ‘standardization and validation’ with level 2 factor: ‘**Case Studies**: Real-world examples where AI-driven EIA processes produced conflicting or inaccurate results highlight the challenges posed by a lack of standardization and validation’ (see supplementary data Box 2.9). Again, the case studies are not specifically specified so the evidence base cannot be confirmed.

    4. Level 1 issue ‘environmental misrepresentations’ with level 2 factor: ‘**Case Studies**: There have been instances where AI-driven assessments or simulations have provided misleading results due to factors like data issues, model limitations, or improper parameter settings’ (see supplementary data Box 2.10). Here a case study is not a specific factor – it is simply a statement of fact that AI-related issues lead to problems which can be identified through case studies.

  2. ‘Scientific literature’ is a level 2 factor associated with 10 separate level 1 issues with the following descriptions:

    1. Level 1 issue ‘data quality and availability’ with level 2 factor: ‘**Scientific Literature**: Research papers and publications often emphasize the importance of quality data. Numerous studies have showcased that models trained on limited or poor-quality data lead to misinterpretations or mispredictions’ (see supplementary data Box 2.1). Here the unspecified scientific literature simply points to an AI-related data quality issue rather than being specific evidence.

    2. Level 1 issue ‘transparency and interpretability’ with level 2 factor: ‘**Scientific Literature**: Many research papers and articles have pointed out the opacity of certain AI models. For example, research on using AI for species identification or habitat assessment has shown that while models can be accurate, understanding precisely why they make specific predictions can be challenging’ (see supplementary data Box 2.2). Here again the scientific literature is not specified and therefore the evidence cannot be verified.

    3. Level 1 issue ‘over-reliance on AI’ with level 2 factor: “**Scientific Literature**: Studies and articles discussing the implementation of AI in various disciplines often highlight the importance of human oversight. The consensus in many research papers is that while AI can augment decision-making processes, it shouldn’t entirely replace human judgment (see supplementary data Box 2.3). Here again the scientific literature is not specified and therefore the evidence cannot be verified.

    4. The same issue is repeated for seven other level 1 issues, all of which identify scientific literature incompletely as a factor (see Boxes 2.4, 2.5, 2.6, 2.8, 2.9, 2.10, 2.11).

Figure 2. AI issues and factors in EIA as identified by ChatGPT, after author interpretation. There are three concentric circles: 1) the inner green circles represent level 1 issues, connection shown to all level 2 factors (yellow rounded boxes) with lines; connection to other level 1 issues have replaced circular arguments to the same issue that had been listed in level 2; 2) the middle yellow rounded boxes are level 2 factors that are connected to more than one level 1 issue using solid lines; 3) the outer yellow rounded boxes are level 2 factors that connect to a single level 1 issue using dashed lines.

Figure 2. AI issues and factors in EIA as identified by ChatGPT, after author interpretation. There are three concentric circles: 1) the inner green circles represent level 1 issues, connection shown to all level 2 factors (yellow rounded boxes) with lines; connection to other level 1 issues have replaced circular arguments to the same issue that had been listed in level 2; 2) the middle yellow rounded boxes are level 2 factors that are connected to more than one level 1 issue using solid lines; 3) the outer yellow rounded boxes are level 2 factors that connect to a single level 1 issue using dashed lines.

These three problem categories were all dealt with consistently for each category as detailed in boxes 3 to 5, but summarised for each of the categories below:

  1. Circular arguments – a level one issue is repeated as a level 2 factor. In this case the superfluous level 2 factor is deleted, with a connection instead made between the level 1 issue of the same name and the level 1 issue that hosted the repeated level 1 issue name as a level 2 factor. In the revised figure, these circular argument replacements are all illustrated with solid lines that are not linear.

  2. Repetition – the same level 2 factor is repeated under two or more level 1 issues. In this case all the repeated level 2 factors are aggregated into a single level 2 factor which has multiple links to the level 1 issues that connect to them illustrated by solid, straight lines.

  3. Bullshit – issues identified by ChatGPT are not issues or factors per se, instead they are statements of (often unsubstantiated) fact. These issues are simply deleted from .

These three problem categories would also be experienced through human researcher coding approaches, which progress on the basis of iteration – identifying circular arguments, overlaps, and clearly inappropriate codes.

makes it clear that ChatGPT interprets an ‘issue’ to be a challenge, or potentially uncertainty, as might be expected. Nevertheless, it remains an important point that ChatGPT was not asked to identify positive aspects of the use of AI, and this has had consequences for the negative framing of the issues identified.

Whilst the caption explains how the relationships can be interpreted, the key learning point is that the level 1 factors (the inner green circles) are categories of issues. As such, they remain somewhat vague and need further explanation. The level 2 factors (yellow circles) are the more precisely defined issues that might lend themselves to specific research tasks in the future, for example, one could ask ‘what is the potential for over-automation to create ethical issues in relation to the use of LLM chatbots in IA?’.

5. Conclusions

The research set out to achieve two objectives:

  1. to reduce the length of time taken to perform a critical evaluation of literature; and

  2. to synthesise current learning and engage in debate about the issues that researchers and practitioners need to take into account when considering the use of AI in EIA.

For the first of these, we set out to try and use AI tools (ChatGPT4.0. and Elicit) to produce output useful to the IA community in as short a time period as possible. The article was submitted three and a half days after the research began (although we acknowledge that additional time has been taken up later during the review process, and it is not clear whether the time taken for revision is different than it would have been had AI not been used). However, in using AI to improve the time efficiency of the research, we have been careful not to simply accept results at face value, but to find evidence for the ChatGPT outputs. Engaging with AI tools ChatGPT and Elicit took up a small proportion of the time used. The majority of human input was spent on identification of evidence to support the ChatGPT claims where Elicit was found to be flawed, but still in many cases a useful means of identifying some relevant literature. The protocol developed to assist researchers in fact checking evidence delivered through Elicit (or other machine learning tools) is a novel outcome of this research which can assist future researchers. The other significant use of time was drawing the network diagram (using Miro, https://miro.com/app/dashboard/), and then interpreting the resulting network diagram to remove inconsistencies and circular references. This interpretation step duplicates standard coding of academic literature by human researchers which involves an iterative process of checking and improving the codes. Whilst we have no control experiment to use as a benchmark for time taken, it remains the shortest time taken by any of the authors to get from idea to output by a significant margin. Subject to the veracity of the results, this suggests AI tools, such as ChatGPT, can improve the efficiency of research tasks significantly. A question that remains unanswered is the output that may have been produced through the use of systematic literature review based on the use of academic databases. Future research along these lines could help to add clarity over the scale of time saving possible and, more importantly, compare the findings of ‘traditional’ versus AI-assisted literature review approaches.

Thus, the key shortcoming of the use of AI relate to the confidence of the user in the findings given the uncertainty over the approach taken by ChatGPT. This does point to an interesting future research area related to a comparison of the outcomes of literature reviews when undertaken by AI as opposed to when undertaken by human researchers. LLMs conduct the initial coding on an unknown, dataset, using unknown algorithms. Yet human researchers are both fallible and resource constrained, and are subject to database access issues, and cognitive limitations when it comes to synthesis. Determining which approach is better is a task in itself, as is determining which approach gains most confidence of readers. Additional questions raised include:

  1. how well a researcher needs to know the literature before the use of an LLM chatbot to assist becomes credible (or even whether knowledge of the literature is needed?);

  2. what kind of knowledge does a researcher require in order to be able to critically assess LLM chatbot output?

Reviewers of the initial manuscript expressed some discomfort with the rapid approach to writing the paper. In particular, concerns were raised that:

  • A focus on rapid output can compromise the quality of research and therefore its robustness.

  • The automated nature of analysis that saves time undermines the value of human thinking and therefore threatens the level of insight and knowledge advancement.

  • Time saving associated with finding references risks some publishing outlets being privileged over others (with no way of knowing whether this is taking place).

We acknowledge that these are valid concerns. Some could potentially be overcome with different questions being asked of the chatbots. But concerns over the lack of insights and unconscious bias towards outlets seem more difficult to simply avoid. These are issues that researchers need to actively consider as they undertake their studies.

On the second objective, subject to the caveat that we cannot say whether a more traditional approach would have yielded the same (or more, or less, comprehensive) findings, strikes us as being a useful starting point for those wishing to engage with, or manage, the use of AI in both EIA research and practice. Our independent evidence checks for the veracity of the level 1 issues gives us confidence that these are credible, acknowledged issues. Whilst there was repetition and circularity of evidence, these have been addressed in producing . This strikes us as representing the current limitations of LLM chatbots used in this way, and it seems likely that LLMs will improve in the future and require less adjustment. Therefore, EIA researchers could begin to address these issues now; with providing a research roadmap of issues that need either to be managed, or resolved.

Nevertheless, reviewers expressed some concerns over the output. In particular:

  • Limiting the search for references that validate an issue misses any searches for counter-evidence that could undermine the findings.

  • The fact that the authors had to revise the initial output of ChatGPT was highlighted by reviewers as evidence that the initial output is unhelpful.

  • There was some criticism that the output, in terms of identified issues, was not further processed in terms of the meaning it might have in practice.

  • There is a disconnect between the identification of ‘issues’ and the consequences these might have for planning and commissioning processes for IA.

  • The idea that the learning from ChatGPT was in any way valuable was questioned as being superficial. In particular, the fact that a deeper understanding of the limitations and consequences of using AI to manage, for example, spatial data on environmental components, like species abundance, distribution and connectivity, is simply missing at present.

  • The AI-based output produced nothing new.

We would acknowledge these points having some validity. In particular, whilst we only set out to undertake a literature review to identify issues with the use of AI in EIA rather than further consider the broader implications, the comment did lead the authors to reflect on a level of discomfort with the findings in that it is difficult to feel significant ownership of them. And without that ownership, there is some reticence to further explore their meaning. This could well be problematic if the use of AI in EIA research expands, depending on what the objectives are in each case. We would argue that whilst nothing new will come from currently available chatbots, as they simply search existing knowledge, the same would be true of a literature review. In this case the listing of issues in a single publication might be regarded as being novel and useful in presenting a synthesis of knowledge that has not been published before – but the extent to which a reader will agree is subjective. We have no idea how ChatGPT has interpreted the term ‘AI’ – and so don’t know the boundaries placed around the searches conducted. There remains, therefore, considerable uncertainty over the issues identified as being associated with the use of AI in IA, because it is not known what Chat GPT has interpreted as being AI.

A last word on the ethics of what we have done. We have taken the view that the increasing use of AI in research in inevitable, and where the kind of efficiency gains in delivering timely research are possible through the application of AI, it seems foolish to dismiss it. However, we do believe that AI can be used ethically, subject to appropriate transparency, which we have tried to deliver in this paper. The real challenge going forward is knowing where AI has been used and not acknowledged – which is where the ethical lines are crossed. And one reviewer also commented that project proponents already generate plenty of bullshit in EIA that does not adversely affect decision-making, asking whether a more critical lens is being used to view AI output than is directed at some stakeholders in the process – which is perhaps another potential future research area.

Other challenges in the use of AI in EIA-focussed research have been identified through the review process. Perhaps as LLMs become more sophisticated, the criticisms in relation to time savings may become less of an issue, and the LLMs can create space for human thinking that is otherwise unavailable. Other issues raised by reviewers seem to relate to good research practice in general and ensuring that a reliance on LLMs is not leading to poor practice, like not searching for counter-evidence (which we are guilty of in this paper). Ultimately, the exercise points to the need to be aware of how AI can help, and how it can be a threat. There is surely some middle ground where the benefits can be felt without the very real threats highlighted by the reviewers.

Supplemental material

AI of AI in IA_supp data cleaned.docx

Download MS Word (60.3 KB)

Disclosure statement

No potential conflict of interest was reported by the author(s).

Supplementary material

Supplemental data for this article can be accessed online at https://doi.org/10.1080/14615517.2024.2320591

Correction Statement

This article has been corrected with minor changes. These changes do not impact the academic content of the article.

Notes

1. Bullshit is defined by the Cambridge Dictionary as being ‘a rude word for complete nonsense or something that is not truehttps://dictionary.cambridge.org/dictionary/english/bullshit, and has been adopted by some scholars, like Costello (Citation2023), as a philosophical concept.

References

  • Aghion P, Jones BF, Jones CI. 2018. Artificial intelligence and economic growth in the economics of artificial intelligence: an agenda. Chicago: University of Chicago Press.
  • Alemohammad S, Casco-Rodriguez J, Luzi L, Imtiaz Humayun A, Babaei H, LeJeune D, Siahkoohi A, Baraniuk RG. 2023. Self-consuming generative models go mad. doi: 10.48550/arXiv.2307.01850.
  • Au Yeung J, Kraljevic Z, Luintel A, Balston A, Idowu E, Dobson RJ, Teo JT. 2023. AI chatbots not yet ready for clinical use. Front Digit Health. 5:60. doi: 10.3389/fdgth.2023.1161098.
  • Bice S, Fischer TB. 2020. Impact assessment for the 21st century–what future? Impact Assess Proj Apprais. 38(2):89–93. doi: 10.1080/14615517.2020.1731202.
  • Bonab AB, Rudko I, Bellini F. 2021. A review and a proposal about socio-economic impacts of artificial intelligence. In Business Revolution in a Digital Era: 14th International Conference on Business Excellence, ICBE 2020; Bucharest, Romania: Springer p. 251–270.
  • Bond A, Dusík J. 2020. Impact assessment for the twenty-first century – rising to the challenge. Impact Assess Proj Apprais. 38(2):94–99. doi: 10.1080/14615517.2019.1677083.
  • Broussard M. 2018. Artificial unintelligence: how computers misunderstand the world. Cambridge MA: MIT Press.
  • Campolo A, Crawford K. 2020. Enchanted determinism: power without responsibility in artificial intelligence. Engag Sci Technol Soc. 6:1–19. doi: 10.17351/ests2020.277.
  • Cortés U, Sànchez-Marrè M, Ceccaroni L, R-Roda I, Poch M. 2000. Artificial intelligence and environmental decision support systems. Appl Intell. 13(1):77–91. doi: 10.1023/A:1008331413864.
  • Costello E. 2023. ChatGPT and the educational AI chatter: full of bullshit or trying to tell us something? Postdigital Sci Educ. 1–6. doi: 10.1007/s42438-023-00398-5.
  • Curmally A, Sandwidi BW, Jagtiani A. 2022. Chapter 9: Artificial intelligence solutions for environmental and social impact assessments. In: Fonseca A, editor. Handbook of Environmental Impact Assessment. Cheltenham: Edward Elgar Publishing Limited; p. 163–177.
  • Day T. 2023. A preliminary investigation of fake peer-reviewed citations and references generated by ChatGPT. Prof Geogr. 75(6):1024–1027. doi: 10.1080/00330124.2023.2190373.
  • de Fine Licht K, de Fine Lichtde Fine Licht J. 2020. Artificial intelligence, transparency, and public decision-making. AI Soc. 35(4):917–926. doi: 10.1007/s00146-020-00960-w.
  • Di Minin E, Fink C, Hausmann A, Kremer J, Kulkarni R. 2021. How to address data privacy concerns when using social media data in conservation science. Conserv Biol. 35:437–446. doi: 10.1111/cobi.13708.
  • Dostatni E, Mikołajewski D, Rojek I. 2023. The use of artificial intelligence for assessing the pro-environmental practices of companies. Appl Sci. 13(1):310. doi: 10.3390/app13010310.
  • Dupps WJ Jr. 2023. Artificial intelligence and academic publishing. J Cataract Refr Surg. 49(7):655–56. doi: 10.1097/j.jcrs.0000000000001223.
  • Eke DO. 2023. ChatGPT and the rise of generative AI: threat to academic integrity? J Responsib Technol. 13:100060. doi: 10.1016/j.jrt.2023.100060.
  • Francini M, Salvo C, Vitale A. 2023. Combining deep learning and multi-source GIS methods to analyze urban and greening changes. Sensors. 23(8):3805. doi: 10.3390/s23083805.
  • Galaz V, Centeno MA, Callahan PW, Causevic A, Patterson T, Brass I, Baum S, Farber D, Fischer J, Garcia D. 2021. Artificial intelligence, systemic risks, and sustainability. Technol Soc. 67:101741. doi: 10.1016/j.techsoc.2021.101741.
  • Giuffrida I. 2019. Liability for AI decision-making: some legal and ethical considerations. Fordham L Rev. 88:439.
  • Glasson J, Therivel R. 2019. Introduction to environmental impact assessment. London: Routledge.
  • Goto A, Katanoda K. 2023. Should we acknowledge ChatGPT as an author? J Epidemiol. 33(7):333–334. doi: 10.2188/jea.JE20230078. JE20230078.
  • Gurstein M. 1985. Social impacts of selected artificial intelligence applications: the Canadian context. Futures. 17(6):652–71. doi: 10.1016/0016-3287(85)90018-7.
  • Hagerty A, Rubinov I. 2019. Global AI ethics: a review of the social impacts and ethical implications of artificial intelligence. doi: 10.48550/arXiv.1907.07892.
  • International Association for Impact Assessment. undated. About IAIA, IAIA, [Accessed 26 May 2021]. http://www.iaia.org/about.php.
  • Kaur D, Uslu S, Rittichier KJ, Durresi A. 2022. Trustworthy artificial intelligence: a review. ACM Comput Surv. 55(2):1–38. doi: 10.1145/3491209.
  • Khan M, Nawaz Chaudhry M. 2023. Artificial intelligence and the future of impact assessment. SSRN Electron J. Available at SSRN 4519498. doi: 10.2139/ssrn.4519498.
  • Kim JK, Chua M, Rickard M, Lorenzo A. 2023. Response to letter to the editor re ChatGPT and large language model (LLM) chatbots: the current state of acceptability and a proposal for guidelines on utilization in academic medicine. J Pediatr Urol. 19(5):607. doi: 10.1016/j.jpurol.2023.07.007.
  • Koyamparambath A, Adibi N, Szablewski C, Adibi SA, Sonnemann G. 2022. Implementing artificial intelligence techniques to predict environmental impacts: Case of construction products. Sustainability. 14(6):3699. doi: 10.3390/su14063699.
  • Krause D. 2023. Large language models and generative AI in finance: an analysis of ChatGPT, Bard, and Bing AI. Bard, and Bing AI (July 15, 2023).
  • Ligozat A-L, Lefevre J, Bugeau A, Combaz J. 2022. Unraveling the hidden environmental impacts of AI solutions for environment life cycle assessment of AI solutions. Sustainability. 14(9):5172. doi: 10.3390/su14095172.
  • Liu KF-R, Chih-Wei Y. 2009. Integrating case-based and fuzzy reasoning to qualitatively predict risk in an environmental impact assessment review. Environ Model Softwa. 24:1241–51. doi: 10.1016/j.envsoft.2009.04.005. 10
  • Lozo O, Onishchenko O. 2021. The potential role of the artificial intelligence in combating climate change and natural resources management: political, legal and ethical challenges. J Nat Resour. 4(3):111–31. doi: 10.33002/nr2581.6853.040310.
  • Makhkamov D. 2022. Modern trends in regulation of environmental and legal relations: digitalization and artificial intelligence. Am J Pol Sci Law Crim. 3(01):41–46. doi: 10.37547/tajpslc/Volume04Issue01-07.
  • McGovern A, Ebert-Uphoff I, John Gagne D, Bostrom A. 2022. Why we need to focus on developing ethical, responsible, and trustworthy artificial intelligence approaches for environmental science. Environ Data Sci. 1:e6. doi: 10.1017/eds.2022.5.
  • Morgan RK. 2012. Environmental impact assessment: the state of the art. Impact Assess Proj Apprais. 30(1):5–14. doi: 10.1080/14615517.2012.661557.
  • Mustak S, Singh D, Kumar Srivastava P. 2023. Advanced remote sensing for Urban and landscape ecology. Singapore: Springer.
  • Nadeem O, Hameed R. 2008. Evaluation of environmental impact assessment system in Pakistan. Environ Impact Assess Rev. 28(8):562–71. doi: 10.1016/j.eiar.2008.02.003.
  • Ought. 2023. Elicit: the AI research Assistant. [Accessed 15 August 2023]. https://ellicit.org.
  • Pachot A, Patissier C. 2022. Towards sustainable artificial intelligence: an overview of environmental protection uses and issues. Green Low-Carbon Econ. arXiv preprint arXiv:2212.11738. doi: 10.47852/bonviewGLCE3202608.
  • Pagallo U, Ciani Sciolla J, Durante M. 2022. The environmental challenges of AI in EU law: lessons learned from the artificial intelligence act (AIA) with its drawbacks. Transforming Gov. 16(3):359–376. doi: 10.1108/TG-07-2021-0121.
  • Rizzoli AE, Young WJ. 1997. Delivering environmental decision support systems: software tools and techniques. Environ Model & Soft. 12(2–3):237–49. doi: 10.1016/S1364-8152(97)00016-9.
  • Sample I. 2023. Science journals ban listing of ChatGPT as co-author on papers. The Guardian, 26 January 2023.
  • Sandfort R, Uhlhorn B, Geissler G, Lyhne I, Jiricka-Pürrer A. 2024. AI will change EA practice - but are we ready for it? A call for discussion based on developments in collecting and processing biodiversity data. In Preprint Org. doi: 10.1080/14615517.2024.2318684
  • Schibuola S, Byer PH. 1991. Use of knowledge-based systems for the review of environmental impact assessments. Environ Impact Assess Rev. 11(1):11–27. doi: 10.1016/0195-9255(91)90014-B.
  • Schwartz R, Vassilev A, Greene K, Perine L, Burt A, Hall P. 2022. Towards a standard for identifying and managing bias in artificial intelligence. Gaithersburg, MD: National Institute of Standards and Technology, US Department of Commerce; p. 1270. doi: 10.6028/NIST.SP.1270.
  • Spector JM, Shanshan M. 2019. Inquiry and critical thinking skills for the next generation: from artificial intelligence back to human intelligence. Smart Learn Environ. 6(1):1–11. doi: 10.1186/s40561-019-0088-z.
  • Stahl BC, Leach T. 2023. Assessing the ethical and social concerns of artificial intelligence in neuroinformatics research: an empirical test of the European Union Assessment List for Trustworthy AI (ALTAI)’. AI Ethics. 3(3):745–67. doi: 10.1007/s43681-022-00201-4.
  • Stokel-Walker C. 2023. ChatGPT listed as author on research papers: many scientists disapprove. Nature. 613(7945):620–21. doi: 10.1038/d41586-023-00107-z.
  • Taylor & Francis. 2023. Taylor & Francis clarifies the responsible use of AI tools in academic content creation. Taylor & Francis, [Accessed 8 November 2023]. https://newsroom.taylorandfrancisgroup.com/taylor-francis-clarifies-the-responsible-use-of-ai-tools-in-academic-content-creation/.
  • Um T-W, Kim J, Lim S, Lee GM. 2022. Trust management for artificial intelligence: A standardization perspective. Appl Sci. 12:6022. doi: 10.3390/app12126022.
  • Uren V, Edwards JS. 2023. Technology readiness and the organizational journey towards AI adoption: an empirical study. Int J Inf Manage. 68:102588. doi: 10.1016/j.ijinfomgt.2022.102588.
  • Vanclay F. 2015. Changes in the impact assessment family 2003–2014: implications for considering achievements, gaps and future directions. J Environ Assess Policy Manage. 17(1):1550003. doi: 10.1142/S1464333215500039.
  • Wang YM, Yang JB, Xu DL. 2006. Environmental impact assessment using the evidential reasoning approach. Eur J Oper Res. 174(3):1885–913. doi: 10.1016/j.ejor.2004.09.059.
  • Wischmeyer T. 2020. Artificial intelligence and transparency: opening the black box. In: Wischmeyer T, and Rademacher T, editors. Regulating artificial intelligence. Cham: Springer International Publishing; p. 75–101.