Artificial Intelligence in Arbitration: Evidentiary Issues and Prospects – GAR

12 October 2023
This is an Insight article, written by a selected partner as part of GAR’s co-published content. Read more on Insight
The artificial intelligence (AI) genie is out of the bottle.
In November 2022, Open AI released its AI chatbot, ChatGPT. By January 2023, it had amassed over 100 million monthly active users, making it the fastest growing consumer application in history.^[2]
ChatGPT’s latest iteration, GPT-4 (announced in March 2023), has wowed (and terrified) with its apparent displays of human competitive intelligence in a broad array of fields. It scored in the top 10 per cent of a simulated bar exam, achieved a near-perfect Scholastic Aptitude Test science score and obtained similar results across a wide range of professional and college admission exams.^[3] As has been widely publicised, at our firm, Allen & Overy, lawyers are able to use a GPT-4-based platform called Harvey to automate and enhance various aspects of their work.^[4] The authors can confirm that the results are as impressive as they are unnerving.
Ask Harvey to prepare a memo on privilege under English law and it will prepare one within seconds (at a level of competence that will surprise many practitioners). Ask it to do so in the style of Donald Trump, or the author of 50 Shades of Gray, and you will be impressed (and amused) by the results.
This is a remarkable feat. As recently as October 2022, if you had received a coherent memo on a legal topic, this would have been proof of human involvement (if not quite human intelligence). That assumption is now obsolete.
It is natural to wonder, how far will this go? What will AI models be able to do, and how will we humans fit in? Investment in generative AI start-ups is exploding,^[5] research breakthroughs in the field are continuing,^[6] and even leading AI advocates now express concerns about the speed of progress.^[7] These developments seem certain to have profound implications for our society. It is naïve to assume that international arbitration will be immune.
Against that backdrop, this chapter considers some of the ways in which AI models (and, in particular, large language models) may transform the practice of international arbitration. Its focus is on evidence and how AI may, in the imminent and conceivable future, change the way in which parties gather, analyse and present evidence. The chapter’s core hypothesis is that while AI will not replace lawyers, lawyers who use AI will replace those who do not.
But the road will not be without its speed bumps. AI comes with risks and has limitations. For instance, examples abound of ChatGPT (and competitor AIs) confidently asserting falsehoods, known as ‘hallucinations’. Users cannot blindly follow its outputs, as one sorry US lawyer discovered.^[8] And so this chapter also considers some of the potential dangers and ethical issues arising from using AI.
Finally, we must note (with humility) that AI’s development is fast-moving and unpredictable. We do not purport to have a crystal ball, nor to have ‘answers’ to the ethical considerations its use will pose. We hope, however, that this chapter will be a thought-provoking addition to what will be an important conversation for the arbitration community in the years to come.
Given AI’s emerging capabilities in analysing and manipulating language – at speed and at scale – it seems obvious that AI could have powerful potential applications for identifying, finding and analysing evidence.^[9]
This section considers just some of those potential applications, starting with the pre-dispute or early claim development phase, then the pleadings, fact and expert witness and discovery phases, right up to AI’s potential use at the final merits hearing. We also consider the possibility of AI-generated evidence. The potential risks and limitations of these uses are discussed below.
Could AI soon be used to proactively look for evidence of claims? Imagine an AI tool that had access to your contracts, and: (1) automatically reviewed all emails, documents and other data related to the contract’s performance; (2) compiled evidence, and notified you, of possible breaches; and (3) suggested next steps, including diarising deadlines for contractual notices, preparing draft claim letters or drafting talking points for inter-party meetings. Or perhaps you simply want AI to find and compile the relevant evidence so external counsel can advise you on how to proceed.
Sound far-fetched? Maybe less than you think. In September 2023, Microsoft announced Copilot, which envisions integrating GPT-4-based AI across the full suite of Microsoft 365 products (Windows, Outlook, Teams, Word, Excel, etc.), to help automate, accelerate and enhance many aspects of knowledge work. Its promotional video offers an eye-opening glimpse of what knowledge work, including legal work, may look like in the future.^[10] Time will tell whether Copilot ultimately lives up to its hype, and if so, across what time scale and whether it can be adapted for legal work in the manner described above. In the meantime, multiple legal tech companies already claim to be able to (or to be developing tools that) automatically review contract compliance, or to identify, synthesise and organise evidence.
Not only do these tools offer the possibility of productivity and cost gains (for both clients and lawyers), they may also provide strategic benefits. The sooner one masters the factual record, the earlier one can identify the key issues in dispute, the strengths and weaknesses of one’s case and the ‘right’ litigation strategy. Parties and lawyers adept at using these tools would be able to act quicker and with better information than those who are not.
But what about when AI gets it wrong? For example, flagging irrelevant evidence or missing issues or deadlines? It is important to remember that humans are not perfect in the evidence-collection process either – far from it, and lawyers do not need to choose between artificial and human intelligence. If these tools can get to an acceptable baseline level of competence, the optimal process would likely combine both AI and human intelligence, and be iterative – the lawyer reviewing an AI model’s initial selection of evidence, asking further questions of the AI and providing feedback, and the AI calibrating its results. So long as the lawyer ultimately reviews and assesses the reliability and relevance of the underlying documents flagged as evidence by AI, these risks should be manageable. As discussed below, the biggest dangers seem to arise when lawyers blindly rely on an AI model’s findings.
Consider the following scenario. It is 9am. You check your email. As expected, the other side’s statement of case arrived at 4.34am, a few hours after the midnight deadline. ‘Those poor souls’, you think to yourself as you sip your morning coffee. The submission, including witness statements, expert reports and exhibits, spans thousands of pages. You take a deep breath and brace yourself for the task ahead. You have a lot of reading to do.
But why not first upload the submission to your ‘AI Assistant’ and ask for:
Now, of course, any lawyer worth their salt will ultimately need to read, and reread, and then re-reread, the submissions. And clients would also need to agree to the use of AI. But a preliminary review would undoubtedly be useful. It would speed up understanding, and help identify evidence and ideas for your defence that may otherwise have been missed. These tools appear to be under development, if not already in use.^[11]
But why stop there? A generative AI model could prepare the first draft of pleadings based on a lawyer’s bullet point outline of the key points, themes and evidence. The first draft would not be perfect, but it can be an iterative process. The lawyer’s role in the drafting process could evolve into one similar to an editor: providing comments and asking the AI to write and rewrite accordingly.
In principle, voice recognition AI could listen to fact witness interviews and integrate with a generative text AI to prepare first drafts. Could this help prevent distortion or contamination of witness memory?^[12] Or, would it give rise to criticism that the statement is not in the witness’ own words,^[13] but those of an AI model? Reasonable minds might differ on these questions, as well as on whether AI’s use is more or less preferable than lawyers in the preparation of first drafts, as is common in many jurisdictions.
AI seems less likely to play this ‘listen in, then draft’ role in expert evidence, given expert evidence is, by its nature, more analytical and less descriptive, and the credibility of the expert and their work product is paramount.
The potential of AI to transform the disclosure phase also seems evident. Rather than running search terms (which tend to generate many false positives), the underlying document requests could be run as prompts in the AI model, which would then review all of the documents and identify those that may be responsive. Advanced AI-driven search technology is already being rolled out for litigation purposes.^[14] The potential cost and time savings are obvious.
AI’s possible use during hearings is especially intriguing. Suppose AI listened to the hearing and reviewed the transcript in real time, while simultaneously looking for counterarguments and evidence – both on the record and in the public domain – to what opposing counsel, witnesses or experts were saying.
This kind of tool would be powerful but also dangerous. The risk of missing hallucinations in the heat of a hearing would be particularly acute. Counsel would need to be especially careful not to mislead the tribunal.^[15]
Counsel would also need to consider whether the AI would need to be disclosed and agreed to, and whether it complies with any privacy or confidentiality obligations that the parties may have with respect to the hearing. In the US, a tool was developed to help self-represented litigants contest traffic tickets.^[16] It involved using smart glasses that recorded court proceedings and dictated responses to the defendant’s ear from a small speaker. But the developers faced threats of criminal charges for (1) illegally recording audio during a live legal proceeding (which is not permitted in federal court and is often prohibited in state courts), and (2) the unauthorised practice of law.
We have discussed using AI to help identify, analyse and present evidence that is otherwise human generated. But what about AI giving evidence directly? An AI robot recently did so before a parliamentary inquiry in the United Kingdom.^[17] Admittedly, this was in a particular context (a review of how AI might affect creative industries), but it raises the question as to whether parties in arbitration might one day attempt to use ChatGPT responses or other AI outputs as evidence (e.g., as opinion evidence). For now, at least, it seems far-fetched that a tribunal would accept a ChatGPT or other generative AI output as probative evidence, given the issue of hallucinations and that (as explained below) ChatGPT essentially operates as a sophisticated ‘predictive text’ machine. But might this change if hallucinations reduce over time and society becomes more acclimatised to trusting AI-generated responses?
As clients increasingly integrate AI into their businesses, other instances of AI-generated evidence will no doubt arise. For instance, AI-generated minutes of meetings where the parties dispute what was said.
As discussed, AI offers many potentially advantageous and efficient ways of handling evidence in international arbitration. There are, however, important limitations and risks that cannot be ignored. We focus on four:^[18] first, the tendency of AI models to generate hallucinations, errors and inaccuracies; second, client confidentiality and privacy issues; third, more general regulatory risks; and fourth, the risk of fabricated AI evidence.
As noted above, AI will sometimes confidently assert incorrect answers. These hallucinations can even come with fabricated footnotes and sources, including entirely made up case names. In a field where accuracy and credibility are paramount, this is clearly a cause for concern.
The reasons for AI hallucinations may be varied.
First, AI can only deliver results that are as good as the data it holds. Yet, data may be incomplete – for example, documents may be confidential, unavailable, not known about or not client-approved for upload onto a platform. By way of illustration, in the context of legal research, AI is currently less reliable where it does not have access to subscription-only legal research services or to confidential awards or documents from a commercial arbitration.^[19] Data that is incomplete or selective (whether intentionally or unintentionally) will lead to unreliable results being produced by the algorithm.^[20]
Conversely, more data does not always guarantee better performance. While it is generally thought that the bigger the pool of sample data, the more accurate the prediction of an AI model should be,^[21] more data, especially if it is of low quality, irrelevant or inconsistent, may sometimes introduce more noise or challenges for AI.^[22] There is also evidence in the biological sphere that even where nature and nurture are the same, there remains a high degree of randomness.^[23]
Data can also contain biases that affect the reliability of results. For instance, at Allen & Overy, we have encountered situations where AI trained mainly on US data misinterpreted UK documents. The AI labelled as ‘positive’ responses that UK readers would recognise as being passive aggressive. These cultural biases are particularly relevant for arbitration, given its international nature.
Second, even with high-quality data sets, hallucinations can still occur due to the way generative AI functions. Consider ChatGPT, which – in simple terms – operates by imitating how humans have written in the past. It has been trained on billions of words of text to detect patterns in language.^[24] When asked a question, ChatGPT assesses how those words (and similar ones) have appeared across its training corpus and predicts what the first word in its response should be. It repeats this prediction, one word at a time, until its response is complete.^[25] Incredibly, this ‘copycat’ method leads to mostly accurate responses across a great number of fields (as its exam scores attest to). But it does not necessarily reflect actual understanding of what it is being asked, or what it is saying, and so the potential for convincing-sounding hallucinations exists.
Given the way ChatGPT operates, certain accuracy issues will be down to human input, as the answers presented by AI can only be as good as the question posed. Learning how to pose the right questions to put into an AI tool will become as crucial a skill for junior lawyers as drafting and researching.^[26]
Notwithstanding these issues, it is important to remember that AI can still outperform many human lawyers, both in terms of accuracy and speed. Moreover, work is also being carried out to reduce the rate of hallucinations. For instance, data sets and models are being refined for particular use cases. AI models are also able to learn based on feedback given by human reviewers.^[27] And while it may seem that ChatGPT is merely predicting, rather than reasoning, this may be incorrect. Microsoft’s researchers considered that ChatGPT’s exam performances indicated core mental capabilities such as reasoning, creativity and deduction.^[28] OpenAI’s CEO, Sam Altman, hopes that ChatGPT will evolve into a ‘reasoning engine over time’, enabling it to separate fact from fiction.^[29]
In sum, AI outputs require close scrutiny and verification. They should be treated as first drafts from an inexperienced junior, but one who prefers to concoct an answer rather than confess ignorance. Lawyers who understand this, as well as how to leverage AI’s talents, could enjoy significant efficiency and creativity gains. Those who blindly follow it are courting disaster.
As explained above, AI requires large amounts of data to function well. The uses considered above presuppose that all documents disclosed in, relevant to, and on the record in, the arbitration have been uploaded onto the relevant AI platform. For many clients, this would (understandably) raise alarm bells in terms of confidentiality and data protection.^[30] They might wonder, who will have access to this data, and for what purpose.
A recent example shows the reasonableness of these concerns. A coder at Samsung, in search of a fix for a bug, uploaded lines of confidential code to ChatGPT on two separate occasions.^[31] Because ChatGPT takes user inputs to train its model, this code was subsequently reproduced in response to users from other organisations.
Not all AIs operate this way, however. Some AI platforms, such as Harvey, use closed systems whereby any information submitted by a user is secured and cannot be reproduced in future responses. Lawyers who make use of AI will need to be certain that the AI they use maintains the confidentiality of client data.
In addition to the possibility of accidental disclosure by the AI tool, volumes of confidential data stored together inevitably face the risk of cybersecurity breaches, especially for emerging technologies – recall the ‘video-crashers’ who disrupted Zoom conference calls during the early days of the pandemic.^[32] Precedent already exists for hacking arbitration matters. In 2015, while the arbitration between China and the Philippines regarding disputed territory in the South China Sea was pending, hackers accessed the website of the Permanent Court of Arbitration, reportedly through malware.^[33] Similarly, in Caratube v. Kazakhstan, the claimant in the proceedings managed to obtain confidential information that was leaked from the Kazakh government’s IT system.^[34] Initiatives exist to seek to counter risk, such as the Protocol on Cybersecurity in International Arbitration (updated in 2022) jointly released by the International Council for Commercial Arbitration (ICCA), the New York City Bar Association and the International Institute for Conflict Prevention and Resolution,^[35] but these are not currently tailored to risks emanating from the use of AI.
Where technology advances, regulation is sure to follow. Governments and legislators in many jurisdictions are advancing, with AI-specific regulation on the horizon.
This started slowly: in December 2018, the European Commission for the Efficiency of Justice of the Council of Europe adopted an ethical charter on the use of artificial intelligence in judicial systems and their environment.^[36] The charter contained five broad principles:
The aim was for these principles to be subject ‘to regular application, monitoring and evaluation by public and private actors, with a view to continuous improvement of practices application’.^[37] The ethical charter does not appear to have been widely adopted or evaluated as envisaged.
Regulation is, however, incoming, and it is not a given that AI will be compliant with new regulations in all relevant jurisdictions. On 14 June 2023, the European Parliament voted in favour of regulation that will form the basis of Europe’s Artificial Intelligence Act.^[38] If adopted in its current form, many popular AI tools would – as things now stand – be rendered non-compliant.^[39] Some of these non-compliances may be fixable, others may threaten the underlying viability of the technology (at least according to some AI advocates).^[40]
Notably, the proposed EU regulation categorises AI systems that facilitate the administration of justice and democratic processes as high-risk.^[41] Under the current draft, high-risk AI systems are subject to strict obligations before they can be put on the market, including: verification that they are based on high-quality data sets to minimise the risks of discriminatory outcomes; appropriate human oversight measures to minimise risk; and demonstrating high levels of robustness, security and accuracy.^[42] These requirements may (in the short term) slow the development and, thus, adoption of AI tools intended for legal use; in the longer term, however, these safeguards could potentially help drive adoption, if they bolster public confidence in these tools. We note that it is unclear at this stage whether assisting with the management and analysis of evidence would fall within the remit of administration of justice, or whether the proposed regulation is only intended to address the controversial issue of AI in judicial decision-making.
Another complexity is that approaches to AI regulation may differ widely across jurisdictions. For instance, the UK looks set to take a less formulaic (and potentially more lenient) approach than the European Union. Its current white paper sets out five overarching principles that individual regulators will have to interpret and apply to AI within their remits:
It is not yet clear how this will be applied to judicial processes. The US and China also appear to be adopting different approaches.^[44]
The onus will be on arbitration practitioners to ensure any use of AI complies with applicable regulations. This may extend beyond regulations in their home jurisdiction, to include the laws of the seat and place of enforcement. It is not inconceivable that parties may soon attempt to resist enforcement of an award on the grounds that the other side’s use of AI was illegal under one of the applicable laws to the arbitration. This risk currently looks most likely in certain jurisdictions if AI is being used as the decision maker (rather than as a tool for evidence). For example, both the French Civil Code and the Dutch Code of Civil Procedure require an arbitrator to be a human being.^[45] This also calls into question whether arbitrators using AI should themselves disclose its use, to avoid concerns, or potential challenges, on the grounds that any decision-making has been delegated to AI.^[46]
There is also clearly the potential for parties from different jurisdictions to be subject to very different rules regarding which AI tools they can and cannot use. This raises concerns regarding level playing fields and, thus, the legitimacy of the arbitration itself. In the interests of fairness, it may fall upon the international arbitration community to develop common standards, or at the very least, for arbitral tribunals to address these concerns in their first procedural orders. A potential difficulty faced by rule-making initiatives is the speed at which AI is developing, and the risk that as soon as rules are written, they may become obsolete.
At present, arbitration rules and institutions either provide no, or only limited, guidance with respect to AI. For instance, the major arbitration rules do not currently address AI, either as a means to aid disclosure or more generally. Similarly, the most recent revisions of the International Bar Association (IBA) Rules on the Taking of Evidence (the IBA Rules), in 2020, did not make amendments specific to AI, although they did add provisions relating to cyber security and data protection (Article 2.2(e)).^[47]
On the other hand, many institutional rules and national arbitration laws encourage parties and tribunals to conduct arbitration efficiently, with regard to costs, with implicit, or sometimes explicit, reference to electronic disclosure. Parties may argue that these references implicitly permit the use of AI to assist in managing evidence. Article 1.5 of the IBA Rules also provides tribunals with flexibility, where the applicable rules are otherwise silent, to ‘conduct the taking of evidence as it deems appropriate, in accordance with the general principles of the IBA Rules of Evidence’. Similarly, in 2022, the ICCA and the IBA’s Joint Task Force on Data Protection in International Arbitration referred to AI as a means of minimising disclosure of personal data.^[48]
In August 2023, a task force launched by the Silicon Valley Arbitration and Mediation Center released for public consultation a set of draft principles-based guidelines on the use of AI in international arbitration (the SVAMC Guidelines).^[49] The SVAMC Guidelines focus on:
The SVAMC Guidelines also give practical illustrations of compliant and non-compliant application of AI, albeit emphasising the need for case-specific determination. Although still at an early stage, with comments invited from members of the arbitration community until the end of September 2023, the SVAMC Guidelines may serve as a blueprint for shaping the ethical and legal framework governing AI in arbitration, and they ultimately aim to be incorporated into procedural orders via a model clause.
Apart from the few AI-specific regulations and guidelines enumerated above, there is currently limited guidance for the arbitration community on the evidential use of AI in arbitration. In light of the developments summarised above, and the relative regulatory uncertainty, the issue of use of AI tools is likely to arise for agreement between parties, and failing that, before tribunals imminently. In addition to establishing what tools (if any) the parties are able to use, the question may arise as to whether parties are obliged to indicate when AI has been used to prepare witness statements or pleadings (as has been required in some US courtrooms, and as envisaged in certain circumstances by the SVAMC Guidelines).^[50]
Another issue is the difficulty in identifying whether content has been written by human hand, or AI. A number of tools exist to try to identify AI-generated text, but none are fully accurate, and current advice is to test text against a number of these tools to increase the chances of accurately identifying whether it is written by human hand, or AI.^[51]
This issue may become important should unscrupulous parties be tempted to use AI technology to create fake documentary, or photographic or video evidence (or naïve parties be tempted into purchasing this evidence from unscrupulous operators).^[52] Forgeries are not a novel issue,^[53] especially digital forgeries in the era of low-cost image editing software such as Photoshop. AI-generated images present a unique challenge as it can be impossible for the naked eye to detect forgeries.^[54] As new forgery methods arise, forgery detection software follows (admittedly at a slightly slower pace). It remains to be determined how forged evidence can be safeguarded against, without leaving the door open for any shrewd defendant to argue that genuine, adverse evidence is in fact fake (the ‘deepfake defence’).^[55] To account for this uncertainty in the international arbitration context, perhaps all digital evidence will need to be accompanied by a counsel’s statement of authenticity or expert opinion confirming that the content has been examined and is authentic and reliable.^[56]
The adoption of AI models to analyse evidence is one of the less controversial uses of AI in the context of international arbitration (as compared to, say, decision-making).^[57] In theory, although AI needs to be fully tested in practice, it has the potential to perform tasks such as document review, data extraction and anomaly detection with greater speed and accuracy than humans. By using AI, arbitration practitioners can potentially save time and money, reduce errors and focus on more strategic and creative aspects of their cases.
However, the benefits of AI for evidence management are not without challenges, including ensuring the reliability, security and ethical standards of the technology, as well as gaining the trust and acceptance of clients and lawyers who may be wary of delegating such a crucial aspect of their work to machines.^[58] Achieving this optimal balance will require careful attention to the quality and validity of the data underlying AI systems, as well as to the legal and ethical implications of their use. Arbitration practitioners will also need to communicate effectively with their clients and colleagues about the benefits and limitations of AI and ensure that they retain oversight and accountability.
AI in this context is not a threat to the legal profession but rather an opportunity to enhance and transform it. AI cannot replace the human qualities that make lawyers valuable, such as critical thinking, creativity, empathy and advocacy. The focus now should be on understanding how users can achieve the best and most accurate results from AI. Lawyers who embrace AI as a tool to augment their skills and expertise will have a competitive edge over those who resist or ignore it.
^[1] Martin Magál is a partner, and Katrina Limond and Alexander Calthrop are senior associates, at Allen & Overy. The authors are grateful to Kamilla Marianayagam, Sze Hian Ng and Saarthak Jain for their assistance with the preparation of this chapter.
^[2] Victor Ordonez, Taylor Dunn and Eric Noll, ‘OpenAI CEO Sam Altman says AI will reshape society, acknowledges risks: “A little bit scared of this”’, ABC News, 16 March 2023, https://abcnews.go.com/Technology/openai-ceo-sam-altman-ai-reshape-society-acknowledges/story?id=97897122. See also, ‘With 100 Million Users In Just 2 Months, OpenAI’s ChatGPT Becomes The Fastest-Growing App In History’, Eyerys, 6 February 2023, www.eyerys.com/articles/timeline/openai-chatgpt-with-100-million-users-in-2-months#event-a-href-articles-timeline-intel-reveals-glass-substrates-advance-chip-development-line-moores-lawintel-reveals-glass-substrates-to-advance-chip-development-in-line-with-moores-law-a.
^[3] ‘GPT-4 Technical Report’, OpenAI, 27 March 2023, p. 5, https://cdn.openai.com/papers/gpt-4.pdf.
^[4] ‘A&O announces exclusive launch partnership with Harvey’, Allen & Overy, 15 February 2023, www.allenovery.com/en-gb/global/news-and-insights/news/ao-announces-exclusive-launch-partnership-with-harvey.
^[5] ‘Generative AI startups raised $1.5B in 2022, up from just $213M in 2020’, ‘A New Frontier: Generative AI in 2022’, Dealroom, 20 December 2022, https://dealroom.co/uploaded/2022/12/Generative-AI-Social-Mini-Report-2022.pdf?x67701.
^[6] The number of AI papers published across a range of journals and archives is growing exponentially. See M Krenn, et al., ‘Predicting the Future of AI with AI: High-quality link prediction in an exponentially growing knowledge network’, 23 September 2022, https://arxiv.org/pdf/2210.00881.pdf; ‘Growth in AI and robotics research accelerates’, Nature Index, 12 October 2022, www.nature.com/articles/d41586-022-03210-9.
^[7] Geoffrey Hinton (winner of the 2018 Turing Award, and one of the ‘Godfathers of AI’) has argued, ‘We need to find a way to control artificial intelligence before it’s too late.’ Even OpenAI’s CEO, Sam Altman, has acknowledged, ‘We’ve got to be careful here. I think people should be happy that we are a little bit scared of this.’ See EL PAÍS, Geoffrey Hinton, ‘We need to find a way to control artificial intelligence before it’s too late’, 12 May 2023, https://english.elpais.com/science-tech/2023-05-12/geoffrey-hinton-we-need-to-find-a-way-to-control-artificial-intelligence-before-its-too-late.html; Edward Helmore, ‘“We are a little bit scared”: OpenAI CEO warns of risks of artificial intelligence’, The Guardian, 17 March 2023, www.theguardian.com/technology/2023/mar/17/openai-sam-altman-artificial-intelligence-warning-gpt4.
^[8] A New York lawyer is facing potential sanctions for citing non-existent cases generated by ChatGPT in a legal brief. See Karen Sloan, ‘A lawyer used ChatGPT to cite bogus cases. What are the ethics?’, Reuters, 30 May 2023, www.reuters.com/legal/transactional/lawyer-used-chatgpt-cite-bogus-cases-what-are-ethics-2023-05-30/.
^[9] We noted ChatGPT 4’s impressive exam results above. Other AI models have also demonstrated an ability to outperform humans on specific legal tasks. For instance, a 2018 study commissioned by LawGeex, a Tel Aviv-based software company, pitted 20 experienced US-qualified lawyers against LawGeex’s AI algorithm to review non-disclosure agreements (NDAs), to issue-spot and annotate. The AI algorithm (having been trained on thousands of other NDAs) achieved a 94 per cent accuracy rate at spotting issues, in 26 seconds, compared with an average of 85 per cent accuracy for the lawyers in an average time of 92 minutes. See ‘Comparing the Performance of Artificial Intelligence to Human Lawyers in the Review of Standard Business Contracts’, LawGeex, February 2018, https://images.law.com/contrib/content/uploads/documents/397/5408/lawgeex.pdf.
^[10] ‘Introducing Microsoft 365 Copilot – your copilot for work’, Microsoft, 16 March 2023, https://blogs.microsoft.com/blog/2023/03/16/introducing-microsoft-365-copilot-your-copilot-for-work/.
^[11] See e.g., CoCounsel by Casetext (https://casetext.com/) and LitiGate (https://www.litigate.ai/).
^[12] As investigated by the International Chamber of Commerce Commission Task Force in its 2020 Report: ‘The Accuracy of Fact Witness Memory in International Arbitration: Current Issues and Possible Solutions, 2020’, https://library.iccwbo.org/content/dr/commission_reports/cr_0062.htm#TOC_BKL1_1_4.
^[13] In the English court context, Practice Directions 32 (paragraph 18.1) and 57AC (applicable to the Business and Property Courts) now require witness statements to be in witnesses’ own words wherever possible.
^[14] See e.g., LitiGate (https://www.litigate.ai/).
^[15] Perhaps by informing the tribunal if and how AI has been used in preparing materials filed with the tribunal, if required by the tribunal in its procedural directions. The Court of King’s Bench in Manitoba, Canada, has recently introduced a practice direction requiring this information for materials filed with that Court. See Practice Direction, Court of King’s Bench of Manitoba Re: Use of Artificial Intelligence in Court Submissions, 23 June 2023, www.manitobacourts.mb.ca/site/assets/files/2045/practice_direction_-_use_of_artificial_intelligence_in_court_submissions.pdf.
^[16] Bobby Allyn, ‘A robot was scheduled to argue in court, then came the jail threats’, NPR, 25 January 2023, www.npr.org/2023/01/25/1151435033/a-robot-was-scheduled-to-argue-in-court-then-came-the-jail-threats.
^[17] Martyn Landi, ‘Ai-Da robot makes history by giving evidence to parliamentary inquiry’, The Independent, 11 October 2022, www.independent.co.uk/news/uk/politics/house-of-lords-technology-liberal-democrat-b2200496.html.
^[18] This chapter focuses on risks pertinent to the use of AI in the context of evidence. It therefore does not address more general risks to society and civilisation by human-competitive AI systems; for example; in the form of economic and political disruptions.
^[19] Maxi Scherer, ‘Artificial Intelligence and Legal Decision-Making: The Wide Open?’ (2019) 36(5) Journal of International Arbitration 539, pp. 554–55. Concerns that confidentiality of awards and documents in commercial arbitration are a considerable obstacle for machine learning are capable of circumvention to an extent by use of anonymised versions of awards.
^[20] Jenny Gesley and Viktoria Fritz, ‘Artificial “judges”? – Thoughts on AI in Arbitration Law’ (2021), https://blogs.loc.gov/law/2021/01/artificial-judges-thoughts-on-ai-in-arbitration-law/.
^[21] Maxi Scherer, op. cit., p. 554.
^[22] Michael Ansaldo, ‘When training AI models, is a bigger dataset better?’ Enterprise.nxt, http://web.archive.org/web/20230318054634/https://www.hpe.com/us/en/insights/articles/when-training-ai-models-is-a-bigger-dataset-better-2207.html.
^[23] Gunter Vogt, et al., ‘Production of different phenotypes from the same genotype in the same environment by developmental variation’, J. Exp. Biol. 2008 Feb; 211 (Pt 4): 510–23, https://pubmed.ncbi.nlm.nih.gov/18245627/.
^[24] ChatGPT-4 was apparently trained on approximately 300 billion words. See Aparna Iyer, ‘Behind ChatGPT’s Wisdom: 300 Bn Words, 570 GB Data’, Analytics India Magazine, 15 December 2022, https://analyticsindiamag.com/behind-chatgpts-wisdom-300-bn-words-570-gb-data/.
^[25] For a detailed explanation of how Chat GPT, and other large language models work, see Stephen Wolfram, ‘What is ChatGPT Doing . . . and Why Does It Work?’, 14 February 2023, https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/.
^[26] OpenAI has recently published strategies and guidelines for how to ‘prompt’ ChatGPT-4 to increase the accuracy of results. See ‘GPT best practices’, OpenAI, https://platform.openai.com/docs/guides/gpt-best-practices.
^[27] Notably, ChatGPT-4 is 40 per cent more likely to produce factual responses than GPT-3.5, based largely on increasing the data set and incorporating human feedback. See ‘GPT-4 is OpenAI’s most advanced system, producing safer and more useful responses’, OpenAI, https://openai.com/gpt-4.
^[28] Sébastien Bubeck, et al., ‘Sparks of Artificial General Intelligence: Early experiments with GPT-4’, Microsoft, March 2023, www.microsoft.com/en-us/research/publication/sparks-of-artificial-general-intelligence-early-experiments-with-gpt-4/.
^[29] Victor Ordonez, Taylor Dunn and Eric Noll, op. cit.
^[30] A detailed analysis of the interplay between AI and data protection is beyond the scope of this chapter; however, further guidance can be found at: UK Information Commissioner’s Office, ‘Guidance on AI and data protection’, updated 15 March 2023, https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/artificial-intelligence/guidance-on-ai-and-data-protection/; Council of Europe, ‘Artificial Intelligence and Data Protection’, November 2019, https://edoc.coe.int/en/artificial-intelligence/8254-artificial-intelligence-and-data-protection.html.
^[31] Lewis Maddison, ‘Samsung workers made a major error by using ChatGPT’, Techradar, 4 April 2023, www.techradar.com/news/samsung-workers-leaked-company-secrets-by-using-chatgpt.
^[32] ‘Arbitration as a “new normal” may hinge on cybersecurity’, LexisNexis, 11 May 2020, www.lexisnexis.co.uk/blog/covid-19/arbitration-as-a-new-normal-may-hinge-on-cybersecurity.
^[33] David Turner and Gulshan Gill, ‘Addressing emerging cyber risks: reflections on the ICCA Cybersecurity Protocol for International Arbitration’, Practical Law Arbitration Blog, 17 May 2019, http://arbitrationblog.practicallaw.com/addressing-emerging-cyber-risks-reflections-on-the-icca-cybersecurity-protocol-for-international-arbitration/; Luke Eric Peterson, ‘Permanent Court Of Arbitration Website Goes Offline, With Cyber-Security Firm Contending That Security Flaw Was Exploited In Concert With China-Philippines Arbitration’, Investment Arbitration Reporter, 23 July 2015, www.iareporter.com/articles/permanent-court-of-arbitration-goes-offline-with-cyber-security-firm-contending-that-security-flaw-was-exploited-in-lead-up-to-china-philippines-arbitration/.
^[34] John Choong et al., ‘Data protection and cybersecurity in international arbitration remain in the spotlight’, Freshfields Bruckhaus Deringer, 2023, www.freshfields.com/en-gb/our-thinking/campaigns/international-arbitration-in-2023/data-protection-and-cybersecurity-in-international-arbitration-remain-in-the-spotlight/.
^[35] The 2020 Cybersecurity Protocol for International Arbitration, jointly released in 2019 by the International Council for Commercial Arbitration (ICCA), New York City Bar Association and the International Institute for Conflict Prevention and Resolution, provides helpful guidelines and examples of information security measures that may be adopted and tailored to a particular arbitration, https://cdn.arbitration-icca.org/s3fs-public/document/media_document/ICCA-reports-no-6-icca-nyc-bar-cpr-protocol-cybersecurity-international-arbitration-2022-edition.pdf.
^[36] European Commission for the Efficiency of Justice, ‘European ethical Charter on the use of Artificial Intelligence in judicial systems and their environment’, December 2018, https://rm.coe.int/ethical-charter-en-for-publication-4-december-2018/16808f699c, summarised at www.biicl.org/documents/10496_merethe_eckhardt.pdf.
^[37] See ‘European ethical Charter on the use of Artificial Intelligence in judicial systems and their environment’, op. cit., at p 6.
^[38] A final version of the law will be negotiated by representatives of the three branches of the EU – the European Parliament, the European Commission and the Council of the European Union. Officials have indicated that they hope to reach a final agreement by the end of 2023.
^[39] Rishi Bommasani, et al., ‘Do Foundation Model Providers Comply with the Draft EU AI Act?’, Stanford Center for Research on Foundation Models, 15 June 2023, https://crfm.stanford.edu/2023/06/15/eu-ai-act.html.
^[40] While commenting on the EU’s proposed AI legislation, OpenAI CEO Sam Altman noted, ‘If we can comply, we will, and if we can’t, we’ll cease operating . . . We will try. But there are technical limits to what’s possible’ (emphasis added). See Billy Perrigo, ‘OpenAI Could Quit Europe Over New AI Rules, CEO Sam Altman Warns’, Time, 25 May 2023, https://time.com/6282325/sam-altman-openai-eu/.
^[41] Proposal for a Regulation of the European Parliament and of the Council, laying down harmonised rules on artificial intelligence (Artificial Intelligence Act), 2021/0106 (COD), Recital 40.
^[42] id., Chapter 2.
^[43] Department for Science, Innovation & Technology and Office for Artificial Intelligence, ‘Policy paper: A pro-innovation approach to AI regulation’, updated 3 August 2023, www.gov.uk/government/publications/ai-regulation-a-pro-innovation-approach/white-paper, paragraph 38.
^[44] Alex Engler, ‘The EU and U.S. diverge on AI regulation: A transatlantic comparison and steps to alignment’, The Brookings Institution, 25 April 2023, www.brookings.edu/articles/the-eu-and-us-diverge-on-ai-regulation-a-transatlantic-comparison-and-steps-to-alignment/; Matt O’Shaughnessy and Matt Sheehan, ‘Lessons from the World’s Two Experiments in AI Governance’, Carnegie Endowment for International Peace, 14 February 2023, https://carnegieendowment.org/2023/02/14/lessons-from-world-s-two-experiments-in-ai-governance-pub-89035.
^[45] See French Civil Code, Article 1450, www.parisarbitration.com/wp-content/uploads/2014/02/French-Law-on-Arbitration.pdf; Dutch Code of Civil Procedure, Article 1023, www.dutchcivillaw.com/legislation/civilprocedure044.htm.
^[46] Judges have admitted to using AI tools to render rulings; see Luke Taylor, ‘Colombian judge says he used ChatGPT in ruling’, The Guardian, 3 February 2023, www.theguardian.com/technology/2023/feb/03/colombia-judge-chatgpt-ruling; and Ben Cost, ‘Judge asks ChatGPT to decide bail in murder trial’, New York Post, 29 March 2023, https://nypost.com/2023/03/29/judge-asks-chatgpt-for-decision-in-murder-trial/.
^[47] The next revision is not expected soon. Revisions have been made to date, on average, each decade (1999, 2010, 2020). According to the IBA website, the 2020 revision was made on the recommendation by survey respondents that a revision should be considered on the 10th anniversary of the 2010 Rules.
^[48] ICCA-IBA Joint Task Force on Data Protection in International Arbitration, ‘Roadmap to Data Protection in International Arbitration’, ICCA Reports No. 7, 2022, https://cdn.arbitration-icca.org/s3fs-public/document/media_document/ICCA_Reports_No_7_ICCA-IBA_Joint_Task_Force_on_Data_Protection_in_International_Arbitration.pdf, pp. 48 and 61.
^[49] Silicon Valley Arbitration and Mediation Center – AI Task Force, ‘Guidelines on the use of Artificial Intelligence in Arbitration’, 31 August 2023, https://thearbitration.org/wp-content/uploads/2023/08/SVAMC-AI-Guidelines-CONSULTATION-DRAFT-31-August-2023-1.pdf.
^[50] At least two US federal judges have ordered that lawyers declare use of generative AI tools in cases that appeared before them; Sara Merken, ‘Another US judge says lawyers must disclose AI use’, Reuters, 8 June 2023, www.reuters.com/legal/transactional/another-us-judge-says-lawyers-must-disclose-ai-use-2023-06-08/; Order in the United States Court of International Trade of the Honourable Stephen Alexander Vaden, Judge, 9 June 2023, www.cit.uscourts.gov/sites/cit/files/Order%20on%20Artificial%20Intelligence.pdf.
^[51] Ron Karjian, ‘How to detect AI-generated content’, Techtarget, 2 August 2023, www.techtarget.com/searchenterpriseai/feature/How-to-detect-AI-generated-content; Justin Gluska, ‘How To Check If Something Was Written with AI’, Goldpenguin, updated 11 September 2023, https://goldpenguin.org/blog/check-for-ai-content/.
^[52] Microsoft President Brad Smith has described deep fakes as his biggest concern around AI; Diane Bartz, ‘Microsoft chief says deep fakes are biggest AI concern’, Reuters, 25 May 2023, www.reuters.com/technology/microsoft-chief-calls-humans-rule-ai-safeguard-critical-infrastructure-2023-05-25/.
^[53] In the 2001 International Court of Justice case Qatar v. Bahrain (Maritime Delimitation and Territorial Questions between Qatar and Bahrain (Qatar v. Bahrain), 2001 ICJ Rep. 40 (16 March 2001), 139 ILR 1), Qatar’s memorial was initially accompanied by 82 forged documents; W Michael Reisman and Christina Skinner, ‘Qatar v. Bahrain: massive forgeries’, Fraudulent Evidence before Public International Tribunals (Cambridge University Press, 2014).
^[54] As evidenced by a German artist, who, in April 2023, won the Sony World Photography award for a photograph that he later revealed to be an AI creation; see Paul Glynn, ‘Sony World Photography Award 2023: Winner refuses award after revealing AI creation’, BBC News, www.bbc.co.uk/news/entertainment-arts-65296763; Guy Alon, Azmi Haider and Hagit Hel-Or, ‘Judicial Errors: Fake Imaging and the Modern Law of Evidence’, 21 UIC Rev. Intell. Prop. L. 82 (2022), https://repository.law.uic.edu/cgi/viewcontent.cgi?article=1512&context=ripl.
^[55] Victor Tangermann, ‘Reality Is Melting as Lawyers Claim Real Videos Are Deepfakes’, Futurism, 10 May 2023, https://futurism.com/reality-melting-lawyers-deepfakes.
^[56] Guy Alon, Azmi Haider and Hagit Hel-Or, op. cit.
^[57] Jordan Bakst et al., ‘Artificial Intelligence and Arbitration: A US Perspective’, Dispute Resolution International, Vol. 16, No. 1 (May 2022), www.cov.com/-/media/files/corporate/publications/2022/05/artificial-intelligence-and-arbitration-a-us-perspective_bakst-harden-jankauskas-mcmurrough-morril.pdf.
^[58] A recent study found that only 51 per cent of lawyers thought generative AI should be used for legal work; see ‘Generative AI could radically alter the practice of law’, The Economist, 6 June 2023, www.economist.com/business/2023/06/06/generative-ai-could-radically-alter-the-practice-of-law.
Author | Partner
Author | Senior Associate
Author | Senior Associate
Amy C Kläsener, Martin Magal and Joseph E Neuhaus
Allen & Overy Bratislava, s.r.o. and Sullivan & Cromwell LLP
James Hope and Marcus Eklund
Vinge
Joseph E Neuhaus, Andrew J Finn and David S Blackman
Sullivan & Cromwell LLP
Janet Walker
Chartered Arbitrator
Amy C Kläsener and Courtney Lotfi
Jones Day
Michael McIlwrath
MDisputes
Beata Gessel-Kalinowska vel Kalisz, Joanna Kisielińska-Garncarek, Barbara Tomczyk and Łukasz Ostas
GESSEL
Cinzia Catelli and Romana Weinöhrl-Brüggemann
Bär & Karrer Ltd
Michael Hwang SC and Clarissa Chern
Michael Hwang Chambers LLC
Damián Vallejo and Esther Romay
Dunning Rievman & MacDonald LLP
Julia Sherman, Himmy Lui, Kelly Renehan and Anish Patel
Three Crowns LLP
Erik G W Schäfer
Bodenheimer
Laura Hardin and Trevor Dick
Alvarez & Marsal
Stefan Riegler, Oleg Temnikov and Venus Valentina Wong
Wolf Theiss
Anna Masser, Lucia Raimanova, Kendall Pauley and Peter Plachý
Allen & Overy LLP
Martin Magal, Katrina Limond and Alexander Calthrop
Allen & Overy LLP
Get more from GAR
Sign up to our daily email alert
Sign up

Unlock unlimited access to all Global Arbitration Review content

source

Artificial Intelligence in Arbitration: Evidentiary Issues and Prospects – GAR

Artificial Intelligence in Arbitration: Evidentiary Issues and Prospects – GAR

Jesse

https://playwithchatgtp.com