Post #8: Into the Abyss: Examining AI Failures and Lessons Learned

Amidst the dynamic integration of Artificial Intelligence (AI) across diverse sectors, instances of AI initiatives veering off course serve as poignant reminders (and cautionary tales) of the practical perils of AI development and deployment. These episodes underscore the multifaceted risks associated with AI integration and help frame conversations on the societal implications of AI-driven technologies in more practical terms. By examining these cases of AI misalignment, we unearth invaluable insights into the immediate ramifications that accompany the integration of AI into our lives. Moreover, such examinations serve as crucial touchstones for policymakers, technologists, and stakeholders, to identify the risks that should influence other AI initiatives.

In this week’s essay, we provide examples of misaligned AI, not to accentuate fears about AI risks casting dark shadows on AI entrepreneurial ventures, but instead to highlight the necessity of comprehensive and ethical governance strategies in the pursuit of responsible AI deployment. In certain cases, the enterprises responsible for misaligned AI systems acted swiftly to address negative societal impacts. Where this occurred, we commend these actors. Emerging technology innovation is not for the faint of heart; we applaud the entrepreneurship at the core of AI innovation and firmly believe that examining the past is a vital step towards ensuring a safer and more ethically conscious future in this age of AI.

IBM Watson for Oncology

In 2018, IBM’s Watson for Oncology, lauded as a revolutionary tool for AI-enabled personalized cancer treatment, encountered significant setbacks due to inaccuracies and unsafe treatment recommendations. The system's reliance on synthetic data, coupled with limited real-world patient data, underscored the critical importance of data quality and diversity in AI-driven healthcare solutions. Consequently, the accuracy and efficacy of the AI-generated outcomes were insufficient, and IBM ultimately decided to discontinue its Watson for Oncology solution. This case exemplifies the imperative for rigorous data validation protocols to generate high-value recommendations; further, an overreliance on synthetic data can diminish AI effectiveness and model accuracy. Further, this AI setback may also be an example of a problem better left to humans – in this case oncologists with years of specialized training, experience, and highly-contextual knowledge of the most complex of systems, the human body.

Amazon’s Algorithmic Hiring Decisions

In the realm of algorithmic hiring, the repercussions of inaccurate or biased algorithms can perpetuate societal inequalities and create disparate impacts on specific demographic groups. In 2014, Amazon developed an AI system for hiring, intended to streamline the recruitment process. However, the system failed to achieve its desired outcomes and, upon examination, was found to have produced discriminatory results against women. The AI system was developed and trained to make hiring recommendations based on resumes submitted to Amazon over a ten-year period.

However, these resumes powering the AI system were from job seekers who were predominantly male. Consequently, Amazon opted to discontinue the AI-driven hiring program. This case presents warnings not just for gender bias but also underscores the potential for demographic factors, such as zip codes, membership within certain associations, and even interests and hobbies, to inadvertently influence hiring decisions.

Recognizing the ethical implications of algorithmic hiring, policymakers in jurisdictions like the European Union and New York City have classified AI-enabled hiring decisions as high-risk activities necessitating heightened oversight. Such designations emphasize the importance of ensuring that AI programs undergo rigorous scrutiny to prevent discriminatory outcomes.

Zillow & Air Canada Face Financial Losses

Instances such as those involving Zillow and Air Canada underscore the tangible financial risks associated with AI inaccuracies, alongside the potential damage to reputation and customer trust. In 2021, Zillow’s AI-supported home-buying algorithm was found to have overestimated the value of homes. This led the company to purchase homes at inflated prices and incur financial losses upon later sales, quantified in the millions of dollars, together with having to layoff ~25% of its workforce and the closure of the associated Zillow Offers Division. Similarly, Canada's flagship airline, Air Canada, suffered financial losses stemming from an adverse court ruling, triggered by a chatbot's confabulations, which provided erroneous responses to questions about the company’s bereavement rate policy.

These incidents serve as stark reminders of the significant financial and reputational consequences that can arise from flawed AI systems. To mitigate such risks, it is imperative for companies to adopt best practices, such as conducting societal and stakeholder impact assessments during the design phase of AI solution development – to flag possible negative outcomes and develop contingency plans.

ShotSpotter – Implicating Personal Freedom

The misuse of AI-powered gunshot detection technology in criminal justice proceedings underscores the ethical implications of AI applications in high-stakes contexts. The wrongful conviction of Michael Williams exemplifies the dire consequences of algorithmic inaccuracies on individual freedoms and due process. The conviction was principally a result of data extracted from ShotSpotter, a gunshot detection technology that uses AI-powered sensors to retrieve data from gunshots.

Williams was imprisoned for nearly a year before having his case dismissed because the data extracted from this AI tool was later found to be inaccurate and insufficient to support the murder conviction. This case underscores the ethical imperative of rigorously testing and validating AI systems, particularly in contexts implicating personal liberties and human rights.

Uber & Tesla’s Autonomous Vehicles

The fatalities involving autonomous vehicles operated by Uber and Tesla underscore the ethical imperatives of responsible AI development and governance in high-risk domains. In 2018, a self-driving Uber vehicle in suburban Phoenix was involved in a fatal collision killing a pedestrian; the first such accident involving a fully autonomous vehicle. In addition, in 2016, the driver of a Tesla Model S was killed when his car, operating on its autopilot system, crashed into a tractor-trailer in Florida; and in December 2019, two people were killed in California when a Tesla on autopilot ran a red light, colliding with another vehicle.

These tragedies accentuate the need for greater AI governance and regulation for the highest-risk AI systems involving personal safety. Moreover, these cases highlight the necessity for greater clarity of organizational and personal accountability when AI goes wrong and injures people - or worse.

Content Platforms - Disinformation and Broader Societal Threats

The proliferation of AI-generated misinformation on social platforms like Facebook and X has ignited a firestorm of controversy, signaling far-reaching consequences for societal discourse and individual empowerment. In the absence of comprehensive AI governance frameworks, the dissemination of false and misleading content distorts user perceptions and undermines their ability to make informed decisions. The implications of inadequate AI governance extend beyond misinformation, encompassing broader societal threats such as diminished trust in technology, exacerbated social inequalities, compromised privacy and individual rights, and even jeopardized democratic principles.

***

Are your AI governance protocols sufficiently resilient to avert catastrophes akin to the examples outlined in this essay? Please reach out if you are concerned about the negative societal consequences that may result if the AI systems your organization develops go astray. As always, we welcome your input and collaboration. Until next week.

Jeffrey Saviano: Business AI Ethics Initiative Leader; Edmond & Lily Safra Center for Ethics, Harvard University jeffreysaviano@fas.harvard.edu 
Jonathan Hack: Director of Content & Strategy; Edmond & Lily Safra Center for Ethics, Harvard University jhack@fas.harvard.edu 
Vincent Okonkwo: Research Assistant; LL.M Candidate, Harvard Law School vokonkwo@llm24.law.harvard.edu 
Shuying (Christina) Huo: Research Assistant; Harvard University visiting undergraduate shuying_huo@college.harvard.edu

The Business AI Ethics research team is part of the Edmond & Lily Safra Center’s ongoing effort to promote the application of ethics in practice. Their research assists business leaders in examining the promise and challenges of AI technologies through an ethical lens. Views expressed in these posts are those of the author(s) and do not imply endorsement by ELSCE.