Audit Algorithms

repo: erwanlemerrer/awesome-audit-algorithms
category: Theory related: Algorithms


Awesome Audit Algorithms Awesome

A curated list of algorithms for auditing black-box algorithms. Nowadays, many algorithms (recommendation, scoring, classification) are operated at third party providers, without users or institutions having any insights on how they operate on their data. Audit algorithms in this list thus apply to this setup, coined the "black-box" setup, where one auditor wants to get some insight on these remote algorithms.

<img src="https://github.com/erwanlemerrer/awesome-audit-algorithms/blob/main/resources/audit.png" width="600" alt="banner" class="center">

A user queries a remote algorithm (eg, through available APIs), to infer information about that algorithm.

Contents

Papers

2026

2025

  • Auditing Pay-Per-Token in Large Language Models - (arXiv) Develops an auditing framework based on martingale theory that enables a trusted third-party auditor who sequentially queries a provider to detect token misreporting.
  • P2NIA: Privacy-Preserving Non-Iterative Auditing - (ECAI) Proposes a mutually beneficial collaboration for both the auditor and the platform: a privacy-preserving and non-iterative audit scheme that enhances fairness assessments using synthetic or local data, avoiding the challenges associated with traditional API-based audits.
  • [The Fair Game: Auditing & debiasing AI algorithms overtime](https://www.cambridge.org/core/services/aop-cambridge-core/content/view/9E8408C67F7CE30505122DD1586D9FA2/S3033373325000080a.pdf/the-fair-game-auditing-and-debiasing-ai-algorithms-over-time.pdf) - (Cambridge Forum on AI: Law and Governance) Aims to simulate the evolution of ethical and legal frameworks in the society by creating an auditor which sends feedback to a debiasing algorithm deployed around an ML system.
  • Robust ML Auditing using Prior Knowledge - (ICML) Formally establishes the conditions under which an auditor can prevent audit manipulations using prior knowledge about the ground truth.
  • CALM: Curiosity-Driven Auditing for Large Language Models - (AAAI) Auditing as a black-box optimization problem where the goal is to automatically uncover input-output pairs of the target LLMs that exhibit illegal, immoral, or unsafe behaviors.
  • Queries, Representation & Detection: The Next 100 Model Fingerprinting Schemes - (AAAI) Divides model fingerprinting into three core components, to identify ∼100 previously unexplored combinations of these and gain insights into their performance.

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

  • Measuring Personalization of Web Search - (WWW) Develops a methodology for measuring personalization in Web search result.
  • [Auditing: Active Learning with Outcome-Dependent Query Costs](https://www.cs.bgu.ac.il/~sabatos/papers/SabatoSarwate13.pdf) - (NIPS) Learns from a binary classifier paying only for negative labels.

2012

2008

2005

  • Adversarial Learning - (KDD) Reverse engineering of remote linear classifiers, using membership queries.

2025

2024

2023

[[curator]]
I'm the Curator. I can help you navigate, organize, and curate this wiki. What would you like to do?