Publications

Conference paper

Paulino-Passos G, Toni F, 2021,

Monotonicity and Noise-Tolerance in Case-Based Reasoning with Abstract Argumentation

, Pages: 508-518

Conference paper

Lauren S, Belardinelli F, Toni F, 2021,

Aggregating Bipolar Opinions

, 20th International Conference on Autonomous Agents and Multiagent Systems

Cite

Conference paper

Kotonya N, Toni F, 2020,

Explainable Automated Fact-Checking: A Survey

, Barcelona. Spain, 28th International Conference on Computational Linguistics (COLING 2020), Publisher: International Committee on Computational Linguistics, Pages: 5430-5443

A number of exciting advances have been made in automated fact-checkingthanks to increasingly larger datasets and more powerful systems, leading toimprovements in the complexity of claims which can be accurately fact-checked.However, despite these advances, there are still desirable functionalitiesmissing from the fact-checking pipeline. In this survey, we focus on theexplanation functionality -- that is fact-checking systems providing reasonsfor their predictions. We summarize existing methods for explaining thepredictions of fact-checking systems and we explore trends in this topic.Further, we consider what makes for good explanations in this specific domainthrough a comparative analysis of existing fact-checking explanations againstsome desirable properties. Finally, we propose further research directions forgenerating fact-checking explanations, and describe how these may lead toimprovements in the research area.v

Conference paper

Kotonya N, Toni F, 2020,

Explainable Automated Fact-Checking for Public Health Claims

, 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP(1) 2020), Publisher: ACL, Pages: 7740-7754

Fact-checking is the task of verifying the veracity of claims by assessing their assertions against credible evidence. The vast major-ity of fact-checking studies focus exclusively on political claims. Very little research explores fact-checking for other topics, specifically subject matters for which expertise is required. We present the first study of explainable fact-checking for claims which require specific expertise. For our case study we choose the setting of public health. To support this case study we construct a new datasetPUBHEALTHof 11.8K claims accompanied by journalist crafted, gold standard explanations(i.e., judgments) to support the fact-check la-bels for claims1. We explore two tasks: veracity prediction and explanation generation. We also define and evaluate, with humans and computationally, three coherence properties of explanation quality. Our results indicate that,by training on in-domain data, gains can be made in explainable, automated fact-checking for claims which require specific expertise.

Conference paper

Lertvittayakumjorn P, Specia L, Toni F, 2020,

FIND: Human-in-the-loop debugging deep text classifiers

, 2020 Conference on Empirical Methods in Natural Language Processing, Publisher: ACL

Since obtaining a perfect training dataset (i.e., a dataset which is considerably large, unbiased, and well-representative of unseen cases)is hardly possible, many real-world text classifiers are trained on the available, yet imperfect, datasets. These classifiers are thus likely to have undesirable properties. For instance, they may have biases against some sub-populations or may not work effectively in the wild due to overfitting. In this paper, we propose FIND–a framework which enables humans to debug deep learning text classifiers by disabling irrelevant hidden features. Experiments show that by using FIND, humans can improve CNN text classifiers which were trained under different types of imperfect datasets (including datasets with biases and datasets with dissimilar train-test distributions).

Abstract
Cite

Conference paper

Albini E, Baroni P, Rago A, Toni Fet al., 2020,

PageRank as an Argumentation Semantics

, Biennial International Conference on Computational Models of Argument (COMMA), Publisher: IOS PRESS, Pages: 55-66, ISSN: 0922-6389

Conference paper

Cocarascu O, Stylianou A, Cyras K, Toni Fet al., 2020,

Data-empowered argumentation for dialectically explainable predictions

, 24th European Conference on Artificial Intelligence (ECAI 2020), Publisher: IOS Press, Pages: 2449-2456

Today’s AI landscape is permeated by plentiful data anddominated by powerful data-centric methods with the potential toimpact a wide range of human sectors. Yet, in some settings this po-tential is hindered by these data-centric AI methods being mostlyopaque. Considerable efforts are currently being devoted to defin-ing methods for explaining black-box techniques in some settings,while the use of transparent methods is being advocated in others,especially when high-stake decisions are involved, as in healthcareand the practice of law. In this paper we advocate a novel transpar-ent paradigm of Data-Empowered Argumentation (DEAr in short)for dialectically explainable predictions. DEAr relies upon the ex-traction of argumentation debates from data, so that the dialecticaloutcomes of these debates amount to predictions (e.g. classifications)that can be explained dialectically. The argumentation debates con-sist of (data) arguments which may not be linguistic in general butmay nonetheless be deemed to be ‘arguments’ in that they are dialec-tically related, for instance by disagreeing on data labels. We illus-trate and experiment with the DEAr paradigm in three settings, mak-ing use, respectively, of categorical data, (annotated) images and text.We show empirically that DEAr is competitive with another transpar-ent model, namely decision trees (DTs), while also providing natu-rally dialectical explanations.

Journal article

Baroni P, Toni F, Verheij B, 2020,

On the acceptability of arguments and its fundamental role in nonmonotonic reasoning, logic programming and n-person games: 25 years later Foreword

, ARGUMENT & COMPUTATION, Vol: 11, Pages: 1-14, ISSN: 1946-2166

Conference paper

Čyras K, Karamlou A, Lee M, Letsios D, Misener R, Toni Fet al., 2020,

AI-assisted schedule explainer for nurse rostering

, AAMAS, Pages: 2101-2103, ISSN: 1548-8403

We present an argumentation-supported explanation generating system, called Schedule Explainer, that assists with makespan scheduling. Our stand-alone generic tool explains to a lay user why a resource allocation schedule is good or not, and offers actions to improve the schedule given the user's constraints. Schedule Explainer provides actionable textual explanations via an interactive graphical interface. We illustrate our system with a proof-of-concept application tool in a nurse rostering scenario whereby a shift-lead nurse aims to account for unexpected events by rescheduling some patient procedures to nurses and is aided by the system to do so.

Abstract
Cite

Conference paper

Albini E, Rago A, Baroni P, Toni Fet al., 2020,

Relation-Based Counterfactual Explanations for Bayesian Network Classifiers

, The 29th International Joint Conference on Artificial Intelligence (IJCAI 2020)

Cite

Book chapter

Cocarascu O, Toni F, 2020,

Deploying Machine Learning Classifiers for Argumentative Relations “in the Wild”

, Argumentation Library, Pages: 269-285

Argument Mining (AM) aims at automatically identifying arguments and components of arguments in text, as well as at determining the relations between these arguments, on various annotated corpora using machine learning techniques (Lippi & Torroni, 2016).

Abstract
Cite

Conference paper

Jha R, Belardinelli F, Toni F, 2020,

Formal verification of debates in argumentation theory.

, Publisher: ACM, Pages: 940-947

Conference paper

Cocarascu O, Cabrio E, Villata S, Toni Fet al., 2020,

Dataset Independent Baselines for Relation Prediction in Argument Mining.

, Publisher: IOS Press, Pages: 45-52

Conference paper

Lertvittayakumjorn P, Toni F, 2019,

Human-grounded evaluations of explanation methods for text classification

, 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Publisher: ACL Anthology, Pages: 5195-5205

Due to the black-box nature of deep learning models, methods for explaining the models’ results are crucial to gain trust from humans and support collaboration between AIsand humans. In this paper, we consider several model-agnostic and model-specific explanation methods for CNNs for text classification and conduct three human-grounded evaluations, focusing on different purposes of explanations: (1) revealing model behavior, (2)justifying model predictions, and (3) helping humans investigate uncertain predictions.The results highlight dissimilar qualities of thevarious explanation methods we consider andshow the degree to which these methods couldserve for each purpose.

Conference paper

Schulz C, Toni F, 2019,

On the responsibility for undecisiveness in preferred and stable labellings in abstract argumentation (extended abstract)

, IJCAI International Joint Conference on Artificial Intelligence, Pages: 6382-6386, ISSN: 1045-0823

© 2019 International Joint Conferences on Artificial Intelligence. All rights reserved. Different semantics of abstract Argumentation Frameworks (AFs) provide different levels of decisiveness for reasoning about the acceptability of conflicting arguments. The stable semantics is useful for applications requiring a high level of decisiveness, as it assigns to each argument the label “accepted” or the label “rejected”. Unfortunately, stable labellings are not guaranteed to exist, thus raising the question as to which parts of AFs are responsible for the non-existence. In this paper, we address this question by investigating a more general question concerning preferred labellings (which may be less decisive than stable labellings but are always guaranteed to exist), namely why a given preferred labelling may not be stable and thus undecided on some arguments. In particular, (1) we give various characterisations of parts of an AF, based on the given preferred labelling, and (2) we show that these parts are indeed responsible for the undecisiveness if the preferred labelling is not stable. We then use these characterisations to explain the non-existence of stable labellings.

Abstract
Cite

Conference paper

Cocarascu O, Rago A, Toni F, 2019,

Extracting dialogical explanations for review aggregations with argumentative dialogical agents

, International Conference on Autonomous Agents and MultiAgent Systems (AAMAS), Publisher: International Foundation for Autonomous Agents and Multiagent Systems

The aggregation of online reviews is fast becoming the chosen method of quality control for users in various domains, from retail to entertainment. Consequently, fair, thorough and explainable aggregation of reviews is increasingly sought-after. We consider the movie review domain, and in particular Rotten Tomatoes' ubiquitous (and arguably over-simplified) aggregation method, the Tomatometer Score (TS). For a movie, this amounts to the percentage of critics giving the movie a positive review. We define a novel form of argumentative dialogical agent (ADA) for explaining the reasoning within the reviews. ADA integrates: 1.) NLP with reviews to extract a Quantitative Bipolar Argumentation Framework (QBAF) for any chosen movie to provide the underlying structure of explanations, and 2.) gradual semantics for QBAFs for deriving a dialectical strength measure for movies, as an alternative to the TS, satisfying desirable properties for obtaining explanations. We evaluate ADA using some prominent NLP methods and gradual semantics for QBAFs. We show that they provide a dialectical strength which is comparable with the TS, while at the same time being able to provide dialogical explanations of why a movie obtained its strength via interactions between the user and ADA.

Conference paper

Karamlou A, Cyras K, Toni F, 2019,

Deciding the winner of a debate using bipolar argumentation

, International Conference on Autonomous Agents and MultiAgent Systems, Publisher: IFAAMAS / ACM, Pages: 2366-2368, ISSN: 2523-5699

Bipolar Argumentation Frameworks (BAFs) are an important class of argumentation frameworks useful for capturing, reasoning with, and deriving conclusions from debates. They have the potential to make solid contributions to real-world multi-agent systems and human-agent interaction in domains such as legal reasoning, healthcare and politics. Despite this fact, practical systems implementing BAFs are largely lacking. In this demonstration, we provide a software system implementing novel algorithms for calculating extensions (winning sets of arguments) of BAFs. Participants in the demonstration will be able to input their own debates into our system, and watch a graphical representation of the algorithms as they process information and decide which sets of arguments are winners of the debate.

Journal article

Baroni P, Rago A, Toni F, 2019,

From fine-grained properties to broad principles for gradual argumentation: A principled spectrum

, International Journal of Approximate Reasoning, Vol: 105, Pages: 252-286, ISSN: 0888-613X

The study of properties of gradual evaluation methods in argumentation has received increasing attention in recent years, with studies devoted to various classes of frameworks/ methods leading to conceptually similar but formally distinct properties in different contexts. In this paper we provide a novel systematic analysis for this research landscape by making three main contributions. First, we identify groups of conceptually related properties in the literature, which can be regarded as based on common patterns and, using these patterns, we evidence that many further novel properties can be considered. Then, we provide a simplifying and unifying perspective for these groups of properties by showing that they are all implied by novel parametric principles of (either strict or non-strict) balance and monotonicity. Finally, we show that (instances of) these principles (and thus the group, literature and novel properties that they imply) are satisfied by several quantitative argumentation formalisms in the literature, thus confirming the principles' general validity and utility to support a compact, yet comprehensive, analysis of properties of gradual argumentation.

Abstract
Cite

Conference paper

Cyras K, Domínguez J, Karamlou A, Prociuk D, Curcin V, Delaney B, Toni F, Chalkidou K, Darzi Aet al., 2019,

ROAD2H: Learning Decision Support System for Low- and Middle-Income Countries.

, Publisher: AMIA

Conference paper

Kotonya N, Toni F, 2019,

Gradual Argumentation Evaluation for Stance Aggregation in Automated Fake News Detection

, 6th Workshop on Argument Mining (ArgMining), Publisher: ASSOC COMPUTATIONAL LINGUISTICS-ACL, Pages: 156-166

Author Web Link
Cite
Citations: 16

Journal article

Cocarascu O, Toni F, 2018,

Combining deep learning and argumentative reasoning for the analysis of social media textual content using small datasets

, Computational Linguistics, Vol: 44, Pages: 833-858, ISSN: 0891-2017

The use of social media has become a regular habit for many and has changed the way people interact with each other. In this article, we focus on analysing whether news headlines support tweets and whether reviews are deceptive by analysing the interaction or the influence that these texts have on the others, thus exploiting contextual information. Concretely, we define a deep learning method for Relation-based Argument Mining to extract argumentative relations of attack and support. We then use this method for determining whether news articles support tweets, a useful task in fact-checking settings, where determining agreement towards a statement is a useful step towards determining its truthfulness. Furthermore we use our method for extracting Bipolar Argumentation Frameworks from reviews to help detect whether they are deceptive. We show experimentally that our method performs well in both settings. In particular, in the case of deception detection, our method contributes a novel argumentative feature that, when used in combination with other features in standard supervised classifiers, outperforms the latter even on small datasets.

Abstract
Cite

Conference paper

Popescu C, Cocarascu O, Toni F, 2018,

A platform for crowdsourcing corpora for argumentative

, The International Workshop on Dialogue, Explanation and Argumentation in Human-Agent Interaction (DEXAHAI)

One problem that Argument Mining (AM) is facing is the difficultyof obtaining suitable annotated corpora. We propose a web-basedplatform, BookSafari, that allows crowdsourcing of annotated cor-pora forrelation-based AMfrom users providing reviews for booksand exchanging opinions about these reviews to facilitate argumen-tative dialogue. The annotations amount to pairwise argumentativerelations ofattackandsupportbetween opinions and between opin-ions and reviews. As a result of the annotations, reviews and opinionsform structured debates which can be understood as bipolar argu-mentation frameworks. The platform also empowers annotationsof the same pairs by multiple annotators and can support differentmeasures of inter-annotator agreement and corpora selection.

Abstract
Cite

Conference paper

Cyras K, Delaney B, Prociuk D, Toni F, Chapman M, Dominguez J, Curcin Vet al., 2018,

Argumentation for explainable reasoning with conflicting medical recommendations

, Reasoning with Ambiguous and Conflicting Evidence and Recommendations in Medicine (MedRACER 2018), Pages: 14-22

Designing a treatment path for a patient suffering from mul-tiple conditions involves merging and applying multiple clin-ical guidelines and is recognised as a difficult task. This isespecially relevant in the treatment of patients with multiplechronic diseases, such as chronic obstructive pulmonary dis-ease, because of the high risk of any treatment change havingpotentially lethal exacerbations. Clinical guidelines are typi-cally designed to assist a clinician in treating a single condi-tion with no general method for integrating them. Addition-ally, guidelines for different conditions may contain mutuallyconflicting recommendations with certain actions potentiallyleading to adverse effects. Finally, individual patient prefer-ences need to be respected when making decisions.In this work we present a description of an integrated frame-work and a system to execute conflicting clinical guidelinerecommendations by taking into account patient specific in-formation and preferences of various parties. Overall, ourframework combines a patient’s electronic health record datawith clinical guideline representation to obtain personalisedrecommendations, uses computational argumentation tech-niques to resolve conflicts among recommendations while re-specting preferences of various parties involved, if any, andyields conflict-free recommendations that are inspectable andexplainable. The system implementing our framework willallow for continuous learning by taking feedback from thedecision makers and integrating it within its pipeline.

Abstract
Cite

Conference paper

Baroni P, Borsato S, Rago A, Toni Fet al., 2018,

The "Games of Argumentation" web platform

, 7th International Conference on Computational Models of Argument (COMMA 2018), Publisher: IOS Press, Pages: 447-448, ISSN: 0922-6389

This demo presents the web system “Games of Argumentation”, which allows users to build argumentation graphs and examine them in a game-theoretical manner using up to three different evaluation techniques. The concurrent evaluations of arguments using different techniques, which may be qualitative or quantitative, provides a significant aid to users in both understanding game-theoretical argumentation semantics and pinpointing their differences from alternative semantics, traditional or otherwise, to differentiate between them.

Abstract
Cite

Conference paper

Rago A, Baroni P, Toni F, 2018,

On instantiating generalised properties of gradual argumentation frameworks

, SUM 2018, Publisher: Springer Verlag, Pages: 243-259, ISSN: 0302-9743

Several gradual semantics for abstract and bipolar argumentation have been proposed in the literature, ascribing to each argument a value taken from a scale, i.e. an ordered set. These values somewhat match the arguments’ dialectical status and provide an indication of their dialectical strength, in the context of the given argumentation framework. These research efforts have been complemented by formulations of several properties that these gradual semantics may satisfy. More recently a synthesis of many literature properties into more general groupings based on parametric definitions has been proposed. In this paper we show how this generalised parametric formulation enables the identification of new properties not previously considered in the literature and discuss their usefulness to capture alternative requirements coming from different application contexts.

Abstract
Cite

Conference paper

Rago A, Cocarascu O, Toni F, 2018,

Argumentation-based recommendations: fantastic explanations and how to find them

, The Twenty-Seventh International Joint Conference on Artificial Intelligence, (IJCAI 2018), Pages: 1949-1955

A significant problem of recommender systems is their inability to explain recommendations, resulting in turn in ineffective feedback from users and the inability to adapt to users’ preferences. We propose a hybrid method for calculating predicted ratings, built upon an item/aspect-based graph with users’ partially given ratings, that can be naturally used to provide explanations for recommendations, extracted from user-tailored Tripolar Argumentation Frameworks (TFs). We show that our method can be understood as a gradual semantics for TFs, exhibiting a desirable, albeit weak, property of balance. We also show experimentally that our method is competitive in generating correct predictions, compared with state-of-the-art methods, and illustrate how users can interact with the generated explanations to improve quality of recommendations.

Abstract
Cite

Conference paper

Cocarascu O, Cyras K, Toni F, 2018,

Explanatory predictions with artificial neural networks and argumentation

, Workshop on Explainable Artificial Intelligence (XAI)

Data-centric AI has proven successful in severaldomains, but its outputs are often hard to explain.We present an architecture combining ArtificialNeural Networks (ANNs) for feature selection andan instance of Abstract Argumentation (AA) forreasoning to provide effective predictions, explain-able both dialectically and logically. In particular,we train an autoencoder to rank features in input ex-amples, and select highest-ranked features to gen-erate an AA framework that can be used for mak-ing and explaining predictions as well as mappedonto logical rules, which can equivalently be usedfor making predictions and for explaining.Weshow empirically that our method significantly out-performs ANNs and a decision-tree-based methodfrom which logical rules can also be extracted.

Abstract
Cite

Conference paper

Baroni P, Rago A, Toni F, 2018,

How many Properties do we need for Gradual Argumentation?

, AAAI 2018, Publisher: AAAI

The study of properties of gradual evaluation methods inargumentation has received increasing attention in recentyears, with studies devoted to various classes of frame-works/methods leading to conceptually similar but formallydistinct properties in different contexts. In this paper we pro-vide a systematic analysis for this research landscape by mak-ing three main contributions. First, we identify groups of con-ceptually related properties in the literature, which can be re-garded as based on common patterns and, using these pat-terns, we evidence that many further properties can be consid-ered. Then, we provide a simplifying and unifying perspec-tive for these properties by showing that they are all impliedby the parametric principles of (either strict or non-strict) bal-ance and monotonicity. Finally, we show that (instances of)these principles are satisfied by several quantitative argumen-tation formalisms in the literature, thus confirming their gen-eral validity and their utility to support a compact, yet com-prehensive, analysis of properties of gradual argumentation.

Abstract
Cite

Conference paper

Baroni P, Borsato S, Rago A, Toni Fet al., 2018,

The "Games of Argumentation" Web Platform.

, Publisher: IOS Press, Pages: 447-448

Conference paper

Rago A, Toni F, 2017,

Quantitative Argumentation Debates with Votes for Opinion Polling

, PRIMA, Publisher: Springer, Pages: 369-385, ISSN: 0302-9743

Opinion polls are used in a variety of settings to assess the opinions of a population, but they mostly conceal the reasoning behind these opinions. Argumentation, as understood in AI, can be used to evaluate opinions in dialectical exchanges, transparently articulating the reasoning behind the opinions. We give a method integrating argumentation within opinion polling to empower voters to add new statements that render their opinions in the polls individually rational while at the same time justifying them. We then show how these poll results can be amalgamated to give a collectively rational set of voters in an argumentation framework. Our method relies upon Quantitative Argumentation Debate for Voting (QuAD-V) frameworks, which extend QuAD frameworks (a form of bipolar argumentation frameworks in which arguments have an intrinsic strength) with votes expressing individuals’ opinions on arguments.

Abstract
Cite

Imperial College London

Latest News

Computational Logic and Argumentation Group

Publications

Monotonicity and Noise-Tolerance in Case-Based Reasoning with Abstract Argumentation

Aggregating Bipolar Opinions

Explainable Automated Fact-Checking: A Survey

Explainable Automated Fact-Checking for Public Health Claims

FIND: Human-in-the-loop debugging deep text classifiers

PageRank as an Argumentation Semantics

Data-empowered argumentation for dialectically explainable predictions

On the acceptability of arguments and its fundamental role in nonmonotonic reasoning, logic programming and n-person games: 25 years later Foreword

AI-assisted schedule explainer for nurse rostering

Relation-Based Counterfactual Explanations for Bayesian Network Classifiers

Deploying Machine Learning Classifiers for Argumentative Relations “in the Wild”

Formal verification of debates in argumentation theory.

Dataset Independent Baselines for Relation Prediction in Argument Mining.

Human-grounded evaluations of explanation methods for text classification

On the responsibility for undecisiveness in preferred and stable labellings in abstract argumentation (extended abstract)

Extracting dialogical explanations for review aggregations with argumentative dialogical agents

Deciding the winner of a debate using bipolar argumentation

From fine-grained properties to broad principles for gradual argumentation: A principled spectrum

ROAD2H: Learning Decision Support System for Low- and Middle-Income Countries.

Gradual Argumentation Evaluation for Stance Aggregation in Automated Fake News Detection

Combining deep learning and argumentative reasoning for the analysis of social media textual content using small datasets

A platform for crowdsourcing corpora for argumentative

Argumentation for explainable reasoning with conflicting medical recommendations

The "Games of Argumentation" web platform

On instantiating generalised properties of gradual argumentation frameworks

Argumentation-based recommendations: fantastic explanations and how to find them

Explanatory predictions with artificial neural networks and argumentation

How many Properties do we need for Gradual Argumentation?

The "Games of Argumentation" Web Platform.

Quantitative Argumentation Debates with Votes for Opinion Polling

Publications

Search or filter publications

Filter by type:

Filter by year:

Results

Search results

Aggregating Bipolar Opinions

FIND: Human-in-the-loop debugging deep text classifiers

AI-assisted schedule explainer for nurse rostering

Relation-Based Counterfactual Explanations for Bayesian Network Classifiers

Formal verification of debates in argumentation theory.

Dataset Independent Baselines for Relation Prediction in Argument Mining.

On the responsibility for undecisiveness in preferred and stable labellings in abstract argumentation (extended abstract)

Extracting dialogical explanations for review aggregations with argumentative dialogical agents

Deciding the winner of a debate using bipolar argumentation

ROAD2H: Learning Decision Support System for Low- and Middle-Income Countries.

Gradual Argumentation Evaluation for Stance Aggregation in Automated Fake News Detection

A platform for crowdsourcing corpora for argumentative

Argumentation for explainable reasoning with conflicting medical recommendations

Explanatory predictions with artificial neural networks and argumentation

How many Properties do we need for Gradual Argumentation?

The "Games of Argumentation" Web Platform.