Search results
Feb 8, 2024 · We highlight that the current trend towards increasing the plausibility of explanations, primarily driven by the demand for user-friendly interfaces, may come at the cost of diminishing their faithfulness. We assert that the faithfulness of explanations is critical in LLMs employed for high-stakes decision-making.
definition of faithfulness. Since LLM explanations mimic human explanations, they often reference high-level concepts in the input question that purportedly influenced the model. We define faithfulness in terms of the difference between the set of concepts that LLM explanations imply are influential and the set thattruly are.
Feb 7, 2024 · In this work, we discuss the dichotomy between faithfulness and plausibility in SEs generated by LLMs. We argue that while LLMs are adept at generating plausible explanations -- seemingly logical and coherent to human users -- these explanations do not necessarily align with the reasoning processes of the LLMs, raising concerns about their faithfulness.
- arXiv:2402.04614 [cs.CL]
- Computation and Language (cs.CL)
to which an explanation accurately reflects a model’s reasoning process (Jacovi and Goldberg2020). In other words, an explanation should not “lie” about the underlying mechanism at work. Explanations that lack faithfulness can be dangerous, especially when they still appear plausible, i.e., convincing to humans. This can mislead the
Feb 7, 2024 · It is asserted that the faithfulness of explanations is critical in LLMs employed for high-stakes decision-making and called upon the community to develop novel methods to enhance the faithfulness of self explanations thereby enabling transparent deployment of LLMs in diverse high-stakes settings. Large Language Models (LLMs) are deployed as powerful tools for several natural language ...
Apr 21, 2023 · The seven explanatory virtues are: Explanatoriness: Explanations must explain all the observed facts. Depth: Explanations should not raise more questions than they answer. Power: Explanations should apply in a range of similar contexts, not just the current situation in which the explanation is being offered.
People also ask
What is the difference between faithfulness and plausible explanations?
When is an explanation considered faithful?
Can a plausible explanation be unfaithful?
Do we need a systematic characterization of faithfulness-plausibility requirements?
Is the plausibility of explanations a problem in LLMs?
Is the faithfulness of explanations important in LLMs?
Mar 2, 2024 · However, in the context of faithfulness, we must warn against HCI-inspired evaluation, as well: increased performance in this setting is not indicative of faithfulness; rather, it is indicative of correlation between the plausibility of the explanations and the model’s performance.