Search results
Feb 8, 2024 · Evaluating the faithfulness of explanations is a non-trivial problem due to the lack of ground truth explanations. This problem has worsened in the case of self-explanations from LLMs, as the billion-parameter scale and often proprietary nature of LLMs make assessments using saliency maps and other gradient-based methods nearly impossible.
- [2402.04614] Faithfulness vs. Plausibility: On the (Un ...
Large Language Models (LLMs) are deployed as powerful tools...
- [2402.04614] Faithfulness vs. Plausibility: On the (Un ...
Jan 2, 2024 · They find that explanations from LLMs such as GPT-3.5 and GPT-4 have low precision, indicating that they mislead humans to form incorrect mental models. The article reveals limitations of current methods and that optimizing human preferences like plausibility may be insufficient for improving counterfactual simulatability.
In this work, we discuss the dichotomy between faithfulness and plausibility in SEs generated by LLMs. We argue that while LLMs are adept at generating plausible explanations -- seemingly logical and coherent to human users -- these explanations do not necessarily align with the reasoning processes of the LLMs, raising concerns about their faithfulness.
Feb 7, 2024 · Large Language Models (LLMs) are deployed as powerful tools for several natural language processing (NLP) applications. Recent works show that modern LLMs can generate self-explanations (SEs), which elicit their intermediate reasoning steps for explaining their behavior. Self-explanations have seen widespread adoption owing to their conversational and plausible nature. However, there is little ...
- arXiv:2402.04614 [cs.CL]
- Computation and Language (cs.CL)
IBE-Eval can successfully identify the best explanation supporting the correct answers with up to 77% accuracy (+ 27% above ran- dom and + 17% over GPT 3.5-as-a-Judge baselines) 6. IBE-Eval is signicantly correlated with hu- man judgment, outperforming a GPT3.5-as- a-Judge baseline in terms of alignment with human preferences.
Feb 7, 2024 · It is asserted that the faithfulness of explanations is critical in LLMs employed for high-stakes decision-making and called upon the community to develop novel methods to enhance the faithfulness of self explanations thereby enabling transparent deployment of LLMs in diverse high-stakes settings. Large Language Models (LLMs) are deployed as powerful tools for several natural language ...
People also ask
Is the plausibility of explanations a problem in LLMs?
Do LLMs mislead humans to form incorrect mental models?
Why is plausibility important for LLMs?
What happens if an LLM explanation is incorrect?
Why is explainability important in LLMs?
Is the faithfulness of explanations important in LLMs?
Feb 28, 2024 · They find that explanations from LLMs such as GPT-3.5 and GPT-4 have low precision, indicating that they mislead humans to form incorrect mental models. The paper reveals limitations of current methods and that optimizing human preferences like plausibility may be insufficient for improving counterfactual simulatability.