About 1,440,000 results
Open links in new tab
  1. Towards Safer Large Language Models through Machine Unlearning

    Feb 15, 2024 · To address this gap, we introduce Selective Knowledge negation Unlearning (SKU), a novel unlearning framework for LLMs, designed to eliminate harmful knowledge while …

  2. Towards Safer Large Language Models through Machine Unlearning

    6 days ago · The rapid advancement of Large Language Models (LLMs) has demonstrated their vast potential across various domains, attributed to their extensive pretraining knowledge and …

  3. Rethinking machine unlearning for large language models

    Feb 17, 2025 · We explore machine unlearning in the domain of large language models (LLMs), referred to as LLM unlearning.

  4. Towards Safer Large Language Models through Machine Unlearning

    Feb 15, 2024 · To address this gap, we introduce Selective Knowledge negation Unlearning (SKU), a novel unlearning framework for LLMs, designed to eliminate harmful knowledge while …

  5. Towards Safer Large Language Models through Machine Unlearning

    Feb 15, 2024 · This work introduces a scalable, automated approach to generate high-quality forget sets using language models themselves, and suggests that synthetic datasets offer a …

  6. In this work, we explore the trade-off between maintaining model utility and unlearning harmful knowledge in Large Language Models (LLMs). To tackle this challenge, we introduce SKU, an …

  7. Large Language Models (LLMs) (Brown et al., 2020; Chowdhery et al., 2023; Touvron et al., 2023; Qin et al., 2023) have demonstrated their exceptional ability across various AI applications

  8. We envision LLM unlearning becoming a pivotal element in the life-cycle management of LLMs, potentially standing as an essential foundation for developing generative artificial intelligence...

  9. [2405.15152] Machine Unlearning in Large Language Models

    May 24, 2024 · This paper presents a dual-pronged approach to enhance the ethical and safe behavior of large language models (LLMs) by addressing the issues of harmful responses and …

  10. AFEERASER, a safety unlearning benchmark for MLLMs, consisting of 3,000 images and 28.8K VQA pairs. We comp. e- hensively evaluate unlearning methods from two perspectives: forget …

  11. SafeEraser: Enhancing Safety in Multimodal Large Language Models ...

    Feb 18, 2025 · To address this issue, we propose SAFEERASER, a safety unlearning benchmark for MLLMs, consisting of 3,000 images and 28.8K VQA pairs. We comprehensively evaluate …

  12. ACL Anthology

    @inproceedings{liu-etal-2024-towards-safer, title = "Towards Safer Large Language Models through Machine Unlearning", author = "Liu, Zheyuan and Dou, Guangyao and Tan, Zhaoxuan …

  13. Towards Safer Large Language Models through Machine Unlearning

    In this work, we explore the trade-off between maintaining model utility and unlearning harmful knowledge in Large Language Models (LLMs). To tackle this challenge, we introduce SKU, an …

  14. SafeEraser: Enhancing Safety in Multimodal Large Language Models ...

    2 days ago · To quantitatively measure over-forgetting mitigated by PD Loss, we propose a new metric called **Safe Answer Refusal Rate (SARR)**.

  15. Unforgotten Safety: Preserving Safety Alignment of Large Language ...

    2 days ago · Abstract The safety alignment of large language models (LLMs) is becoming increasingly important with their democratization. In this paper, we study the safety …