[논문리뷰]Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding(ACL, 2024)
카테고리: NR
Jun Zhang, Jue Wang, Huan Li, Lidan Shou, Ke Chen, Gang Chen, and Sharad Mehrotra. 2024. Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Lun-Wei Ku, Andre Martins, and Vivek Srikumar (Eds.). Association for Computational Linguistics, Bangkok, Thailand, 11263–11282. https://doi.org/10.18653/v1/2024.acl-long.607
Problem Statement
댓글 남기기