LLM.int8()/QLORA

07-03 QLoRA (T Dettmers, 2023)
https://wikidocs.net/232761

https://www.facebook.com/groups/TensorFlowKR/posts/2043855619288819/?paipv=0&eav=AfbdY6h01SJwQ1mXtjJ8BKHu77BLtnbU6feXjEltFkI77UJjyj39bhwNZdK0HJc8Pfw&_rdr

Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA
https://huggingface.co/blog/4bit-transformers-bitsandbytes?utm_source=pytorchkr

Introduction to 8-bit Matrix Multiplication for transformers at scale
https://dhpark1212.tistory.com/entry/Introduction-to-8-bit-Matrix-Multiplication-for-transformers-at-scale

bitsandbytes 이슈 삽질기
https://blog.lablup.com/posts/2023/07/28/bitsandbytes/

Text Generation Inference(TGI)를 활용한 프로덕션 레벨 LLM 추론 가속화
https://medium.com/@nuatmochoi/text-generation-inference-tgi-%EB%A5%BC-%ED%99%9C%EC%9A%A9%ED%95%9C-%ED%94%84%EB%A1%9C%EB%8D%95%EC%85%98-%EB%A0%88%EB%B2%A8-llm-%EC%B6%94%EB%A1%A0-%EA%B0%80%EC%86%8D%ED%99%94-2b5f0641c232

LLM Trend Note (6) LLM.int8(), LoRA, Prefix LM, Sparse transfomer, Sparse attention, Model parallelism, Data parallelism
https://questionet.tistory.com/78

P_7. A Survey of Large Language Models (2/3)
https://wikidocs.net/222913

ml-papers/papers/2023/230523 QLoRA.md
https://github.com/rosinality/ml-papers/blob/main/papers/2023/230523%20QLoRA.md

ml-papers/papers/2023/220815 LLM.int8().md
https://github.com/rosinality/ml-papers/blob/main/papers/2022/220815%20LLM.int8%28%29.md

'News > 논문' 카테고리의 다른 글

RNN Implementation (0)	2024.06.14
죽기 전에 이해하고 싶은 논문 (0)	2024.05.31
RLHF (0)	2024.04.23
Text Summarization (0)	2024.04.17

LLM.int8()/QLORA

'News > 논문' 카테고리의 다른 글

티스토리툴바