본문 바로가기

News/논문

RNN, Transformer

효과적인 RNN 학습
https://ratsgo.github.io/deep%20learning/2017/10/10/RNNsty/

 

RNN과 LSTM을 이해해보자!
https://ratsgo.github.io/natural%20language%20processing/2017/03/09/rnnlstm/

 

Attention is All You Need (Pytorch)

https://chioni.github.io/posts/transformer2/

 

 

ELECTRA (ICLR 2020)
https://chioni.github.io/posts/electra/

 

 

어텐션 매커니즘
https://ratsgo.github.io/from%20frequency%20to%20semantics/2017/10/06/attention/

 

셀프 어텐션 동작 원리
https://ratsgo.github.io/nlpbook/docs/language_model/tr_self_attention/

 

트랜스포머에 적용된 기술들
https://ratsgo.github.io/nlpbook/docs/language_model/tr_technics/

 

Attention과 Self-Attention의 차이
https://velog.io/@glad415/Attention%EA%B3%BC-Self-Attentiondml-%EC%B0%A8%EC%9D%B4

 

 

[X:AI] Transformer 논문 이해하기
https://rahites.tistory.com/151

 

[딥러닝] 언어모델, RNN, GRU, LSTM, Attention, Transformer, GPT, BERT 개념 정리

https://velog.io/@rsj9987/%EB%94%A5%EB%9F%AC%EB%8B%9D-%EC%9A%A9%EC%96%B4%EC%A0%95%EB%A6%AC

 

Attention #1 Attention의 첫 등장

https://wigo.tistory.com/entry/attention1

 

[NLP 논문 리뷰] Neural Machine Translation By Jointly Learning To Align And Translate (Attention Seq2Seq)

https://cpm0722.github.io/paper-review/neural-machine-translation-by-jointly-learning-to-align-and-translate

 

Neural Machine Translation by Jointly Learning to Align and Translate

https://curaai00.tistory.com/9

 

 

논문 제목 : Neural Machine Translation by Jointly Learning to Align and Translate
https://enfow.github.io/paper-review/sequence-to-sequence/2020/06/28/neural_machine_translation_by_jointly_learning_to_align_and_translate/

 


[NLP | 논문리뷰] NEURAL MACHINE TRANSLATION BY JOINTLY LEARNING TO ALIGN AND TRANSLATE
https://velog.io/@xuio/NLP-%EB%85%BC%EB%AC%B8%EB%A6%AC%EB%B7%B0-NEURAL-MACHINE-TRANSLATIONBY-JOINTLY-LEARNING-TO-ALIGN-AND-TRANSLATE

 

 

[Attention] Luong Attention 개념 정리
https://hcnoh.github.io/2019-01-01-luong-attention

 

 

[NLP | 논문리뷰] Sequence to Sequence Learning with Neural Networks
https://velog.io/@xuio/NLP-%EB%85%BC%EB%AC%B8%EB%A6%AC%EB%B7%B0-Sequence-to-Sequence-Learning-with-Neural-Networks

[자연어처리][paper review] Seq2Seq : Sequence to Sequence Learning with Neural Networks

https://supkoon.tistory.com/17

 

Attention is All You Need: shape를 따라가는 해설
https://deep-deep-deep.tistory.com/183

 

 

16-01 트랜스포머(Transformer)
https://wikidocs.net/31379

 

 

[NLP | 논문리뷰] Attention is all you need(Transformer)
https://velog.io/@xuio/NLP-%EB%85%BC%EB%AC%B8%EB%A6%AC%EB%B7%B0-Attention-is-all-you-needTransformer

 

Transformer 논문 리뷰 전 프리뷰1(Attention의 흐름과 Self Attention, Masked Decoder Self-Attention)
https://velog.io/@xuio/Transformer-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0-%EC%A0%84-%ED%94%84%EB%A6%AC%EB%B7%B0

 

Transformer 논문 리뷰 전 프리뷰2(Positional Encoding과 Residual Connection)
https://velog.io/@xuio/Transformer-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0-%EC%A0%84-%ED%94%84%EB%A6%AC%EB%B7%B02Positional-Encoding%EA%B3%BC-Residual-Connection

 

Transformer 논문 리뷰 전 프리뷰3(End to End Memory, Extended Neural GPU 논문 리뷰)
https://velog.io/@xuio/Transformer-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0-%EC%A0%84-%ED%94%84%EB%A6%AC%EB%B7%B03End-to-End-Memory-Extended-Neural-GPU

 

 

Illustrated: Self-Attention

https://towardsdatascience.com/illustrated-self-attention-2d627e33b20a

 

 

The Illustrated Transformer

https://nlpinkorean.github.io/illustrated-transformer/

 

 

[NLP] Transformer

https://amber-chaeeunk.tistory.com/96

 

 

Attention Is All You Need

https://machinereads.wordpress.com/2018/09/26/attention-is-all-you-need/

 

 

Transformer: Attention is All You Need 리뷰

https://vanche.github.io/NLP_Transformer/

 

 

Attention Is All You Need

https://20chally.tistory.com/222

 

 

Attention Is All You Need(NIPS 2017)

https://milkclouds.work/attention-is-all-you-need-nips-2017/

 

 

[논문 리딩] Attention is All You Need

https://everyday-log.tistory.com/entry/%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%94%A9-Attention-is-All-You-Need

 

 

리뷰 : Transformer

https://jhtobigs.oopy.io/transformer

 

 

Transformer는 이렇게 말했다, "Attention is all you need.

https://blog.promedius.ai/transformer/

 

 

What exactly are keys, queries, and values in attention mechanisms?
https://stats.stackexchange.com/questions/421935/what-exactly-are-keys-queries-and-values-in-attention-mechanisms

 

What are attention mechanisms exactly?
https://stats.stackexchange.com/questions/344508/what-are-attention-mechanisms-exactly

 

On masked multi-head attention and layer normalization in transformer model
https://stats.stackexchange.com/questions/387114/on-masked-multi-head-attention-and-layer-normalization-in-transformer-model

 

Why are residual connections needed in transformer architectures?
https://stats.stackexchange.com/questions/565196/why-are-residual-connections-needed-in-transformer-architectures

 

What stops the network from learning the same weights in multi-head attention mechanism
https://stats.stackexchange.com/questions/373850/what-stops-the-network-from-learning-the-same-weights-in-multi-head-attention-me

 

Meaning of the value matrix in self-attention
https://stats.stackexchange.com/questions/481324/meaning-of-the-value-matrix-in-self-attention

 

What is the role of feed forward layer in Transformer Neural Network architecture?
https://stats.stackexchange.com/questions/485910/what-is-the-role-of-feed-forward-layer-in-transformer-neural-network-architectur

 

Why are the embeddings of tokens multiplied by D−−√
 (note not divided by square root of D) in a transformer?
https://stats.stackexchange.com/questions/534618/why-are-the-embeddings-of-tokens-multiplied-by-sqrt-d-note-not-divided-by-sq

 

Formula to compute approximate memory requirements of Transformer models
https://stats.stackexchange.com/questions/563919/formula-to-compute-approximate-memory-requirements-of-transformer-models

 

Why transformer in deep learning is called transformer?
https://stats.stackexchange.com/questions/541498/why-transformer-in-deep-learning-is-called-transformer

 

Why do large LMs use the transpose of the word embeddings matrix in the classification head?
https://stats.stackexchange.com/questions/584685/why-do-large-lms-use-the-transpose-of-the-word-embeddings-matrix-in-the-classifi

 

Why use sinusoidal along embedding dimension in positional encoding in transformers?
https://stats.stackexchange.com/questions/497879/why-use-sinusoidal-along-embedding-dimension-in-positional-encoding-in-transform


Transformer model: Why are word embeddings scaled before adding positional encodings?
https://datascience.stackexchange.com/questions/87906/transformer-model-why-are-word-embeddings-scaled-before-adding-positional-encod

 

 

[토치의 호흡] 07 About Transformer PART 01 간단한 구조 설명
https://velog.io/@heiswicked/%ED%86%A0%EC%B9%98%EC%9D%98-%ED%98%B8%ED%9D%A1-06-About-Transformer-PART-01-%EA%B0%84%EB%8B%A8%ED%95%9C-%EA%B5%AC%EC%A1%B0-%EC%84%A4%EB%AA%85

 

[토치의 호흡] 08 About Transformer PART 02 "Positional Encoding Layer"
https://velog.io/@heiswicked/%ED%86%A0%EC%B9%98%EC%9D%98-%ED%98%B8%ED%9D%A1-06-About-Transformer-PART-01-PositionalEncodingLayer

 

[토치의 호흡] 09 About Transformer PART 03 "Encoder and EncoderLayer"
https://velog.io/@heiswicked/%ED%86%A0%EC%B9%98%EC%9D%98-%ED%98%B8%ED%9D%A1-09-About-Transformer-PART-03-Encoder-and-EncoderLayer

 

[토치의 호흡] 10 About Transformer PART 04 "In EncoderLayer: Multi Head Attention"
https://velog.io/@heiswicked/%ED%86%A0%EC%B9%98%EC%9D%98-%ED%98%B8%ED%9D%A1-10-About-Transformer-PART-04-In-EncoderLayer-Multi-Head-Attention

 

[토치의 호흡] 11 About Transformer PART 05 "Classification_by_DIY_TRANSFORMER"
https://velog.io/@heiswicked/%ED%86%A0%EC%B9%98%EC%9D%98-%ED%98%B8%ED%9D%A1-11-About-Transformer-PART-05-NLPClassificationbyDIYTRANSFORMER

 

[Transformer]-1 Positional Encoding은 왜 그렇게 생겼을까? 이유
https://velog.io/@gibonki77/DLmathPE

 

[Transformer]-2 Self Attention 어떻게 계산할까? 차원은?
https://velog.io/@gibonki77/DLmathSA

 

NL-009, Attention Is All You Need (2017-NIPS)
https://ai-information.blogspot.com/2019/03/nl-009-attention-is-all-you-need.html

 

Attention is all you need paper 뽀개기
https://pozalabs.github.io/transformer/

 

The Illustrated Transformer
http://jalammar.github.io/illustrated-transformer/

 

The Annotated Transformer
http://nlp.seas.harvard.edu/annotated-transformer/

 

Chapter 8 Attention and Self-Attention for NLP
https://slds-lmu.github.io/seminar_nlp_ss20/attention-and-self-attention-for-nlp.html

 

 

Attention Is All You Need(Attention 논문 설명)

https://greeksharifa.github.io/nlp(natural%20language%20processing)%20/%20rnns/2019/08/17/Attention-Is-All-You-Need/#4-%EC%99%9C-self-attention%EC%9D%B8%EA%B0%80why-self-attention

 

 

[논문정리] Attention is all you need

https://haystar.tistory.com/77#toc135

 

 

[자연어처리][paper review] Transformer : Attention is all you need

https://supkoon.tistory.com/21

 

'News > 논문' 카테고리의 다른 글

논문 읽기/쓰기  (0) 2023.07.07
ELMo, GPT, BERT  (0) 2023.07.06
arxiv-utils Google Chrome Extension  (0) 2023.06.22
Reliable Post hoc Explanations: Modeling Uncertainty in Explainability  (0) 2023.05.07









>