Attention Is All You Need
https://arxiv.org/abs/1706.03762
Auto-Encoding Variational Bayes
https://arxiv.org/abs/1312.6114
XGBoost: A Scalable Tree Boosting System
https://arxiv.org/pdf/1603.02754
Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms
https://aclanthology.org/W02-1001/
Gradient-based learning applied to document recognition
https://ieeexplore.ieee.org/document/726791
Learning RoI Transformer for Oriented Object Detection in Aerial Images
https://arxiv.org/abs/1812.00155
BiBERT: Accurate Fully Binarized BERT
https://arxiv.org/abs/2203.06390
OmniQuant: Omnidirectionally Calibrated Quantization for Large Language Models
https://arxiv.org/pdf/2308.13137
Distilling the Knowledge in a Neural Network
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes
Guaranteeing the Õ(AGMOUT) Runtime for Uniform Sampling and OUT Size Estimation over Joins
ImageNet Classification with Deep Convolutional Neural Networks
Visualizing and Understanding Convolutional Networks
Dropout: A simple way to prevent neural networks from overfitting
DeepFace: Closing the gap to human-level performance in face verification
Playing Atari with Deep Reinforcement Learning
TensorFlow: Large-scale machine learning on heterogeneous distributed systems.
Language Models are Unsupervised Multitask Learners
HuggingFace's Transformers: State-of-the-art Natural Language Processing
Protein complex prediction with AlphaFold-Multimer
Reliable Post hoc Explanations: Modeling Uncertainty in Explainability
Cristian Bucil˘a et al. Model Compression
Adaptive Mixtures of Local Experts
Long Short-Term Memory
Random Forest
Batch normalization: Accelerating deep network training by reducing internal covariate shift
Learning to summarize from human feedback
Training language models to follow instructions with human feedback
On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?
Eureka: Human-Level Reward Design via Coding Large Language Models
GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models
Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference
Can ChatGPT Understand Too? A Comparative Study on ChatGPT and Fine-tuned BERT
The Curse of Recursion: Training on Generated Data Makes Models Forget
AI models collapse when trained on recursively generated data
Leveraging large language models for predictive chemistry
'News > 논문' 카테고리의 다른 글
Time Series Forecasting (0) | 2024.06.14 |
---|---|
RNN Implementation (0) | 2024.06.14 |
LLM.int8()/QLORA (0) | 2024.04.25 |
RLHF (0) | 2024.04.23 |