본문 바로가기

Doc/메카닉

Slurm/Kubernetes/GPU Cluster

SchedMD/slurm
https://github.com/SchedMD/slurm

 

머신러닝 연구를 위한 GPU 600장의 클러스터 운영
https://medium.com/vessl-ai-kr/%EB%A8%B8%EC%8B%A0%EB%9F%AC%EB%8B%9D-%EC%97%B0%EA%B5%AC%EB%A5%BC-%EC%9C%84%ED%95%9C-gpu-600%EC%9E%A5%EC%9D%98-%ED%81%B4%EB%9F%AC%EC%8A%A4%ED%84%B0-%EC%9A%B4%EC%98%81-7fa8ceaae9f4


GPU클러스터 오케스트레이션, Slurm vs Kubernetes [토크아이티 세미남#55, 리더스시스템즈]
https://youtu.be/fBZsbOgV_FI?feature=shared

 

Slurm Job Scheduler Basics
https://youtu.be/Juo_mb3otJ0?feature=shared

 


빅데이터 처리를 가속화하는 NVIDIA SuperPOD 솔루션-GPU 클러스터 확산(DEEPOPS란 무엇인가?)
https://youtu.be/K1ba9i23k9E?feature=shared

 

Slurm | Slurm이란?
https://haawron.tistory.com/33

 

Slurm | Slurm의 유저 인터페이스
https://haawron.tistory.com/38

 

[Techtonic 2021] GPU 놀지마! 작업 시간 예측을 통한 스케줄링 방법(Job Scheduling) - 이창주 프로
https://youtu.be/cRNgrP02vIU?feature=shared

 

Slurm 사용법인데 ipynb를 곁들인...
https://dev-kimke.tistory.com/59

 

따라하며 하는 Slurm 세팅 & 설명, Ubuntu 18.04
https://ai4nlp.tistory.com/25

 

Slurm - Workload manager
https://wycho.tistory.com/63

 

centos 8에 slurm 설치하기
https://bgreat.tistory.com/177

 

[Pytorch] Distributed package으로 Multi-Node Multi-GPU 학습 알아보기
https://csm-kr.tistory.com/89

 

deepops
https://become-a-developer.tistory.com/entry/deepops

 

쿠버네티스가 AI, ML, LLM를 위한 플랫폼인 이유
https://nauco.tistory.com/125 

 

Lecture 6: MLOps Infrastructure & Tooling
https://tistory-nari.tistory.com/174

 

Slurm과 Kubernetes 정답은?
https://re-code-cord.tistory.com/entry/Slurm%EA%B3%BC-Kubernetes-%EC%A0%95%EB%8B%B5%EC%9D%80

 

 

'Doc > 메카닉' 카테고리의 다른 글

Continuum Robotics/Reinforcement Learning  (0) 2023.11.24
HD Tune Pro/하드디스크 검사/배드 섹터  (0) 2023.11.10
스마트폰 미러링/Scrcpy  (0) 2023.10.24
Recurdyne/ROS/FMI/OpenModelica  (0) 2023.10.13









>