Digital Library[ Search Result ]
Search : "[ keyword: 모델 병렬화 ]" (2)
An Analysis on Inference Time, Accuracy, Communication, and GPU Memory Usage for Inference Batch of Large Language Models
Changyong Shin Younghun Go Yeonho Yoo Gyeongsik Yang Chuck Yoo
Vol. 49, No. 10, pp. 1377-1385, Oct. 2024
10.7840/kics.2024.49.10.1377
Vol. 49, No. 10, pp. 1377-1385, Oct. 2024
10.7840/kics.2024.49.10.1377