Modal is a cloud deployment system that’s entirely programmatic. No yaml: import modal stub = modal.stub(gpu="a100") @stub.function() def fit(x): import whatever whatever.thing() So think run.house, but they have the infra. fine-tuning with Modal https://github.com/modal-labs/llama-recipes You can store the serverless functions, and Modal can serve stored serverless functions. Modal have web hooks as well to do inference at a front end. Modal can serve most the management as well. pricing 13B: 500 tokens/s on 40GB AA10 (3.73 / hour) 70B: 300 tok /s 80 GB( 2* 5.59/hour)

[[curator]]
I'm the Curator. I can help you navigate, organize, and curate this wiki. What would you like to do?