Train – yorickvp/llava-13b:80537f9e

yorickvp / llava-13b

Visual instruction tuning towards large language and vision models with GPT-4 level capabilities

Public
28M runs
GitHub
Paper
License

Playground API Examples Train README Versions

Create training

Trainings for this model run on Nvidia A100 (80GB) GPU hardware, which costs $0.0014 per second. Upon creation, you will be redirected to the training detail page where you can monitor your training's progress, and eventually download the weights and run the trained model.

Note: versions of this model with fast booting use the hardware set by the base model they were trained from.

Learn more about training

If you haven't yet trained a model on Replicate, you can read one of the following guides.

You can finetune LLaVA with your own dataset, using LoRA techniques! Training data can be passed to cog train with the train_data parameter. Your training dataset should be a zip-file with the following structure:

./images/: A folder with training data images.
./data.json: A JSON file that links images to conversations. For details, see the dataset format instructions in the github repository.

Example code for training:

import replicate

training = replicate.trainings.create(
    version="yorickvp/llava-13b:[version_id]",
    input={
        "train_data": "https://my-domain/my-input-images.zip",
    },
    destination="my-name/my-model"
)
print(training)

You can find more information about finetuning image models in the Replicate docs. The tutorial on finetuning SDXL with your own images is a good starting point.