Gen AI Guide > Fine tuning > Exercise#2 Fine Tune Cohere

Exercise#2 Fine Tune Cohere

References

Intent is not to teach Cohere but give you a general experience with fine-tuning of closed source LLMs.

Objective

Your objective in this exercise is to fine tune the Cohere model for multi-Label toxicity classification task. There are 2 parts in this exercise:

ex-2-objective-comment-tagging

Part-1

You will use a multi-label dataset to fine-tune a Cohere model

Part-2

Try out the fine tuned model. Evaluate the accuracy of prediction.

You MUST be registered with Cohere to carry out fine-tuning. Free plan is good enough :) for this exercise.

Cohere fine-tuning dataset requirements

Dataset requirements

A minimum of 40 examples are required
There must be atleast 2 unique labels
There must be atleast 5 unique examples per label
The validation dataset is optional
If validation dataset is provider, it must have atleast 16 unique examples

Part-1 Fine-tune the model

Steps:

Checkout the multi-label toxicity dataset
Create a fine tuning job using WebUI

Open the Cohere dashboard
On left navigation panel, select Fine-tuning
Select Classify task

cohere-dashboard-fine-tuning-1

Upload the training and validation datasets

The datasets are already prepared for you in jsonl format
Datasets get loaded and validated before the fine-tuning process kicks off
Location of the dataset:

cohere-jsonl-datasets-location

cohere-dashboard-fine-tuning-1

Initiate fine tuning using WebUI

ex-2-fine-tuning-endpoints

Provide the name for the fine-tuned model
Follow instructions to initiate the fine-tuning process

Cohere places the fine-tuning job in a request queue. The job is picked and processed asynchronously. Per Cohere documenation, completion of the job may take up to a day sometimes depending on how busy the platform is !!!

Part-2 Try out the fine-tuned model

You are already given the code for invoking the fine-tuned model.

Open the notebook

ex-2-cohere-evaluate-notebook

Google colab

Make sure to follow instructions for setting up packages
Upload the test data file to colab (./data/toxicity-classifier/multi_label_comment_classification_test_cohere.jsonl)

Copy the model ID from Cohere fine-tuned model dashboard. Save the model ID in a temporary file. The fined-tuned model MUST be in the Ready state, otherwise the invocation will fail.

cohere-dashboard-copy-model-id

Try out the model.

Code uses Cohere REST endpoint for the classification task
It requires you to setup the model ID for your fine-tuned model