Gen AI Guide > Fine tuning > Exercise#4 Explore hyperparameters

Exercise#4 Explore hyperparameters

Objective

The objective is to understand the hyperparameters supported for fine-tuning by various models.

Steps

Go through the fine-tuning documentation for the model (links are provided)
Explore the hyperparameters recommendations for the models

You may go through the documentation manually or you may use an AI tool like ChatGPT, Google NotebookLLM to carry out this exercise.

References

Google NotebookLM

1. Gemini

Review the documentation on fine-tuning Gemini family of models or add the documentation as source/context to ground the AI tool.

List the adjustable hyperparameters for fine-tuning Gemini models. Keep the response concise.

Provide recommendations for optimal hyperparameter settings for the Gemini family of models. Include specific values or ranges, and explain their impact on model performance where applicable.

2. Open AI

Review the documentation on fine-tuning Open AI models or add the documentation as souce/context to ground the AI tool.

Google NotebookLLM is unable to read the content from Open AI website. To address this use the Paste text option. Copy documentation content to clipboard and paste it as source in the notebookLLM.

List the adjustable hyperparameters for fine-tuning Open AI models. Keep the response concise.

Provide recommendations for optimal hyperparameter settings for the Open AI family of models. Include specific values or ranges, and explain their impact on model performance where applicable.

3. Cohere

Review the documentation on fine-tuning Cohere family of models or add the documentation as source/context to ground the AI tool.

List the adjustable hyperparameters for fine-tuning Cohere models. Keep the response concise.

Provide recommendations for optimal hyperparameter settings for the Cohere family of models. Include specific values or ranges, and explain their impact on model performance where applicable.

4. Common theme for fine-tuning

Objective is to learn the common themes across the recommendations for the Gemini. Open AI and Cohere models.

Identify common themes in the fine-tuning recommendations provided by Gemini, OpenAI, and Cohere. Highlight shared practices, strategies, or guidelines, and explain how these align across the different frameworks.

Solution

Gemini

Parameter	Description
Epochs	Number of complete passes through the training dataset
Batch size	Number of examples used in one training iteration
Learning rate	Controls the adjustment of model parameters during each iteration
Learning rate multiplier	Modifies the original learning rate

Open AI

Hyperparameter	Description	Adjustment Recommendations
Epochs	Number of complete passes through the training dataset	Increase by 1-2 for underfitting, decrease by 1-2 for overfitting
Learning Rate Multiplier	Modifies the default learning rate	Increase for convergence issues, decrease for stability
Batch Size	Number of training examples processed together	No explicit recommendations provided

Cohere

Hyperparameter	Description	Range	Default
epochCount	Number of epochs	1-100	1
batchSize	Number of samples processed per iteration	Command: 8; Light: 8-32	Command: 8; Light: 8
learningRate	Learning rate	5.00E-6 to 0.1	1.00E-5
earlyStoppingThreshold	Minimum improvement required to continue training	0-0.1	0.01
earlyStoppingPatience	Tolerance for stagnation in loss	1-10	6
evalPercentage	Percentage of dataset used for evaluation	5-50	20

Common themes

Feature	OpenAI	Cohere
Starting Point	Default Hyperparameters	Default Hyperparameters
Iterative Adjustment	Recommended	Recommended
Epochs	Adjust based on model behavior	Higher for larger, complex datasets
Learning Rate	Adjust multiplier for convergence and stability	Dynamic adjustment with validation dataset
Batch Size	No explicit recommendations	Model-specific limits and defaults
Data Quality	Prioritize quality over quantity	Implicitly emphasized through validation dataset usage