Gen AI Guide > Inference endpoints > Exercise#1 Use Ollama for hosting

Exercise#1 Use Ollama for hosting

Objective

Setup a Ollama chat application locally
Learn to use the Ollama for local development

Pre-requisite

Download and install Ollama on your machine
Pull a few models that you would like to try out. Check out the run commands for modelsin Readme.md file - section: Model library.

It would take sometime for the models to get donloaded. Duration depends on your machine and speed of internet.

ollama run gemma2

ollama run llama3.1

Must have Ollama running as server

ollama serve

Part-1 Setup a Ollama chat application locally

Step-1 Review the Chat applications available

Step-2 Checkout the available apps in Readme.md file - section: Community Integrations.

Step-3 Select one (or two) apps that you will like to try

Step-4 Launch Ollama

Step-5 Use the application instruction to launch & try it

Note: In the lesson video, I used the HTML UI application; so if you are not able to make up your mind on which app to use, go with ‘HTML UI’ - it is easy to follow & install.

Part-2 Learn to use the Ollama for local development

In this part of the exercise, you will write code to interact with the models hosted in OLlama. Recall that any HTTP library can be used for these interactions.

Start by creating a new notebook under [root-folder-for-course]/gen-ai-app-dev-template/[Endpoints]/[ollama-usage.ipynb]. Add the code below and run it.

import requests

# URL for the endpoints
base_url = 'http://localhost:11434'

Step-1 Get model information

The API indepoint to get the model information : /api/show.

# Create the URL for getting model information
url = base_url + '/api/show'

# Query to be sent in body
query = {
  "name": "llama3.1"
}

# Invoke API
response = requests.post(url, json=query)