Exercise#1 Try out ChromaDB

Objective

Learn to use custom embedding model with ChromaDB vector database.

Tasks

1. You will be using the Cohere embed-english-light-v2.0 model for creating the embeddings.
* Check the documentation for the embedding model [Cohere](https://docs.cohere.com/docs/models#embed)

* Note down the dimension of the model you have selected
2. Check the ChromaDB documentation for available embeddings functions
3. Put together the code & run
  • Create an embedding function
model_name = 'embed-english-light-v2.0'
embedding_dimension = 1024

cohere_ef  = embedding_functions.CohereEmbeddingFunction(
        api_key=COHERE_API_KEY, 
        model_name=model_name)
  • Create a collection
collection_cohere = client.get_or_create_collection(name=cohere_collection_name, embedding_function=cohere_ef)
  • Add the documents to collection
corpus = [
  "A man is eating food.", "A man is eating a piece of bread.",
  "The chef is preparing a delicious meal in the kitchen.", "A chef is tossing vegetables in a sizzling pan.",
  "A man is riding a horse.", "A man is riding a white horse on an enclosed ground.",
  "A woman is playing violin.", "A musician is tuning his guitar before the concert.",
  "The girl is carrying a baby.", "The baby is giggling while playing with her toys.",
  "The family is having a picnic under the shady oak tree.", "A group of friends is hiking up the mountain trail.",
  "The mechanic is repairing a broken-down car in the garage.", "The old man is feeding breadcrumbs to the ducks at the pond.",
  "The artist is sketching a beautiful landscape at sunset.", "A man is painting a colorful mural on the city wall.",
  "A team of scientists is conducting experiments in the laboratory.", "A group of students is studying together in the library.",
  "The birds are chirping happily in the morning sun.", "The dog is chasing its tail around the backyard.",
  "A group of children are playing soccer in the park.", "A monkey is playing drums.",
  "A boy is flying a kite in the open field.", "Two men pushed carts through the woods.",
  "A woman is walking her dog along the beach.", "A young girl is reading a book under a shady tree.",
  "The dancer is gracefully performing on stage.", "The farmer is harvesting ripe tomatoes from the vine."
]


# add metadata, id as needed
collection_cohere.add(documents = corpus)
  • Query the docs
result = collection_cohere.query(
    query_texts = ["I like to cook"], # "small child is having fun"],
    n_results = 3,)

Solution

Open in local Jupyter Lab environment

The soultion to the exercise is available under Part-2 of the notebook: images/vectordb/chroma-basics.png

Open in Google Colab
Open In Colab