Create your own image search (Python)

In this tutorial, we will create an image search where we enter a search term and 10 images, and return the similarities of each image to the search term.

📘
How does it work?
To compare a text term and an image, we will first create two vectors (in JSON/Python a vector is simply an array). This vector will have 512 numbers in it, and is a way to describe the text/image in the context of the AI model.
When you have two vectors, the dot product of the vectors tells you the similarity of the two vectors. If we then rank all the dot products highest to lowest, we know which images best match the keyword.

The code shown in this tutorial is available on GitHub.

Text search term

Let's choose to search for images with a bicycle in them. We create the vector for our search term using the Vision: Embed Text endpoint.

import requests, json
from dotenv import dotenv_values
corcelKey = dotenv_values(".env")['corcel_apikey']


url = "https://api.corcel.io/vision/embed_text"

payload = {
    "text_prompts": ["bicycle"]
}
headers = {
    "accept": "application/json",
    "content-type": "application/json",
    "Authorization": corcelKey
}

response = requests.post(url, json=payload, headers=headers)

jsonresponse = json.loads(response.text)
text_embedding = jsonresponse[0]['text_embeddings'][0]

#print(text_embedding)

This code uses the /vision/embed_text endpoint. We are using environmental variables to obfuscate our API Key. To use your API key, create a file in the same directory called ".env" and place "corcel_apikey=<your apikey>" in the file.

The response is converted into JSON, and then the text embedding is extracted into text_embedding.

Images

Let's grab a few images from Corcel for our search:

There are a few that are definitely bicycles - then weird bicycles/things that sort of look like bicycles and a banana.

https://storage.googleapis.com/corcel-images/d36a1306-6973-4d94-9.webp

https://storage.googleapis.com/corcel-images/8da56a62-6129-48c7-b.webp

https://storage.googleapis.com/corcel-images/1dfa33d5-f1e9-42c9-a.webp

https://storage.googleapis.com/corcel-images/74e45f262c5f5748954b617cc2fbc103.webp

https://storage.googleapis.com/corcel-images/2df9fa91-7fac-4dd2-a.webp

https://storage.googleapis.com/corcel-images/6debe5d9-a9ba-45eb-b.webp

Let's collect all of these images, base64 encode them, storing the encodings in an array.

import requests
import base64
image_urls = [
    "https://storage.googleapis.com/corcel-images/d36a1306-6973-4d94-9.webp",
    "https://storage.googleapis.com/corcel-images/8da56a62-6129-48c7-b.webp",
    "https://storage.googleapis.com/corcel-images/1dfa33d5-f1e9-42c9-a.webp",
    "https://storage.googleapis.com/corcel-images/74e45f262c5f5748954b617cc2fbc103.webp",
    "https://storage.googleapis.com/corcel-images/2df9fa91-7fac-4dd2-a.webp",
    "https://storage.googleapis.com/corcel-images/6debe5d9-a9ba-45eb-b.webp"
]

b64_images = []

for image in image_urls:
    response = requests.get(image)

    # Check if the request was successful (status code 200)
    if response.status_code == 200:
        # Convert the image content to base64
        base64_image = base64.b64encode(response.content).decode('utf-8')
    else:
        base64_image = ""
    
    b64_images.append(base64_image)

Create the Vectors

We can send the array of encoded images to Corcel to create the vectors. The vectors are saved in the array image_embedding.

#encode the B64 images into vectors

url = "https://api.corcel.io/vision/embed_image"

payload = {
    "image_b64s": b64_images
}
headers = {
    "accept": "application/json",
    "content-type": "application/json",
    "Authorization": corcelKey
}

response = requests.post(url, json=payload, headers=headers)


jsonresponse = json.loads(response.text)
image_embedding = jsonresponse[0]['image_embeddings']

print(len(image_embedding))

Comparison

We can now compare the text_embedding with the base 64 images. This does not require any API calls, we just compute the dot product of the text vector with each of the image vectors. Then we can sort the results from highest to lowest, and display the images with the dot product values

import numpy as np
from IPython.display import display, Image

#create the text vector
textVector = np.array(text_embedding)

#store dot products in ana array
dotProducts = []

#loop through the images, create the vector, do a dot product, and place the solution in the dotProducts array
for imageVector in image_embedding:

    dotProd = np.dot(textVector, imageVector)
    dotProducts.append(dotProd)

#sort the dot products highest to lowest
sorted_indices = np.argsort(dotProducts)[::-1]

#from highest to lowest, show the dot product (with 3 digits after the decimal), and the image.
for index in sorted_indices:
    print(f"similarity : {dotProducts[index]:.3f}")
    display(Image(url=image_urls[index], width=400))

This displays all of the images. In this case, the order is:

Elephant on a bike
mountain biker
lime tricycle
Steampunk factory
motorcycle
banana

Addendum - modify to search for similar images

You can search for similar images in much the same way as above - you just take the vectors of your search image and compare them to the vectors of the images you'd like to compare to.