Create your own image search (Python)
In this tutorial, we will create an image search where we enter a search term and 10 images, and return the similarities of each image to the search term.
How does it work?
To compare a text term and an image, we will first create two vectors (in JSON/Python a vector is simply an array). This vector will have 512 numbers in it, and is a way to describe the text/image in the context of the AI model.
When you have two vectors, the dot product of the vectors tells you the similarity of the two vectors. If we then rank all the dot products highest to lowest, we know which images best match the keyword.
The code shown in this tutorial is available on GitHub.
Text search term
Let's choose to search for images with a bicycle in them. We create the vector for our search term using the Vision: Embed Text endpoint.
import requests, json
from dotenv import dotenv_values
corcelKey = dotenv_values(".env")['corcel_apikey']
url = "https://api.corcel.io/vision/embed_text"
payload = {
"text_prompts": ["bicycle"]
}
headers = {
"accept": "application/json",
"content-type": "application/json",
"Authorization": corcelKey
}
response = requests.post(url, json=payload, headers=headers)
jsonresponse = json.loads(response.text)
text_embedding = jsonresponse[0]['text_embeddings'][0]
#print(text_embedding)
This code uses the /vision/embed_text endpoint. We are using environmental variables to obfuscate our API Key. To use your API key, create a file in the same directory called ".env" and place "corcel_apikey=" in the file.
The response is converted into JSON, and then the text embedding is extracted into text_embedding.
Images
Let's grab a few images from Corcel for our search:
There are a few that are definitely bicycles - then weird bicycles/things that sort of look like bicycles and a banana.
https://storage.googleapis.com/corcel-images/d36a1306-6973-4d94-9.webp
https://storage.googleapis.com/corcel-images/8da56a62-6129-48c7-b.webp
https://storage.googleapis.com/corcel-images/1dfa33d5-f1e9-42c9-a.webp
https://storage.googleapis.com/corcel-images/74e45f262c5f5748954b617cc2fbc103.webp
https://storage.googleapis.com/corcel-images/2df9fa91-7fac-4dd2-a.webp
https://storage.googleapis.com/corcel-images/6debe5d9-a9ba-45eb-b.webp
Let's collect all of these images, base64 encode them, storing the encodings in an array.
import requests
import base64
image_urls = [
"https://storage.googleapis.com/corcel-images/d36a1306-6973-4d94-9.webp",
"https://storage.googleapis.com/corcel-images/8da56a62-6129-48c7-b.webp",
"https://storage.googleapis.com/corcel-images/1dfa33d5-f1e9-42c9-a.webp",
"https://storage.googleapis.com/corcel-images/74e45f262c5f5748954b617cc2fbc103.webp",
"https://storage.googleapis.com/corcel-images/2df9fa91-7fac-4dd2-a.webp",
"https://storage.googleapis.com/corcel-images/6debe5d9-a9ba-45eb-b.webp"
]
b64_images = []
for image in image_urls:
response = requests.get(image)
# Check if the request was successful (status code 200)
if response.status_code == 200:
# Convert the image content to base64
base64_image = base64.b64encode(response.content).decode('utf-8')
else:
base64_image = ""
b64_images.append(base64_image)
Create the Vectors
We can send the array of encoded images to Corcel to create the vectors. The vectors are saved in the array image_embedding.
#encode the B64 images into vectors
url = "https://api.corcel.io/vision/embed_image"
payload = {
"image_b64s": b64_images
}
headers = {
"accept": "application/json",
"content-type": "application/json",
"Authorization": corcelKey
}
response = requests.post(url, json=payload, headers=headers)
jsonresponse = json.loads(response.text)
image_embedding = jsonresponse[0]['image_embeddings']
print(len(image_embedding))
Comparison
We can now compare the text_embedding with the base 64 images. This does not require any API calls, we just compute the dot product of the text vector with each of the image vectors. Then we can sort the results from highest to lowest, and display the images with the dot product values
import numpy as np
from IPython.display import display, Image
#create the text vector
textVector = np.array(text_embedding)
#store dot products in ana array
dotProducts = []
#loop through the images, create the vector, do a dot product, and place the solution in the dotProducts array
for imageVector in image_embedding:
dotProd = np.dot(textVector, imageVector)
dotProducts.append(dotProd)
#sort the dot products highest to lowest
sorted_indices = np.argsort(dotProducts)[::-1]
#from highest to lowest, show the dot product (with 3 digits after the decimal), and the image.
for index in sorted_indices:
print(f"similarity : {dotProducts[index]:.3f}")
display(Image(url=image_urls[index], width =400))
This displays all of the images. In this case, the order is:
- Elephant on a bike
- mountain biker
- lime tricycle
- Steampunk factory
- motorcycle
- banana
Addendum - modify to search for similar images
You can search for similar images in much the same way as above - you just take the vectors of your search image and compare them to the vectors of the images you'd like to compare to.
Updated about 2 months ago