Skip to contents

Image Recognition using LLMs

Usage

llm_image_recognition(
  llm_model = "qwen2.5vl",
  image = system.file("img/test_img.jpg", package = "kuzco"),
  recognize_object = "face",
  backend = "ellmer",
  additional_prompt = "",
  provider = "ollama",
  ...
)

Arguments

llm_model

a local LLM model pulled from ollama

image

a local image path that has a jpeg, jpg, or png

recognize_object

an item you want to LLM to look for

backend

either 'ollamar' or 'ellmer', note that 'ollamar' suggests structured outputs while 'ellmer' enforces structured outputs

additional_prompt

text to append to the image prompt

provider

for backend = 'ollamar', provider is ignored. for backend = 'ellmer', provider refers to the ellmer::chat_* providers and can be used to switch from "ollama" to other providers such as "perplexity"

...

a pass through for other generate args and model args like temperature. set the temperature to 0 for more deterministic output

Value

a df with object_recognized, object_count, object_description, object_location