Customized Vision using LLMs

Usage

llm_image_custom(
  llm_model = "qwen2.5vl",
  image = system.file("img/test_img.jpg", package = "kuzco"),
  backend = "ellmer",
  system_prompt =
    "You are a terse assistant specializing in computer vision image sentiment.\n                   You are short and to the point. You only respond if the user supplies an image.\n                   You will observe the image and return JSON specific answers.\n                   Return as JSON\n                   Do not include backticks or 'json' within your answer but purely the json.\n                   Do not return NULL, all fields must be complete.\n                   Do not return the exact examples given but fill out the template,\n                   supply your own new original answer every time. ",
  image_prompt = "please return JSON for image according to the example format supplied",
  example_df = data.frame(image_sentiment = "positive", image_score = 0.6,
    sentiment_description = "image envokes a positive emotional response."),
  provider = "ollama",
  ...
)

Arguments

llm_model: a local LLM model pulled from ollama
image: a local image path that has a jpeg, jpg, or png
backend: either 'ollamar' or 'ellmer'
system_prompt: overarching assistant description, please note that the LLM should be told to return as JSON while kuzco will handle the conversions to and from JSON
image_prompt: anything you want to really remind the llm about.
example_df: an example data.frame to show the llm what you want returned note this will be converted to JSON for the LLM.
provider: for backend = 'ollamar', provider is ignored. for backend = 'ellmer', provider refers to the ellmer::chat_* providers and can be used to switch from "ollama" to other providers such as "perplexity"
...: a pass through for other generate args and model args like temperature

Value

a customized return based on example_df for custom control