feat(img): implement rudimentary image API #31399

chipsenkbeil · 2024-11-29T22:01:21Z

Status

This PR serves as a nice feasibility study but has grown too large to consider merging. So it is temporarily parked while a suitable API is being discussed.

Feature & Implementation Breakdown

The following is both a tracker and detailing of the features supported by this PR, what they look like, and a visual of each implementation (provider) in order to support as many terminals as possible, including on Windows. Monitor this top-level comment for the current API structure and visual examples of the PR in action.

Core API Structure

The API follows a simplistic, object-oriented design similar to vim.system(). You specify an image, optionally eagerly loading its data, and then display it within neovim. The information about where and how it is displayed is provided through a single table.

Each image, once shown, returns a placement, which represents the specific instance of the image on screen. This is to support the separation of an image and its data from the details and management of displaying it.

The image API makes use of an internal Promise<T> class in order to support both synchronous and asynchronous operation, meaning that images can be loaded, shown, hidden, and updated asynchronously in order to be efficient, but also can manually be waited upon at any stage.

A static image you can place within neovim

-- Supports loading PNG images into memory
local img = assert(vim.ui.img.load("/path/to/img.png"):wait())

-- Supports lazy-loading image, deferring to a provider
local img = vim.ui.img.new("/path/to/img.png")

-- Supports specifying an image and explicitly providing the data
local img = vim.ui.img.new({ bytes = "...", filename = "/path/to/img.png" })

-- Once created, can be shown, returning an object representing the instance
local placement = img:show():wait() -- Places in top-left of editor with default size
local placement = img:show({ pos = { x = 4, y = 8, unit = 'cell' }):wait()
local placement = img:show({ relative = "cursor" }):wait()

Support deleting the image placed within neovim

local img = vim.ui.img.new({ filename = "/path/to/img.png" })
local placement = assert(img:show():wait())

-- Supports hiding the specific instance of the image
placement:hide()

Dynamically resize/move an image

local img = vim.ui.img.new({ filename = "/path/to/img.png" })
local placement = assert(img:show({ pos = { x = 1, y = 2, unit = "cell" } }):wait())

-- Supports updating a displayed image with a new position
placement:update({ pos = { x = 5, y = 6, unit = "cell" } }):wait()

-- Supports resizing a displayed image
placement:update({ size = { width = 10, height = 5, unit = "cell" } }):wait()

Asynchronous support for image operations

img:show({ ... })
    :on_ok(function(placement)
        -- Use the placement once it has been confirmed as shown
    end)
    :on_fail(function(err)
        -- Do something with the error that occurred
    end)
    :on_done(function(err, placement)
        -- When either ok or fail happens
    end)

-- The same applies for all APIs:
--
--    img.load() -> returns Promise<Image>
--    img:show() -> returns Promise<Placement>
--    placement:show() -> returns Promise<true>
--    placement:hide() -> returns Promise<true>
--    placement:update() -> returns Promise<true>
--
-- Each Promise<T> supports chaining callbacks for individual
-- conditions of success or failure as well as combining the two
-- together.
--
-- The on_* methods can be called multiple times and each
-- callback will be invoked when finished.
--
-- You can also still choose to wait in a synchronous fashion
-- using `:wait()` which supports supplying a specific timeout
-- in milliseconds.

Image Manipulation & Inspection

Leveraging ImageMagick for the functionality, this section describes features needed to manipulate and inspect images. This is particularly needed for sixel, which would normally require advanced image decoding to convert to an RGB format and then packaged in sixel's format. ImageMagick supports features like cropping and resizing, which we expose through the convert method.

Additionally, provides a retrieval API called identify (modeled after magick identify) to inspect images without needing advanced header parsing per image format in order to learn details like the true image format (not just reading the file extension) and pixel dimensions of the image.

Manipulation an image

-- Making use of ImageMagick, you can convert an image
-- to different formats, resize it, crop it, and more
--
-- And like all other APIs, this returns a promise
img:convert({ 
    format = 'jpeg', -- Supports a variety of formats, including sixel!
    size = { width = 50, height = 30, unit = 'pixel' },
}):on_done(function(err, data)
    -- By default, this returns the data of the image
    -- instead of updating the instance or saving it
    -- to disk
    --
    -- Use img:convert({ out = '/path/to/img.jpeg' })
    -- to write to disk instead
end)

Retrieve information about an image

-- Retrieving information also makes use of ImageMagick
-- to learn things like image dimensions and format
img:identify({ format = true, size = true }):on_done(function(err, info)
    -- Info is a table containing the information requested by each
    -- field marked true
    --
    -- In this case, info.format == "PNG" and
    -- info.size == vim.ui.img.utils.Size { width = 50, height = 30, unit = 'pixel' }
end)

Kitty Provider

The 1st protocol supported is kitty graphics protocol. Aside from the obvious kitty, other graphics-accelerated terminals like Ghostty make use of this protocol to display images. This protocol only requires magick when attempting to load an image other than PNG, which will be converted to PNG and placed in ~/.cache/nvim/imgs/{sha256(filename)}.png.

Integration of kitty as a provider
```
vim.o.imgprovider = 'sixel'
```
neovim-img-kitty.mp4

Sixel Provider

The 2nd protocol supported is sixel image protocol. Windows Terminal, in particular, makes use of this protocol to display images. This protocol requires magick in order to convert images into the sixel format, as well as all operations like resizing and cropping.

Integration of sixel as a provider
```
vim.o.imgprovider = 'sixel'
```
neovim-img-sixel-windows.mp4

iterm2 Provider

The 3rd protocol supported is iterm2 image protocol. WezTerm provides a GPU-accelerated implementation of this protocol. This protocol only requires magick in order to crop images. Resizing images does not require magick, and iterm2 supports multiple image formats.

Integration of iterm2 as a provider
```
vim.o.imgprovider = 'iterm2'
```
neovim-img-iterm2.mp4

3rd Party Providers

To enable GUIs and other image protocols to easily slide into neovim's API abstraction, a singular API is exposed through vim.ui.img.providers.new, which implementers will use to specify show and hide functions at a minimum. Each provider supports operations when loading, unloading, checking supported to see if the provider can operate in the current environment, and (if possible) specifying a more efficient update method.

The methods supported, show, hide, and update take a callback to support asynchronous operation, and must use it to communicate when the operation is finished. The methods load and unload are considered synchronous and should be fast to operate.

Abstraction for 3rd parties to support images in neovim

-- Providers implement a small API to support showing and hiding images
vim.ui.img.providers['neovide'] = vim.ui.img.providers.new({
    ---(Optional) Called to initialize the provider.
    ---@param ... any arguments for the specific provider upon loading
    load = function(...)
      -- Implement here
    end,

    ---(Optional) Called to cleanup the provider.
    unload = function()
      -- Implement here
    end,

    ---(Optional) Reports whether this provider is supported in the current environment.
    ---@param on_supported? fun(supported:boolean) callback when finished checking
    supported = function(on_supported)
        -- Implement here
    end,

    ---Displays an image, returning (through callback) an id tied to the instance.
    ---@param img vim.ui.Image image data container to display
    ---@param opts vim.ui.img.Opts specification of how to display the image
    ---@param on_shown? fun(err:string|nil, id:integer|nil) callback when finished showing
    ---@return integer id unique identifier connected to the displayed image (not vim.ui.Image)
    show = function(img, opts, on_shown)
        -- Implement here
    end,

    ---Hides one or more displayed images.
    ---@param ids integer[] list of displayed image ids to hide
    ---@param on_hidden fun(err:string|nil, ids:integer[]|nil) callback when finished hiding
    hide = function(ids, on_hidden)
        -- Implement here
    end,

    ---(Optional) Updates an image, returning (through callback) a refreshed id tied to the instance.
    ---If not specified, neovim will invoke `hide(id)` followed by `show(img, opts, on_updated)`.
    ---@param id integer id of the displayed image to update
    ---@param opts vim.ui.img.Opts specification of how to display the image
    ---@param on_updated? fun(err:string|nil, id:integer|nil) callback when finished updating
    ---@return integer id unique identifier connected to the displayed image (not vim.ui.Image)
    update = function(id, opts, on_updated)
        -- Implement here
    end,
})

Example of using the neovide 3rd party once specified above

-- Load an image to display, nothing different here
local img = assert(vim.ui.img.load("/path/to/img.png"):wait())

-- Set the global provider, which will be used for all image operations by default
vim.o.imgprovider = 'neovide'

-- Display the image, which will use the neovim provider
local placement = assert(img:show():wait())

-- Alternatively, you can manually create a placement and specify the provider
-- This will not be shown yet until specifically invoked
local placement = img:new_placement({ provider = 'neovide' })
placement:show({ size = { ... } }):wait()

Maybe in scope for this PR

Injection of an image into a buffer within neovim (tracking its movement properly, possibly through the use of extmarks)
1. Support raw placement of an image on top of text in a buffer, but have the image scroll with the buffer
2. Support injecting an image into the buffer and reflowing text around it

Out of Scope for this PR

Video/gif support (there are reasons why this would be neat, but not a dealbreaker if we want to exclude this from neovim core)

runtime/lua/vim/img/_image.lua

runtime/lua/vim/img/_terminal.lua

kovidgoyal · 2024-12-01T03:15:39Z

On Sat, Nov 30, 2024 at 01:29:25PM -0800, Chip Senkbeil wrote: @justinmk heads up, one complexity that we'll punt for now is supporting non-PNG images. I think we can write a pretty straightforward decoder for BMP & GIF, but JPEG is very complex and would /probably/ need a specialized C function to do it with the assistance of a JPEG-oriented library. This is in order to get RGB or RGBA data. @kovidgoyal I'm assuming my understanding of pixel formats is correct in that if we fed in any other image format that was not PNG, using `f=100` would not work, and we'd need to instead decode the base64 image data, figure out the format (i.e. bmp, jpeg, etc) and then extract a 24-bit RGB or a 32-bit RGBA set of data to feed in order for your protocol to work.

Yes, correct. You can use either imagemagick or the statically compiled kitten binary that comes as part of kitty to do this.

I don't know what iterm2's graphics protocol supports as I've only tested with png and I don't see anything mentioned on their doc page. I also don't know what sixel supports or how it works since I haven't read the documentation yet, but I imagine given the age of sixel that we'd need to support image decoding of some kind to break out rgb/rgba data.

sixel supports nothing, you need to convert every image format to the sixel format and transmit that.

runtime/lua/vim/_editor.lua

runtime/lua/vim/img.lua

runtime/lua/vim/img/_backend.lua

runtime/lua/vim/img/_terminal.lua

runtime/lua/vim/img/_image.lua

runtime/lua/vim/img/_backend.lua

runtime/lua/vim/img/_image.lua

justinmk · 2025-05-21T10:29:15Z

runtime/lua/vim/ui/img.lua

+  vim.validate('opts', opts, { 'string', 'table' })
+  if type(opts) == 'table' then
+    vim.validate('opts.bytes', opts.bytes, 'string', true)
+    vim.validate('opts.filename', opts.filename, 'string')


"file" is common in vim.fs. Eventually we will have URIs, and "file" is more compatible with that.

Suggested change

vim.validate('opts.filename', opts.filename, 'string')

vim.validate('opts.file', opts.file, 'string')

justinmk · 2025-05-21T10:33:11Z

runtime/lua/vim/ui/img.lua

+---Will throw an error if unable to load the bytes of the file.
+---Works without ImageMagick.
+---@return boolean
+function M:is_png()


strange for a boolean getter to throw. can this be format():'png'|nil instead, where nil means "unknown"? or maybe format():'png'|'error'|nil

or perhaps is_png():boolean|nil

Yeah, can do that instead. Do you expect to support other formats other than PNG down the line? If so, one of the format functions makes sense. If not, the is_png makes more sense.

let's assume PNG.

justinmk · 2025-05-21T10:34:01Z

runtime/lua/vim/ui/img.lua

+---Returns information about the image by invoking ImageMagick's identify.
+---This will be some uppercase string like 'PNG'.
+---@param opts? vim.ui.img.IdentifyOpts
+---@return vim.ui.img.utils.Promise<vim.ui.img.Metadata>
+function M:identify(opts)


this is definitely out of scope for this PR

Only reason it existed was to easily find any image format's dimensions (width & height). If we just stick with PNG, I know how to read that from the file header. So, no problem.

justinmk · 2025-05-21T10:34:28Z

runtime/lua/vim/ui/img.lua

+---Converts an image using ImageMagick, returning the bytes of the new image.
+---
+---If `background` is specified, will convert alpha pixels to the background color (e.g. #ABCDEF).
+---If `crop` is specified, will crop the image to the specified pixel dimensions.
+---If `format` is specified, will convert to the image format, defaulting to png.
+---If `out` is specified, will treat as output file and write to it, and promise returns empty str.
+---If `size` is specified, will resize the image to the desired size.
+---@param opts? vim.ui.img.ConvertOpts
+---@return vim.ui.img.utils.Promise<string>
+function M:convert(opts)


out of scope

Not a problem. If we did want sixel or need cropping for iterm2 (if we brought it in), then we would need this. Or we'd have to provide or write a PNG decoder.

requiring PNG seems reasonable. converting to PNG is a different topic, which could be covered in some other api in the future. but the caller of this api is expected to provide PNG.

Yes, I think supporting png only is reasonable. It also makes storing the images in memory more efficient, which we will need to do, since Kitty has a limited buffer size and images can be evicted.

What makes the evection even worse, is that on tmux with pass through, we don't have a way to know when that happens or how much quota is used. So, I think the only safe way to do it, is to re-upload the image to kitty each time a new image is brought to view.

@fredizzimo in snacks they use a least recently used queue to know which images to re-upload. I was going to implement that, but held off when discussion started back.

Or we transmit again every time, but that definitely has noticeable performance considerations from what I saw with iterm2 and sixel.

LRU out of scope for this PR? Retransmitting out of scope for this PR?

I think we only need to retransmit when a new image appears, not for every movement, but yes it could still be annoying when scrolling up and down.

LRU, with some size limit could work. The only thing we really have to be careful with is when the image is displayed in tmux or similar, in the other cases, neovim has full control. And the same thing applies to unique id collisions, not much we can do about that in a tmux context.

Since this does not affect the interface, I would vote for out of scope.

For LRU we have vim.ringbuf. But yes, let's save performance improvements for a followup, not here.

justinmk · 2025-05-21T10:36:27Z

runtime/lua/vim/ui/img.lua

+---Creates a placement of this image that is not yet visible.
+---@param opts? {provider?:string}
+---@return vim.ui.img.Placement
+function M:new_placement(opts)
+  return require('vim.ui.img.placement').new(self, opts)


seems overwrought to represent "placements" as objects which need to be created. I would expect something like M:set_pos.

Positions are called "pos". Avoid a new "placement" concept.

Maybe the naming is wrong on this - was following kitty's naming - but the concept is that the image is the file path and (optionally) data of the image itself and a placement is the reference to the image managed by the provider in the terminal (or UI) of which there may be multiple. Maybe we just don't make this a public method or we change the name, but M:set_pos makes me think that I'm moving the vim.ui.Image instead of one of the displayed versions on the terminal.

Maybe we just don't make this a public method or we change the name,

If it can be private for now, let's do that.

but M:set_pos makes me think that I'm moving the vim.ui.Image instead of one of the displayed versions on the terminal.

Is the intention to avoid creating a new image for each placement? Can we instead just say "a new placement requires new image". So an image object has one position, and cannot be placed in multiple places. That might be "inefficient", but so what?

The use cases for showing the same image multiple times, is probably quite limited. I can only really think of demo type effects, games and repeated UI elements. So, I think limiting the interface to only support images would be fine.

Furthermore, I'm pretty sure this can be optimized on the provider side. It can calculate the hash of the image data and properties and internally share it and keep track of the ids.

The use cases for showing the same image multiple times, is probably quite limited. I can only really think of demo type effects, games and repeated UI elements. So, I think limiting the interface to only support images would be fine.

Totally fair! I don't have a use case for same image repeated, and should be able to merge placement and image without much trouble.

Furthermore, I'm pretty sure this can be optimized on the provider side. It can calculate the hash of the image data and properties and internally share it and keep track of the ids.

Yep! I already do that within the sixel and iterm2 providers, and it works well. So agreed the optimization can be on the provider side. 😄

Will make the change along with the other requests.

chipsenkbeil · 2025-05-22T13:11:20Z

Thanks @justinmk and @fredizzimo for locking down discussion and changes. I'll get to these soon.

So it's clear what I'll do:

Remove ImageMagick
Remove iterm2 and sixel providers
Rename filename param to file and bytes param to data within Image
Remove crop option when showing an image
Remove Region as it was only needed for cropping
Remove Size and Position and flatten those parameters in the Opts table as we no longer need to carry around unit type
Merge Image and Placement
Remove supported check from providers as they don't function in neovim (can't capture CSI or kitty responses)
Remove screen size code (was temporary)
Move promise to _promise internally
Update PNG function to return boolean or nil
Wrap img import within ui.lua via the defer @justinmk mentioned
Remove imgprg and imgprovider and hard code kitty.
Keep the printio-test bin per @justinmk feedback

What I won't do unless you tell me you want it now:

Convert provider to be synchronous
Remove promise code in favor of libuv/luv style (ironically, what I started with)
Change to use the nvim_* discussed by @fredizzimo
Add LRU to kitty provider to track which images to retransmit
Update kitty provider to transmit every time you show the image

Anything I missed? Typing from my phone, so going from memory.

justinmk · 2025-05-22T13:45:20Z

Remove Size and Position and flatten those parameters in the Opts table as we no longer need to carry around unit type

Yes. It can be done in a way that allows us to re-add those in the future, right? If I'm wrong, can you show a small example of the function signature before/after.

What I won't do unless you tell me you want it now:

Convert provider to be synchronous
Remove promise code in favor of libuv/luv style (ironically, what I started with)

Not required here, but hopefully in a followup. Async should ideally be opt-in via an optional callback param.

lewis6991 · 2025-05-22T13:48:10Z

Remove promise code in favor of libuv/luv style (ironically, what I started with)

I would prefer this. I'm working on adding vim.async and porting luv style will be easier.

fredizzimo · 2025-05-22T16:05:08Z

We still need to discuss the exact API semantics, especially regarding the size parameter. That was the main reason I opened my API proposal, to have a specification. The removal of the absolute position support wasn't even included in the initial version.

As far as I can see the exact behaviour is unspecified here, but from based on the actual implementation and interface we have the following. Size is either optional or present, and if present both width and height are required. And when present, it's passed directly along to Kitty.

This means.

Not present - The image is displayed with the size of the source image
Present - The image is stretched/shrunk so that both the width and height fit the columns and rows exactly
There's no way to control how big the image, while at the same time keeping the aspect ratio.

This seemed OK to me when I though Kitty worked differently, and we would have more control being able to specify pixel sizes, preserving the aspect ratio. I was counting on the helper functions, which were unspecified to take care of the details. But that would still only have worked for a single TUI/GUI case.

So, this is what I specified instead here #34093
For buffer/inline positions both dimensions are always required, otherwise only one. Furthermore, there's a keep_aspect field that defines the behaviour when both dimensions are specified, with the idea that one of the dimensions is always filled. I thought that this was doable without knowing the cell aspect ratio, but it's actually not so I'm revising my suggestion:

Regardless of the placement mode, you can specify non, one or both of the dimensions. If both are specified, the keep_aspect ratio field determines if both dimensions should be used fully or only the one with the best fit. If none are specified, the number of columns and rows is automatically determined.

Bonus: minimum maximum support like snacks.

As mentioned, we will need to know the cell aspect ratio and/or size according to the following:

Non specified and absolute positioning - No need unless we want to be able to query how many cells the image actually is
Non specified and inline positioning - We need the exact cell size so that we can allocate room for the image in the buffer
One specified and absolute positioning - No need, kitty supports scaling accordingly unless we want to query
One specified and inline positioning - We only need the aspect ratio
Both specified - We need the aspect ratio to determine which size to use
Bonus - We need aspect ratio

Note that the cell aspect ratio can be virtual, for example 1:2, and the images would still be displayed with the correct proportions. It's just that the reserved area might be bigger or smaller than intended. So, in that case no terminal query support is needed, and no decision needs to be taken about which terminal is the main one. This approach also works well when multiple different UIs are connected.

Therefore, I suggest that we always require at least one dimension for now in this initial version, because terminal query support and designating a primary UI, which need a lot of design thought by itself, is still not implemented.

The z-index behaviour also needs specification

Do we copy the Kitty one?

Finally, you can specify the image z-index, i.e. the vertical stacking order. Images placed in the same location with different z-index values will be blended if they are semi-transparent. You can specify z-index values using the z key. Negative z-index values mean that the images will be drawn under the text. This allows rendering of text on top of images. Negative z-index values below INT32_MIN/2 (-1,073,741,824) will be drawn under cells with non-default background colors. If two images with the same z-index overlap then the image with the lower id is considered to have the lower z-index. If the images have the same z-index and the same id, then the behavior is undefined.

That is the only thing we can implement with the Kitty protocol I think. But it's hard to do in Neovide, since it fights against the natural order of floating windows in Neovim
2. Do we allow only positive indices? Meaning that the images will always be on top of all other Neovim elements. The z-index has no relation to the window z-index at all. At least this interpretation is much more doable
3. Or do we leave it undefined? I don't like this at all, even if Neovide itself would benefit from an interpretation where the z-index follows the window z-index, or even more fine grained. But the compatibility issues are worrying me.

I suggest that we implement 2.

fredizzimo · 2025-05-22T16:12:29Z

I also suggest that we battle test this against snacks.nvim. It has an absolute positioning mode to support wezterm.

fredizzimo · 2025-05-22T16:20:03Z

Remove promise code in favor of libuv/luv style (ironically, what I started with)

I would prefer this. I'm working on adding vim.async and porting luv style will be easier.

Couldn't we avoid async support altogether by externalizing the file loading, and pass a data buffer containing the PNG file data to the API? I see zero reasons why anything in here should ever block or require calculation.

fredizzimo · 2025-05-22T21:09:50Z

runtime/lua/vim/ui/img/providers/kitty.lua

+
+  -- Capture old cursor position, hide the cursor, and move to the
+  -- position where the image should be displayed
+  local pos = opts:position():to_cells()


If the position is not meant to be dynamically updated when the mouse or window moves, I think this calculation should be moved out of the provider. And the provider gets the precalculated position.

I had previously assumed that it was up to the provider to dynamically watch and update this, partially because of the mouse movement demo. But looking at the code, that does not appear to be the case.

If could be interesting though if it is updated dynamically by the provider, for example Neovide could move the image pixel by pixel with the mouse, and the image would follow smoothly with an animated window.

Hm... Maybe it has to be on the provider side after all, in the case of multigrid with custom window placement, only the provider will know the location it should be displayed.

In any case, it needs to be specified if it's supposed to dynamically update or not

It gets even more tricky than that. Since the provider processes this asynchronously, the cursor or mouse might have moved since the request, which results in another image location than intended. So, I think it has to be either continuously on the provider side, or immediately when the request is done outside the provider.

That means that multigrid UIs that make their own placement breaks, but that's the case with a lot of things, so I don't think it's a huge problem. These kinds of UIs probably need their own specialized plugins anyway. Also with inline placement, none of this is a problem, other than that the image will never follow the mouse or cursor automatically.

This is batched together and not sent immediately. That's what the writer does when it calls flush(). So it aggregates:

Saving the cursor

Hiding the cursor

Moving to a new location

Sending the display request

Restoring the cursor position

Showing the cursor

All of that is sent at once, so the jump of the cursor and return to its original position (wherever that may be) is really quick.

If the position is not meant to be dynamically updated when the mouse or window moves, I think this calculation should be moved out of the provider. And the provider gets the precalculated position.

This is a static location sent to the provider. The call to position() just figures out the row and column based on the relative field. So, not a dynamic moving position. Every time you call position() it will return the same value unless you change the opts.

It's a helper function attached to the opts so each provider doesn't need to implement that calculation itself.

I'm talking about this sequence and the positions specified by the placement

The user clicks somewhere

It takes x amount of time until the request is processed by the provider, which might be on a remote machine. With UI updates being batched.

At this point the Neovim cursor or mouse pointer might have moved

Oh, I see. Yeah, to avoid that, we'd need to calculate it on the spot before sending to the provider.

One way to fix this, would be to remove the support for opts.Relative, and only support editor for now, I'm not sure how much convenance functionality the protocol should provide, those are all easy to do on the caller side.

And as I mentioned, another option is to make it dynamic, meaning it updates in real-time when the cursor or mouse is moved. In that case it would also do something that can't be done otherwise.

@fredizzimo I just submitted a bunch of cleanup, and a distinct commit to try to solve this. I just updated the data structure that is fed to the provider to just have a row, col, optional width/height, and optional z.

The user-provided opts get mapped to the internal opts used by the provider. This gets calculated at the point in time when the user wants to show/update an image, and the provider can assume it always has the precalculated opts. Thoughts?

Yes, I think this part looks good now. We can possibly add real-time update support with an option in the future if needed.

justinmk · 2025-05-23T15:01:26Z

We still need to discuss the exact API semantics, especially regarding the size parameter.

The details that you pointed out are useful to track in a (concise) issue. If we find things that need to be changed, we can quickly adjust.

The point is to have something that works on master branch NOW instead of endlessly debating it and not having real-world feedback from master branch users.

I suggest that we always require at least one dimension for now in this initial version,

ok.

Do we allow only positive indices? Meaning that the images will always be on top of all other Neovim elements. The z-index has no relation to the window z-index at all.

sure. start with the most restricive thing. constraints can always be relaxed later, but the opposite is much harder.

fredizzimo · 2025-05-23T22:12:17Z

runtime/lua/vim/ui/img/opts.lua

+
+---Calculates and returns the position dictated by `relative` and `pos`.
+---@return vim.ui.img.utils.Position
+function M:position()


The indexing of this function is inconsistent and unclear. Personally, I think that all of them should be 0-indexed, since we are specifying offsets.

Currently, you need to specify 1-indexed positions, even for the ones that are commented as 0-indexed, since the terminal cursor position is 1-indexed

Well, technically you can specify both 0 and 1 to mean the edge of the editor, but to leave one space you need to pass 2

Another note regarding this. Snacks win could use relative = win, but in practice has to use relative = editor, since it needs to take the border into account.

And now that we have vim.ui.img.InternalOpts for the provider, the indexing need to be specified for that as well. Right now, in my implementation, I'm assuming a 1-based index, since the Kitty provider use it directly.

fredizzimo · 2025-05-23T22:49:59Z

I can't find a function that deletes an image to free up memory, I can only see hide for placements.

Even if we combine images and placements, I think there needs to be a function to free up the memory. I also think that hiding an image without deleting the data should still be possible.

chipsenkbeil · 2025-05-24T03:49:51Z

@justinmk @fredizzimo @gpanders @clason I've updated the PR to follow through with the items I listed to make it just kitty and clean up/remove a lot of the code we talked about. I hope this makes it easier to review going forward!

@fredizzimo I haven't updated docs or added checks to assert that one or both width & height are specified yet.

Do we allow only positive indices? Meaning that the images will always be on top of all other Neovim elements. The z-index has no relation to the window z-index at all. At least this interpretation is much more doable

When I implemented sixel and iterm2, this is essentially what I did because it would be near impossible to get them to behave the same way as kitty when it comes to negative indices. I didn't stop anyone from creating them with negative indices, but it would just behave the same way as positive.

I can't find a function that deletes an image to free up memory, I can only see hide for placements.

Even if we combine images and placements, I think there needs to be a function to free up the memory. I also think that hiding an image without deleting the data should still be possible.

Are you referring to the data loaded into the image class? If so, it's a public field, why can't the user just do img.data = nil to remove it if they want to keep the image instance itself around?

fredizzimo · 2025-05-24T10:34:15Z

I can't find a function that deletes an image to free up memory, I can only see hide for placements.
Even if we combine images and placements, I think there needs to be a function to free up the memory. I also think that hiding an image without deleting the data should still be possible.

Are you referring to the data loaded into the image class? If so, it's a public field, why can't the user just do img.data = nil to remove it if they want to keep the image instance itself around?

I'm thinking more about the provider side, without deletion on the user side we are forcing all providers to implement some kind of eviction system. Even with the automatic eviction on the Kitty side, it's not a good idea to keep the buffer full, since other applications that shows images will suffer.

But maybe the planned fifo system could do it two ways, to also send deletion requests and keep a fixed size allocation on the provider side at all times? That has the advantage of the user not having to do any management themselves.

I'm not sure if it's possible to use the lua __gc hooks with luajit, but even if it is, it's probably not too well suitable, due to its non-deterministic behaviour.

If it's going to be done on the FIFO side, no actions are required for this PR.

fredizzimo · 2025-05-24T11:22:11Z

I also suggest that we battle test this against snacks.nvim. It has an absolute positioning mode to support wezterm.

I made a working implementation of that yesterday, but I need to redo it with the new API changes, probably done later today. After that I will share it. It works both with the kitty protocol and a custom branch of Neovide.

It works, but is not really practical due to the zindex limitations. As predicted, images meant to be displayed in the main buffer, are always displayed on top of floating windows. Maybe some clever tricks with dynamically hiding and showing images could be done, but my focus will be on inline image support.

Edit: Snacks already does the dynamic hiding and showing part, I just didn't call the API hide properly

Problem: many terminals support some form of image protocol to display graphics; however, nvim provides no integration with any protocol available today. This has resulted in multiple plugins (e.g. image.nvim, snacks.nvim) providing solutions with even more plugins adopting (e.g. orgmode.nvim, neorg). Solution: provide a rudimentary API that integrates with common graphics protocols including kitty, sixel, and iterm2. This API should be extensible enough to cover all three protocols while also being easy to adapt for external GUIs like neovide. In addition, the API should be developed with performance and efficiency in mind, meaning taking advantage of asynchronous design and caching where possible. Overview: the following outlines the code points of interest that make up the image API implementation. 1. `vim.ui.img` serves as the entrypoint into the API, where images can be loaded and manipulated. Manipulation of images includes converting between formats, resizing, and cropping of images leveraging the `magick` binary, which is defined through the new option `imgprg`. 2. `vim.ui.img.opts` serves to extract the specifications of how and where an image should be displayed within a UI. 3. `vim.ui.img.placement` houses the logic for individual instances of an image displayed within a UI. It manages scheduling and batching of requests to show, hide, and update on screen. 4. `vim.ui.img.providers` both serves as the location where new providers are created and also where they are stored. Each of the builtin providers - kitty, iterm2, and sixel - can be found within `vim.ui.img.providers.*`. 5. `vim.ui.img.utils` contains a mixture of data structures and functions core to the needs of the image API. a. `vim.ui.img.utils.position` is an abstraction of an x & y coordinate (aka column & row) that also contains a unit definition - pixel or cell - in order to support converting between the two coordinate systems. b. `vim.ui.img.utils.promise` provides a simplistic data structure to collect a value or an error at a later date. It is heavily used throughout the image API internals and also is seen publicly as a return value in top-level APIs like `vim.ui.img.load()`. c. `vim.ui.img.utils.region` is an abstraction of a rectangular dimension of x, y, width, and height. It also contains a unit definition - pixel or cell - in order to support converting between the two coordinate systems. d. `vim.ui.img.utils.screen` exposes the ability to determine the pixel width and height of the editor, which is critical towards converting between pixel and cell units. e. `vim.ui.img.utils.size` is an abstraction of a width and height. It also contains a unit definition - pixel or cell - in order to support converting between the two coordinate systems.

@justinmk

1. Remove ImageMagick 2. Remove iterm2 and sixel providers 3. Rename filename param to file and bytes param to data within Image 4. Remove crop option when showing an image 5. Remove Region as it was only needed for cropping 6. Remove Size and Position and flatten those parameters in the Opts table as we no longer need to carry around unit type 7. Merge Image and Placement 8. Remove supported check from providers as they don't function in neovim (can't capture CSI or kitty responses) 9. Remove screen size code (was temporary) 10. Move promise to _promise internally 11. Update PNG function to return boolean or nil 12. Wrap img import within ui.lua via the defer @justinmk mentioned 13. Remove imgprg and imgprovider and hard code kitty. 14. Keep the printio-test bin per @justinmk feedback

…ge scheduling of update

fredizzimo · 2025-05-24T19:43:32Z

runtime/lua/vim/ui/img.lua

+---end)
+---```
+---@return vim.ui.img._Promise<true>
+function M:hide()


I think this function could internally check if the image is visible or not, so that hide can be called multiple times, it makes the API more pleasant to use.

fredizzimo · 2025-05-24T19:53:04Z

runtime/lua/vim/ui/img.lua

+---```
+---@param opts? vim.ui.img.Opts
+---@return vim.ui.img._Promise<true>
+function M:update(opts)


Update and show work almost the opposite way of what I would expect.

On one hand, show is used for initially showing an image, but on the other hand it can also be thought of a pair to hide. And in that case, I would like to be able to just re-use the existing options without specifying anything.

I think we could use only one function, but support three modes

nil opts - keep the existing configuration

partial opts - merge the configuration

full opts - replace the configuration

The difference between 2. and 3. could for example be expressed with a special field update=true to select mode 3, otherwise mode 2. is used.

fredizzimo · 2025-05-24T21:20:59Z

There does not seem to be a way to override the provider in the latest version, so I'm unable to test it against Neovide. I also think that the image itself shouldn't store any references to the provider as ideally the same image should display on all connected UIs simultaneously.

But I do have my hacks to snacks.nvim here https://github.com/fredizzimo/snacks.nvim/tree/nvim-api, which allows it to work with this API in terminals with Kitty protocol support.

The images are stretched, since the automatic aspect ratio calculation when both with and height is specified described here #31399 (comment) isn't implemented. And that can't be done on the snacks side, since the decisions of using width or height depends on the terminal aspect ratio.

The lack of querying basic image information like width and height is missing, so that's done on the snacks side for now.

All the API usage there is synchronous with wait, as that's still how I think that's how it should work. The provider could deal with all the asynchronous support internally. At most the loading could be asynchronous, since that needs to load the header to be able to provide the image information.

Edit: Only the picker and raw buffer display works, the doc for markdown documents needs some fixes.
Edit2: The doc display works now. but the image is displayed in the wrong position. I don't have time to check if what's the cause is right now, but likely somewhere in my snacks changes. Update: That bug exists in the main version of snacks as well, if a hover window is moved, then the image does not follow. It's just worse in my version since the windows are always big due to lack of terminal query support.

fredizzimo · 2025-05-25T21:16:37Z

I'm sorry, but I think that my previous suggestion of allowing either the width or the height to be unspecified wasn't a wise one.

Allowing that is still not enough for many practical use cases, as I found out when changing snacks.nvim. Many will need access to the actual cell geometry to determine how big the images should be displayed.

Therefore, I suggest that the size decision as a whole is externalized to the plugin, using whatever comes out of #32408. So, for now I suggest that both the width and height are required.

I'm afraid that specifying what aspect ratio is used for automatically calculating the unspecified dimension would be a bit premature, when we don't fully understand the requirements yet. It's easier to add support for that later, than to have to modify the specification and maybe break plugins that were made against the previous spec. I'm mostly worried about how multiple simultaneously connected UIs will affect it.

If #32408, is kept general so that plugins can query the cell information of all connected UIs, then the plugins have all the information needed to experiment and come up with the best solution, that can be adopted in the core later.

This specification should not affect this PR very much, other than specifying that both dimensions are required. There's no query support, and the whole existing implementation relies on the Kitty protocol doing its magic. So other than that, everything works as before.

I would still rather see that keeping the aspect ratio is the default when both dimensions are specified, than stretching/shrinking like currently implemented. But that requires determining the cell geometry on the provider side, which is an easier problem, since hopefully there will be separate providers for every UI connected. For now, that could be documented as an implementation bug.

This comment was marked as outdated.

Sign in to view

chipsenkbeil force-pushed the feat/ImageApi branch 3 times, most recently from 9964ad6 to 630f852 Compare November 30, 2024 03:00

chipsenkbeil changed the title ~~Implement image API for #30889~~ feat(img): add vim.img.protocol() to detect preferred graphics protocol Nov 30, 2024

chipsenkbeil changed the title ~~feat(img): add vim.img.protocol() to detect preferred graphics protocol~~ Implement image API for #30889 Nov 30, 2024

chipsenkbeil force-pushed the feat/ImageApi branch 2 times, most recently from fe0d5a8 to 33ee581 Compare November 30, 2024 04:27

chipsenkbeil changed the title ~~Implement image API for #30889~~ feat(img): implement image API Nov 30, 2024

chipsenkbeil changed the title ~~feat(img): implement image API~~ feat(img): implement image API for absolute positions Nov 30, 2024

ribru17 reviewed Nov 30, 2024

View reviewed changes

runtime/lua/vim/img/_image.lua Outdated Show resolved Hide resolved

chipsenkbeil mentioned this pull request Nov 30, 2024

image API #30889

Open

lewis6991 reviewed Nov 30, 2024

View reviewed changes

runtime/lua/vim/img/_terminal.lua Outdated Show resolved Hide resolved

chipsenkbeil force-pushed the feat/ImageApi branch 4 times, most recently from babd349 to dc51bf3 Compare November 30, 2024 20:56

This comment was marked as outdated.

Sign in to view

chipsenkbeil force-pushed the feat/ImageApi branch from dc51bf3 to ce818a9 Compare December 1, 2024 20:24