AI Art Application and Improvements Handbook: Difference between revisions

[unreviewed revision]

Content deleted Content added

Inline

Revision as of 10:08, 14 December 2023

This AI Art Application and Improvements Handbook is intended to help people create free useful media for the public domain using AI art generators in practice with a special focus on post-editing and getting things done in practice.

Prompts

Which prompts work best differs by AI generator. The promptomania prompt builder is a great place to get started with prompts and to have a cheatsheet of different art styles one could use. It is missing many styles but may become more complete over time and be good enough for learning purposes. Many sites such as openart.ai and playgroundai.com let you see many other filterable/searchable images along with their prompts which you could build upon and learn from. Here is a further comprehensive resource.

Misgenerations and creating improved versions

As you can see below there still are some issues with these images. People who have better AI art skills may be able to generate much better images. Usually one may need to slight manual editing.

Moreover, over time these images could be improved by their uploaders or other people using for example tools including:

the [# the Clipdrop cleanup tool]
inpainting (requires some skills)
AI art web platforms' [# face restoration]
upscaling features
manually editing the images in image editors like GIMP or Photoshop
…

If you can improve an existing image or an image you uploaded earlier on Wikimedia Commons, upload it as a new version, not as a separate new file. If the image has text, it can be removed via the listed ways. However, to prevent text from being anywhere in the image is best to use negative prompts, albeit that can be problematic for example when you'd like to generate a street scene with store texts being visible in the background. This is a good example of a specific skill to learn when generating AI art: creating texts that fit neatly into the image.

You need to continuously adjust the prompt until you get good results, sometimes and at some point it is better to just generate a new image from the same prompt rather than adjust the prompt (make sure the seed is set to random and not always the same except if you want to make the image look like the one just generated).

You can also generate a new image from the image just generated via img2img and then put it underneath the newly generated image as a layer in GIMP. Then cut out the upper layer to have the former visible at the places where you'd like it to.

Negative prompt

If you see things in your generated image that you don't want there or anticipate that the AI generator may add them or misunderstand your prompt in certain ways add these as negative prompts.

A good negative prompt to use when you generate

humans: extra fingers (TBA)
rooms: picture frame, frames

Parameters

Some images have their parameters specified. A step count of around 40 often yields best results. Setting the prompt strength too high such as over 10 makes it more difficult to get a good picture.

Differences between generators

Stable Diffusion is open source so that one is recommended and focused on here. However, Midjourney may as of 2023 often generate better images in many cases and DALL-E probably as well in some or many cases. A difference between SD and DALL-E for example is that in SD the prompts are phrased like tags, not whole sentences or similar.

Applications where AI art can be useful

Paleoart of the ancient past

AI art can be used to create realistic-looking scenes that depict the past either how we it may have looked like to the best of our knowledge, for example including high-resolution depictions of extinct ancient organisms.

For such images, img2img techniques can be used.

File:Demonstration of AI-restoration of prehistoric scenes - a dinosaur in a fern forest (Migmanychion).jpg
Base image 1 (see WMC cat Paleoart)
Base image 2
File:Hesperocyon AI reconstruction.png
File:Hesperocyon reconstruction v2.png
Base image 1

In the first example parts of the leg were cut out so it looks like the organism is walking through the fern.

It may also be possible to use tools like DreamBoth to train AIs on a set of images or even 3D models that depict ancient organisms like a species of dinosaurs.

Do not use images in Commons::Category:Inaccurate paleoart as a base and add Commons:Template:Factual accuracy to any image you know is inaccurate.

Most currently available paleoart only depicts the extinct organism (e.g. a dinosaur), but does not place them into an environment of flora and fauna that is theorized to have existed at the same time. Those images that do are usually low resolution. One exception is this image which shows how such scenes could look like.

Huminids lived in caves and/or did not have civilizational lifestyle for hundreds of thousands of years. Despite of that, there literally was not even one high-resolution image in the public domain that depicts how daily life may or is though to have looked like for most of human existence, or at least none on WMC. This was changed by the emergence of advanced AI image generators in the 2020s, the two images below are two of very few in Commons:Category:Ancient humans in art:

Caricatures and public characters

In the 2020s it became more easily possible to create artworks using public characters due to the emergence of AI art generators like Stable Diffusion.

This

it democratized the creation of caricatures and political art
enabled problematic online misinformation
enabled humorous art using known characters, including fictional characters (prime example: 'Harry Spotter')

It works well with some specific public characters without any kind of extra training. Some of these are well-known to be easily generatable in realistic-looking ways such as Vladmir Putin.

One example use-case is to generate a portrait of a person and the background in ways that is linked to that person such as art illustrating scientific theories for scientists or art styles for artists like the Vincent van Gogh image kind of hints at.

At the same time it can be a problem to use specific characters in specific environments, for example the generators then generate that person multiple times rather than only once or also make the person show up in picture frames. This may change with future generators where you e.g. can specify where the person is located or how often. Keep that in mind when creating your prompts; there also are many options to solve such issues beyond negative prompts such as cutting the generated person out and placing the person into an image.

Others require fine-tuning using tools and techniques like DreamBooth, the first image below is made with Stable Diffusion/Imagine without any kind of extra training and the second used DreamBooth, where the second looks much more realistic regarding Jimmy Wales' face:

Humorous societally- and historically- critical art
After DreamBooth training

The reasons for why some famous characters do not look realistic with current models without extra training are unknown and that may change over time.

It also allowed the creation of videos with public characters:

These abilities have scratched the sensitivities of some religious people and worries of political elites regarding democratized political art.

And it also enables democratized artistic depictions of historic public characters which can e.g. be used for humorous images, higher-resolution portraits, innovative/creative combinations, or realistic AI art for historical scenes:

More or less the first CCBY artwork/illustration of the Rendlesham Forest UFO incident (could be improved to more closely match the descriptions)

It can also be used to create art depicting people not commonly featured in high-quality art such as specific scientists which are usually not the subject of art and fiction with an exception of e.g. the movies 'The Theory Of Everything' and 'Oppenheimer':

Mendeleev
Aristotle

Historical scenes

AI art can be used to create realistic-looking scenes that depict the past either how we it may have looked like to the best of our knowledge or how stories depict it. The latter may also include images for imaginary stories of the past, illustrating how imaginaries of past people may have looked like in more visual ways.

Whether or not there are still some minor glitches may not matter very much when you're interested in visualizing for example how ordinary daily life experienced by average people may have looked like in high resolution or when creating the first image of some historical events that are in the public domain rather than locked away.

Using tools like DreamBoth one can train AIs on a set of images based on a historical figure. Below are some examples which may deviate somewhat from how Ferdinand II of Naples (Ferrandino d'Aragona) looked like at an older age according to the artistic drawing that is the first image here and the second image that was drawn a whole hundred years after he died:

AI art generators usually make people look better so as you can see it may often deviate from existing images of a character. However, if you provide more training hidata or the AI generator is well trained on it (which is sometimes the case for some currently famous people), then the characters may look more realistic.

Instead of making the file focus on the character it would be better to focus on the historical event or the historical scene. For example, the image could portray how a village in the Middle Ages may have realistically looked like at high resolution.

It can also be used to create high-resolution realistic images of historical figures in realistic or unrealistic settings.

As just explained, AI generators still have problems with generating faces and other issues. Please keep that in mind since correcting that can require significant skills and may limit the usefulness or realism of the images.

Images can also focus on historical events entirely without any kind of historic character, realistic or not, in the foreground.

Educational games

AI art can be used to generate the images for board games, for example for the cards. These can be educational games or otherwise useful. Note that in such cases you should only generate the image, not full cards because the text for example will be gibberish.

Objects and topics for which no free media is available

For example it can show how pulp science fiction comics looked like or how what a science fiction subgenre is about or what the styles and themes of it are.

Illustrating contents of books

Illustrating the world of the book 'The Windup Girl'
Children's book "13th prophecy"
Ubik
Ubik (similar to some covers and depicting a main subject of the book with the text gibberish fixed)

Styles merged and adopted

Art meant to depict a civilization adopting an old art style in a modern scifi context

Illustrating technologies, ideas and concepts

Especially useful if no other or only low-quality images are available for the concept

Illustration of artifacts / colonies on comets in the context of technosignatures
Illustration of a technological innovation (could also be used to illustrate prototypes / mock-ups)
Illustration of a mythological being adopted to modern scifi as done earlier in some studies
Illustration of 'science fantasy' and arcology
Illustration of 'science fantasy'

Illustration of 'science fantasy'
Solarpunk sustainable city design illustration
Microdrones in art and science fantasy illustration

Microdrones in art and science fantasy illustration
High-resolution illustration of contemporary post-apocalyptic art
Concept of robotic aliens
Illustration of the cyberpunk genre, a street scene without e.g. neon lights which are present in most if not all comparable free media depicting the genre
Concept of a collapsed civilization on another planet
Green city urban planning illustration

@@ Line 146: / Line 146: @@
 File:Ferrandino d'Aragona, albarello.jpeg|~1477 childish portrait
 File:Ferrandino.jpg|Died in 1496; image made in 1596
+File:Re Ferrandino d'Aragona e la moglie Giovannella di Napoli 01.jpg
+File:Re Ferrandino d'Aragona e la moglie Giovannella di Napoli 02.jpg
+File:Re Ferrandino d'Aragona e la moglie Giovannella di Napoli 03.jpg
+File:Re Ferrandino d'Aragona e la moglie Giovannella di Napoli 04.jpg
+File:Re Ferrandino d'Aragona e la moglie Giovannella di Napoli 05.jpg
+File:Re Ferrandino d'Aragona e la moglie Giovannella di Napoli 06.jpg
+File:Re Ferrandino d'Aragona e la moglie Giovannella di Napoli 07.jpg
+File:Re Ferrandino d'Aragona alla battaglia di Seminara 01.jpg
+File:Re Ferrandino d'Aragona alla battaglia di Seminara 02.jpg
+File:Re Ferrandino d'Aragona alla battaglia di Seminara 03.jpg
+File:Re Ferrandino d'Aragona alla battaglia di Seminara 04.jpg
+File:Re Ferrandino d'Aragona alla battaglia di Seminara 05.jpg
+File:Re Ferrandino d'Aragona alla battaglia di Seminara 06.jpg
+File:Re Ferrandino d'Aragona alla battaglia di Seminara 07.jpg
 </gallery>
 AI art generators usually make people look better so as you can see it may often deviate from existing images of a character. However, if you provide more training hidata or the AI generator is well trained on it (which is sometimes the case for some currently famous people), then the characters may look more realistic.