Once I began considering A.I. as a possible solution to my problem, I scoured the Web for information. There were already a ton of videos out there, and I watched quite a few. I looked at a lot of the AI generated art, as well.
Here’s a note I sent to the guys helping me, after I perused a recommended site looking for “models” of different stuff (I was new to this and didn’t understand most of the terminology):
“I looked for character models; go figure, it was all badass busty macho chicks, sexy elves, and NSFW.
I clicked "illustration"... badass busty macho chicks, sexy elves, and NSFW.
OK, sez I, and clicked "landscapes." There was some random dude in a suit, a couple actual landscapes; and hundreds of badass busty macho chicks, sexy elves, and NSFW.
"Cars" did have some cars--all imports.
"Retro" actually had some styles I thought might be useful, and downloaded. But mostly badass busty macho chicks, sexy elves, and NSFW.
"Sci-fi" had some robots and dudes in futuristic clothing; but mostly guess what?
Hmm. There might be a pattern here, sez I.
I've kind of run into this with both human artists and the online AI art programs. No matter what you ask for, they want to give you macho chicks. Maybe once I learn to "train" SD, I'll have to train it not to assume every character needs to be one. Because apparently, that's what 99% of people using AI have been asking for.”
The impression I got was that A.I. was a toy for rendering actresses in combat gear. Everything was basically a one-off poster. No storytelling efforts.
Well, there was a guy who used it to make a “comic book.” I watched his video. There was no plot. The entire book was random images of a cartoon dog (in the style of Robert Crumb, I guess) on random bizarre backgrounds. Dude actually printed this thing on paper and bound it. I wish I had his disposable income.
There are pothead artfags at university who would probably consider such a creative work important experimental art. But this is not the sort of project I was interested in.
I needed the software to be able to generate consistent characters and objects for multiple panels. Linear stories with beginnings, middles, and ends require continuity that an audience can follow.
Early on, I experimented to see what Stable Diffusion 1.5 could do, and typed in a prompt to make it generate an image of one of my superhero characters. Here are four of the results I got:
This was not gonna be easy.
Just so you know, I did not ask for Siamese Twins, mangled hands and feet, lazy Down Syndrome eyes, mutant feet/legs or hands/arms on backwards. These images should clue you in why, to get an acceptable image, you have to rewrite your positive and negative prompts in between generations umpteen times, tweak about a thousand settings about a thousand times, use inpainting and outpainting on the generated images about a bazillion times…and then use an image editor to manually fix stuff, because the software (“artificial intelligence”) JUST. WILL. NOT. DO. IT.
If you see an AI generated image that looks great and there’s nothing wonky or bizarre in it, it is extremely unlikely that somebody simply typed a prompt, and out popped that picture. Getting a usable image out of AI is a lot of work.
Almost as much work as getting a Fiverr artist to draw what you paid them to draw.
While researching what Stable Diffusion can do, I learned a few different ways to “train models.”
What does that even mean?
Let’s say I want to use Sylvia Lipshitz as a character in sequential art. I feed the program pictures of Ms. Lipshitz at different angles, at different depth-of-focus, wearing different clothes and hairstyles, with different facial expressions, in different poses, with different lighting and different backgrounds. After that, I can ask it to generate images of her at different angles in different poses, etc.
If the software can now generate a character with the Lipshitz face, then I have successfully trained the program on the Sylvia Lipshitz model.
The most efficient way to train models at the time was by creating a LORA. There were tutorials on how to do this a few different ways. At least one of them required installing a user interface called Kohya SS. One by one I followed these tutorials, installed the suggested additional software and tried to get a model trained. I tried each method several times. I followed instructions exactly. To the letter. I never got an error message.
I never got a model, either.
Much like a young man going out into the real world after being raised by the boob tube, government schools, and a single mother, I did everything I was told would work, but none of it worked.
Waddya do then?
Stable Diffusion has no user manual, and there’s no help line to call. I reached out to my tech-savvy friends. They didn’t know why it didn’t work. I went to the comment threads of the guys with the Youtube tutorials, explained the problem, and asked what I should try. No response.
Well, sometimes there was a response. Imagine you’re a brain surgeon who has just successfully finished an operation, but your patient has a diminished ability to recognize patterns (or whatever) now. You retrace your steps several times and confirm you did everything exactly the way you were supposed to, but the problem persists. Out of desperation, you turn to an “expert” in the field. You explain the situation thoroughly, precisely, in detail, and ask if he has any insights. He responds by telling you how to wash your hands.
“Washing hands before surgery is very important. Make sure you use soap and water. There ya go. You’re welcome.”
That is the sort of guidance I got.
I joined Discord groups overseen by some of these Youtube tutors, and asked for help there.
“Gosh, that sucks. It should have trained a model for ya. Did you try washing your hands, first?”
I joined a support message board (or whatever they’re called—very similar to the old BBS) and sought help there. No luck.
I’ve explained this issue in a few paragraphs. You probably read it in 30 seconds or so. But actually going through this process with all the rinsing and repeating, in between my day job and all the other stuff that needed to be done in life, beating my head against this AI wall cost me over a year.
There were other neat extensions/add-ons/whatever, too. One of them is ControlNet, which will help you pose characters (assuming you have a trained model of a character). According to one video, it can even convert black & white images to color. Well, that wouldn’t work for me, either.
All my investment of time and money (for the high-performance computer) were for naught.
It looked like mastering Clip Studio Paint and Blender, and learning how to draw it myself (draw the right way this time, hopefully) was the only unblocked road for me.
But reality check: I had put all my other creative efforts on hold for years trying to get this ball rolling. Getting my graphic novels published (or even just starting on the illustration) was all likely going to take a few more years.
My Great American Novel was collecting dust on the back-burner, still unfinished. It wouldn’t have gone back on the backburner if it wasn’t for this frustrating detour into graphic novels. Also, it was a labor of love. It was important to me, fun to write, gratifying, theraputic, cathartic. I didn’t want it to remain unfinished. I had the juice to finish it. And the best part? I didn’t need anybody’s help or collaboration, nor did I need to pay any money to finish it or publish it.
To be continued…
I have had similar issues. I found a great guy on Fiver after having to pass on several. You can see his work on my YouTube channel by watching a Gravity Keyper Vid. I'm currently considering Reallusion (3D images, easy to learn ) and some other character-creation software. The best AI images came from Firefly.
Thanks for the comment, Gio. Yeah. "AI" art has come a long way in the last few years, but it's still not there, yet.