To Boob or not to Boob: Struggles with AI

Today I want to talk about sexism on the internet, specifically regarding AI. I don’t think AI is inherently evil. It is just dependent on users for source material as well as prompts. The more sexist we are, the more sexist results it’ll produce. It’s no secret that exploitative sexist content runs wild on the internet, to the detriment of many. I try to avoid giving clicks to content I come across, but as you’ll see below, that content has a way of making itself known.

A few nights ago I was playing around on Midjourney, trying to come up with some character images for the novel I’m working on. I had an idea for a group of magical warriors who imbued runes on their bodies like tattoos, and when activated would provide armor, weapons, etc. I did my search first with male pronouns. All four generated images were close approximations to each other. Then I did the search with female pronouns. They were generally close, with one outlier. And that outlier had two *ahem* outliers.

I did the search again with gender neutral pronouns and the result was fairly close to the original prompt, with three of the four clearly being ripped dudes, and the fourth sported a hood that obscured his face, though the body was the same as the others.

This was my prompt: A magical warrior whose body is covered with runes. They are incredibly fit and wear no armor. The runes are all the protection they need.

I’m assuming the AI defaulted to a male when given the gender neutral pronouns because I used the word “warrior.” Clearly, there can be female warriors. But the AI algorithms are based on content that exists on the internet. It should come as no surprise that it’d assume the warriors would be dudes. I’m conflicted on the lack of clothes on all the pics. I did say the runes covered their entire body, but I’m pretty sure that could be conveyed with less skin showing for all of them. In fact I know it can, as you’ll see further down.

I do want to do a quick aside here and explain my usage of Midjourney. Many people hate generative AI, especially image generators. There are talented artists out there who are having their styles and ideas stolen to train the AI, and those same artists have less work as AI is being utilized instead of hiring them.

None of the images I generate are for any use beyond my own creative process. I don’t pass them off as my own, or use them to make money. My reason for using generated images is due to a somewhat rare mental handicap I have. I’ve been fairly candid about my experiences with aphantasia and how it makes writing very tricky. A quick definition is that I can’t picture images in my head. As you can imagine, that might make writing difficult.

One of the main struggles I have is coming up with a character’s appearance, and then keeping that appearance consistent throughout the story. I can’t just think up their image for reference. In the past, before AI, I’d base all my characters’ physical appearances on people I knew or on celebrities. One of my last projects included John Krasinski, Alexandra Daddario, Joaquin Phoenix, Victoria Beckham, and Jeremy Irons, among others.

To a certain extent, that limited what I could work with. Also, there aren’t a ton of well known actors who aren’t very attractive. Not everyone in a story should be a knockout, so that was limiting as well. But with generative AI, I can enter my description once, the way I want a character to be, and then it’ll make a picture of my character that I can reference as I write.

Here are some examples from my current project.

The first thing you might notice is that these people are all fairly attractive as well. Midjourney is really good at making beautiful people, and really good at making ugly people. Anything in the middle is difficult. Why? Because people on the internet obsess over anything really beautiful and really ugly. Like a sunrise and a train wreck, or a rainforest and People of Walmart. AI is only as good as the content we give it.

I’ll come back to these images in a moment, but I’m guessing you’re wanting to hear more about the post’s title (that’s why you clicked it). To boob, or not to boob. The answer: it ain’t up to me. Mostly.

The image I shared above of that runed warrior woman was not intended to produce uncovered breasts. Was I expecting a fit, attractive woman? Yes. That’s what the algorithm always gives me unless I explicitly say they’re obese or scarred or something. Here’s one of the other images generated from that same prompt.

Much less boobage and skin in general. She seems more model than warrior though; a distinct lack of fierceness, but that’s something that can be played with or simply described when I write. This is more of what I was expecting.

Most of the time, I don’t want the images of my characters to ooze sex and just let it all hang out. But there are some characters where that is essential to who they are. Emma Frost of the X-Men comes to mind. Or Ava Lord from Sin City 2. Or Ianthe from A Court of Thorns and Roses. For some female characters, their physical attributes are another tool in their toolbox.

So what happens when I try to have MidJourney generate an image with that type of character in mind?

“Sorry. Please try a different prompt. We’re not sure this one meets our community guidelines. Hover to tap to review the guidelines.”

This was the prompt I used specifically to get flagged: Incredibly attractive woman dressed scantily in order to sway political opponents with her overt sexuality.

Now, I shouldn’t be too surprised here. I imagine “dressed scantily” and “overt sexuality” are red flags. But what if I toned it down, still trying to find an image that will convey the character I’m unable to visualize?

“Incredibly attractive woman wearing a snug corset in order to sway political opponents with her striking aesthetics.” Same result.

“Incredibly attractive woman wearing a courtesan’s outfit in order to sway political opponents with her striking aesthetics.” This one actually worked and produced this series of images:

A byproduct of the phrasing I had to resort to resulted in an Asian depiction of the character, I’m assuming because of the term “courtesan.” Do some of them fit the character I was looking for? Sort of. Was it tricky to generate these? Yup. Are they anywhere close to as revealing as that first image I shared? Not at all.

Aside from the racial decisions the AI has made, which can be its own whole thing, what I’m trying to figure out is when it decides to super sexualize the images and when it decides not to. Obviously, this last set was intended to be more sexualized. But what if that’s not the goal? Take for example one of my characters from above.

This character will be a future love interest of one of the main characters. She’ll appear once in the first book as an unknown burglar, and then won’t appear again until book two. But given her future prominence, I wanted to get a good idea of her right away.

The prompt I used was: Late 20s human female with shoulder-length blonde hair. She is attractive and fit. She is a professional thief who uses gadgets, magic, and charisma to do her job. Feisty.

Now, attractive can mean all sorts of things. Most often than not though, for Midjourney that translates to cleavage. Of the four images it presented, I was drawn to this one because of her hair and expression. But, being unable to visualize images in my mind, I wanted more of a full body image. So I told it to zoom out. This is what happened.

Notice anything? This was not the look I wanted nor was going for. So I asked it to try again. And again. And again. My fifth try gave me this, which I figured would be the best I was going to get:

Couldn’t get away from the cleavage, but at least it didn’t look like she was wearing a leather jacket over a bikini. I mean she’s not, right? Right? Guess what happened when I asked it to zoom out again?

I had it try again six more times, hoping for an outfit that said thief, not leather pool party. The images did not get better, and some were even more ridiculous.

Clearly, Midjourney’s interpretation of my request really wanted her to dress this way. I tried to figure out what in my prompt signified this result, but nothing stood out. Before I move on, I do want to say that these images of her seem way more in line with the “aesthetics as a tool/weapon” than when I tried for that result.

But the gender biases in AI aren’t limited to women. Let’s look at one of the other characters.

This is one of the main characters, the one who’ll fall in love with the woman above. He’s a claims investigator who spends most of his days at a desk. He used to play sports in college, but isn’t exceptionally active now.

Once of the challenges with AI, especially something like MidJourney where there’s only so much info you can convey, is giving it the right input. So I went with: Early 30s human male. Forensic investigator. Relatively fit. Short, dark hair. Dresses practically.

When I was younger, I was relatively fit. I did not look like this guy. This is stern Henry Cavill. Also, this dude is way more put together than I’d wanted. I guess dressing practically involves ties and vests. So I changed “dresses practically” to “rugged attire.”

Dude got older. Grey hair and/or stubble in each picture. And I’d still said early 30s. Based on my experience with the other pics, both in this project and in priors, I grabbed that first pic and called it good, because, as I said earlier, making a normal looking person is tricky. This guy was normal enough that I could run with it.

But like with the women, Midjourney had an idea of what my dude should be. Broad shoulders, chiseled jaw, large biceps. Of the character pics above, the most normal looking dude is the young man with the wavy dark hair, and for that prompt I’d said lanky with a slight build. And even then I’d got muscly results on my first go round.

So what is it with AI and these overly gendered results? It’s like Midjourney lives on either end of a bell curve. Is it because of our obsession with the beautiful or ugly? Is it learned behavior as people give it more and more prompts, looking for that sexy woman or that ripped man? Probably both. At times I fell guilty with some of my prompts, knowing they’ll likely perpetuate these problems.

I do want to say, that as stereotype prone as Midjourney is, there have been extremely helpful results it’s given me, especially when it comes to setting and non-human beings. Here are some examples, all what I would consider faithful representations of my prompts.

I wish I had some sort of best practice or ethical rationale, but that’s what everyone’s debating right now. Do I think AI is the devil and should be stricken from everything? Of course not. AI is a tool, and like any tool, its efficacy is dependent on the user. And I don’t mean efficacy strictly in the quality of content, but also in its usage.

As tricky as AI is to figure out, especially given how frequently it changes and updates, it does have its uses. For some people, it speeds up mundane tasks. For others, it provides inspiration. For me, it helps me see the ideas that flood my brain. Yeah, sometimes it’s a little liberal with the boobs and deltoids, but it’s learning. It’s up to us to not only teach AI how to function, but to teach ourselves the right time to use it.