Saturday, 14 January 2023

Becoming Borg

The kind of elf you'll find in any Wizards of the Coast book, but AI-generated
Once again a post from Monsters and Manuals about AI has prompted... thoughts.

I agree entirely with this statement:
The real 'threat' posed by AI is not that it will replace us, but that we will come more closely to resemble robots in our thoughts, behaviours, and opinions.
You can already see similar transformations play out in the worlds of pop music and movies: the ability to (relatively) simply produce anything within the bounds of our imaginations has resulted in a regression to the mean, creating soulless pap to appeal to the lowest common denominator. Generations who grow up knowing nothing other than these mediated mediocre media come to accept them as the norm, which will shape their own expectations, thoughts, and behaviour (this has already happened to expectations of sex as a result of easily-available hardcore pornography).

It's depressing, but it's going to happen. And there's nothing we can do to stop it.

But it won't happen for a while (and it might make sense for anyone writing about this topic to first have a play with the tools and discover the current state of the art).

Regular readers of this blog will know that I've posted a fair bit about AI (I really didn't set out to do this but... rabbit holes), and I have set up The Mycoleum shop for T-shirts etc. featuring images shamelessly generated by AI.

This is the flowering of a lifelong quest to get computers to surprise me. When I started programming in the early 80s, my best friend was the random number generator. I used it to choose and combine, and was delighted every time the computer gave back something which I hadn't explicitly asked for...

I lapped up Stephen Levy's book Artificial Life when it came out in 1992. I played with flocking algorithms and developed variations on the Sorcerer's Cave board game, generating random labyrinths and caves filled with arbitrary monsters and treasure. Sadly these, and my visual "flashperiments", are no longer usable since Steve Jobs killed off the App Store's biggest rival, Adobe Flash. But a couple of years ago I resurrected Flashperiments in the form of a JavaScript music visualiser, used to create this video for my friend Will's track Ice (inspired by the Donna Kavan novel of the same name):

Most recently, I created the Twitter bot Nanodeities, which populates templates with selections from a long list of possibilities. By nesting templates within templates within template within... turtles, the possibilities are virtually limitless, and yet each is generating by combining words and phrases which I typed out myself.

AI is the realisation of my lifetime's dream: finally, a computer can awe me with its unexpected creativity. Or rather, "creativity". Anyway, I can finally get a computer to go to places I truly would never have dreamed of.

And this phrase, "would never have dreamed of", is key. In the world of the Mycoleum, unpredictability is a feature. For most commercial creators, it is a bug.

The Mycoleum works because it is a world of unimaginable fantasy, inscrutable, understood only by the mushroom-folks who live there. It's almost impossible to express just how surprised I am by the computer's responses to my prompts. But what becomes clearer, with each picture that I try to coax out of it, is how little of my prompt it "understands". This is best demonstrated with examples (all made using Midjourney).

I noticed today that somebody on the Yoon Suin reddit had posted an AI image of a crab-man. So I decided to make one of a slug-man. Below are the images generated, labelled with the prompts that generated them (minus my special sauce that makes the style of a keeping with my other AI images):
The decadent slug-man ruler of Yoon Suin
The computer seems to have completely ignored the "slug" part of my prompt here, and gone with the slightly racist assumption that "Yoon Suin" must be some Fu Manchu-type character. I'll ditch that bit, and focus more on making my ruler more slug-like...
A half-man half-slug emperor
Still not seeing a lot of slug in this; wearing a slug is not the same as being a slug. Let's be more specific about the body parts I'm after...
An emperor with the torso and head of a man and the hindquarters of a slug
Oh FFS. I've no idea what's going on here. Perhaps I should ditch the "man" part entirely because, come to think of it, I'm not really after a slug-centaur, just some kind of slug-with-personality. Let's try...
A slug emperor
 You call those slugs??? I see frog, I see tortoise (at a stretch, if I squint, it could be a snail). I see aphid, iguana... bugger all slug. At this point I have zero expectations of success, but I'll see what happens if I add a bit of human back into the mix...
A humanoid slug emperor.
No, no, no, no, no, no, no. NO!

This "artifical intelligence" simply does not speak my language. It recognises the odd word, but not how sentences work. I am painfully reminded of the time I outsourced the building of a website to India and what I got back was... well, I've no idea what it was, everything got lost in translation. (I should add that this was over 20 years ago, things have changed a lot; in fact, I am collaborating with Bangalore-based Jayaprakash Satyamurthy on King Arthur vs Devil Kitty, and I would recommend him without hesitation to anyone looking for editorial assistance; his novel Strength of Water and his short story collection Weird Tales of a Bangalorean are also excellent).

The slug-emperor attempts were portraits, which is actually one of the things that AI does best (see elf-maiden pic above). As soon as other elements are introduced, you're more likely to roll a 1 on a d10,000 than you are to get what you want. Describing a scene and having the AI render anything even vaguely approaching what you had in mind: forget it. Yesterday I spent an hour trying to get it to draw an oak sapling growing beneath a gorse bush. It couldn't even draw a gorse bush. Even with very simple bog standard D&D prompts, it fails: 
D&D magic-user casting fireball at an orc
When I started writing the adventure Ship of Theseus (now Ship of Thorsus) I had planned to use royalty-free art from oldbookillustrations.com. The images I sourced were scarcely representative of the adventure content, and they were also used across many other OSR games and zines, but they were free. Once I started playing with AI art, I figured I could use that to generate images which were also free, but which were original and which more closely matched the adventure content. Could the AI draw a Viking ship whose every inch of surface bristled with vicious thorns? Could it heck as like.

There is certainly an art to crafting prompts for the AI, but even the most artful prompt-crafter in the world would struggle to get an AI to paint a scene like "the slugman emperor of Yoon Suin riding on a palanquin borne by crabmen through dusty streets lined with obsequious crowds of humans; the emperor bears a sceptre and an ivory back-scratcher".
the slugman emperor of Yoon Suin riding on a palanquin borne by crabmen through dusty streets lined with obsequious crowds of humans; the emperor bears a sceptre and an ivory back-scratcher.
This is the current state of play. But it will change. How long before an AI is able to approximate a scene in the manner in which its overlord intends? It could be anything between a year and twenty years, but I would estimate around 5.

Once that happens, the Borg will have arrived.
 
By the way, over at my personal blog I've just posted a look back over the (mostly) non-gaming aspects of my last year. I also have a new email newsletter.

5 comments:

  1. I agree with that regression to the mean argument up to a point. At the same time, it makes the same kind of generational and condescending argument that every generation seems to make, acknowledging how their actions negatively impact the subsequent generation, while simultaneously not allowing for them agency for themselves and essentially pre-determining them as intellectually and creatively invalid.

    The reality is just that most people are uninteresting and uninspired, but the effect that having more tools to more easily create art of any kind, whether that's AI tools, or the internet as a medium, or platforms designed to facilitate independent production such as itch.io, etc., on the overall creative cultural output, goes both ways; there will be more uninspired art, sure, but there will also be more inspired art, of a kind we likely lack the imagination to predict, because by its nature creativity is difficult to predict (and this is also why ML and AI art does not replace creative expression, just changes its nature- hopefully in a way that is ultimately more additive than subtractive, which I am inclined to believe will be true but which I acknowledge is not a certainty).

    Similar to some of the points you expressed, it's like how pop music and billboard charts or whatever have gotten more homogeneous, even as the variety of genres and the number of performers out there, whether on youtube or spotify or soundcloud or tiktok or wherever, has grown disproportionately far more so than the degree to which the popular consensus as homogenized. The bell curve may seem wider if you're only looking at the center, but when you rescale it to account for everything happening at the tails, I question whether that is actually the case.

    ReplyDelete
  2. I agree with a lot of this - the glitchiness of the medium is where the action is at - as soon as get it to work better it will produce garbage of course.

    Kenny G practices 6 hours a day, of course, searching for a perfect tone, untouched by anything but his skill.

    I often have a fantasy that in fact the Uber-Operating System (lower case U I guess, robots, not rideshares) will be perfected, but that humanity, or a blueprint for humanity, will survive within the confines of Temple OS

    https://en.wikipedia.org/wiki/TempleOS

    Must take issue though with:

    "The ability to (relatively) simply produce anything within the bounds of our imaginations has resulted in a regression to the mean, creating soulless pap to appeal to the lowest common denominator ..."

    This is simply not really possible to quantify at this time - who knows what's bubbling up.

    I don't think the computers (upstream) are really the threat to the Underground though, but rather downstream: the ubiquity of information and the facility and speed with which capital can co-opt the "next big thing". Or even so gentle a master as over-access: ubiquitous choice limiting emersion and the concentrated skill so wrenched from the static.

    ReplyDelete
  3. I hadn't heard of TempleOS before - that's nuts!

    I agree that we don't know yet what's bubbling up, and I also think that the new tools have enabled some really wonderful stuff that wasn't possible before, but I think it's likely that the mainstream will become ever more formulaic and standardised.

    ReplyDelete
  4. Having thought all this over a little bit more, I would propose that said effects occur in several distinct and antagonistic phases: perhaps we, team humanity, have an inedible tendency toward patois and trade tongue - but for same reason, when exposed to GENUINE robot brain (or some fascimile thereof), we will make some crazy crazy things.

    1 - tool doesn't understand what's being asked of it - wild times!

    2 - tool understands now - boring boring - our dreams are pretty substandard, to be honest, at least with an n of 8 billion

    3 - tool does the questions - even worse

    4 - tool's questions become completely inscrutable - there is something more akin to a cultural exchange going on: zippy and energetic!

    either the tool's shallowness or its depth lets the grass grow. In between its asphalt

    ReplyDelete
    Replies
    1. That sounds about right. Annoyingly, Midjourney AI recently moved to a new improved version. The results are much slicker, and much more boring. Ah well, it was fun while it lasted.

      Delete