Back to Index Page Articles


Parts of the article might not be correcly converted. For best experience, go to the Tor site.
http://ttauyzmy4kbm5yxpujpnahy7uxwnb32hh3dja7uda64vefpkomf3s4yd.onion




CGI vs AI

September 13, 2023


I showed my movie Moria's Race in a gas station once. A woman came by, saw it and asked me "Is it AI?". I obviously cringed. Because it wasn't AI, it was CGI. And there is a difference. Well technically there was one AI tool that was used. The open-image-denoise denoiser that comes preinstalled with Blender. But it's not really AI. Or is it?

See AI stands for Artificial Intelligence. So things like generating images from text inputs, which requires intelligence, could be done today with AI, where the intelligence is artificial. But CGI on the other hand stands for Computer Generated Images... Wait! Huh?

It's funny how something like Stable Diffusion even though called AI is better expressed with the name CGI. While something like Avatar, even though called CGI, is better expressed with some other term that perhaps doesn't exist, because people were okay with CGI. Yes, technically there is a term Motion Capture, or as the producer of the movie calls it E-Motion Capture, since it's an improved version that captures emotions, rather than just motions. But James Cameron, the director of the films calls the process Performance Capture instead. And the funny thing is, all 3 of those terms completely avoid the second half of the process. The putting together of the scene in software and then rendering it out into pictures. Something that is until now has been referred to as CGI.

So how should we call things? Well... Let's speculate.





We got a sale!




Whether Moria's Race is CGI, AI or something else entirely I'm not sure. I definitely worked very hard for the movie to be made. Even though technically one AI tool was used for the movie ( open-image-denoise ). So I don't consider it in the same category with something made with Stable Diffusion. I consider is more like a normal film. Though shorter in length.

Few days ago, somebody actually purchased the movie. The funny thing is, as far as I know, that person saw the movie before purchasing. And did the purchasing, because the movie turned out to be good enough for it. Which kind of flips every preconception of business and throws the copyright industry out of the window. Anyway, here is the movie.


Moria's Race






"Artificial Intelligence"




First of all "Artificial Intelligence" is way too broadly used. I mean an algorithm playing chess is technically "Artificial Intelligence". But what we are talking about is nicely summarized by the term "Machine Learning". Which in and of itself a very interesting term.

So the intelligence here comes not in a form of knowing how to make a picture based on mere text input, but rather, the ability to get to that point from zero, based on some examples. And therefor Stable Diffusion is not "Machine Learning" per say, but rather a product of "Machine Learning". A machine that doesn't know nothing is not intelligent. Because this machine is literally a representation of a Tabula Rasa. While on the other hand a learned machine is more intelligent. Therefor making it more "Artificial Intelligence" and less "Machine Learning".

So I think not to confuse the process and the result of "Machine Learning" while not confusing intelligence programmed by somebody with intelligence learned, we can use some sort of new term. There is "Machine Learning Algorithm". But it presenting the learning in real time. Which is possible. I'm not saying that those algorithm cannot keep learning while being in use. But if it's something like open-image-denoise that comes preinstalled in Blender and doesn't learn, so to speak. Where you have normal cycle of updates instead. This is not "Machine Learning Algorithm" it's a "Machine Learned Algorithm".





Generating vs Filtering "Machine Learned Algorithms"




There is a difference between such "Machine Learned Algorithms" as open-image-denoise and Stable Diffusion. One receives an input of a noisy picture and outputs a clean picture without noise. The other receives a vague description in text and outputs a detailed image based on this description. You can see the severity of the difference. And I think the difference here is in how artistic the algorithm is.

For example open-image-denoise doesn't generate details. It just removes the noise. You can look at both images and think that it does generate some details. But those details are present in the data fed into the algorithm. Since a 3D software can fit things like Z-buffer and normal maps into it, if not the whole model, to help it know about details that didn't show up on the render. And thus it can more correctly recreate the scene. But it is still very dependent on the input itself. And the more noise in the picture originally, the less it can clean it properly.

On the other hand, Stable Diffusion basically makes up the picture almost entirely alone. Where with open-image-denoise, somebody had to make this picture only for the algorithm to clean it, Stable Diffusion makes the picture from scratch. Nobody needs to make nothing. Somebody just needs to type some words. So we have a clear difference between the two. And the difference is how artistic the algorithm is.

Stable Diffusion is very artistic. Open-image-denoise is very not artistic. Open-image-denoise is just a mere filter. And in Blender it is represented as a filter. In compositing nodes, under the category Filters you can find the Denoise node. While if something like Stable Diffusion was in Blender's compositor it could come under the Generate category.

So we have "Generative Machine Learned Algorithms" and "Filtering Machine Learned Algorithms". They are two different beasts doing completely different things. But how about something like Google Deep Dream that takes an input of a picture and outputs a different picture. It sounds like a filter, but in between it generates new images based roughly on shapes it recognizes in the picture that was fed into it. Is it "Generative" or "Filtering"? Is it both? Is the presence of one cancels the name of the other? And if so, which cancels which?

People that are trying to argue that they made the art themselves. Will they be lying if they used merely "Filtering Machine Learned Algorithms"? I asked this question on Mastodon not so long ago. I presented the idea of how open-image-denoise operates. And I received a unanimous vote that open-image-denoise or "Filtering Machine Learned Algorithms" are okay. And so Moria's Race, the movie where I used it, can be considered as done by me.

But then what if I used something more extreme? What if I used Google Deep Dream? It is technically a "Filtering Machine Learned Algorithm". Is it still me making the movie? Or because it generates detail, it was the algorithm that made the movie?

So okay, if it does generate detail it is only "Generative Machine Learned Algorithm". And so for it to be "Filtering Machine Learned Algorithm" it should be merely filtering and never generating. Or there could be a term such as "Filtering and Generative Machine Learning Algorithm" that would describe both actions in the same time. But then something like translating text into images is also a form of very extreme filtering where a lot is generated.

Or look at, say, a lens flare filter. It is technically generating new stuff, based on what is in the input image. So is it as bad as Google Deep Dream? No it's not. And intuitively it makes sense. But why?

I think the answer might be in who actually controls the output. Can you predict exactly what the algorithm will do to your picture? Or is it so generative that it will come up with stuff that are not under your control. Look at it like this. In the wikipedia page about Google Deep Dream, they have this image:




It's a Mona Lisa painting re-imagined as a dog. Did the person who ran the algorithm intend it to look like a dog? Or was the algorithm itself the one that chose to do it? With something like a denoiser, you know what you want to get and the algorithm delivers it. The same is true for a lens flare filter, or a blur filter, or an algorithm that deletes a green screen and puts a different image on the background.

But then something like Stable Diffusion is the same kind of filter. You know what you want to get. Perhaps you want to get a picture of an astronaut riding a horse on a moon. And it does it. You get the picture.




So how is then that Stable Diffusion is apparently the evil algorithm? Is it all based on it taking away jobs? If you think about it, it does make sense.

Somebody was going to be paid to either draw a picture of a guy on a horse, or stage it, or make it using photo-editing software. But now the person who needs it, just types it into a prompt and gets it immediately and way cheaper. So the job goes away.

On the other hand open-image-denoise is not taking a job, but rather shortens a time for a different algorithm. It takes a job that was already made with a computer. Before Blender had a denoiser, the only way to get a decent image was to render more samples. To render for longer time. Now, we can stop mid-way and ask open-image-denoise to magically make it look as if we waited the other half as well. Nobody suffers from it. Everybody is only happier. While with Stable Diffusion, somebody lost a job.

So here the term "Artificial Intelligence" actually comes back. We substitute a human real intelligence with a machine's artificial one. An algorithm that is there to replace a skill. So what is it a "Skill Substituting Machine Learned Algorithm"? Okay!

So then what does it make open-image-denoise? You can argue that it is reducing time and therefor it's "Optimizing" or something. But then how about any other machine learned or coded algorithm that shortens the time to make something that is not skill based, but rather just tedious to do? For example, what about physics simulation algorithms? Some are machine learned. And yes, you can argue that a person would otherwise animate the physics. But then nobody actually wants to. It's tedious and mathematical.

We develop software so tedious things we don't enjoy doing, would be done for us by our powerful machines. We do not like to check every word we write for grammar, for example. We use a grammar checking tool that is now in every text editor imaginable. I use it right now as I type this text. And we understand the value of it. We use calculators instead of adding and multiplying on a piece of paper. Software is there to ease things for us. To reduce the painful, in order for the pleasurable to stay.

But I think we reached the point where we don't value art at all. To the point where making it becomes the new tedious. Where making it becomes the thing that we want to delegate to the computer. Because all we want is to merely enjoy art. But then what joy is there if art was made by a computer?

Artists do art not because they have to. Not because it's their job. Even if it's their job, they do it because it is what they want to be doing. Because making art is fun. It is entertainment in and of itself. It is something that people want to do. And so for those people it is strange that somebody wants to skip the process entirely. Where is a joy in "Machine Generated" art? There is non. It's just content!





Content




The word "Content" is now under a radar for a lot of people due to how effortlessly it is used by executives of media companies to describe whatever they are producing without actually paying any ounce of attention to what that is.

Facebook and YouTube for example are both companies grown large on the concept of "User-generated Content" meaning that those who run the website do not need to populate the website with value, but rather the users of the website will populate the website with value themselves. But here we are getting a rather stark difference of perspective.

For those who run those websites ( and for executives, say, in Hollywood ) what is being produced is of no importance. They just want something to be produced. Better if this something also makes them money. But that's about it. Filmmakers on the other hand emphasize the importance of every project. The why this project was chosen is as important as how it performed. They respect the projects themselves and value the art in making those. While in big corporations on the other hand, what is being produced or how is not important. If they can get it more cheaply, they will. But even that is a tediousness job for them. Thinking about hiring a producer for a film project is a job that they rather delegate to a computer. I'm not even talking about the making the movie itself.

For those people "Generative Machine Learned Algorithms" are the thing that they want the most. That would be the perfect program that would stop their tedious, uninteresting work. Imagine just having a server that every 1 second outputs a new original movie idea, writes the script and makes all the shots. The costs are only on electricity and internet connection. The tedious job is reduced to its minimum. And on the other side a steady flow of "Original Content" is ready to be shoved into people's minds on demand.

On the other hand the artists, those who enjoy crafting, want to craft. They want to come into a place where they can create. I would be lying if I told you that I didn't enjoy the making of Moria's Race. Every time I chose a shot I was giggling with excitement and anticipation on how people would react to it. Every little nuance that I put was thought about and enjoyed being thought about. Yes, there was a lot of tedious bullshit. For example, the walk animations. Oh my god. I need to automate those. And you can see in the movie that I didn't take too much interest in the walk animations. They kind of look meh, compared to the rest of the film. And I avoided them as much as I can. But hell I enjoyed most of the rest of it.

I feel weirdly jealous of people who are afraid of "Generative Machine Learned Algorithms". Of people who have jobs that they like so much, that they cannot understand substituting those jobs. I work regular, low paid, tedious jobs. I have been a cashier twice. I have been a warehouse worker. I stood on a conveyor belt and did the same motion one million times a day. All those made me money, but non of those I liked doing. I return home to do what I like. To write this article, that I enjoy writing immensely. I would not generate it with ChatGPT. Are you kidding me? I enjoyed coming home to animate Moria's Race. I will enjoy making the next project, whatever it will be.

Perhaps I am conditioned to dislike what I do for money. Maybe it's from the parents that were raised during the soviet union. Maybe I'm insecure about myself. There was a chance to get into a film festival in Israel with my movie. And I didn't do it in time. Maybe next year. A friend of mine recommended doing it because it would allow me to know people with whom I can make a real cinema picture. And finally leave the warehouse. I like it. It sounds amazing. But in the same time it sounds unreal. Too good to be true. Making art in a tiny room, nearly starving, which nobody will enjoy later. That sounds realistic. Depressing, sure. But realistic. Perhaps I need to work on myself psychologically to allow myself to consider doing what I enjoy to do for the living.

For the last few years I rationalized my not working a job I enjoy by thinking that it is impossible to do ethically. I like programming, but the companies hiring develop proprietary software. I like making films, but they distribute them with DRM. I didn't even think that there might be an option to do a movie, that will make money, but will be under a Creative Commons license. I've talked to a few people that understand the business and they say that it is possible. Perhaps the license should be added only after the movie runs in the cinema, so the cinema will not run away with movie. But it is possible. Making Free Software and being paid for it is also possible. Just perhaps I have to believe in it first, to find an offer that I'd like.

Strange how an article that I started by trying to understand the difference between CGI and AI, ended up a kind of therapy. I feel so good that I written this rambling nonsense. Whatever the result might be, I enjoyed the process a lot.

Happy Hacking!!!