

Cartwheel Co-Founder Andrew Carr: AI Tools Allow Animators To Be ‘Empowered To Make More Things’
Artificial intelligence tools that can control animation software like Maya or Unreal likely have a better chance at disrupting the animation industry than generative video tools such as Sora or Runway. A new AI animation tool called Cartwheel has recently been released to the public that appears to have that disruptive potential.
Instead of using AI to generate final frames of animation, Cartwheel uses text prompts to have the AI control rigs, and the data can be sent to a 3d program of your choice for further refinement. The tool is still in the early days, but since this is exactly the kind of thing that might have a big impact on the future of animation and vfx (for better or worse), we wanted to learn more about what it actually does.
The company behind the tool recently raised a new round of funding worth $10 million, and its backers include Dreamworks Animation founder and former CEO Jeffrey Katzenberg (through his venture capital fund WndrCo.)
In the interview below, Cartwheel co-founder Andrew Carr speaks about the capabilities of the new tool, why former Pixar animators are involved in its development, and how it might affect artists in the animation industry. Carr (formerly of OpenAI) is a self-described math and science guy, but his interest in classical piano from a young age created a love for the arts that he now hopes to merge with math, science, and AI. (This interview was originally published in Life in the Machine, a great newsletter about tech in animation by veteran director Matt Ferguson, and is being reprinted here in partnership.)
Cartoon Brew: What made you decide to get into this space, and how would you describe the company’s origin story?
Andrew Carr: I was a scientist at OpenAI working on program synthesis — it’s a jargon term — it essentially means code that writes code. So, I was on a team doing this and I was looking around at other tools that were up and coming. DALL-E, the first image generator from OpenAI, was on the horizon and we were testing this tool internally. But I felt extremely frustrated every time I used it because the output was pixels, and I couldn’t modify it after the fact.
So, I had this program synthesis tool we were building, and I was frustrated with the image generator and I thought, “I wonder if I can put this ‘AI that writes code’ into Blender.” So I hooked it all up and I asked Blender — Hey, make this lighting set up, make these things do that. And it was able to do it! Most importantly, after the fact I could go into the 3d viewport and change whatever I wanted.
Light bulbs were going off for me. I saw the ability to move quickly and then have complete creative control. So, I ended up leaving OpenAI to pursue this idea. I knew there was something here.
A few months later, OpenAI emailed me and said, “We tried to hire this guy and he pitched us exactly what you went to go build. You should probably talk to him.” That’s my co-founder, Jonathan Jarvis. He’s a brilliant animator and product designer. He did animation for Google, ran his own studio. Brilliant man. We met up and connected over this idea of editability at the core.

One day over breakfast we were talking and I said, “What’s the hardest thing that we could try and solve in animation?” Immediately, no hesitation, he said, “3d characters. It’s super tough but people love it. There’s almost infinite demand for like 3d character animated content.” That was the genesis for Cartwheel. This was in 2022.
You seem to be taking an interesting approach here because as a director when I look at video models like Sora or Runway they can do all these impressive things, but the minute you want to go in and change something, it falls apart.
Right. I always call it the AI slot machine. You pull the lever and you hope it gives you what you want. It’s going to give you something fantastic, but the chances of it matching your vision seems very low. And that’s wrong. We should give animators the tools to tell the stories that they want to tell, not the stories that the AI guesses.
Do you see Cartwheel as being mostly a professional product, something more consumer-facing, or something in-between?
Our initial target group of customers is people who are paid to do this today — 3d professionals. We’ve noticed that the industry is buckling under the load of demand and the studio greenlight process is kind of broken, but the tools haven’t kept up with that. Individual animators cannot make animated films. Even small teams can’t really do it. [Oscar winning animated film] Flow is amazing, but it took quite some time.
We’re targeting this professional group of people who want to do both smaller and bigger projects. However, I think a side effect of making it easier is that the adjacent folks who are creative but don’t do animation today can use it. It doesn’t require the two years of schooling to pick it up like you would if you’re doing Unreal or Maya, but if you do know Unreal and Maya, we slot right into that workflow and just supercharge it.
Eventually I would love for everyone to be an animator in the same way that we’re kind of all photographers with our smartphones. Kind of, right? I would love for the world to exist where animation is everywhere, and everyone uses it for communication. But then there’s still, of course, the top-of-the-top experts who work in this field.
Currently with Cartwheel you can work exclusively in the web app, or you can export what you do and use it in Maya, Blender, or whatever. It seems to me that second way is how the tool could potentially be used in production.
Yes.
Do you always see it working in that intermediary kind of way? Do you see a world in which you would work directly inside a program like Maya?
Yeah, so we are building out an API, which would allow you to plug motion and pipe it directly into your experiences. We would love to integrate with all these tools. The easiest way today to see if we’re building in the right direction is download the motion, see if it’s useful.
Got it. Can you tell me a bit about the process of training the model for Cartwheel?
Great question. If you look across the AI landscape you have different AI models that do different things. So ChatGPT is like mostly a text model, although there’s some images now. Midjourney is an image model. Runway does video There’s a new data type, which is 3d motion. This data is different than text and different than images.
So, you look around the internet and you say, where can I find this data? I call it jokingly, ethically sourced. We’ve purchased and licensed motion capture libraries. We’ve paid animators to animate these things. We’ve done mo-cap ourselves. I’ve even got in the suit and danced around myself. So today it is mostly trained by motion capture data.
Critically, one thing we’ve done, however, is clean that up so the splines that come out of the system are not mo-cap splines because that would not be useful. And for those who are unaware, mo-cap splines would be a key on every frame. And once you try to edit that curve, it just falls apart. So we’ve taken great care to not output a key on every frame. We output beautiful curves that have great tangents that you can manipulate in your system.
Did you have animators go in and clean up that mo-cap data or is that something that you’re automating?
We do both. Our Pixar animators on the team informed the process and they’ve looked at thousands of hours of automated cleanup splines to dial this thing in. But when you download generated motion, it’s cleaned up automatically by our system.
I actually wanted to ask about the former Pixar artists you’ve hired. I’m curious how they’ve pushed against the technology and how you’ve adjusted based on their feedback.
It’s amazing. We’ve hired Cat Hicks and Neil Helm, 15-year directing animators, brilliant folks, very kind. The short version is they wanted to build the tools they wish always existed. And we’re taking that to heart. They have put the Pixar tools through the paces over the years and they know what good tools are. One of the things that we take into account is how would a professional use this tool?

For example, we have built a full IK/FK rig with a line of follow spaces and all the good things that goes onto any custom character on any motion that you can download. That was one thing that they said we need and cannot launch without. And me, as a scientist, I’m like, “You just get the rotations and it’s all forward kinematics and you’re fine.” And they said, “No, chance. No one’s going to use this.”
So, it’s little things like that that elevate the experience from a fun toy to a powerful professional tool being able to actually slot into production.
One thing that we’re going to be starting really soon for folks who are interested in the Cartwheel community is Zoom chats with the animators on our team showing you how to use the tool and answer your questions. We’re going to be trying to make sure the communication is strong. Because that matters here. A lot.
It feels important to connect with animators and what they actually need in the day-to-day work of creating shots.
Yeah, that’s right.

One thing I noticed playing around with the tool is that the motion library and the ability to combine pre-animated motions together into a shot seems to be core. Was the animation in the library created generatively with AI, or was it animated by your team?
It’s a mixture of both. We’ve generated a lot of that motion and a lot of it is motion capture that we’ve cleaned up. One thing we’ve observed is animators don’t actually care if the motion is AI generated. The scientist in me cares, but animators want great motion wherever it comes from. We’re trying to serve that desire.
Another thing that we’re launching very soon is video reference upload. So you can record yourself or upload a reference video. It’s almost like rotoscope, but we’ll extract the 3d motion from there. So yes, we try to have this three-pronged approach. You can type it in and generate, you can search for it, and very soon you can upload reference to get the motion that you care about out of our tool.
Right. And obviously the generative stuff can get a bit wonky at times.
It’s hallucination at its purest form.
Which is totally understandable. Is there a process where that improves over time? How do you get stuff to work more consistently?
The motion generation system is in its infancy and there’s lots of crazy [results]. There are many ways we could imagine improving this over time. One thing we’re not going to do is train on uploaded motions. That’s not part of our company culture. We’re not going to train on your data. But you could easily imagine once we launch this reference video upload system that we ourselves go out and find reference video, label it, extract the 3d motion, and use that to continuously improve the model.
We have an amazing suite of artist data labelers who help us. They look at the motion, they watch it, they describe it. That process is ongoing, giving us beautiful, dense captions that are descriptive of the motion. Once we get more and more of those, then the model better understands the intent of the users when they ask questions.
A lot of criticism of AI generated art and video is that there’s a certain sameness – a sanding off the edges that can make it feel generic. What are you doing to make sure that your animation doesn’t just end up being the average of everything, but instead that you can get something interesting and unique?
It’s a little bit more of a science question, so without getting too much into the jargon, there’s two steps to training these models. The first is called pre-training where you train on all this beautiful motion capture and all this hand-animated stuff. If you start there and finish there, then you get this sameness and this sort of averaging of all the big pile of data. There’s another step that’s called post-training where you inject the life of the motion. It’s just simply a very small hand-curated set of beautiful stylized motion with lots of anticipation, lots of secondary motions, and so on. You really force the model at the end to respect that quality. This is an open research question that is not yet solved by anyone. We’re not there yet. We have a long way to go, but there are promising first results.
You mentioned Flow earlier which is a very different style of animation from Across the Spider-Verse, which is very different from a Pixar movie. Can you see a world in which you could tune your model to the style of the show that you’re making?
I love that. The short answer is emphatically yes! There’s two ways to do it. If you are a studio and you have a large history of motion that you’ve done for different shows, you can work with us to get a custom model. We don’t advertise that anywhere, but we’d love to work with anyone who wants a custom model. That involves additional training, access to the model and stuff. It’s a little bit more complicated.
And there’s a second option, which is right around the corner, is simply where you upload three or four or five references — a walk cycle, an idling cycle, maybe some performance motion. Because of how the model itself is structured you can prompt with examples in motion and then all the generations thereafter will be closer to the style of this motion. [It] does not require additional training.
I can imagine even a 2d animated film that’s trying to make the transition into 3d, they have all this beautiful 2d motion. Put it through our video reference uploader, train a custom style model, and then suddenly you can generate 3d versions of this beautiful 2d style that’s existed forever. I am very excited for that future.
As you know, a lot of artists working in animation are really nervous that tools like yours are going to automate their jobs away. In the past you have mentioned that Cartwheel could make the work 100 times faster. Jeffrey Katzenberg, who is one of your investors, has said he thinks animated movies could be made with just 10% of the current workforce. So it’s obviously pretty scary for people working in animation. What’s your message to people who are worried right now?
I believe that the demand for animation is so much higher than the supply that it’s unreal. We don’t see that demand because it’s just too expensive today, but if it were slightly easier then I think animation is going to be everywhere. We’re going to see it in all of our communication and all of our ads.
If you think back to the early days of design, you used to have an entire design firm to make a poster. You’d have someone to do the typeset and you’d have someone to do the drawing and so on. With the advent of Photoshop now, every team has a designer and there’s so much more design being done. Yes, we don’t have a huge design firm in the same way we used to, but there’s so much more work for designers. And these designers can accomplish so much more — a single person can make a great print ad.
I want to live in a world where the individual is more empowered. Today we live in a world where the studio holds a lot of the power and they greenlight and hire and fire sort of at will and the animators are unfortunately not treated as well as they should be. I want to flip that script — empower the animators and give them the tools they need so that they can create more.
I don’t know what’s going to happen in the future. I can’t pretend to predict what’s going to come, but we’re building the tools for the animator to be empowered to make more things. That’s our goal here at Cartwheel.
So you see a world with smaller teams, more independent artists, and these tools are empowering individual animators to do that?
Both ends of this — I think smaller teams can do more and I think larger teams can do more. We’re currently stuck in this middle ground of animation where you have to be of a certain size to do anything and I would love to spread that out to more people.
Thanks for taking the time to talk with me. Is there something I didn’t ask that you want to mention?
Just one thing briefly. I’ve said this implicitly, but I love to say it explicitly. We’re very open to feedback. Please use the tool and tell us what you like. Tell us what you don’t like. We have a Discord, we have emails, you can tweet at me, whatever you need. I have a vision in my head for what this tool needs to be, but I’m very open to being wrong. So please, please, please reach out. Tell me what you like, what you don’t like.
The interview has been lightly edited for length.