Machine Learning for Production

MOD Tech Labs
9 min readSep 25, 2020

This is a transcript taken from a series of lightning talks focused on modern content creation techniques during SIGGRAPH 2020. Enjoy!

I am Tim Porter, CTO and Co-Founder of MOD Tech Labs, and we’ll be going over the myriad of different ways that we use machine learning — not necessarily insomuch of a “this is a computational neural network” or “this is deep learning,” or anything like that — but to provide the different ways and possibilities that machine learning can be used and how they can interest individuals going through production.

So, a little bit about me: I am a serial entrepreneur. I’ve worked at Volumation, also at Underminer Studios for coming up on five years now. We did a lot of XR work for a bunch of different companies; everything from Microsoft to Quantum Rehabilitation — which quantum is the second largest wheelchair manufacturer in the world. It was actually a pretty cool project. We took on the actual VR driving controls. If you have a joystick or anything like that, we actually put that into the VR experience. So as you went to physically drive the chair, you could either sit in the chair or they made a very special version that could actually come off and it would allow people to go ahead and drive the VR experience — actually, it technically drove windows but that was pretty cool.

We did stuff for KPMG, data visualization, different things like that. And then of course Volumation was a volumetric capture and processing solution much more service based than what MOD does, which is fully automated solutions, things like that.

I graduated Bachelors of Science from Full Sail University. Also worked in games and movies. Last movie I worked on was Alice in Wonderland: Through the Looking Glass. Last game I worked on was probably some of the Cars stuff that came out — iOS/Android stuff. Although, I still see games that I worked on, or the tech I worked on that was in it, that still come out on regular basis.

For us — as far as MOD — we received backing through SputnikATX, and Quansight Futures also went ahead and put it on this round. And I, as well as my Co-Founder the last three years, got Intel Top Innovator awards as part of the Intel innovation team. And then the City of Austin Innovation Award, which is super cool, because we didn’t expect that was going to happen. We went up against like these really big companies, and were pretty surprised when we ended up winning — that was a lot of fun!

So, what is the core of MOD Tech Labs? Obviously, the biggest thing with VFX, in general, is just how long it takes to do things. And that’s because most of the different tasks are fairly manual, even if they do have automated systems, there’s so many stop gaps and things that go through that. So, we go ahead and we do a lot of automated tools with universal input and output on a very private, secure cloud that we have, which is actually in the same facility that runs DoD processes and things like that.

What is also really interesting is the way that we’re using and leveraging machine learning and how to distribute this across our cluster which we have very large systems. Our storage array alone is 11 systems, with over a terabyte of RAM. So… that’s just the storage side, let alone our processing side — we have over 100 GPUs, over 200 CPUs — lots of them. Anyways, I could talk all day on that!

Currently what our feature list includes, are things like unoptimized mesh creation, Sketchfab visualization, point cloud visualization, glTF integration, voxelization, automated color correction, mesh creation, decimation, cleaning remeshing, texture baking, secondary maps, and texture optimization. A lot of these are available to everyone to go ahead and utilize if you go to our website, modtechlabs.com. Some of them are in beta, so you might have to request them to go ahead and be done.

Then some of the features that are coming up are much more interesting stuff like automated delighting, deshadowing, image resizing, sharpening, cutout tools, auto rigging, full FACs and wraps remeshing. One of my favorites is FBX file encoding optimization, which actually does a single .fbx file for an entire volumetric capture sequence, including textures and meshes and everything all-in-one (and the animation), super cool. There’s also full green screen correction, automatic rotoscoping, and things like that.

Obviously, a lot of this is super heavy machine learning and most of it we’re currently in creation of, so we’ll discuss on some of those and then some of the ways that we are using machine learning in the current ones that we’re showcasing — through videos and Sketchfab… things like that.

This (above) is an automated texture optimization tool that we have. It uses machine learning to keep the image scale and quality as it goes along. So it’s a fairly interesting little tool. Here’s the Sketchfab version (below):

The first mesh from that lineup is in 8k, then followed 4k, 2k, and 1k. I think the big thing that amazes is that even when we’re talking about people with really good eyesight and everything like that, it’s really hard to tell a massive quality difference between all of these. You’re literally going to 1/64th the size from 8k to 1k — so, that’s a lot of fun.

The next one (above) is our mesh decimator. This one does a lot of work to ensure that the actual edging and silhouette is kept. Most decimators do a really poor job at that, but we end up taking images based off of how we want to go ahead and do the mesh. So as it goes in, it actually goes ahead and basically does a scan of the scan, then goes through and then runs that again to keep a silhouette. And so it ends up doing a balancing kind of act between the two meshes — between the mesh that is optimized perfect and the mesh that ends up coming out which ends up providing the solution that we get, which is once again super cool. You can see that in this mesh, here (below):

If you zoom in close on the figure of the guy centered in the fountain you can see this is a decimator. Most decimators will not provide a quality like this. And the reason why we were able do this was through the strength of machine learning. If you notice on the figure there is less than half the amount of mesh that’s on here: 4.49 million down to 2 million, and the quality on that is just obviously phenomenal. So, that is a fun one.

Automated remeshing — this one was a really fun one for me to build. It does flow mapping along the entire surface. You’ll want to notice the quadrilation and then the flow linear processing that goes across this. Even looking at this mesh (above in the video thumbnail) and seeing how the lines actually run across, it understands a certain amount about not only the topology, but the topology flow and how to produce that topology flow to actually perform a really awesome result. So, you know, it ends up getting some flow that goes across the front of the model. If you look at the lapel of the jacket on the bust, it does the edging and a really good job to make a sharp edge on this.

What’s weird about it is normally when you do a remesh or optimization, you go, “I want X number of polygons,” and it’s not quite like that. Instead, with this you’d go, “I want this much crease amount” and say, “I want to make sure that the edges stayed this true to the original,” and that’s how we get an asset out.

So it’s little bit weird to work with as a coder or as a person on the back end. But when you deal with it on our platform, you state what percentage you want and then it ends up sending it through, doing the math for you. The UI on there is really simplistic. This one is kind of fun like that.

Above, you’ll notice the first model is the big ol’ original raw asset. You can see how high res it is — 1.4 million tris. The second model (MOD Movie Quality Remesh) goes down to 87,000! And if you look at the shirt, the quality that’s on there, it provides just a wonderful mesh to go across. Especially if you’re not using this for a massively hero asset. The third model goes down to 28,000 — once again, same kind of thing. Then the last one ends up having a really, really low res mesh (13,000 tris). So if you’re trying to use that in a game or something similar, you’ll end up baking in the secondary maps to keep that quality that you’re looking for.

Now this, here (above), is automated texture reprojection. On this one, it actually does a considerable amount of cage work. Normally, when you do a reprojection you say, “I want a cage that is this big — this many units”. The work on this was very specifically to figure out how big the cage we want is and so it does a considerable amount of work based on the original images and other kind of setups — looks based on different angles. It does a whole bunch of work to make sure that the cage is the appropriate size.

Texture Reprojection — this one’s a really cool one. You can see in the Sketchfab (below), the original texture is on the left side. Then the remake is on the right, and it provides you the ability to do whatever you want to do with your UVs — which is super important.

And lastly, automated glTF playback. This one (above) doesn’t necessarily have a whole bunch of machine learning in it other than how we end up doing the mesh generation on it.

Something that’s a little bit different for our mesh generation than a lot of other mesh generations happens to be not only how we find our cameras in three-dimensional space, but also how we use that camera information to provide secondary features. And once we get past our dense process, that’s how we get that higher quality based off of the textual images — which is something that you tend to lose when you go through a process (especially most photogrammetric processes). You’ll go from an initial and get a high res mesh, and you’re like, “Okay, cool, I want to optimize it some…” and then what do you do with those details?

One of the big ways that a lot of studios do it is they go in and they and they hand remap over all of it, with different varying brushes. So, we’ve gotten away from the hand remap kind of setup, and this one just ends up automating the process. And this, of course, is just glTF on top of it. These (in the video) are super optimized assets. The first one is only 10mb and then the second one is less than 100mb for a full body sequence... that’s it.

So, obviously, keep in touch! You can send an email: tim@modtechlabs.com.

We have a lot of really cool stuff that is coming out — stuff like motion blur reduction as well as machine learning color grading where we use a convolutional neural network, which goes through and understands what an asset is. For example, it takes someone’s skin tone or something like that and it realizes that is the same asset over time, and then it does all of the style transfer throughout all of your scenes. So if you want neutral grading across everything, it is like a new gearbox. It does a wonderful job at that. Then we also have a style transfer tool that’s coming out as well, which specifically you feed it both the image you want to color grade and the one that you want the grading to come from and it does an intelligent amount of combination between the two of them.

Additionally, we are giving away $500 worth of credits if you sign up and use code MODtalks.

Thank you very much. I definitely look forward to everyone going on our YouTube and Sketchfab. We will be releasing a whole bunch of Unity assets soon so that everybody can play with things individually. Thank you!

--

--

MOD Tech Labs

Enabling production studios to bring immersive video content to life with fast and affordable SaaS processing. Learn more by visiting www.modtechlabs.com