By Bijan Bowen

Testing locally run Open Sora AI Video Generation

As I mentioned in my recent post about locally run portrait animation, the thing I am most excited about in the generative AI space is the ability to generate uncensored videos locally, without subscriptions or censorship. The difference between my current self and the version of myself who wrote that article 11 days ago is that I have discovered a locally run AI video generation repository—Open Sora.

The Open Sora Github Repo

When I first learned about this repo from a post on the LocalLlama subreddit, I eagerly browsed through it and was excited by the high-quality 720p video samples present in the repository. Unfortunately for me, I then realized that these samples were generated using an Nvidia H100, an 80 GB GPU with a price point comparable to a new Japanese sedan. Since my localllm machine uses two Nvidia 3090ti's, I searched for a way to see if I could run any sort of generation on my setup and came across a page that mentioned the ability to generate a 4-second-long 240p video using a 3090. While this may not sound enticing to most, those who share my fantasy of uncensored offline video generation will appreciate my excitement at the prospect of being able to run this with my local setup.

The single gpu CLI prompt
The parallel gpu cli prompt

I followed the GitHub instructions to install everything and had a bit of trouble, though that was related to some obscure issues regarding my use of an SSH session into the localllm machine, and not inherent to the repo itself. Once everything was installed, I attempted to run the script, but for some reason, I was not able to get the command line inference script to work with only one card. Fortunately, the option to utilize multiple GPUs was available and through this, I was able to get my first generation. I chose to initially attempt to recreate the results from the backprop page, utilizing the prompts listed there for my examples.

A bustling city transitioning from day to night

The first prompt was for a bustling city transitioning from day to night, and my output was definitely impressive. As of the conclusion of my testing, I would say that I found this one to be the most impressive in terms of "realism." Next, I tried the art museum tour, which began in a very "trippy" manner, ending in a reasonable walk through the halls of the generated museum. Following this, I generated a magical girl transformation sequence, which looked very similar to the test result on the backprop page. Finally, I generated an epic Shonen fight scene, which was a bit of a letdown from the demo result I saw, as it did not really contain any human-like forms until the last second or so of the clip.

A magic girl transformation

Now that I had some verification that my setup appeared to be performing similarly to one I assume was set up with more candor and care, I began to, to no one's surprise, experiment with some more "obscure" and NSFW results. I won't go through every single prompt that I experimented with here, but I will say that some of the results were absolutely hilarious, while others were absolutely terrifying. While I won't show some of the more NSFW results, I will say that anyone with a 24 GB GPU who intends to use this for "adult" purposes may need to spring for an H100 unless their interest in that area centers around horror films and extreme contortion.

A more "mature" video generation

While it may be easy to look in and dismiss the results produced here, I could not be more excited about the future of locally generated videos. I believe that in the near future, people will spend a lot of time generating their own content using tools like this, and having the ability to do so without needing another subscription is a great thing. I also read a feature request from an Apple ML engineer asking for an MLX backend, meaning the ability to run this on Apple silicon machines, which would open up the possibility of generating much higher-quality videos without needing a data center GPU. I am looking forward to watching this evolve and will continue playing around with it on my local machine.

You can view the video for this article on my YouTube channel.


Leave a comment