If you want to databend video you won't get very far opening up a video in a text editor and typing in at random. This usually results in an unplayable file. That said, you can get away with hacking at the raw bytes in a hex editor, but small changes won't make much of a difference as replacing a single byte constitutes a much smaller edit in a video than in an image (the former usually being a much larger file). So you'll need to make large changes, like using the find and replace functionality to swap large numbers of bytes across the entire file. That said, as explained in the avoiding MP3 frame headers page, this can quickly result in swapping bytes that result in an unplayable file. Still, you can achieve some interesting results with enough patience and trial and error. That said, back in 2009 while working closely with my graduate advisers Ben Chang and Jon Cates, I wondered if it would be possible to hack the video codecs themselves rather than the video file.
A codec (or coder-decoder) is a program (or part of a program) that creates and interprets media files. I've mentioned multiple times throughout these appendices that media files are rarely ever one-to-one mappings of bytes like a text file is (with our uncompressed BMP file being a rare exception), instead they take the raw bytes (pixels, sound buffers, etc) and encode them into a smaller compressed series of bytes, which keeps the over all file size small. There are different methods and philosophies for how best to maintain the integrity of the original data while using as few bytes as possible. Techniques like Huffman encoding (see my Huffman hacking page) are used to represent information with less bits than would be by default (for example every instance of this byte 01100011 might become 010), other techniques involve removing certain colors from an image or video we're not likely to notice are missing or removing certain frequencies from a sound, like particularly high or low frequencies human's aren't great at picking up anyway. Some techniques are unique to video, for example inter-frame coding, which is a form of temporal compression where frames are compared to the ones that come before it so that they can "borrow" from the predecessor's pixel info and only store the differences. The glitch art style known as "datamoshing" exploits this particular form of compression by removing the "i-frames" (the initial references frames) such that subsequent frames borrow pixels from the wrong reference frame, or by duplicating frames multiple times so that each frame keeps borrowing the same set of offset pixels such that it creates a "pixel bleeding" or "blooming" effect.
In 2010 I created the first release of my Glitch Codec Tutorial, where I cloned a copy of my operating system (Ubuntu Linux) onto a series of DVDs which I used to conduct a series of workshops. Participants would stick the DVDs into their laptops and restart their computers, rather than booting into their operating system (stored on their hard drive) they would boot into mine stored on the DVD. Everyone essentially had a copy of my desktop, setup for hacking video codecs, on their machines. This meant rather than databending the video files, we could hack the code that creates those videos files, such that every video we ran through the hacked compression algorithms would by default come out glitched. In 2011 I created an online version of the Glitch Codec Tutorial allowing participants to download an .ISO file which they could burn onto a DVD themselves and follow the workshop through a series of online videos. Needless to say, DVDs became obsolete soon after, and though I updated the tutorial to run off of a USB stick the operating system I was using back then (a very old version of Ubuntu at this point) also became obsolete and eventually stopped working on most systems.
The purpose of creating the DVD/USB was to make it easier for beginners to jump into hacking video codecs, but it ultimately made it very difficult to maintain. I now regularly get emails from disappointed glitch artists who can no longer get the ISO running on their computers. For that reason I've decided to create this update to the Glitch Codec Tutorial, where rather than sharing an "easy to get started" (but difficult to maintain) ISO of my preconfigured environment, I'll share the process for how to replicate this configuration on your own computer. This will make this version of the Glitch Codec Tutorial much more sustainable, but comes with the trade off of no longer being beginner friendly. As such this is more of an intermediate level tutorial which requires knowledge of the command line. If you're unfamiliar with the command line (aka terminal) I have introductory notes on another one of my class websites. In any event, I'll do my best to be clear and verbose rather than assuming you have years of experience working in a terminal. That said, I will be assuming you are in a Unix-style terminal, this is the default on Mac and Linux. On Windows the default is the "command prompt" but modern versions of Windows now come with a Linux or bash terminal preinstalled (if not you can install it).
In addition to having an app for writing code (like Atom, Sublime or VSCode) you'll need to have a couple of command line apps installed before we begin. You'll need git, a version control system programmers use to track changes to their projects. We won't be doing much programming but git is also used to download the source code (see the page on machine code if you're not familiar with the concept of source code) of open source software projects, which is what we'll be using it for. You will also need to install yasm, a tool for compiling source code into machine code (an app we can run). To check if you've already got these installed you can launch a terminal and run git --version as well as yasm --version, if these commands both return a version number, than you have them installed, if instead you get some kind of error like "command 'yasm' not found", then you need to install them. On Mac and Windows you can install git by downloading an installer from the git website, on Linux you can use the terminal sudo apt-get install git. Yasm can be installed the same way on Linux, sudo apt-get install yasm. I assume (though have not tested) that if you're using the new Linux terminal in Windows you can install it the same way (in fact I'll be assuming for the remainder of this tutorial that the Linux instructions translate over to the new built in Linux terminal in Windows). On Mac you can install yasm using homebrew (but if you don't already have homebrew you'll need to install that as well).
Once you have those two dependencies installed on your system we can move onto the next step of downloading and compiling the source code for FFmpeg, a command line program for working with media files. FFmpeg is used regularly by programmers to convert videos from one file type to another, but it's also capable of doing pretty much anything any graphical video editing program is capable of doing. In fact, I know many graphical video apps that use FFmpeg underneath the hood. We won't be using any of FFmpeg's fancy video editing functionality though, we just want to use it to run videos through codecs. FFmpeg is also open source, which means the original source code for the application is available for us to work with. This is key, because we'll be hacking the codecs (which are themselves part of the FFmpeg source code) and then compiling a version of FFmpeg with those hacked codecs built right in.
We're going to download the FFmpeg source code using git in the terminal, but it's important that your terminal is currently inside the folder you want to download FFmpeg into. You can check what folder (aka directory) you are inside of by running pwd (which stands for "present working directory") in your terminal. If that's not the folder you want to be in, you can use cd to "change directory". For example, say you want to work on your desktop, we can 'cd' into it by running cd /home/username/Desktop on Linux or cd /Users/username/Desktop on Mac (where you would replace "username" with your own username). Once there, it would be a good idea to create a folder for all the work we're going to do, let's call it "gct3" (for this new v3 of the Glitch Codec Tutorial), you can create that folder like you always do, or you coul use the terminal mkdir gct3. Once created let's 'cd' into this new folder cd gct3.
Once inside the gct3 folder (use pwd to confirm), we can download the FFmpeg source code:
git clone https://git.ffmpeg.org/ffmpeg.git
When it's finished downloading you should see a folder called "ffmpeg" inside gct3. Next we'll create another couple of folders, one called "glitchcodec" (mkdir glitchcodec) and another called "bin" (mkdir bin). Your gct3 folder should now have those three folders inside it. We're now ready to compile FFmpeg, cd into the ffmpeg folder cd ffmpeg once there we'll run the configuration file like this:
./configure --enable-nonfree --prefix=/home/username/Desktop/gct3/glitchcodec
Or like this on Mac (in both cases, replacing "username" with your own username like before):
./configure --enable-nonfree --prefix=/Users/username/Desktop/gct3/glitchcodec
Once the configuration has finished run make, this process will take a few minutes. Once complete run make install which should take a little less time. Then when that's finished you should see that some new folders were created inside the "glitchcodec" folder. If not you may have mistyped something in the initial configuration step. You can revisit commands you previously entered into your terminal by pressing the up arrow. Check and see the configure command from before and compare it closely to the command above (make sure you have the path correct for your particular system, it should match the output of pwd, replacing the last part "ffmpeg" with "glitchcodec"). Assuming the new folders did get created inside your glitchcodec folder the last thing we need to do to complete our setup is create a symlink inside our gct3/bin folder to our complied "glitchcodec" app like this:
ln -s /home/username/Desktop/gct3/glitchcodec/bin/ffmpeg /home/username/Desktop/gct3/bin/glitchcodec
Or like this on Mac (again, replacing "username" with your own username):
ln -s /Users/username/Desktop/gct3/glitchcodec/bin/ffmpeg /Users/username/Desktop/gct3/bin/glitchcodec
Now comes the fun part! We've finished reproducing the Glitch Codec Tutorial environment (which I used to distribute on the DVD) on our own computer, the "intermediate" phase is done and at this point you could follow along the same steps I lay out in the original video series. But since you're here, let's keep going. Before we glitch stuff, it's always a good idea to make sure things are working like they're suppose to. Let's cd into the bin folder, assuming you are still inside the ffmpeg folder you can run cd ../bin to go back a directory and then into bin. You'll also want to copy some video files into the bin folder, or download some off the internet. There are loads of online tools and browser plugins for downloading videos from streaming sites like YouTube, but because we're using the terminal I'll recommend youtube-dl, my favorite command line tool for downloading videos from the Internet. Let's assume you have a video called "cat.mp4" inside your bin folder (which is your present working directory or pwd), run the following:
./glitchcodec -i cat.mp4 cat2.mp4
After a bit of processing (the longer the video file the longer this takes) you should see a new file called "cat3.mp4" appear inside your bin folder. If it plays exactly the same as the original cat.mp4 than things are working! At this moment our "glitchcodec" app is exactly the same as ffmpeg (you can run any/all ffmpeg commands with it).
We can now start hacking the codec source files. Using a code editor (like Atom, Sublime or VSCode) open up a file called "h263data.c" located inside the libavcodec folder of the ffmpeg folder (gct3/ffmpeg/libavcodec/h263data.c) this codec file, written in the programming language C, is used by default when compressing .mp4 files. You should notice various sections with numbers in them like lines 34 and 35 which look like:
const uint8_t ff_h263_intra_MCBPC_code[9] = { 1, 1, 2, 3, 1, 1, 2, 3, 1 };
const uint8_t ff_h263_intra_MCBPC_bits[9] = { 1, 3, 3, 3, 4, 6, 6, 6, 9 };
Go ahead and change some of those numbers at random. Now that we've made a change to the source we'll need to recompile our glitchcodec, rather than having to cd back and fourth between our bin folder and our ffmpeg folder it might be a good idea to open a second terminal where the pwd is ffmpeg and keep your other terminal inside of bin. In the terminal with ffmpeg as the pwd run make install to recompile the glitchcodec. Then back in the terminal with bin as the pwd let's run the same command from before:
./glitchcodec -i cat.mp4 cat2.mp4
This time our cat2.mp4 video should have been automatically glitched! If it's not, you may need to change a few more of those numbers inside the h263data.c file. We can now repeat this process hacking some of the other codecs inside of ffmpeg/libavcodec. For example we can hack the prores codec (ffmpeg/libavcodec/proresdata.c), then like before recompile in the ffmpeg terminal make install and then in the bin terminal, we can run our cat video through this particular codec (instead of the default h263) by selecting it like this (note the .mov extension on the output file):
./glitchcodec -i cat.mp4 -c:v prores cat3.mov
That's pretty much it, you can now keep repeating this process, hacking the source code of a particular codec, recompiling the app and running a video through it.