Multimedia
- Introduction
- Audio
- Graphics
- RGB
- Bitmap Format
- Image Compression
- Image File Formats
- “Enhance”
- Video Compression
- Video File Formats
- 3D Video
Introduction
- Odds are you use it everyday, but what is it?
Audio
- Computers are good at recording, playing back, and generating audio
- Uses different file formats
    - File formats are just a way of storing 0s and 1s on disk so that certain software knows how to interpret it
 
- MIDI
    - Way of storing musical notes for certain songs
- Can do this for different instruments
        - Programs can render the notes for these instruments
 
 
- GarageBand
    - Included with macOS
  - This is the Star Wars theme in MIDI
        - Doesn’t sound quite as good as the actual version
- Computer synthesizes the notes
            - Not an actual recording
- Computer interprets notes in the MIDI file
 
 
 
- MIDI is common in the digital workspace among musicians who wish to share music with each other.
- Humans typically like to hear music preformed and recorded by humans
    - File formats for recorded music include:
        - AAC
- MIDI
- MP3
- WAV
 
 
- File formats for recorded music include:
        
- WAV is an early sound format, but still used
    - Uncompressed data storage allowing high quality
 
- MP3
    - File format for audio that uses compression
        - Significantly reduce how many bits are necessary to store a song
- Discards 0s and 1s that humans can’t necessarily hear
            - True audiophiles may disagree
 
- Trade off between optimizing storage space and sacrificing quality
- This compression is said to be lossy
            - Losing the quality in the compression process
 
 
 
- File format for audio that uses compression
        
- AAC
    - Similar to MP3
- May see when you download a song from iTunes
 
- Streaming services such as Spotify don’t transfer a file to you but rather stream bits of information to you
- How do we think about the quality of these formats?
    - Sampling frequency
        - Number of times per seconds we take a digital snapshot of what a person would hear
 
- Bit depth
        - Number of bits used for these individual snapshots
 
- Sampling frequency x bit depth = number of bits necessary to store one second of music
- Audio file formats allow you to modify what these parameters are
 
- Sampling frequency
        
Graphics
- A graphic, what we see with multimedia, is really just a bunch of pixels both horizontal and vertical
    - All file formats are rectangular in nature, though transparent pixels can make images look to take on other shapes
- 
        In the simplest form, each of the dots or pixels is a bunch of 0s and 1s  
- To create a file format, we just need to determine a mapping
 
- This image is only black and white, so how to represent color?
RGB
- RGB stands for Red Green Blue
    - With information giving an amount of red, an amount of green, and an amount of blue, you can tell a computer how to colorize pixels
- None of the colors yields a black pixel
- All of the colors yields a white pixel
- In between these two options is where we get all sorts of colors
 
- Consider the three bytes: 11111111 00000000 00000000
    - If we interpret these bytes to represent colors, it appears we want all of the red, none of the green, and none of the blue
- These 24 bits (3 bytes =  3 x 8 bits = 24 bits) represent the color we know as red!
        - If a computer wanted to represent this color, it would store these 24 bits
 
 
- Consider the three bytes: 00000000 11111111 00000000
    - Green
 
- Consider the three bytes: 00000000 0000000 11111111
    - Blue
 
- Consider the three bytes: 00000000 0000000 0000000
    - Black
 
- Consider the three bytes: 11111111 11111111 11111111
    - White
 
- Can get many color variations by mixing the above colors in different quantities
- When we talk about image formats, we typically don’t talk in terms of binary but rather something called hexadecimal (base-16, contains 16 digits)
    - 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, a, b, c, d, e, f
- 0 is the smallest number we can represent in single digit
- f is the largest number (value of 15) we can represent in a single digit
 
- Consider the 8 bits: 1111 1111
    - Each hexadecimal digit represents four bits
- One hexadecimal digit can represent the first four bits, another can represent the second four
        - Represent something with eight symbols using only two!
 
- 1111 is the decimal number 15, which is f
- Therefore, 1111 1111 in hexadecimal is ff
 
- Red can thus be represented in hexadecimal as ff 00 00
- Green can be represented in hexadecimal as 00 ff 00
- Blue can be represented in hexadecimal as 00 00 ff
- A lot of graphical editing software such as Photoshop use hexadecimal to represent colors
Bitmap Format
- This background for Windows XP was a bitmap file (.bmp)

- A mapping or grid of bits much like the smiley face from before
- Zooming in on this image show that it is just a grid of dots

- Notice the pixelation
- Much like with audio, so too in the world of images do you have discretion over how many bits to use
    - How many bits to represent each pixel’s color?
- Resolution is another factor
        - An image that is only 100 pixels scaled up only duplicates the existing limited information, resulting in a blotchy image
- Would be better to start with image that has a higher resolution (more pixels)
 
 
- A lot of repeated colors, so it seems silly to represent each color with the same number of bits
Image Compression
- Graphical file formats can often be compressed
- Can be done lossy or losslessly
    - With audio, we threw away audio information that the human ear can’t necessarily hear
        - This is lossy compression; throwing information away
 
- Using fewer bits to represent the same information is lossless compression
 
- With audio, we threw away audio information that the human ear can’t necessarily hear
        
- 
    Lossless compression  
- There is a lot of repeated blue in the first image
    - Using the same 24 bits to represent each pixel!
 
- The second image is compressed and not what a user would see
    - The first column contains the color that the rest of the row (scan line) should have
        - Image contains instructions on how to repeat the color in a particular row
 
- When a color is encountered that isn’t in the first column (the apple in this case), the instructions would list the colors for each non-repeated pixel
- This uses less bits but makes the original information recoverable
 
- The first column contains the color that the rest of the row (scan line) should have
        
- 
    Lossy compression  
- This is a .jpg photograph that is somewhat compressed, but not easy to tell
- Let’s say we want to compress this image further so that we can share it without going over a social media platform’s limit
- 
    It contains more complicated patterns of colors, so let’s try a lossy compression resulting in the following:  
- Lossy compression means that I won’t be able to get that original image back
    - The compression throws away bits of information
        - “Does the sky really need this many shades of blue?”
- “Does this leaf really need this many shades of green”
            - Replaces bits with only a few colors giving an approximation
- I will not be able to know how clear the sky used to be from this information
 
 
 
- The compression throws away bits of information
        
Image File Formats
- BMP
    - Originally used by Windows
- Not super common these days
 
- GIF
    - Low quality images
        - Only supports 8-bit color
 
- Often used for memes
        - Can be animated
- 
            Like a video file with only a few images  
 
 
- Low quality images
        
- JPEG
    - Supports 24-bit color
- Losslessly compresses
        - Can minimize amount of compression to create high quality photos
 
 
- PNG
    - High quality graphics
- Supports 24-bit color
 
- All these formats ultimately have an limited amount of information
    - Ultimate just store pixels and colors of when the image was taken
 
“Enhance”
- Common for popular culture abuses of what it means to be a multimedia format
    - “Enhancing” means to make an image as clear as possible not matter what format it was saved in
 
- David shows a clip of characters “enhancing” an image
    - The characters zoom into a pixelated frame of a video and somehow clear it up to see a reflection
        - Video is just a whole bunch of images being shown to us quickly (24 frames per second, etc.)
 
- The pixelated image only contains information for those pixels
        - There is no way to obtain a clear image unless the original image was already at a high resolution
 
 
- The characters zoom into a pixelated frame of a video and somehow clear it up to see a reflection
        
- David contrasts this with an aware clip of Futurama
Video Compression
- You can think of a video format as similar to a flip book
- Video formats are just a bunch of images shown quickly in succession to create the illusion of motion
    - Not necessarily all information stored as png, jpg, gif, or even images
- Algorithms and mathematics can help go from one frame to another
 
- Opportunities for compression
    - Can leverage same image compression techniques for each frame (intra-frame coding)
- 
        Background of multiple frames can contain redundant information  
- Compare current frame and next frame of video and determine what has changed
        - Store these differences
- Key frames store a snapshot of time to remember what the video looks like
- In each subsequent frame remember what has changed
            - Using algorithms and math, background is drawn
 
- Key frames are stored multiple times to guarantee that frames can be recovered
 
 
Video File Formats
- In the world of video, there are more solutions on how to store information
- Video file formats are containers
    - Containers are digital container in which you can put multiple types of data
- Can include a video track, audio track, a secondary audio track (for different languages), closed captions, …
 
- AVI
    - Commonly used in Windows
 
- DivX
- Matroska
    - Open source container meant to be more versatile
 
- MP4
    - Pretty much universal in all browsers
 
- QuickTime
    - Commonly used in MacOS
 
- Codecs
    - Ways of storing and encoding information
- For video:
        - H.264
- MPEG-4 Part-2
- …
 
- For audio:
        - Can be stand alone files or tracks in a container!
- AAC
- MP3
- …
 
 
3D Video
- 
    Increasingly, 3D formats are becoming more common  - This is a 360 degree image of Sanders Theatre
        - A spherical image
- Looks distorted in 2D
- Like flattening a globe
 
 
- This is a 360 degree image of Sanders Theatre
        
- Images can contain metadata
    - Information that viewers can’t see
- Tells programs, applications, and browsers how to display the image
 
- 
    With sensors on a headset, users can experience virtual reality  
- More file formats are still on the horizon, but ultimately all of them boil down to storing 0s and 1s!