The internet is filled with guides and howto’s for getting video on you iPhone. The specs specify the iPhone to support h.264, baseline profile, level 3.0. Translated this means:
- No B-frames
- No CABAC
- No weighted predictions
- No 8×8 DCT
- Max resolution around 640×640 (technically 1620 MacroBlocks, 16×16 each)
- Max 25fps at that resolution (technically 40500 MacroBlocks per second)
- Max 10Mbps
The iPhone imposes some extra limitations:
- Max 640×480, 30fps
- Max 2.5Mbps
Most guides on the internet additionally force the number of reference frames down to 1 (ffmpeg‘s -refs parameter), although I could no find any specsheet imposing this limit. So I decided to test this.
Testing the number of usable reference frames isn’t easy. For starters, you can’t force an encoder to use a certain frame as reference, you can only allow it to use that many if it sees fit. However, by hand-picking the input frames, you can create a frame sequence that almost certainly will use the reference frame you want. Simply take N frames from different shots. Just concatenate these N frames after each other in a loop: frame 0 is exactly the same as frame N. The encoder will try to minimize the residual image by choosing the reference frame which is most similar. Since the N frames are totally different, the encoder will almost certainly choose frame I-N as a reference for frame I.
Upon decoding the stream, the decoder will fetch the referenced frame from its Decoded Picture Buffer (DPB) and work from there. That is, as long as the referenced frame is still available. If the referenced frame is no longer available, the decoding will fail. Depending on the decoder, it might stop, crash, or attempt to cover up it’s mistake. The iPhone implements the last option. It seems to use a “wrong” reference frame and work from there.
After some trail and error, it seems that the iPhone keeps 6 frames for reference, which is what a level 3.0 decoder should do (6.59 according to the spec).
The above was for for 640×480 frames; Since the DPB is specified in bytes, smaller frame sizes should result in more reference frames. My iPhone 3G 3.0.1 however does NOT play this correctly. This is the result with 512×288 video (specs says 13 reference frames in the DPB):
Strangely, an iPod Touch plays it just fine… So here I’d like to call for help:
- Your hardware type (2G, 3G, 3GS, Touch, …)
- Your model number (Settings -> General -> About -> Model)
- Your version number (Settings -> General -> About -> Version)
- Does it play with or without artifacts
- Anything else that might be important
- iPhone 2G (MB346F) 3.1.2 NOT OK
- iPhone 3G (MB496NF) 3.0.1 NOT OK
- iPhone 3G (MB501DN) 3.1.2 NOT OK
- iPhone 3G (MB489NF) 3.1.2 NOT OK
- iPod Touch 1st gen (MB376C) 3.1.2 NOT OK
- iPod Touch 1st gen (MA623ZD) 3.1.2 NOT OK
- iPod Touch 2nd gen (MB528KN) 2.2.1 OK
- iPod Touch 3rd gen (MC008MF) 3.1.2 OK
- iPhone 3GS (MC132NF) 3.1.2 OK, confirmed twice