Trying to record real-time audio to MP4 containing AAC
2018-06-07 15:56:30
New to this forum, hoping I can get some help. I have a real-time media server which supports a number of RTP media stream formats. I need to record these to MP4 (we also support video though this is not important for this issue) or M4A containing AAC. The resulting file must be played back for real-time streaming over RTP and played in standard media players including browsers. The input RTP can be various codecs, various sample rates (limited set from 8 kHz to 48 kHz), may contain 1 or 2 channels, etc. The input is depacked and decoded to 16 bit linear PCM in 10 msec chunks / frames no matter the input. I am attempting to use the libdfk-aac for encoding and libavformat for creating and writing the frames to MP4 or M4A container. My limited understanding is that I need to encode to AAC-LC to allow the resulting file to be played back in standard players. It seems the libfdk-aac encoder, configured for LC, is limited to frames of 1024 samples, is this true? I'm used to other encoders which tend to work in multiples of 10 msec frame sizes. And these sizes work well within the real-time architecture we use. Using the default of 1024, the resulting MP4 contains frames of varying time lengths / sample sizes depending on the input sample rate. For 16 kHz /mono input frames (160 samples every 10 msec), the resulting MP4 frames are 1280 samples or 80 msec. For 48 kHz / mono input frames (480 samples every 10 msec), the resulting MP4 frames are 1920 samples or 40 msec. Not sure if this makes sense or if there is some way of configuring the encoder such that the resulting frames are 10 or 20 msec no matter the input sampling rate. AACENC_GRANULE_LENGTH = 0x0105, /*!< Core encoder (AAC) audio frame length in samples: - 1024: Default configuration. - 512: Default LD/ELD configuration. - 480: Optional length in LD/ELD configuration. */ Any feedback or pointers to addition info is greatly appreciated. Thanks in advance, Bob