Hello, this is my first post on Hydrogenaudio. I registered because I for myself can't find the solution to my problem and hope that I'll find some experts here willing to help me.
So the thing is about:
Task: Encode microphone data on the fly to opus.
What works: Encoding, decoding, playback of decoded raw (pcm) output with Audacity is OK.
What does not work: The stream (ogg), playback of resulting opus file.
What I have done: I adapted the code from opusenc.c from opustools. So I thought the packaging would be easy. But the ogg stream is corrupt, e.g. Firefox does not show the progress bar but an error like "can't play video, because file is damaged". On analysing with opusinfo.exe, it says the file is corrupt due to missing data (holes in stream).
First off, I have a question regarding the understanding of ogg: Regarding an ogg header, as far as I understand it, at offset 26 there is a byte saying how many (lets say X) segments there are on the page. As an example let X be 3. So, the next 3 bytes (offset 27,28,29) would contain the sizes (aka lacing values?) of the three segments contained on that page. For example, they may be FF, FF and FE (total sum is 764 (decimal) ). Am I correct, that there are exactly 764 bytes of data behind offset 29 and the next Ogg page, starting with OggS? Because sometimes, I count a few more or even less and this made me wonder. I checked that via selecting the octets in Hexedit and if the popup info of my selection showed the hex value of 764(dec), the end of the selection was not just before "OggS" as I had expected.
Here is my code to encode audio data.
The length of "pcm" is typically 2*960 (samplesize = 16 bit), meaning 960 samples (which is also the frame_size).
The code is basically an adaption of the opusenc.c encoding part of the main loop. I just removed the parts that are based on reading samples from a wav-file, because I don't have a wav file, but a continues input of microphone samples.
opus_int32
OpusHandler::encode(char* pcm, /* in, raw audio data, say from mic */
int frame_size /* in, frame_size in samples; 16 bit sample size in pcm.
* frame_size = 960 */
)
{
if (debug_each_packet)
qDebug() << "in OpusHandler::encode()";
int size_segments, cur_frame_size;
id++;
cur_frame_size = this->frame_size_; // should be 960
for ( int i = 0; i < channels_ * frame_size ; i++ )
{ // pcm_ is unsigned char[960]
pcm_[i] = ((static_cast<unsigned char>( pcm[2*i+1] ) << 8) & 0xFF00) |
(static_cast<unsigned char>( pcm[2*i] ) & 0xFF);
}
if (frame_size < cur_frame_size){
for(int i = frame_size /* *channels */;
i < cur_frame_size /* *channels */;
i++ ){
pcm_[i] = 0;
}
}
// nbBytes: length of encoded packet (in bytes)
/*opus_int32*/
nbBytes = opus_encode(
opusEnc_,
pcm_, /* in */ /* opus_int16 aka short */
cur_frame_size, /* in */ /* int */
cbits, /* out */ /* unsigned char */
MAX_PACKET_SIZE /* in */ /* opus_int32 aka int */
);
if (nbBytes<0)
{
fprintf(stderr, "Encoding failed: %s. Aborting.\n", opus_strerror(nbBytes));
return -1;
}
// --------------------------------------------------------
//ogg_write_raw_opus_to_ogg_stream();
nb_encoded += cur_frame_size;
enc_granulepos += cur_frame_size;//*48000/coding_rate;
total_bytes += nbBytes;
size_segments = (nbBytes+255)/255;
/*Flush early if adding this packet would make us end up with a
continued page which we wouldn't have otherwise.*/
while((((size_segments<=255)&&(last_segments+size_segments>255))||
(enc_granulepos-last_granulepos>max_ogg_delay))&&
#ifdef OLD_LIBOGG
ogg_stream_flush(&os, &og)){
#else
ogg_stream_flush_fill(&os, &og,255*255)){
#endif
if(ogg_page_packets(&og)!=0)last_granulepos=ogg_page_granulepos(&og);
last_segments-=og.header[26];
ret=oe_write_page(&og, fout);
if(ret!=og.header_len+og.body_len){
fprintf(stderr,"Error: failed writing data to output stream\n");
exit(1);
}
bytes_written+=ret;
pages_out++;
}
/*The downside of early reading is if the input is an exact
multiple of the frame_size you'll get an extra frame that needs
to get cropped off. The downside of late reading is added delay.
If your ogg_delay is 120ms or less we'll assume you want the
low delay behavior.*/
// if((!op.e_o_s)&&max_ogg_delay>5760){
// nb_samples = inopt.read_samples(inopt.readdata,input,frame_size);
// total_samples+=nb_samples;
// if(nb_samples<frame_size)eos=1;
// if(nb_samples==0)op.e_o_s=1;
// } else nb_samples=-1;
op.packet=cbits;
op.bytes=nbBytes;
op.b_o_s=0;
op.granulepos=enc_granulepos;
// if(op.e_o_s){
// /*We compute the final GP as ceil(len*48k/input_rate). When a resampling
// decoder does the matching floor(len*input/48k) conversion the length will
// be exactly the same as the input.*/
// op.granulepos=((original_samples*48000+rate-1)/rate)+header.preskip;
// }
op.packetno=2+id;
ogg_stream_packetin(&os, &op);
last_segments+=size_segments;
/*If the stream is over or we're sure that the delayed flush will fire,
go ahead and flush now to avoid adding delay.*/
while((op.e_o_s||(enc_granulepos+(frame_size*48000/coding_rate)-last_granulepos>max_ogg_delay)||
(last_segments>=255))?
#ifdef OLD_LIBOGG
/*Libogg > 1.2.2 allows us to achieve lower overhead by
producing larger pages. For 20ms frames this is only relevant
above ~32kbit/sec.*/
ogg_stream_flush(&os, &og):
ogg_stream_pageout(&os, &og)){
#else
ogg_stream_flush_fill(&os, &og,255*255):
ogg_stream_pageout_fill(&os, &og,255*255)){
#endif
if(ogg_page_packets(&og)!=0)last_granulepos=ogg_page_granulepos(&og);
last_segments-=og.header[26];
ret=oe_write_page(&og, fout);
if(ret!=og.header_len+og.body_len){
fprintf(stderr,"Error: failed writing data to output stream\n");
exit(1);
}
bytes_written+=ret;
pages_out++;
}
// --------------------------------------------------------
// encodedData_.append(reinterpret_cast<char*>(cbits),nbBytes);
// decode(cbits, nbBytes, frame_size);
return nbBytes;
}
Besides the encoding, I do also save the raw input of the mic to a file.
What really puzzles me is that if I convert this one with the precompiled opusenc.exe, the file plays.
If I encode it on the fly with the code mostly copied from opusenc (see above), it won't play. What am I missing here?
How I used opusenc.exe:
opusenc rec.raw rec.opus --raw --raw-chan 1 --raw-bits 16 --raw-endianness 0 --raw-rate 48000
output:
48 kHz 1 channel
Preskip 312
Wrote: 34932 bytes, 179 packets, 6 pages
...
I appended 2 files:
- rec.opus (encoded with opusenc.exe from my pcm raw data)
- test.opus (encoded by my application)
I'd be very happy if you could help me solve this.