Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: encoded opus-file by adapted code of opusenc.c does not play (Read 3721 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

encoded opus-file by adapted code of opusenc.c does not play

Hello, this is my first post on Hydrogenaudio. I registered because I for myself can't find the solution to my problem and hope that I'll find some experts here willing to help me.
So the thing is about:

Task:                      Encode microphone data on the fly to opus.
What works:            Encoding, decoding, playback of decoded raw (pcm) output with Audacity is OK.
What does not work: The stream (ogg), playback of resulting opus file.
What I have done:    I adapted the code from opusenc.c from opustools. So I thought the packaging would be easy. But the ogg stream is corrupt, e.g. Firefox does not show the progress bar but an error like "can't play video, because file is damaged". On analysing with opusinfo.exe, it says the file is corrupt due to missing data (holes in stream).

First off, I have a question regarding the understanding of ogg: Regarding an ogg header, as far as I understand it, at offset 26 there is a byte saying how many (lets say X) segments there are on the page. As an example let X be 3. So, the next 3 bytes (offset 27,28,29) would contain the sizes (aka lacing values?) of the three segments contained on that page. For example, they may be FF, FF and FE (total sum is 764 (decimal) ). Am I correct, that there are exactly 764 bytes of data behind offset 29 and the next Ogg page, starting with OggS? Because sometimes, I count a few more or even less and this made me wonder. I checked that via selecting the octets in Hexedit and if the popup info of my selection showed the hex value of 764(dec), the end of the selection was not just before "OggS" as I had expected.

Here is my code to encode audio data.
The length of "pcm" is typically 2*960 (samplesize = 16 bit), meaning 960 samples (which is also the frame_size).
The code is basically an adaption of the opusenc.c encoding part of the main loop. I just removed the parts that are based on reading samples from a wav-file, because I don't have a wav file, but a continues input of microphone samples.

Code: [Select]
opus_int32
OpusHandler::encode(char* pcm,      /* in, raw audio data, say from mic */
                    int frame_size  /* in, frame_size in samples; 16 bit sample size in pcm.
                                      * frame_size = 960 */
                    )
{
    if (debug_each_packet)
        qDebug() << "in OpusHandler::encode()";

    int size_segments, cur_frame_size;
    id++;
    cur_frame_size = this->frame_size_; // should be 960

    for ( int i = 0; i < channels_ * frame_size ; i++ )
    {  // pcm_ is unsigned char[960]
        pcm_[i] = ((static_cast<unsigned char>( pcm[2*i+1] ) << 8) & 0xFF00) |
                  (static_cast<unsigned char>( pcm[2*i]  )      & 0xFF);
    }

    if (frame_size < cur_frame_size){
        for(int i = frame_size    /* *channels */;
                i < cur_frame_size /* *channels */;
                i++ ){
            pcm_[i] = 0;
        }
    }

    // nbBytes: length of encoded packet (in bytes)
    /*opus_int32*/
    nbBytes = opus_encode(
                        opusEnc_,
                        pcm_,          /* in */  /* opus_int16 aka short */
                        cur_frame_size, /* in */  /* int */
                        cbits,          /* out */  /* unsigned char */
                        MAX_PACKET_SIZE /* in */  /* opus_int32 aka int */
                        );
    if (nbBytes<0)
    {
      fprintf(stderr, "Encoding failed: %s. Aborting.\n", opus_strerror(nbBytes));
      return -1;
    }

    // --------------------------------------------------------
   
    //ogg_write_raw_opus_to_ogg_stream();
   
    nb_encoded                  += cur_frame_size;
    enc_granulepos              += cur_frame_size;//*48000/coding_rate;
    total_bytes                += nbBytes;
    size_segments                = (nbBytes+255)/255;

    /*Flush early if adding this packet would make us end up with a
      continued page which we wouldn't have otherwise.*/
    while((((size_segments<=255)&&(last_segments+size_segments>255))||
          (enc_granulepos-last_granulepos>max_ogg_delay))&&
#ifdef OLD_LIBOGG
          ogg_stream_flush(&os, &og)){
#else
          ogg_stream_flush_fill(&os, &og,255*255)){
#endif
      if(ogg_page_packets(&og)!=0)last_granulepos=ogg_page_granulepos(&og);
      last_segments-=og.header[26];
      ret=oe_write_page(&og, fout);
      if(ret!=og.header_len+og.body_len){
        fprintf(stderr,"Error: failed writing data to output stream\n");
        exit(1);
      }
      bytes_written+=ret;
      pages_out++;
    }

    /*The downside of early reading is if the input is an exact
      multiple of the frame_size you'll get an extra frame that needs
      to get cropped off. The downside of late reading is added delay.
      If your ogg_delay is 120ms or less we'll assume you want the
      low delay behavior.*/
//    if((!op.e_o_s)&&max_ogg_delay>5760){
//      nb_samples = inopt.read_samples(inopt.readdata,input,frame_size);
//      total_samples+=nb_samples;
//      if(nb_samples<frame_size)eos=1;
//      if(nb_samples==0)op.e_o_s=1;
//    } else nb_samples=-1;


    op.packet=cbits;
    op.bytes=nbBytes;
    op.b_o_s=0;
    op.granulepos=enc_granulepos;
//    if(op.e_o_s){
//      /*We compute the final GP as ceil(len*48k/input_rate). When a resampling
//        decoder does the matching floor(len*input/48k) conversion the length will
//        be exactly the same as the input.*/
//      op.granulepos=((original_samples*48000+rate-1)/rate)+header.preskip;
//    }
    op.packetno=2+id;
    ogg_stream_packetin(&os, &op);
    last_segments+=size_segments;

    /*If the stream is over or we're sure that the delayed flush will fire,
      go ahead and flush now to avoid adding delay.*/
    while((op.e_o_s||(enc_granulepos+(frame_size*48000/coding_rate)-last_granulepos>max_ogg_delay)||
          (last_segments>=255))?
#ifdef OLD_LIBOGG
    /*Libogg > 1.2.2 allows us to achieve lower overhead by
      producing larger pages. For 20ms frames this is only relevant
      above ~32kbit/sec.*/
          ogg_stream_flush(&os, &og):
          ogg_stream_pageout(&os, &og)){
#else
          ogg_stream_flush_fill(&os, &og,255*255):
          ogg_stream_pageout_fill(&os, &og,255*255)){
#endif
      if(ogg_page_packets(&og)!=0)last_granulepos=ogg_page_granulepos(&og);
      last_segments-=og.header[26];
      ret=oe_write_page(&og, fout);
      if(ret!=og.header_len+og.body_len){
        fprintf(stderr,"Error: failed writing data to output stream\n");
        exit(1);
      }
      bytes_written+=ret;
      pages_out++;
    }


    // --------------------------------------------------------

//    encodedData_.append(reinterpret_cast<char*>(cbits),nbBytes);

//    decode(cbits, nbBytes, frame_size);

    return nbBytes;
}

Besides the encoding, I do also save the raw input of the mic to a file.
What really puzzles me is that if I convert this one with the precompiled opusenc.exe, the file plays.
If I encode it on the fly with the code mostly copied from opusenc (see above), it won't play. What am I missing here?

How I used opusenc.exe:
opusenc rec.raw rec.opus --raw --raw-chan 1 --raw-bits 16 --raw-endianness 0 --raw-rate 48000
output:
48 kHz 1 channel
Preskip 312
Wrote: 34932 bytes, 179 packets, 6 pages
...


I appended 2 files:
- rec.opus  (encoded with opusenc.exe from my pcm raw data)
- test.opus (encoded by my application)

I'd be very happy if you could help me solve this.

encoded opus-file by adapted code of opusenc.c does not play

Reply #1
So your segment tables are messed up.  They are weird anyway.  Did you intend to have only 5 short packets on each page?  The last segment on each page looks like the suspicious one, doesn't look like the start of another packet.  Maybe just a random byte when you wrote too much for the page?

encoded opus-file by adapted code of opusenc.c does not play

Reply #2
A bad segment table would be a plausible reason, still I don't know how this would happen, or how I could manipulate that. I pop frames into an ogg packet, send it to the stream, and then libogg builds pages. I do not touch the internals of libogg in any way.

You are right asking if I intended that small page size. I was checking if that would have a positive effect, but no. The size was controlled via max_ogg_delay set to 4800 instead of 48000.

encoded opus-file by adapted code of opusenc.c does not play

Reply #3
Problem solved. It was so easy in the end.  In my initialization method I missed the character "b" in opening the FILE*, to open it in binary mode.