Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: aoTuV Patches, Vorbis 1.3.5 and Lancer (Read 58254 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

aoTuV Patches, Vorbis 1.3.5 and Lancer

Reply #50
Under debian I built the code with the patches for SSE3 without a problem.  Encoding works like a charm.  But when decoding I get a segfault in the function ov_read_filter of vorbisfile. Could it be because the new function "ov_read_float2pcm" is not referenced ?

ov_read_float2pcm is only used internally by the Lancer code. It's not meant to be exported, so that should not be a problem.

Looking at the ov_read_filter code I noticed that the Lancer code omits vorbis_fpu_setround and vorbis_fpu_restore when reading non-stereo material. This looks like a bug.

Any chance you have been trying to decode mono or multi-channel material when you got the crash? Anyway, attached is a patch that re-adds said function calls. Please try if it fixes the issue.

aoTuV Patches, Vorbis 1.3.5 and Lancer

Reply #51
Under debian I built the code with the patches for SSE3 without a problem.  Encoding works like a charm.  But when decoding I get a segfault in the function ov_read_filter of vorbisfile. Could it be because the new function "ov_read_float2pcm" is not referenced ?
ov_read_float2pcm is only used internally by the Lancer code. It's not meant to be exported, so that should not be a problem.

Looking at the ov_read_filter code I noticed that the Lancer code omits vorbis_fpu_setround and vorbis_fpu_restore when reading non-stereo material. This looks like a bug.

Any chance you have been trying to decode mono or multi-channel material when you got the crash? Anyway, attached is a patch that re-adds said function calls. Please try if it fixes the issue.

Thanks for the patch but unfortunately still the same issue.  The program in question is the latest supertuxkart version 0.9.5 on debian jessie x64.  At the end of the startup when the game waits for user input and should start to play the theme music the program segfaults in libvorbisfile:
Quote
[ 6173.561481] traps: supertuxkart[5123] general protection ip:7fe6fde686f9 sp:7fe6edc23040 error:0 in libvorbisfile.so.3.3.7[7fe6fde60000+b000]

The music files of supertuxkart are all 2 channel.  To dig deeper did some debug under Eclipse CDT and found out that program segfaults and stops here, see the dissasembly code below:

Quote
00007ffff72e060b:  movaps (%rcx,%rdi,4),%xmm0
The above looks like sse code.

Hereunder the full dissasembly list:

Code: [Select]
................ ...
264                  update();
00000000009b324a:  mov -0x8(%rbp),%rax
00000000009b324e:  mov (%rax),%rax
00000000009b3251:  add $0x38,%rax
00000000009b3255:  mov (%rax),%rax
00000000009b3258:  mov -0x8(%rbp),%rdx
00000000009b325c:  mov %rdx,%rdi
00000000009b325f:  callq *%rax
265              }  // updateFaster
00000000009b3261:  leaveq
00000000009b3262:  retq
00000000009b3263:  nop
269              {
                  MusicOggStream::update():
00000000009b3264:  push %rbp
00000000009b3265:  mov %rsp,%rbp
00000000009b3268:  push %rbx
00000000009b3269:  sub $0x38,%rsp
00000000009b326d:  mov %rdi,-0x38(%rbp)
271                  if (m_pausedMusic || m_soundSource == ALuint(-1))
00000000009b3271:  mov -0x38(%rbp),%rax
00000000009b3275:  movzbl 0x3e4(%rax),%eax
00000000009b327c:  test %al,%al
00000000009b327e:  jne 0x9b328f <MusicOggStream::update()+43>
00000000009b3280:  mov -0x38(%rbp),%rax
00000000009b3284:  mov 0x3dc(%rax),%eax
00000000009b328a:  cmp $0xffffffff,%eax
00000000009b328d:  jne 0x9b3294 <MusicOggStream::update()+48>
274                      return;
00000000009b328f:  jmpq 0x9b349e <MusicOggStream::update()+570>
277                  int processed= 0;
00000000009b3294:  movl $0x0,-0x24(%rbp)
278                  bool active= true;
00000000009b329b:  movb $0x1,-0x11(%rbp)
280                  alGetSourcei(m_soundSource, AL_BUFFERS_PROCESSED, &processed);
00000000009b329f:  mov -0x38(%rbp),%rax
00000000009b32a3:  mov 0x3dc(%rax),%eax
00000000009b32a9:  lea -0x24(%rbp),%rdx
00000000009b32ad:  mov $0x1016,%esi
00000000009b32b2:  mov %eax,%edi
00000000009b32b4:  callq 0x989aa0 <alGetSourcei@plt>
282                  while(processed--)
00000000009b32b9:  jmpq 0x9b3374 <MusicOggStream::update()+272>
286                      alSourceUnqueueBuffers(m_soundSource, 1, &buffer);
00000000009b32be:  mov -0x38(%rbp),%rax
00000000009b32c2:  mov 0x3dc(%rax),%eax
00000000009b32c8:  lea -0x28(%rbp),%rdx
00000000009b32cc:  mov $0x1,%esi
00000000009b32d1:  mov %eax,%edi
00000000009b32d3:  callq 0x988ee0 <alSourceUnqueueBuffers@plt>
287                      if(!check("alSourceUnqueueBuffers")) return;
00000000009b32d8:  mov -0x38(%rbp),%rax
00000000009b32dc:  mov $0x11b3790,%esi
00000000009b32e1:  mov %rax,%rdi
00000000009b32e4:  callq 0x9b360c <MusicOggStream::check(char const*)>
00000000009b32e9:  xor $0x1,%eax
00000000009b32ec:  test %al,%al
00000000009b32ee:  je 0x9b32f5 <MusicOggStream::update()+145>
00000000009b32f0:  jmpq 0x9b349e <MusicOggStream::update()+570>
289                      active = streamIntoBuffer(buffer);
00000000009b32f5:  mov -0x28(%rbp),%edx
00000000009b32f8:  mov -0x38(%rbp),%rax
00000000009b32fc:  mov %edx,%esi
00000000009b32fe:  mov %rax,%rdi
00000000009b3301:  callq 0x9b34a6 <MusicOggStream::streamIntoBuffer(unsigned int)>
00000000009b3306:  mov %al,-0x11(%rbp)
290                      if(!active)
00000000009b3309:  movzbl -0x11(%rbp),%eax
00000000009b330d:  xor $0x1,%eax
00000000009b3310:  test %al,%al
00000000009b3312:  je 0x9b333c <MusicOggStream::update()+216>
293                          ov_time_seek(&m_oggStream, 0);
00000000009b3314:  mov -0x38(%rbp),%rax
00000000009b3318:  add $0x18,%rax
00000000009b331c:  pxor %xmm0,%xmm0
00000000009b3320:  mov %rax,%rdi
00000000009b3323:  callq 0x989430 <ov_time_seek@plt>
294                          active = streamIntoBuffer(buffer);//now there really should be data
00000000009b3328:  mov -0x28(%rbp),%edx
00000000009b332b:  mov -0x38(%rbp),%rax
00000000009b332f:  mov %edx,%esi
00000000009b3331:  mov %rax,%rdi
00000000009b3334:  callq 0x9b34a6 <MusicOggStream::streamIntoBuffer(unsigned int)>
00000000009b3339:  mov %al,-0x11(%rbp)
297                      alSourceQueueBuffers(m_soundSource, 1, &buffer);
00000000009b333c:  mov -0x38(%rbp),%rax
00000000009b3340:  mov 0x3dc(%rax),%eax
00000000009b3346:  lea -0x28(%rbp),%rdx
00000000009b334a:  mov $0x1,%esi
00000000009b334f:  mov %eax,%edi
00000000009b3351:  callq 0x988d50 <alSourceQueueBuffers@plt>
298                      if (!check("alSourceQueueBuffers")) return;
00000000009b3356:  mov -0x38(%rbp),%rax
00000000009b335a:  mov $0x11b37de,%esi
00000000009b335f:  mov %rax,%rdi
00000000009b3362:  callq 0x9b360c <MusicOggStream::check(char const*)>
00000000009b3367:  xor $0x1,%eax
00000000009b336a:  test %al,%al
00000000009b336c:  je 0x9b3374 <MusicOggStream::update()+272>
00000000009b336e:  nop
00000000009b336f:  jmpq 0x9b349e <MusicOggStream::update()+570>
282                  while(processed--)
00000000009b3374:  mov -0x24(%rbp),%eax
00000000009b3377:  lea -0x1(%rax),%edx
00000000009b337a:  mov %edx,-0x24(%rbp)
00000000009b337d:  test %eax,%eax
00000000009b337f:  setne %al
00000000009b3382:  test %al,%al
00000000009b3384:  jne 0x9b32be <MusicOggStream::update()+90>
301                  if (active)
00000000009b338a:  cmpb $0x0,-0x11(%rbp)
00000000009b338e:  je 0x9b345d <MusicOggStream::update()+505>
304                      SFXManager::checkError("before source state");
00000000009b3394:  lea -0x12(%rbp),%rax
00000000009b3398:  mov %rax,%rdi
00000000009b339b:  callq 0x988b50 <_ZNSaIcEC1Ev@plt>
00000000009b33a0:  lea -0x12(%rbp),%rdx
00000000009b33a4:  lea -0x20(%rbp),%rax
00000000009b33a8:  mov $0x11b37f3,%esi
00000000009b33ad:  mov %rax,%rdi
00000000009b33b0:  callq 0x988fc0 <_ZNSsC1EPKcRKSaIcE@plt>
00000000009b33b5:  lea -0x20(%rbp),%rax
00000000009b33b9:  mov %rax,%rdi
00000000009b33bc:  callq 0x9b6b5c <SFXManager::checkError(std::string const&)>
00000000009b33c1:  lea -0x20(%rbp),%rax
00000000009b33c5:  mov %rax,%rdi
00000000009b33c8:  callq 0x98a410 <_ZNSsD1Ev@plt>
00000000009b33cd:  lea -0x12(%rbp),%rax
00000000009b33d1:  mov %rax,%rdi
00000000009b33d4:  callq 0x987b90 <_ZNSaIcED1Ev@plt>
307                      alGetSourcei(m_soundSource, AL_SOURCE_STATE, &state);
00000000009b33d9:  mov -0x38(%rbp),%rax
00000000009b33dd:  mov 0x3dc(%rax),%eax
00000000009b33e3:  lea -0x2c(%rbp),%rdx
00000000009b33e7:  mov $0x1010,%esi
00000000009b33ec:  mov %eax,%edi
00000000009b33ee:  callq 0x989aa0 <alGetSourcei@plt>
308                      if (state != AL_PLAYING)
00000000009b33f3:  mov -0x2c(%rbp),%eax
00000000009b33f6:  cmp $0x1012,%eax
00000000009b33fb:  je 0x9b345b <MusicOggStream::update()+503>
312                          count++;
00000000009b33fd:  mov 0xf116d5(%rip),%eax        # 0x18c4ad8 <_ZZN14MusicOggStream6updateEvE5count>
00000000009b3403:  add $0x1,%eax
00000000009b3406:  mov %eax,0xf116cc(%rip)        # 0x18c4ad8 <_ZZN14MusicOggStream6updateEvE5count>
313                          if (count<10)
00000000009b340c:  mov 0xf116c6(%rip),%eax        # 0x18c4ad8 <_ZZN14MusicOggStream6updateEvE5count>
00000000009b3412:  cmp $0x9,%eax
00000000009b3415:  jg 0x9b3430 <MusicOggStream::update()+460>
315                                        "Source state: %d", state);
00000000009b3417:  mov -0x2c(%rbp),%eax
00000000009b341a:  mov %eax,%edx
00000000009b341c:  mov $0x11b3808,%esi
00000000009b3421:  mov $0x11b366f,%edi
00000000009b3426:  mov $0x0,%eax
00000000009b342b:  callq 0x990a3e <Log::warn(char const*, char const*, ...)>
316                          alGetSourcei(m_soundSource, AL_BUFFERS_PROCESSED, &processed);
00000000009b3430:  mov -0x38(%rbp),%rax
00000000009b3434:  mov 0x3dc(%rax),%eax
00000000009b343a:  lea -0x24(%rbp),%rdx
00000000009b343e:  mov $0x1016,%esi
00000000009b3443:  mov %eax,%edi
00000000009b3445:  callq 0x989aa0 <alGetSourcei@plt>
317                          alSourcePlay(m_soundSource);
00000000009b344a:  mov -0x38(%rbp),%rax
00000000009b344e:  mov 0x3dc(%rax),%eax
00000000009b3454:  mov %eax,%edi
00000000009b3456:  callq 0x989fd0 <alSourcePlay@plt>
00000000009b345b:  jmp 0x9b349e <MusicOggStream::update()+570>
323                                            "twice in a row.");
00000000009b345d:  mov $0x11b3840,%esi
00000000009b3462:  mov $0x11b366f,%edi
00000000009b3467:  mov $0x0,%eax
00000000009b346c:  callq 0x990a3e <Log::warn(char const*, char const*, ...)>
00000000009b3471:  jmp 0x9b349e <MusicOggStream::update()+570>
00000000009b3473:  mov %rax,%rbx
00000000009b3476:  lea -0x20(%rbp),%rax
00000000009b347a:  mov %rax,%rdi
00000000009b347d:  callq 0x98a410 <_ZNSsD1Ev@plt>
00000000009b3482:  jmp 0x9b3487 <MusicOggStream::update()+547>
00000000009b3484:  mov %rax,%rbx
00000000009b3487:  lea -0x12(%rbp),%rax
00000000009b348b:  mov %rax,%rdi
00000000009b348e:  callq 0x987b90 <_ZNSaIcED1Ev@plt>
00000000009b3493:  mov %rbx,%rax
00000000009b3496:  mov %rax,%rdi
00000000009b3499:  callq 0x988680 <_Unwind_Resume@plt>
325              }  // update
00000000009b349e:  add $0x38,%rsp
00000000009b34a2:  pop %rbx
00000000009b34a3:  pop %rbp
00000000009b34a4:  retq
00000000009b34a5:  nop
329              {
                  MusicOggStream::streamIntoBuffer(unsigned int):
00000000009b34a6:  push %rbp
00000000009b34a7:  mov %rsp,%rbp
00000000009b34aa:  push %r12
00000000009b34ac:  push %rbx
00000000009b34ad:  sub $0xac70,%rsp
00000000009b34b4:  mov %rdi,-0xac78(%rbp)
00000000009b34bb:  mov %esi,-0xac7c(%rbp)
331                  const int isBigEndian = (IS_LITTLE_ENDIAN ? 0 : 1);
00000000009b34c1:  movzbl 0xf2439c(%rip),%eax        # 0x18d7864 <IS_LITTLE_ENDIAN>
00000000009b34c8:  test %al,%al
00000000009b34ca:  je 0x9b34d3 <MusicOggStream::streamIntoBuffer(unsigned int)+45>
00000000009b34cc:  mov $0x0,%eax
00000000009b34d1:  jmp 0x9b34d8 <MusicOggStream::streamIntoBuffer(unsigned int)+50>
00000000009b34d3:  mov $0x1,%eax
00000000009b34d8:  mov %eax,-0x18(%rbp)
333                  int  size = 0;
00000000009b34db:  movl $0x0,-0x14(%rbp)
337                  while(size < m_buffer_size)
00000000009b34e2:  jmpq 0x9b357d <MusicOggStream::streamIntoBuffer(unsigned int)+215>
340                                        isBigEndian, 2, 1, &portion);
00000000009b34e7:  mov $0xac44,%eax
00000000009b34ec:  sub -0x14(%rbp),%eax
00000000009b34ef:  mov -0x14(%rbp),%edx
00000000009b34f2:  movslq %edx,%rdx
00000000009b34f5:  lea -0xac70(%rbp),%rcx
00000000009b34fc:  lea (%rcx,%rdx,1),%rsi
00000000009b3500:  mov -0xac78(%rbp),%rdx
00000000009b3507:  lea 0x18(%rdx),%rdi
00000000009b350b:  sub $0x8,%rsp
00000000009b350f:  mov -0x18(%rbp),%edx
00000000009b3512:  lea -0x20(%rbp),%rcx
00000000009b3516:  push %rcx
00000000009b3517:  mov $0x1,%r9d
00000000009b351d:  mov $0x2,%r8d
00000000009b3523:  mov %edx,%ecx
00000000009b3525:  mov %eax,%edx
00000000009b3527:  callq 0x9882d0 <ov_read@plt>
00000000009b352c:  add $0x10,%rsp
00000000009b3530:  mov %eax,-0x1c(%rbp)
342                      if(result > 0)
00000000009b3533:  cmpl $0x0,-0x1c(%rbp)
00000000009b3537:  jle 0x9b3541 <MusicOggStream::streamIntoBuffer(unsigned int)+155>
343                          size += result;
00000000009b3539:  mov -0x1c(%rbp),%eax
00000000009b353c:  add %eax,-0x14(%rbp)
00000000009b353f:  jmp 0x9b357d <MusicOggStream::streamIntoBuffer(unsigned int)+215>
345                          if(result < 0)
00000000009b3541:  cmpl $0x0,-0x1c(%rbp)
00000000009b3545:  jns 0x9b357b <MusicOggStream::streamIntoBuffer(unsigned int)+213>
346                              throw errorString(result);
00000000009b3547:  mov $0x8,%edi
00000000009b354c:  callq 0x989600 <__cxa_allocate_exception@plt>
00000000009b3551:  mov %rax,%rbx
00000000009b3554:  mov -0x1c(%rbp),%edx
00000000009b3557:  mov -0xac78(%rbp),%rax
00000000009b355e:  mov %rax,%rsi
00000000009b3561:  mov %rbx,%rdi
00000000009b3564:  callq 0x9b36a8 <MusicOggStream::errorString(int)>
00000000009b3569:  mov $0x98a410,%edx
00000000009b356e:  mov $0x11b3b30,%esi
00000000009b3573:  mov %rbx,%rdi
00000000009b3576:  callq 0x988590 <__cxa_throw@plt>
348                              break;
00000000009b357b:  jmp 0x9b358a <MusicOggStream::streamIntoBuffer(unsigned int)+228>
337                  while(size < m_buffer_size)
00000000009b357d:  cmpl $0xac43,-0x14(%rbp)
00000000009b3584:  jle 0x9b34e7 <MusicOggStream::streamIntoBuffer(unsigned int)+65>
351                  if(size == 0) return false;
00000000009b358a:  cmpl $0x0,-0x14(%rbp)
00000000009b358e:  jne 0x9b3597 <MusicOggStream::streamIntoBuffer(unsigned int)+241>
00000000009b3590:  mov $0x0,%eax
00000000009b3595:  jmp 0x9b3603 <MusicOggStream::streamIntoBuffer(unsigned int)+349>
353                  alBufferData(buffer, nb_channels, pcm, size, m_vorbisInfo->rate);
00000000009b3597:  mov -0xac78(%rbp),%rax
00000000009b359e:  mov 0x3c8(%rax),%rax
00000000009b35a5:  mov 0x8(%rax),%rax
00000000009b35a9:  mov %eax,%edi
00000000009b35ab:  mov -0xac78(%rbp),%rax
00000000009b35b2:  mov 0x3e0(%rax),%esi
00000000009b35b8:  mov -0x14(%rbp),%ecx
00000000009b35bb:  lea -0xac70(%rbp),%rdx
00000000009b35c2:  mov -0xac7c(%rbp),%eax
00000000009b35c8:  mov %edi,%r8d
00000000009b35cb:  mov %eax,%edi
00000000009b35cd:  callq 0x988360 <alBufferData@plt>
354                  check("alBufferData");
00000000009b35d2:  mov -0xac78(%rbp),%rax
00000000009b35d9:  mov $0x11b387b,%esi
00000000009b35de:  mov %rax,%rdi
00000000009b35e1:  callq 0x9b360c <MusicOggStream::check(char const*)>
356                  return true;
00000000009b35e6:  mov $0x1,%eax
................. ...
00007ffff72e0503:  (bad)
00007ffff72e0504:  cvtsd2si %xmm0,%eax
00007ffff72e0508:  cmp $0xffff8000,%eax
00007ffff72e050d:  cmovl %r9d,%eax
00007ffff72e0511:  cmp $0x7fff,%eax
00007ffff72e0516:  cmovg %edi,%eax
00007ffff72e0519:  add $0x4,%rdx
00007ffff72e051d:  add $0x8000,%ax
00007ffff72e0521:  mov %ax,(%rcx)
00007ffff72e0524:  add %rbp,%rcx
00007ffff72e0527:  cmp %rsi,%rdx
00007ffff72e052a:  jne 0x7ffff72e04f8 <ov_read_filter+728>
00007ffff72e052c:  add $0x2,%r10
00007ffff72e0530:  add $0x8,%r11
00007ffff72e0534:  cmp %r13,%r10
00007ffff72e0537:  jne 0x7ffff72e04e6 <ov_read_filter+710>
00007ffff72e0539:  jmpq 0x7ffff72e03ac <ov_read_filter+396>
00007ffff72e053e:  mov $0xffffffffffffff7d,%rax
00007ffff72e0545:  jmpq 0x7ffff72e03fa <ov_read_filter+474>
00007ffff72e054a:  cmp $0x2,%rbp
00007ffff72e054e:  je 0x7ffff72e05e2 <ov_read_filter+962>
00007ffff72e0554:  test %rbp,%rbp
00007ffff72e0557:  jle 0x7ffff72e03ac <ov_read_filter+396>
00007ffff72e055d:  add %rbp,%rbp
00007ffff72e0560:  mov 0x28(%rsp),%r11
00007ffff72e0565:  lea 0x0(,%r8,4),%r12
00007ffff72e056d:  lea (%r15,%rbp,1),%r13
00007ffff72e0571:  mov %r15,%r10
00007ffff72e0574:  mov $0xffff8000,%r9d
00007ffff72e057a:  movss 0xc56(%rip),%xmm1        # 0x7ffff72e11d8
00007ffff72e0582:  mov $0x7fff,%edi
00007ffff72e0587:  mov (%r11),%rdx
00007ffff72e058a:  test %r8,%r8
00007ffff72e058d:  mov %r10,%rcx
00007ffff72e0590:  lea (%rdx,%r12,1),%rsi
00007ffff72e0594:  jle 0x7ffff72e05d0 <ov_read_filter+944>
00007ffff72e0596:  nopw %cs:0x0(%rax,%rax,1)
00007ffff72e05a0:  movss (%rdx),%xmm0
00007ffff72e05a4:  mulss %xmm1,%xmm0
00007ffff72e05a8:  cvtss2sd %xmm0,%xmm0
00007ffff72e05ac:  cvtsd2si %xmm0,%eax
00007ffff72e05b0:  cmp $0xffff8000,%eax
00007ffff72e05b5:  cmovl %r9d,%eax
00007ffff72e05b9:  cmp $0x7fff,%eax
00007ffff72e05be:  cmovg %edi,%eax
00007ffff72e05c1:  add $0x4,%rdx
00007ffff72e05c5:  mov %ax,(%rcx)
00007ffff72e05c8:  add %rbp,%rcx
00007ffff72e05cb:  cmp %rsi,%rdx
00007ffff72e05ce:  jne 0x7ffff72e05a0 <ov_read_filter+896>
00007ffff72e05d0:  add $0x2,%r10
00007ffff72e05d4:  add $0x8,%r11
00007ffff72e05d8:  cmp %r13,%r10
00007ffff72e05db:  jne 0x7ffff72e0587 <ov_read_filter+871>
00007ffff72e05dd:  jmpq 0x7ffff72e03ac <ov_read_filter+396>
00007ffff72e05e2:  mov 0x28(%rsp),%rax
00007ffff72e05e7:  mov %r8d,%esi
00007ffff72e05ea:  mov 0x8(%rax),%rdx
00007ffff72e05ee:  mov (%rax),%rcx
00007ffff72e05f1:  mov %r8d,%eax
00007ffff72e05f4:  and $0xfffffff0,%eax
00007ffff72e05f7:  cltq
00007ffff72e05f9:  test %rax,%rax
00007ffff72e05fc:  jle 0x7ffff72e097f <ov_read_filter+1887>
00007ffff72e0602:  movaps 0xba7(%rip),%xmm2        # 0x7ffff72e11b0 <parm.7799>
00007ffff72e0609:  xor %edi,%edi
00007ffff72e060b:  movaps (%rcx,%rdi,4),%xmm0
00007ffff72e060f:  movaps 0x10(%rcx,%rdi,4),%xmm4
00007ffff72e0614:  mulps %xmm2,%xmm0
00007ffff72e0617:  movaps (%rdx,%rdi,4),%xmm1
00007ffff72e061b:  mulps %xmm2,%xmm4
00007ffff72e061e:  movaps 0x10(%rdx,%rdi,4),%xmm3
00007ffff72e0623:  mulps %xmm2,%xmm1
00007ffff72e0626:  cvtps2dq %xmm0,%xmm0
00007ffff72e062a:  mulps %xmm2,%xmm3
00007ffff72e062d:  cvtps2dq %xmm4,%xmm4
00007ffff72e0631:  packssdw %xmm4,%xmm0
00007ffff72e0635:  cvtps2dq %xmm1,%xmm1
00007ffff72e0639:  cvtps2dq %xmm3,%xmm3
00007ffff72e063d:  packssdw %xmm3,%xmm1
00007ffff72e0641:  movdqa %xmm0,%xmm3
00007ffff72e0645:  punpckhwd %xmm1,%xmm0
00007ffff72e0649:  punpcklwd %xmm1,%xmm3
00007ffff72e064d:  movaps %xmm0,0x10(%r15,%rdi,4)
00007ffff72e0653:  movaps %xmm3,(%r15,%rdi,4)
00007ffff72e0658:  add $0x8,%rdi
00007ffff72e065c:  cmp %rdi,%rax
00007ffff72e065f:  jg 0x7ffff72e060b <ov_read_filter+1003>
00007ffff72e0661:  cmp %r8,%rax
00007ffff72e0664:  jge 0x7ffff72e03af <ov_read_filter+399>
00007ffff72e066a:  mov %r8,%r13
00007ffff72e066d:  sub %rax,%r13
00007ffff72e0670:  lea -0x8(%r13),%r10
00007ffff72e0674:  shr $0x3,%r10
00007ffff72e0678:  add $0x1,%r10
00007ffff72e067c:  lea 0x0(,%r10,8),%rdi
00007ffff72e0684:  mov %rdi,(%rsp)
00007ffff72e0688:  lea -0x1(%r8),%rdi
00007ffff72e068c:  sub %rax,%rdi
00007ffff72e068f:  cmp $0x6,%rdi
00007ffff72e0693:  jbe 0x7ffff72e077f <ov_read_filter+1375>
00007ffff72e0699:  lea 0x0(,%rax,4),%r9
00007ffff72e06a1:  movaps 0xb58(%rip),%xmm5        # 0x7ffff72e1200
00007ffff72e06a8:  xor %edi,%edi
00007ffff72e06aa:  xor %r11d,%r11d
00007ffff72e06ad:  movaps 0xb5c(%rip),%xmm4        # 0x7ffff72e1210
00007ffff72e06b4:  lea (%rcx,%r9,1),%r12
00007ffff72e06b8:  lea (%rdx,%r9,1),%rbp
00007ffff72e06bc:  add %r15,%r9
00007ffff72e06bf:  movaps 0xb5a(%rip),%xmm3        # 0x7ffff72e1220
00007ffff72e06c6:  add $0x1,%r11
00007ffff72e06ca:  movups (%r12,%rdi,1),%xmm0
00007ffff72e06cf:  movups 0x10(%r12,%rdi,1),%xmm1
00007ffff72e06d5:  mulps %xmm5,%xmm0
00007ffff72e06d8:  mulps %xmm5,%xmm1
00007ffff72e06db:  minps %xmm4,%xmm0
00007ffff72e06de:  minps %xmm4,%xmm1
00007ffff72e06e1:  maxps %xmm3,%xmm0
00007ffff72e06e4:  maxps %xmm3,%xmm1
00007ffff72e06e7:  cvttps2dq %xmm0,%xmm0
00007ffff72e06eb:  movdqa %xmm0,%xmm2
00007ffff72e06ef:  cvttps2dq %xmm1,%xmm1
00007ffff72e06f3:  punpcklwd %xmm1,%xmm0
00007ffff72e06f7:  punpckhwd %xmm1,%xmm2
00007ffff72e06fb:  movdqa %xmm0,%xmm1
00007ffff72e06ff:  punpcklwd %xmm2,%xmm0
00007ffff72e0703:  punpckhwd %xmm2,%xmm1
00007ffff72e0707:  movups 0x10(%rbp,%rdi,1),%xmm2
00007ffff72e070c:  mulps %xmm5,%xmm2
00007ffff72e070f:  punpcklwd %xmm1,%xmm0
00007ffff72e0713:  movups 0x0(%rbp,%rdi,1),%xmm1
00007ffff72e0718:  mulps %xmm5,%xmm1
00007ffff72e071b:  minps %xmm4,%xmm2
00007ffff72e071e:  minps %xmm4,%xmm1
00007ffff72e0721:  maxps %xmm3,%xmm2
00007ffff72e0724:  maxps %xmm3,%xmm1
00007ffff72e0727:  cvttps2dq %xmm2,%xmm2
00007ffff72e072b:  cvttps2dq %xmm1,%xmm1
00007ffff72e072f:  movdqa %xmm1,%xmm6
00007ffff72e0733:  punpcklwd %xmm2,%xmm1
00007ffff72e0737:  punpckhwd %xmm2,%xmm6
00007ffff72e073b:  movdqa %xmm1,%xmm2
00007ffff72e073f:  punpcklwd %xmm6,%xmm1
00007ffff72e0743:  punpckhwd %xmm6,%xmm2
00007ffff72e0747:  punpcklwd %xmm2,%xmm1
00007ffff72e074b:  movdqa %xmm0,%xmm2
00007ffff72e074f:  punpckhwd %xmm1,%xmm0
00007ffff72e0753:  punpcklwd %xmm1,%xmm2
00007ffff72e0757:  movups %xmm0,0x10(%r9,%rdi,1)
00007ffff72e075d:  movups %xmm2,(%r9,%rdi,1)
00007ffff72e0762:  add $0x20,%rdi
00007ffff72e0766:  cmp %r11,%r10
00007ffff72e0769:  ja 0x7ffff72e06c6 <ov_read_filter+1190>
00007ffff72e076f:  mov (%rsp),%rdi
00007ffff72e0773:  add %rdi,%rax
00007ffff72e0776:  cmp %rdi,%r13
00007ffff72e0779:  je 0x7ffff72e03af <ov_read_filter+399>
00007ffff72e077f:  movss (%rcx,%rax,4),%xmm4
00007ffff72e0784:  lea 0x0(,%rax,4),%rdi
00007ffff72e078c:  movss 0xa44(%rip),%xmm0        # 0x7ffff72e11d8
00007ffff72e0794:  mulss %xmm0,%xmm4
00007ffff72e0798:  movss 0xa3c(%rip),%xmm2        # 0x7ffff72e11dc
00007ffff72e07a0:  movss 0xa38(%rip),%xmm1        # 0x7ffff72e11e0
00007ffff72e07a8:  movss (%rdx,%rax,4),%xmm3
00007ffff72e07ad:  mulss %xmm0,%xmm3
00007ffff72e07b1:  minss %xmm2,%xmm4
00007ffff72e07b5:  minss %xmm2,%xmm3
00007ffff72e07b9:  maxss %xmm1,%xmm4
00007ffff72e07bd:  maxss %xmm1,%xmm3
00007ffff72e07c1:  cvttss2si %xmm4,%r9d
00007ffff72e07c6:  mov %r9w,(%r15,%rax,4)
00007ffff72e07cb:  cvttss2si %xmm3,%r9d
00007ffff72e07d0:  mov %r9w,0x2(%r15,%rdi,1)
00007ffff72e07d6:  lea 0x1(%rax),%r9
00007ffff72e07da:  cmp %r9,%r8
00007ffff72e07dd:  jle 0x7ffff72e03af <ov_read_filter+399>
00007ffff72e07e3:  movss 0x4(%rcx,%rdi,1),%xmm4
00007ffff72e07e9:  mulss %xmm0,%xmm4
00007ffff72e07ed:  movss 0x4(%rdx,%rdi,1),%xmm3
00007ffff72e07f3:  mulss %xmm0,%xmm3
00007ffff72e07f7:  minss %xmm2,%xmm4
00007ffff72e07fb:  minss %xmm2,%xmm3
00007ffff72e07ff:  maxss %xmm1,%xmm4
00007ffff72e0803:  maxss %xmm1,%xmm3
00007ffff72e0807:  cvttss2si %xmm4,%r9d
00007ffff72e080c:  mov %r9w,0x4(%r15,%rdi,1)
00007ffff72e0812:  cvttss2si %xmm3,%r9d
00007ffff72e0817:  mov %r9w,0x6(%r15,%rdi,1)
00007ffff72e081d:  lea 0x2(%rax),%r9
00007ffff72e0821:  cmp %r9,%r8
00007ffff72e0824:  jle 0x7ffff72e03af <ov_read_filter+399>
00007ffff72e082a:  movss 0x8(%rcx,%rdi,1),%xmm4
00007ffff72e0830:  mulss %xmm0,%xmm4
00007ffff72e0834:  movss 0x8(%rdx,%rdi,1),%xmm3
00007ffff72e083a:  mulss %xmm0,%xmm3
00007ffff72e083e:  minss %xmm2,%xmm4
00007ffff72e0842:  minss %xmm2,%xmm3
00007ffff72e0846:  maxss %xmm1,%xmm4
00007ffff72e084a:  maxss %xmm1,%xmm3
00007ffff72e084e:  cvttss2si %xmm4,%r9d
00007ffff72e0853:  mov %r9w,0x8(%r15,%rdi,1)
00007ffff72e0859:  cvttss2si %xmm3,%r9d
00007ffff72e085e:  mov %r9w,0xa(%r15,%rdi,1)
00007ffff72e0864:  lea 0x3(%rax),%r9
00007ffff72e0868:  cmp %r9,%r8
00007ffff72e086b:  jle 0x7ffff72e03af <ov_read_filter+399>
00007ffff72e0871:  movss 0xc(%rcx,%rdi,1),%xmm4
00007ffff72e0877:  mulss %xmm0,%xmm4
00007ffff72e087b:  movss 0xc(%rdx,%rdi,1),%xmm3
00007ffff72e0881:  mulss %xmm0,%xmm3
00007ffff72e0885:  minss %xmm2,%xmm4
00007ffff72e0889:  minss %xmm2,%xmm3
00007ffff72e088d:  maxss %xmm1,%xmm4
00007ffff72e0891:  maxss %xmm1,%xmm3
00007ffff72e0895:  cvttss2si %xmm4,%r9d
00007ffff72e089a:  mov %r9w,0xc(%r15,%rdi,1)
00007ffff72e08a0:  cvttss2si %xmm3,%r9d
00007ffff72e08a5:  mov %r9w,0xe(%r15,%rdi,1)
00007ffff72e08ab:  lea 0x4(%rax),%r9
00007ffff72e08af:  cmp %r9,%r8
00007ffff72e08b2:  jle 0x7ffff72e03af <ov_read_filter+399>
00007ffff72e08b8:  movss 0x10(%rcx,%rdi,1),%xmm4
00007ffff72e08be:  mulss %xmm0,%xmm4
00007ffff72e08c2:  movss 0x10(%rdx,%rdi,1),%xmm3
00007ffff72e08c8:  mulss %xmm0,%xmm3
00007ffff72e08cc:  minss %xmm2,%xmm4
00007ffff72e08d0:  minss %xmm2,%xmm3
00007ffff72e08d4:  maxss %xmm1,%xmm4
00007ffff72e08d8:  maxss %xmm1,%xmm3
00007ffff72e08dc:  cvttss2si %xmm4,%r9d
00007ffff72e08e1:  mov %r9w,0x10(%r15,%rdi,1)
00007ffff72e08e7:  cvttss2si %xmm3,%r9d
00007ffff72e08ec:  mov %r9w,0x12(%r15,%rdi,1)
00007ffff72e08f2:  lea 0x5(%rax),%r9
00007ffff72e08f6:  cmp %r9,%r8
00007ffff72e08f9:  jle 0x7ffff72e03af <ov_read_filter+399>
00007ffff72e08ff:  movss 0x14(%rcx,%rdi,1),%xmm4
00007ffff72e0905:  add $0x6,%rax
00007ffff72e0909:  mulss %xmm0,%xmm4
00007ffff72e090d:  movss 0x14(%rdx,%rdi,1),%xmm3
00007ffff72e0913:  mulss %xmm0,%xmm3
00007ffff72e0917:  cmp %rax,%r8
00007ffff72e091a:  minss %xmm2,%xmm4
00007ffff72e091e:  minss %xmm2,%xmm3
00007ffff72e0922:  maxss %xmm1,%xmm4
00007ffff72e0926:  maxss %xmm1,%xmm3
00007ffff72e092a:  cvttss2si %xmm4,%r9d
00007ffff72e092f:  mov %r9w,0x14(%r15,%rdi,1)
00007ffff72e0935:  cvttss2si %xmm3,%r9d
00007ffff72e093a:  mov %r9w,0x16(%r15,%rdi,1)
00007ffff72e0940:  jle 0x7ffff72e03af <ov_read_filter+399>
00007ffff72e0946:  movss 0x18(%rcx,%rdi,1),%xmm3
00007ffff72e094c:  mulss %xmm0,%xmm3
00007ffff72e0950:  mulss 0x18(%rdx,%rdi,1),%xmm0
00007ffff72e0956:  minss %xmm2,%xmm3
00007ffff72e095a:  minss %xmm0,%xmm2
00007ffff72e095e:  maxss %xmm1,%xmm3
00007ffff72e0962:  maxss %xmm2,%xmm1
00007ffff72e0966:  cvttss2si %xmm3,%eax
00007ffff72e096a:  mov %ax,0x18(%r15,%rdi,1)
00007ffff72e0970:  cvttss2si %xmm1,%eax
00007ffff72e0974:  mov %ax,0x1a(%r15,%rdi,1)
00007ffff72e097a:  jmpq 0x7ffff72e03af <ov_read_filter+399>
00007ffff72e097f:  xor %eax,%eax
00007ffff72e0981:  jmpq 0x7ffff72e0661 <ov_read_filter+1089>
00007ffff72e0986:  nopw %cs:0x0(%rax,%rax,1)
                  ov_read:
00007ffff72e0990:  sub $0x10,%rsp
00007ffff72e0994:  pushq $0x0
00007ffff72e0996:  pushq $0x0
00007ffff72e0998:  pushq 0x28(%rsp)
00007ffff72e099c:  callq 0x7ffff72dad20 <ov_read_filter@plt>
00007ffff72e09a1:  add $0x28,%rsp
00007ffff72e09a5:  retq
00007ffff72e09a6:  nopw %cs:0x0(%rax,%rax,1)
                  ov_read_float:
00007ffff72e09b0:  mov 0x80(%rdi),%eax
00007ffff72e09b6:  cmp $0x1,%eax
00007ffff72e09b9:  jle 0x7ffff72e0ab9 <ov_read_float+265>
00007ffff72e09bf:  push %r15
00007ffff72e09c1:  push %r14
00007ffff72e09c3:  mov %rsi,%r15
00007ffff72e09c6:  push %r13
00007ffff72e09c8:  push %r12
00007ffff72e09ca:  mov %rcx,%r14
00007ffff72e09cd:  push %rbp
00007ffff72e09ce:  push %rbx
00007ffff72e09cf:  movslq %edx,%r13
00007ffff72e09d2:  mov %rdi,%rbx
00007ffff72e09d5:  lea 0x240(%rdi),%rbp
00007ffff72e09dc:  sub $0x28,%rsp
00007ffff72e09e0:  lea 0x18(%rsp),%r12
00007ffff72e09e5:  jmp 0x7ffff72e0a0f <ov_read_float+95>
00007ffff72e09e7:  nopw 0x0(%rax,%rax,1)
00007ffff72e09f0:  mov %rbx,%rdi
00007ffff72e09f3:  callq 0x7ffff72dc480 <_fetch_and_process_packet.constprop.10>
00007ffff72e09f8:  cmp $0xfffffffe,%eax
00007ffff72e09fb:  je 0x7ffff72e0a90 <ov_read_float+224>
00007ffff72e0a01:  test %eax,%eax
00007ffff72e0a03:  jle 0x7ffff72e0aa8 <ov_read_float+248>
00007ffff72e0a09:  mov 0x80(%rbx),%eax
00007ffff72e0a0f:  cmp $0x4,%eax
00007ffff72e0a12:  jne 0x7ffff72e09f0 <ov_read_float+64>
00007ffff72e0a14:  mov %r12,%rsi
00007ffff72e0a17:  mov %rbp,%rdi
00007ffff72e0a1a:  callq 0x7ffff72dacd0 <vorbis_synthesis_pcmout@plt>
00007ffff72e0a1f:  movslq %eax,%rdx
00007ffff72e0a22:  test %rdx,%rdx
00007ffff72e0a25:  je 0x7ffff72e09f0 <ov_read_float+64>
00007ffff72e0a27:  mov 0x68(%rbx),%rdi
00007ffff72e0a2b:  mov %rdx,0x8(%rsp)
00007ffff72e0a30:  callq 0x7ffff72dad10 <vorbis_synthesis_halfrate_p@plt>
00007ffff72e0a35:  test %r15,%r15
00007ffff72e0a38:  mov %eax,%r12d
00007ffff72e0a3b:  mov 0x8(%rsp),%rdx
00007ffff72e0a40:  je 0x7ffff72e0a4a <ov_read_float+154>
00007ffff72e0a42:  mov 0x18(%rsp),%rax
00007ffff72e0a47:  mov %rax,(%r15)
00007ffff72e0a4a:  cmp %r13,%rdx
00007ffff72e0a4d:  mov %rbp,%rdi
00007ffff72e0a50:  cmovle %rdx,%r13
00007ffff72e0a54:  mov %r13d,%esi
00007ffff72e0a57:  callq 0x7ffff72dadb0 <vorbis_synthesis_read@plt>
00007ffff72e0a5c:  mov %r13,%rax
00007ffff72e0a5f:  mov %r12d,%ecx
00007ffff72e0a62:  shl %cl,%rax
00007ffff72e0a65:  add %rax,0x78(%rbx)
00007ffff72e0a69:  test %r14,%r14
00007ffff72e0a6c:  je 0x7ffff72e0a77 <ov_read_float+199>
00007ffff72e0a6e:  mov 0x90(%rbx),%eax
00007ffff72e0a74:  mov %eax,(%r14)
00007ffff72e0a77:  add $0x28,%rsp
00007ffff72e0a7b:  mov %r13,%rax
00007ffff72e0a7e:  pop %rbx
00007ffff72e0a7f:  pop %rbp
00007ffff72e0a80:  pop %r12
00007ffff72e0a82:  pop %r13
00007ffff72e0a84:  pop %r14
00007ffff72e0a86:  pop %r15
00007ffff72e0a88:  retq
00007ffff72e0a89:  nopl 0x0(%rax)
00007ffff72e0a90:  add $0x28,%rsp
00007ffff72e0a94:  xor %eax,%eax
00007ffff72e0a96:  pop %rbx
00007ffff72e0a97:  pop %rbp
00007ffff72e0a98:  pop %r12
00007ffff72e0a9a:  pop %r13
00007ffff72e0a9c:  pop %r14
00007ffff72e0a9e:  pop %r15
00007ffff72e0aa0:  retq
00007ffff72e0aa1:  nopl 0x0(%rax)
00007ffff72e0aa8:  add $0x28,%rsp
00007ffff72e0aac:  cltq
00007ffff72e0aae:  pop %rbx
00007ffff72e0aaf:  pop %rbp
00007ffff72e0ab0:  pop %r12
00007ffff72e0ab2:  pop %r13
00007ffff72e0ab4:  pop %r14
00007ffff72e0ab6:  pop %r15
00007ffff72e0ab8:  retq
00007ffff72e0ab9:  mov $0xffffffffffffff7d,%rax
00007ffff72e0ac0:  retq
00007ffff72e0ac1:  data32 data32 data32 data32 data32 nopw %cs:0x0(%rax,%rax,1)
                  ov_crosslap:
00007ffff72e0ad0:  cmp %rsi,%rdi
00007ffff72e0ad3:  je 0x7ffff72e1128 <ov_crosslap+1624>
00007ffff72e0ad9:  mov 0x80(%rdi),%eax
00007ffff72e0adf:  cmp $0x1,%eax
00007ffff72e0ae2:  jle 0x7ffff72e1135 <ov_crosslap+1637>
00007ffff72e0ae8:  push %rbp
00007ffff72e0ae9:  mov %rsp,%rbp
00007ffff72e0aec:  push %r15
00007ffff72e0aee:  push %r14
00007ffff72e0af0:  push %r13
00007ffff72e0af2:  push %r12
00007ffff72e0af4:  mov %rsi,%r12
00007ffff72e0af7:  push %rbx
00007ffff72e0af8:  mov %rdi,%rbx
00007ffff72e0afb:  sub $0x48,%rsp
00007ffff72e0aff:  cmpl $0x1,0x80(%rsi)
00007ffff72e0b06:  jg 0x7ffff72e0b2c <ov_crosslap+92>
00007ffff72e0b08:  jmpq 0x7ffff72e112b <ov_crosslap+1627>
00007ffff72e0b0d:  nopl (%rax)
00007ffff72e0b10:  mov %rbx,%rdi
00007ffff72e0b13:  callq 0x7ffff72dbee0 <_fetch_and_process_packet.constprop.11>
00007ffff72e0b18:  cmp $0xfffffffd,%eax
00007ffff72e0b1b:  je 0x7ffff72e0b26 <ov_crosslap+86>
00007ffff72e0b1d:  mov %eax,%edx
00007ffff72e0b1f:  shr $0x1f,%edx
00007ffff72e0b22:  test %dl,%dl
00007ffff72e0b24:  jne 0x7ffff72e0b61 <ov_crosslap+145>
00007ffff72e0b26:  mov 0x80(%rbx),%eax
00007ffff72e0b2c:  cmp $0x4,%eax
00007ffff72e0b2f:  jne 0x7ffff72e0b10 <ov_crosslap+64>
00007ffff72e0b31:  lea 0x240(%r12),%r13
00007ffff72e0b39:  nopl 0x0(%rax)
00007ffff72e0b40:  cmpl $0x4,0x80(%r12)
00007ffff72e0b49:  je 0x7ffff72e0b70 <ov_crosslap+160>
00007ffff72e0b4b:  mov %r12,%rdi
00007ffff72e0b4e:  callq 0x7ffff72dbee0 <_fetch_and_process_packet.constprop.11>
00007ffff72e0b53:  cmp $0xfffffffd,%eax
00007ffff72e0b56:  je 0x7ffff72e0b40 <ov_crosslap+112>
00007ffff72e0b58:  mov %eax,%edx
00007ffff72e0b5a:  shr $0x1f,%edx
00007ffff72e0b5d:  test %dl,%dl
00007ffff72e0b5f:  je 0x7ffff72e0b40 <ov_crosslap+112>
00007ffff72e0b61:  lea -0x28(%rbp),%rsp
00007ffff72e0b65:  pop %rbx
00007ffff72e0b66:  pop %r12
00007ffff72e0b68:  pop %r13
00007ffff72e0b6a:  pop %r14
00007ffff72e0b6c:  pop %r15
00007ffff72e0b6e:  pop %rbp
00007ffff72e0b6f:  retq
00007ffff72e0b70:  xor %esi,%esi
00007ffff72e0b72:  mov %r13,%rdi
00007ffff72e0b75:  callq 0x7ffff72dacd0 <vorbis_synthesis_pcmout@plt>
00007ffff72e0b7a:  test %eax,%eax
00007ffff72e0b7c:  je 0x7ffff72e0b4b <ov_crosslap+123>
00007ffff72e0b7e:  mov $0xffffffff,%esi
00007ffff72e0b83:  mov %rbx,%rdi
00007ffff72e0b86:  callq 0x7ffff72dae00 <ov_info@plt>
00007ffff72e0b8b:  mov $0xffffffff,%esi
00007ffff72e0b90:  mov %r12,%rdi
00007ffff72e0b93:  mov %rax,%r14
00007ffff72e0b96:  callq 0x7ffff72dae00 <ov_info@plt>
00007ffff72e0b9b:  mov %rbx,%rdi
00007ffff72e0b9e:  mov %rax,%r15
00007ffff72e0ba1:  callq 0x7ffff72dac20 <ov_halfrate_p@plt>
00007ffff72e0ba6:  mov %r12,%rdi
00007ffff72e0ba9:  mov %eax,-0x48(%rbp)
00007ffff72e0bac:  callq 0x7ffff72dac20 <ov_halfrate_p@plt>
00007ffff72e0bb1:  mov %eax,%r12d
00007ffff72e0bb4:  movslq 0x4(%r14),%rax
00007ffff72e0bb8:  xor %esi,%esi
00007ffff72e0bba:  mov %r14,%rdi
00007ffff72e0bbd:  lea 0x1e(,%rax,8),%rax
00007ffff72e0bc5:  and $0xfffffffffffffff0,%rax
00007ffff72e0bc9:  sub %rax,%rsp
00007ffff72e0bcc:  lea 0xf(%rsp),%rax
00007ffff72e0bd1:  and $0xfffffffffffffff0,%rax
00007ffff72e0bd5:  mov %rax,-0x50(%rbp)
00007ffff72e0bd9:  callq 0x7ffff72daca0 <vorbis_info_blocksize@plt>
00007ffff72e0bde:  mov -0x48(%rbp),%ecx
00007ffff72e0be1:  mov %eax,%r9d
................. ...

aoTuV Patches, Vorbis 1.3.5 and Lancer

Reply #52
OK, I will try to reproduce it next week.

This looks like SSE code generated by the compiler (i.e. not the Lancer code). What compiler version and CFLAGS did you use to build the Vorbis libraries?

aoTuV Patches, Vorbis 1.3.5 and Lancer

Reply #53
IMHO it looks like a code from ov_read_float2pcm() if ov_read_filter() calls it with unaligned arguments.

It's possible to change in ov_read_float2pcm() all _mm_load_ps to _mm_loadu_ps (and maybe _mm_store_si128 to _mm_storeu_si128) and test again.

aoTuV Patches, Vorbis 1.3.5 and Lancer

Reply #54
Thanks Enzo and lvqcl,

I'm using the standard gcc 4.9.2 compiler + export CFLAGS="-msse3" (CPU : AMD 64 X2 5000xp)

lvqcl, I will try out your advice today or later this week.


aoTuV Patches, Vorbis 1.3.5 and Lancer

Reply #55
Great, it works now !!! Thank you very much lvqcl and Enzo !

aoTuV Patches, Vorbis 1.3.5 and Lancer

Reply #56
BTW: did you change only _mm_load_ps to _mm_loadu_ps? Or _mm_store_si128 -> _mm_storeu_si128 change was also necessary?

aoTuV Patches, Vorbis 1.3.5 and Lancer

Reply #57
I just reproduced this issue as well. Both, _mm_load_ps and _mm_store_si128 need to be changed. ov_read_filter has no control over the buffer that receives the data, so it must be assumed to be unaligned.

Regarding the source, the internal Vorbis synthesis buffer is 16 byte aligned, but the pointer returned by vorbis_synthesis_pcmout may point to the middle of a 16 byte block if an uneven number of samples has been read before and thus be 8 byte aligned. That's why _mm_loadu_ps is needed.

Thanks lvqcl for your hint! I've updated the patches accordingly.

aoTuV Patches, Vorbis 1.3.5 and Lancer

Reply #58
BTW: did you change only _mm_load_ps to _mm_loadu_ps? Or _mm_store_si128 -> _mm_storeu_si128 change was also necessary?


I changed directly both , to be sure ...

aoTuV Patches, Vorbis 1.3.5 and Lancer

Reply #59
Will this get posted on rarewares?

aoTuV Patches, Vorbis 1.3.5 and Lancer

Reply #60
Yes.

aoTuV Patches, Vorbis 1.3.5 and Lancer

Reply #61
I've amended the code, but I'm away this weekend so I'm afraid the compiles won't get updated until early next week.

aoTuV Patches, Vorbis 1.3.5 and Lancer

Reply #62
Thanks. 

aoTuV Patches, Vorbis 1.3.5 and Lancer

Reply #63
Any news about new compiles John?

aoTuV Patches, Vorbis 1.3.5 and Lancer

Reply #64
Hello,

are the current builds on rarewares stable? Or should we wait for an updated build from john33?

Best regards

akapuma

aoTuV Patches, Vorbis 1.3.5 and Lancer

Reply #65
Is it possible to upload this to github or somewhere else with an easy explanation how to apply patches to source? I have mostly compiled using Arch's makepkg and Debian's system that have everything preconfigured. Thanks!

On a side not, why can't Xiph include this in the official version? Is it the instructions that can be a problem on older processors? If so, can't it be programmed to sense what CPU is used and use generic code then?

aoTuV Patches, Vorbis 1.3.5 and Lancer

Reply #66
By "Arch's makepkg" do you mean this AUR package? Cause if yes, then it already has all patches applied. The PKGBUILD will also tell you how to apply the patch, for generating just look at man patch.

Re: aoTuV Patches, Vorbis 1.3.5 and Lancer

Reply #67
By "Arch's makepkg" do you mean this AUR package? Cause if yes, then it already has all patches applied. The PKGBUILD will also tell you how to apply the patch, for generating just look at man patch.
I had missed that they had updated the package. I am not using Arch at the moment so I will look in the PKGBUILD like you said. Thanks!

Re: aoTuV Patches, Vorbis 1.3.5 and Lancer

Reply #68
I've amended the code, but I'm away this weekend so I'm afraid the compiles won't get updated until early next week.
hi john, any news regarding the new compiles?

Re: aoTuV Patches, Vorbis 1.3.5 and Lancer

Reply #69
Hello.

Just wanted to thank you for the patches, I've compiled those on FreeBSD with clang 3.4.1 and encoding is on average 22% faster than
baseline (using the same flags, -O3 -march=native)

Files are not bit identical e.g.

Code: [Select]
Prior:

Done encoding file "audio_07.ogg"

        File length:  9m 03.0s
        Elapsed time: 0m 17.7s
        Rate:         30.7803
        Average bitrate: 239.9 kb/s

ogginfo

AO; aoTuV [20110424] (based on Xiph.Org's libVorbis)

Vorbis stream 1:
        Total data length: 16289845 bytes
        Playback length: 9m:03.333s
        Average bitrate: 239,850479 kb/s

Post:

Done encoding file "audio_07.ogg"

        File length:  9m 03.0s
        Elapsed time: 0m 13.7s
        Rate:         39.6077
        Average bitrate: 239.8 kb/s

ogginfo

BS; LancerMod(SSE3) (based on aoTuV [20110424])

Vorbis stream 1:
        Total data length: 16289693 bytes
        Playback length: 9m:03.333s
        Average bitrate: 239,848240 kb/s

That was expected though. Thanks!

Re: aoTuV Patches, Vorbis 1.3.5 and Lancer

Reply #70
Anybody tried to compile oggenc2 aotuv with normal Visual Studio compiler and not ICC?
What would be the expected speed-penalty?

Re: aoTuV Patches, Vorbis 1.3.5 and Lancer

Reply #71
Hello. Can anyone help - where to get xmmlib sources for Ubuntu linux?

Code: [Select]
gcc: error: .libs/xmmlib.o: No such file or directory

Building aoTuV without patches works.



Re: aoTuV Patches, Vorbis 1.3.5 and Lancer

Reply #74
Building fails because of "--disable-shared --enable-static". I guess it would be useful if enzo can apply fix for static build.
Sorry, I've seen this only now. The patches are now updated to fix this.