Re: ADC (Adaptive Differential Coding) My Experimental Lossy Audio Codec
Reply #82 – 2024-11-04 16:10:39
I thank everyone for the tests (except for someone who doesn't deserve respect for the ridiculous comments he makes). For the -b26 option it makes no sense as at minimum it is set to 128 kbps but being an 8 khz sound which is already terrible in quality. I respect the choice of tests which is considerably reductive. In my tests on "Sopranino Recorder Concerto, RV 443_ Allegro" by Vivaldi test file for example with EAQUAL - Evaluation of Audio Quality ( https://github.com/spxnn/eaqual - https://www.rarewares.org/files/others/eaqual.zip ), mp3 192 kbps results: Resulting ODG: -0.16 Resulting DIX: 2.32 BandwidthRef 17511.4298 BandwidthTest 15909.6910 NMR -15.6350 WinModDiff1 5.0334 ADB -0.8332 EHS 0.2262 AvgModDiff1 5.1856 AvgModDiff2 9.0777 NoiseLoud 0.0853 MFPD 1.0000 RDF 0.0041 while with ADC always at 192 kbps I get this: Resulting ODG: -3.05 Resulting DIX: -1.26 BandwidthRef 11930.2636 BandwidthTest 11913.7601 NMR -2.9485 WinModDiff1 12.0720 ADB 1.8283 EHS 0.2967 AvgModDiff1 10.4023 AvgModDiff2 26.8317 NoiseLoud 0.3622 MFPD 1.0000 RDF 0.7832 There is certainly a gap with the management of CBR/ABR. But ADC with the options -tx -q12 I get for example: Resulting ODG: -0.88 Resulting DIX: 1.03 BandwidthRef 14253.3514 BandwidthTest 14252.6861 NMR -13.1225 WinModDiff1 4.6411 ADB 1.3101 EHS 0.2593 AvgModDiff1 3.6089 AvgModDiff2 8.0902 NoiseLoud 0.1116 MFPD 1.0000 RDF 0.0512 ODG (Objective Difference Grade): Measures perceived fidelity, where scores range from 0 (perfect fidelity) to lower negative values indicating increased perceptible quality degradation. A higher (closer to zero) ODG indicates better quality. DIX (Distortion Index): Represents the total distortion level within the processed audio, with lower values indicating better quality. It's a metric that combines several distortion measures to offer an overall indication of the codec's accuracy. BandwidthRef and BandwidthTest: BandwidthRef is the reference (original) signal's bandwidth in Hz, indicating the highest frequency component captured in the original audio. BandwidthTest is the bandwidth after compression and decompression, showing the highest frequency maintained post-processing. Lower values suggest possible loss of high-frequency information. NMR (Noise-to-Mask Ratio): Reflects the audibility of noise introduced by compression. A more negative NMR value indicates that the added noise is less perceptible because it’s effectively masked by the audio signal, leading to a cleaner, more faithful reproduction. WinModDiff1: Represents the windowed difference in signal modulation (amplitude/phase distortion) over short windows, where lower values indicate a more faithful signal with minimal short-term distortion. ADB (Average Distortion per Band): Measures the average distortion within distinct frequency bands. Lower ADB indicates more accurate reproduction across the audio spectrum. EHS (Error Harmonic Structure): This parameter measures the harmonic structure of the error or distortion introduced, with lower values suggesting that the distortion is less likely to be perceived as unnatural by listeners. AvgModDiff1 and AvgModDiff2: AvgModDiff1 is the average modulation difference over short time scales, indicating how closely the processed signal matches the original in terms of amplitude and phase. AvgModDiff2 is the long-term version of AvgModDiff1, measuring modulation similarity over longer time windows. NoiseLoud (Noise Loudness): Measures the perceived loudness of noise introduced by the codec. Lower values indicate that the noise is less noticeable to the human ear, leading to a cleaner sound. MFPD (Mean Frequency Perceptual Deviation): Measures the perceptual accuracy of high frequencies. A perfect score of 1.0 indicates no deviation from the reference signal in high-frequency perception. RDF (Relative Delay Factor): Indicates any phase or timing discrepancies between the reference and test signals. Lower values suggest minimal delay, which helps maintain a more natural and cohesive audio reproduction. I don't give up on what you say. I'm not looking for innovation but a different way to compress audio. thanks for providing link to EAQUAL will closely inspect it. (The project seems last active 7 years ago) Also you could try other audio evaluation scores like PSNR SDR SISDR MAE NRMSE MDA all available in librempeg.