I need to extract level values from a wave file, the clear sound data as a collection of numbers.
stats [−b bits|−x bits|−s scale] [−w window-time]Display time domain statistical information about the audio channels; audio is passed unmodified through the SoX processing chain. Statistics are calculated and displayed for each audio channel and, where applicable, an overall figure is also given.For example, for a typical well-mastered stereo music file:Image soxpng/grohtml-255919.pngDC offset, Min level, and Max level are shown, by default, in the range ±1. If the −b (bits) options is given, then these three measurements will be scaled to a signed integer with the given number of bits; for example, for 16 bits, the scale would be −32768 to +32767. The −x option behaves the same way as −b except that the signed integer values are displayed in hexadecimal. The −s option scales the three measurements by a given floating-point number.Pk lev dB and RMS lev dB are standard peak and RMS level measured in dBFS. RMS Pk dB and RMS Tr dB are peak and trough values for RMS level measured over a short window (default 50ms).Crest factor is the standard ratio of peak to RMS level (note: not in dB).Flat factor is a measure of the flatness (i.e. consecutive samples with the same value) of the signal at its peak levels (i.e. either Min level, or Max level). Pk count is the number of occasions (not the number of samples) that the signal attained either Min level, or Max level.The right-hand Bit-depth figure is the standard definition of bit-depth i.e. bits less significant than the given number are fixed at zero. The left-hand figure is the number of most significant bits that are fixed at zero (or one for negative numbers) subtracted from the right-hand figure (the number subtracted is directly related to Pk lev dB).For multi-channel audio, an overall figure for each of the above measurements is given and derived from the channel figures as follows: DC offset: maximum magnitude; Max level, Pk lev dB, RMS Pk dB, Bit-depth: maximum; Min level, RMS Tr dB: minimum; RMS lev dB, Flat factor, Pk count: average; Crest factor: not applicable.Length s is the duration in seconds of the audio, and Num samples is equal to the sample-rate multiplied by Length. Scale Max is the scaling applied to the first three measurements; specifically, it is the maximum value that could apply to Max level. Window s is the length of the window used for the peak and trough RMS measurements.
Thank you. The problem was, that the knowledge of the wav-file structure dont show me how to extract the values from the data chunk. That is the reason of this thread. Furthermore i need a solution for unix.
Iam looking for a command, that extract the 01001101 10111110 00001111 11110101 (Bytes of data (data chunk)) ... read out the sample values. Only the sound data. I dont know what i should explain more.
sample-data5.txt 2 channels (stereo)Left channel then Right channel on same line.Sample Rate: 44100 Hz. Sample values on linear scale.Length processed: 100 samples 0.00227 seconds.-0.00183 0.18970-0.00186 0.14175-0.02280 0.14066-0.01584 0.16815-0.02466 0.11380-0.08188 0.06107-0.09854 0.05853-0.14084 0.03131-0.20316 0.03732-0.14594 0.12439-0.07410 0.19080-0.13168 0.18982-0.22549 0.20508-0.19412 0.27438-0.15729 0.28787-0.24057 0.23837-0.23859 0.23926-0.13712 0.25116-0.11496 0.21194-0.09512 0.19464-0.00565 0.206480.04703 0.217620.07034 0.262850.09454 0.233920.01279 0.12692.....
465 -97 -221 -29 208 -158 -143 128 -29 -210 -145 259 21 -69 97 51 -133 -110 405 -24 -325 -115 164 13 -188 188 162 -61 -243 216 177 -220 -91 27 13 -228 196 197 -182 -28 163 149 -193 2 116
@saratoga: The command wvunpack -r in.wv output.samples creates a raw file.
Later i will try to get binary data with a hex-bin converter.
.. the acual sample data ideally in 01010111