HydrogenAudio

Lossy Audio Compression => Ogg Vorbis => Ogg Vorbis - Tech => Topic started by: Garf on 2002-05-12 11:04:25

Title: Flaw in ReplayGain spec
Post by: Garf on 2002-05-12 11:04:25
It occured to me today that there is a problem with the current ReplayGain spec, or rather, my proposal for doing it in Vorbis.

The issue is combining replaygain and clipping prevention.

If applying the replaygain would cause the track to clip, clipping prevention kicks in, and reduces the level. This will make the output loudness different from the ideal, 'equal' level.

When running in radio/track mode, there is no way around this, since you don't know in advance what you are going to encounter. The best you can do is set the default level low enough so you can hope it'll never happen. I believe this was the idea (among possibly other things) behind setting the default level to K-20 in the new MPC decoders? (Frank? )

If the implementation in the current Vorbis players is correct, a similar effect can be reached by setting the preamp in the plugin to -6dB or so.

In album gain, you could avoid this from happening for the entire album you're listening to, since you already ReplayGain-processed them in group and thus know what is coming up, however, my current proposal poses problems for doing this: You would need to read in all files that belong to the album, read in the peak values, and remember the largest, and use that as the peak value for the individual tracks.

This is what I originally envisioned, however, looking back, this is both ugly, cumbersome and it may not even be possible in some player/plugin architectures.

I think the correct solution would probably be to store an album-peak value.

It would be trivial to implement in the ReplayGain tools, and require only minimal changes in the players without all the uglyness the current method would require (which isn't done correctly by anyone anyway).

The disadvantage is that it requires another tag. However, since the Vorbis people seem to have gotten a bit more enthousiastic about ReplayGain lately, perhaps that isn't so much of a problem.

I believe it's valuable to do this, as it may post a real problem in practise. Moreover, the proposal as it is now is broken by design in this regard, and I'd prefer to fix it while it's still fixable.

Also, the ReplayGain proposal on David's site doesn't mention anything about this? Is there another way to address this problem?

There's two other issues with the current spec that I'd like to discuss about while it's still possible.

1) Change RG_* into REPLAYGAIN_*
This was proposed by Segher, with the idea that someone looking at the tags and that doesn't know what they are can at least google to find out, whereas you'd be left clueless with the current 'RG'. I think this idea is valuable and good.

2) Source/version tag
I didn't include one originally because I saw no way to keep it consistent if you allow the user to edit the tags (you can't require them to know the spec...), and because I didn't see the RG calculations being improved for quite a while. Unfortunately, Frank Klemm has already proven me wrong on the latter. I don't see a way to make such a tag actually _work_ though.

I'd like feedback from everyone about all of this. Is it worthwhile to change the current proposal and fix some of the above issues?

--
GCP
Title: Flaw in ReplayGain spec
Post by: SometimesWarrior on 2002-05-12 13:42:29
I think an album (peak) gain value would be good to have in Vorbis. It's already in Musepack, and I use it whenever I transcode MPC to MP3 for my portable player.

Of course, with Vorbis, one shouldn't have to transcode for portable listening purposes (eventually ), but as long as an album gain value is not hard to implement, it might as well be supported, right?

And I like the RG_* --> REPLAYGAIN_* change, since I remember reading from the MAC 2.90 thread that many people think the human-readability of Vorbis tags (or is that Ogg tags?) is a big deal. But since I don't have a single Vorbis file on my computer currently, it wouldn't affect my collection in any way, whereas it might be a pain for those who have already encoded and replaygained their music files with Vorbis. Is it possible to have both tags supported, or is that just a bad idea? If not, then how hard would it be to automatically convert all Vorbis files on a user's hard drive from RG_* to REPLAYGAIN_*?

EDIT: whoops, I meant album peak value, not album gain.
Title: Flaw in ReplayGain spec
Post by: Garf on 2002-05-12 13:47:06
Quote
Originally posted by SometimesWarrior
I think an album gain value would be good to have in Vorbis.


It already is in Vorbis too. My proposal is about album peak tags. (Which MPC has, but the ReplayGain spec doesn't)

Quote
Is it possible to have both tags supported, or is that just a bad idea? If not, then how hard would it be to automatically convert all Vorbis files on a user's hard drive from RG_* to REPLAYGAIN_*?


Simple change in the ReplayGain tool to make it autoconvert all files.

--
GCP
Title: Flaw in ReplayGain spec
Post by: Case on 2002-05-12 15:58:04
Seems like Garf has valid points. I don't believe it's too late to fix the missing of album peak value, but Vorbis people are probably not too happy about changing names of tags. Actually, it's probably not necessary to change the RG_ to REPLAYGAIN_. I'd assume every decoder mentions replaygain somewhere, at least Winamp plugin has replaygain settings right in the first config screen so it takes quite a lot to miss the meaning.
About the version tag, I personally don't think it's necessary. If loudness is incorrect one can always re-apply replaygain to such files.
Title: Flaw in ReplayGain spec
Post by: DSPguru on 2002-05-12 17:12:42
Quote
Originally posted by Garf
It already is in Vorbis too. My proposal is about album peak tags.
you might wanna know about a tag i have already defined. it stores the album maximal gain (1/peak), and it used in order to apply one-pass normalization (mainly, for LittleWing[/i]).

the tag is already supported by Peter's in_vorbis plugin, and should also be supported in the next version of OggDS.


technical details :
all ogg's created by BeSweet/LittleWing will have the following comment :
LWING_GAIN=xxxxx
examples :
LWING_GAIN=1.234, LWING_GAIN=12.34, LWING_GAIN=123.4.

as you can see, the comment's length is constant (16bytes), and the point is floating.


Shouldn't this suffice ?

Dg.
Title: Flaw in ReplayGain spec
Post by: Garf on 2002-05-12 18:12:02
Quote
Originally posted by DSPguru

shouldn't this suffice ?


It's the same thing, I think, except that IMHO the fixed length requirement violates the Vorbis specs (tags are required to be human-readable and editable, which I dont think a fixed length tag complies with), and that it has a quite unfortunate name. (LWING says just as much as RG, and it's usable by a lot more than just your app)

Edit:

Also, the precision of your tag is not enough, see this thread:

http://www.hydrogenaudio.org/forums/showth...25&pagenumber=2 (http://www.hydrogenaudio.org/forums/showthread.php?s=&threadid=769&perpage=25&pagenumber=2)

--
GCP
Title: Flaw in ReplayGain spec
Post by: DSPguru on 2002-05-12 18:31:00
unfortunate name ? come-on..
hehe..

btw, the fixed length isn't a MUST. the reason i set a fixed length was to ease a little bit the decoding implementation (it can be easily parsed without libvorbis).
if you feel like extending the definition into a variable length, it's okay by me .

as for "human-readable" tags,
this information is technical.  techical users can read it, non-technical users won't care about it. therefore, i don't think it matters.


anyway, you don't have to adopt it. just wanted to let you know .


Dg.
Title: Flaw in ReplayGain spec
Post by: Garf on 2002-05-12 23:22:15
Quote
Originally posted by DSPguru

as for "human-readable" tags,
this information is technical.  techical users can read it, non-technical users won't care about it. therefore, i don't think it matters.


The issue is that for something that looks like text, it's not a very good idea to have it behave like binary data....like breaking if a space gets added or some decimals deleted.

--
GCP
Title: Flaw in ReplayGain spec
Post by: Garf on 2002-05-13 11:41:19
Another thing would be renaming AUDIOPHILE to DISK or ALBUM and renaming RADIO to TRACK.

Its much clearer what they actually do, whereas now you at least need to have read the replaygain specs to understand what they are about.

I dislike the fact that it's different from the original replaygain proposal, but the added clearness may very well make up for that.

--
GCP
Title: Flaw in ReplayGain spec
Post by: 2Bdecided on 2002-05-13 14:31:10
You know my thoughts on most of this, but my opinion is...


1) Change RG_* into REPLAYGAIN_*

good idea - do it ASAP. If you find the tag and don't know what it is, a websearch for RG won't help - but a websearch for REPLAYGAIN will.


2) Source/version tag

Yes - useful to include - and I won't say "I told you so" ;-)


3) include album peak value

yes again - great idea


4) renaming AUDIOPHILE to DISK or ALBUM and renaming RADIO to TRACK

You probably know my reservations about this (or at least Frank does!) but in the face of common understanding, it does seem sensible to do as you propose. rename radio to track, and rename audiophile > album.

btw, it's compact disc, not disk - I believe it's because of English v American spellings, and european vs american dominance of CD vs PC development that we have compact disc, and hard disk drive. Whatever the reason, the confusion over spelling is a good enough reason to use "album" instead of disc/disk. Anyway, if you take something off a tape, it's still an album, but not a disc!

I think the word album goes back to the start of the last century, when collections of 78rpm discs were bound into cardboard albums to hold entire classical works, which required many discs at 3-5 minutes per side.

Cheers,
David.

http://www.replaygain.org/ (http://www.replaygain.org/)

P.S. I appologise for that website being very out of date. The more websites I write, the less time I have to spend keeping each one up-to-date!
Title: Flaw in ReplayGain spec
Post by: john33 on 2002-05-13 14:31:22
Interesting how things run full circle!!

I proposed this some time ago to be used in oggdec and was told that the vorbis dev team was very much against the addition of another tag!!!

Ah well, nothing new in life I guess!

john33
Title: Flaw in ReplayGain spec
Post by: Garf on 2002-05-13 15:24:22
Quote
Originally posted by 2Bdecided

2) Source/version tag 

Yes - useful to include - and I won't say "I told you so" ;-)


I hate to bust your enthousiasm, but as long as I see no way to make this work correctly, I'm not going to include it :naughty:

The problem remains that it will be possible to change these values in a non-replaygain capable tag editor. When that happens, the value in the field is worse than useless, it'll be wrong!

If I don't include the tag, the worst that can happen is that when a new version comes out the people that really want to re-replaygain their entire collection will have to have a bit more patience. 

IMHO, that's no comparison to the current problem (the miss of album peak can totally break replaygain on a silenty recorded album with great dynamics) I'm not proposing adding a tag because I think it's cool to have another one, I'm proposing it because without this tag replaygain simply won't work!

In a theoretical ideal world the album peak value is not needed: it can be determined easily by reading in the peak tags of the rest of the album. In practise, it simply won't work because of the way most players are designed. (Not to mention added complexity).

Quote
3) include album peak value

yes again - great idea


Might want to mention it on replaygain.org somewhere.

Quote
4) renaming AUDIOPHILE to DISK or ALBUM and renaming RADIO to TRACK

You probably know my reservations about this (or at least Frank does!) but in the face of common understanding, it does seem sensible to do as you propose. rename radio to track, and rename audiophile > album. 


I don't know what your reservations are...all I remember is Frank saying ' I don't listen to the radio and I'm not an audiophile '

Thanks for the 'album' clarification.

--
GCP
Title: Flaw in ReplayGain spec
Post by: Garf on 2002-05-13 15:27:42
Quote
Originally posted by john33
Interesting how things run full circle!! 

I proposed this some time ago to be used in oggdec and was told that the vorbis dev team was very much against the addition of another tag!!!

Ah well, nothing new in life I guess! 

john33


Amazing isn't it? All feedback from the vorbis side so far has been ' we think your ideas are fine '.

Big difference from the first time it was proposed: ' Three tags?!?!?! Are you nuts ?!?!?!'

--
GCP
Title: Flaw in ReplayGain spec
Post by: Garf on 2002-05-13 16:04:33
On the pratical side, I'm considering

REPLAYGAIN_ALBUM_GAIN=-6.00 dB
REPLAYGAIN_TRACK_GAIN=-7.00 dB
REPLAYGAIN_ALBUM_PEAK=1.12443
REPLAYGAIN_TRACK_PEAK=1.04343

vs

REPLAYGAIN=ALBUM:-6.00dB; TRACK:-7.00dB; ALBUM_PEAK:1.12443; TRACK_PEAK:1.04343

Advantage: only one tag
Disadvantage: harder to parse

--
GCP
Title: Flaw in ReplayGain spec
Post by: Frank Klemm on 2002-05-13 17:36:20
Quote
Originally posted by Garf
On the pratical side, I'm considering 

REPLAYGAIN_ALBUM_GAIN=-6.00 dB
REPLAYGAIN_TRACK_GAIN=-7.00 dB
REPLAYGAIN_ALBUM_PEAK=1.12443
REPLAYGAIN_TRACK_PEAK=1.04343

REPLAYGAIN=ALBUM:-6.00dB; TRACK:-7.00dB; ALBUM_PEAK:1.12443; TRACK_PEAK:1.04343

Advantage: only one tag
Disadvantage: harder to parse


4 == sscanf ( buffer, "REPLAYGAIN=ALBUM:%lfdB; TRACK:%lfdB; ALBUM_PEAK:%lf; TRACK_PEAK:%lf %c, &ga, &gt, &pa, &pt, &trash)

Useful if exact this form is forced.
Title: Flaw in ReplayGain spec
Post by: Garf on 2002-05-13 17:59:09
Quote
Originally posted by Frank Klemm


4 == sscanf ( buffer, "REPLAYGAIN=ALBUM:%lfdB; TRACK:%lfdB; ALBUM_PEAK:%lf; TRACK_PEAK:%lf %c, &ga, &gt, &pa, &pt, &trash)

Useful if exact this form is forced.


Reordering of the field names breaking all parsers? Bad idea.

--
GCP
Title: Flaw in ReplayGain spec
Post by: john33 on 2002-05-13 22:12:16
KISS rules, I think!

john33
Title: Flaw in ReplayGain spec
Post by: Garf on 2002-05-15 15:33:33
There has been a huge discussion about this on #vorbis, and it wasn't nice.

Basically, the vorbis people seem to want to throw out this system altogether.

Monty basically thinks everything but the radio gain value should be thrown out. His arguments were that you can 'clip' this value to take the peak into account, and that the audiophile settings are redundant because they can be inferred from combining the tags of the individual tracks.

If you look at my post that started the thread, you'll see the reason I proposed a change was exactly that that is impossible in most players, and adds a lot of unnecessary complexity in those where it could be done.

Losing the radio peaks means losing ReplayGain functionality when you manually reduce preamp (i.e. increase headroom). Losing the album peaks causes problems with _all_ of the above. Moreover, it seems that this tag is wanted for purposes outside of ReplayGain.

When Monty was done, the Xiph.org CEO went as far as to say that Vorbis should drop ReplayGain support completely, as 'it's really a player issue', going on to explain it can be done by letting the players do all calculations and storing the results in a database etc etc...

Needless to say, I'm not happy with this. The entire goal of my proposal was to make ReplayGain a) work b) easy to support in players. The above nukes both of these goals, and without them, I don't see ReplayGain for Vorbis gaining a lot of support, if any.

I don't really know what to do with my proposal either. It doesn't seem a good idea to continue it if the Vorbis people are so heavily opposed to it, but on the other hand, a lot of people, (including me) want this feature, and it just plain _works_ right now. Updating it to the changes I proposed here would be trivial in both the players and tools.

The players only need to change their reading of RG_* into REPLAYGAIN_*, AUDIOPHILE/RADIO into ALBUM/TRACK and use the REPLAYGAIN_ALBUM_PEAK tag when it's present and we're in album gain mode instead of the REPLAYGAIN_TRACK_PEAK one.

The tools need to change RG_* into REPLAYGAIN_, AUDIOPHILE/RADIO into ALBUM/TRACK, write out the new tag, and include an autoconvert option which converts files from the old format.

So, if anyone has an interest in keeping using the current Vorbis ReplayGain, the current suggested format looks like:

REPLAYGAIN_ALBUM_GAIN=-6.43 dB
REPLAYGAIN_TRACK_GAIN=+1.20 dB
REPLAYGAIN_ALBUM_PEAK=1.12443
REPLAYGAIN_TRACK_PEAK=1.04343

But beware! Because this needs the obviously unacceptable amount of four tags, the Vorbis people will come and haunt you at night for such blashphemy.

Peter, Magnus and zinx, I'll leave it up to you whether you want to update your stuff to the new proposal, or wait for the Vorbis people to come up with a working proposal of their own.

Personally, I've had enough of this for the next two months. I have to fight with David over each tag I leave out, and I have to fight with the Vorbis people over each tag that gets in. This is no situation to be in and I see no reason to keep wasting my time with it.

--
GCP
Title: Flaw in ReplayGain spec
Post by: john33 on 2002-05-15 16:33:26
This is really rather sad!

I thought that part of the idea of the vorbis tags was flexibility,etc. I also thought that the relative lack of rigidity compared to ID3 was to allow people to 'create' tags for their own use. With the currently expounded philosophy it makes you wonder why they bothered! Might just as well have stayed with ID3!!!

I don't really understand why it should be such a big deal.

john33
Title: Flaw in ReplayGain spec
Post by: sam on 2002-05-15 16:34:07
I think your idea's Garf are great - clear, simple and easy to use. The database idea is a strange one, not very object orientated! If the gain feauture was supported as a standard in ogg as you state Garf, it would seriously increase the chance of me choosing ogg.
Title: Flaw in ReplayGain spec
Post by: Lear on 2002-05-15 18:32:56
Only incuding radio gain does indeed seem to be a step backwards. In a way I can see Monty's point that the radio gain is the only required value (which is true; perhaps peak is good too). But having each and every player calculate the rest (and perhaps store it somewhere) isn't very good, and won't promote adoption in a lot of players.

You might then just as well say that each and every player should calculate the gain as well! Possibly differently (just compare ReplayGain for Vorbis and MPC, which does differ slightly).

And lets not forget that 2Bdecided (that's the ReplayGain "author", isn't it?) thinks that the album value *should* be possible to change by the user.

So it would be better to make it easy for the players (and users), IMHO. Having four tags is a bit much perhaps, but there is no other place but tags at the moment, so that will have to do. However, putting all in one tag (as Garf suggested) would be kind of nice. Code to parse that could be included and isn't very complicated anyway. But it doesn't make the player code "as simple as possible"...

Anyway, I'm fully prepared to update VorbisGain, once it is decided how to do it. At least if Winamp and XMMS will supports the "new" standard.

Monty might not like it (or support it in Vorbis, via libs or included tools), but he can hardly stop us from doing it.

(Gotta run...)
Title: Flaw in ReplayGain spec
Post by: Case on 2002-05-15 19:34:02
I have even better idea. Let's store all the tags in player databases, that way also they can be handled by players where they really belong, after all, those are responsible for showing them. Actually players are responsible also for playing sound, maybe players could store the vorbis data in local databases. That would keep Ogg files really small, they would only need to store some hash numbers to access database.
Title: Flaw in ReplayGain spec
Post by: HotshotGG on 2002-05-15 20:56:37
Quote
it seems that this tag is wanted for purposes outside of ReplayGain.


What exactly had they been planning on using it for?

I definitly think replaygain is a good idea, but if the tag is going to be needed for something more important than it, than I would think otherwise.
Title: Flaw in ReplayGain spec
Post by: Garf on 2002-05-15 21:57:06
Quote
Originally posted by HotshotGG


What exactly had they been planning on using it for?

I definitly think replaygain is a good idea, but if the tag is going to be needed for something more important than it, than I would think otherwise. 


Read the thread from the start. I explain in the first message why it is needed for ReplayGain. In DSPguru's message you can see that this tag is useful for other tools.

--
GCP
Title: Flaw in ReplayGain spec
Post by: Garf on 2002-05-15 22:04:46
Quote
Originally posted by Lear

You might then just as well say that each and every player should calculate the gain as well! Possibly differently (just compare ReplayGain for Vorbis and MPC, which does differ slightly).


This is in fact what Paradox (Xiph CEO) said should be done.

Quote
And lets not forget that 2Bdecided (that's the ReplayGain "author", isn't it?) thinks that the album value *should* be possible to change by the user.


This is still possible if the tag is stored in the database. But it becomes a problem when you switch players.

Quote
However, putting all in one tag (as Garf suggested) would be kind of nice. Code to parse that could be included and isn't very complicated anyway. But it doesn't make the player code "as simple as possible"...


My basic question is: why is it bad to have four tags? Adding complication to reduce the number of tags only makes sense to me if there is a good reason to have as few tags as possible.

--
GCP
Title: Flaw in ReplayGain spec
Post by: john33 on 2002-05-15 22:13:07
The ReplayGain values are the 'property' of a particular track, or tracks in the case of Album settings. The notion of storing them anywhere other than as part of the track to which they belong is complete and utter nonsense.

I repeat, what is the big deal? The overhead of carrying the values within the tag structure is so small as to be inconsequential. What is all the fuss about? It must be better to carry the values within the encoded file than to attempt to replicate the values everywhere that they may be used.

john33
Title: Flaw in ReplayGain spec
Post by: JohnV on 2002-05-15 22:17:43
Also, even if there's going to be replaygain calculation support in many players (very unlikely), you probably have to do the calculations with each player separately.. and each player probably uses different format for storing the replygain data..

I can't understand what's the real reason behind this? Why on earth replaygain data can't be saved to the tag?
Title: Flaw in ReplayGain spec
Post by: gnoshi on 2002-05-16 01:14:45
Couple of thoughts..

I use replaygain; I use ogg; I like them both.
Cool.

On tags in general, what if all 'special' tags were prefixed with something to mark them as 'not designed for human digestion'? That doesn't necessarily mean they're not human readable, just that it would be kind of silly to human-edit them.

Also, I like the 4 fields. I can understand your frustration, Garf, (as much as someone who is not trying to do the same thing can)  but is it truly necessary to get Xiph endorsement of replaygain in order to use it? I mean, it is preferable no doubt, but is it essential? Or would things just get less happy-friendly if you just went ahead and did it separately?

I mean, I can understand the idea of using an external database, I don't even mind it, but it is not much help for things like burning individual files to a CD if you want to burn 50 tracks, but not the database for all 5000 tracks you may have. From this perspective, I think tags on the individual files are better.
Of course, in this case, Album gain may be unimportant; but that said, if you don't want it then you can always ignore it. That dozen or so bytes is not going to fill the hard drive notably quicker (or at least, one would hope not).

Just a few thoughts on the table, sorry for any ranting; caffine levels are a tad high this morning.

gnoshi

btw. garf: In case I didn't get the message across, I really like your work on replaygain. WD+THNX.
Title: Flaw in ReplayGain spec
Post by: Garf on 2002-05-16 08:06:30
Quote
Originally posted by gnoshi
On tags in general, what if all 'special' tags were prefixed with something to mark them as 'not designed for human digestion'? That doesn't necessarily mean they're not human readable, just that it would be kind of silly to human-edit them.


The tags are human-readable (though you'll have problems understaing what exactly they do if you are not familiar with RG), and it makes sense for a human to edit them.

Quote
Also, I like the 4 fields. I can understand your frustration, Garf, (as much as someone who is not trying to do the same thing can)  but is it truly necessary to get Xiph endorsement of replaygain in order to use it? I mean, it is preferable no doubt, but is it essential? Or would things just get less happy-friendly if you just went ahead and did it separately?


It would be a possibility, but it's obviously hardly ideal, especially since at least monty really wants RG support for vorbis too, so it will get added to Vorbis eventually. I want to avoid a split over this.


Quote
I mean, I can understand the idea of using an external database, I don't even mind it, but it is not much help for things like burning individual files to a CD if you want to burn 50 tracks, but not the database for all 5000 tracks you may have. From this perspective, I think tags on the individual files are better.
Of course, in this case, Album gain may be unimportant; but that said, if you don't want it then you can always ignore it. That dozen or so bytes is not going to fill the hard drive notably quicker (or at least, one would hope not).


The database problem does not work, and that is my problem with it. Way too much added complexity to the player side. Imagine what it would be on a _portable_, for example.

--
GCP
Title: Flaw in ReplayGain spec
Post by: Lear on 2002-05-20 09:52:48
Quote
Originally posted by Garf
So, if anyone has an interest in keeping using the current Vorbis ReplayGain, the current suggested format looks like:

REPLAYGAIN_ALBUM_GAIN=-6.43 dB 
REPLAYGAIN_TRACK_GAIN=+1.20 dB 
REPLAYGAIN_ALBUM_PEAK=1.12443 
REPLAYGAIN_TRACK_PEAK=1.04343 


OK, I've mostly changed VorbisGain now, including a way to convert the old format to the new (haven't tested it yet though, so it isn't ready for release). But there's one thing I've been thinking about (and that has been mentioned in this discussion): Change the "gain" to "level" or something. I.e., don't store the relative gain, but the absolute volume level.

The reason is that you may want different "target levels" in different situations (e.g., use the "standard" 83 dB when playing on your home computer with a 24-bit soundcard, but 89 dB when using a portable player). And it wasn't without a reason that Snelg decided for 89 dB as the default for MP3Gain...

Besides, the peaks describe a property of the file. Level would do the same.  (And do you really need the "REPLAYGAIN_" prefix then? )

Anyway, after thinking a bit about it, it seems like the right thing to do. The --target-level option doesn't really belong in VorbisGain IMHO; the player is a better place (the fact that the Vorbis Winamp plugin already has a pre-amp option speaks in favour of that, I'd say).

Comments please.
Title: Flaw in ReplayGain spec
Post by: 2Bdecided on 2002-05-20 11:19:51
Is almost the same discussion happening in two threads here?


Lear,

First, please read www.replaygain.org (http://www.replaygain.org) especially the bit about reference level.

You see, the "level" as you want to define it is not intrinsically the level of the track. Why not? Because you'll get a (completely) different value depending on how you measure it! So, to be meaningful, you have to say "the level, measured <this way> is x dB". But even this isn't a great idea!

The Replay Gain Proposal suggests a way of measuring the level, AND defines a reference level. Already Frank Klemm has improved the way of measuring the tracks "level". However, because there's a reference level for him to tie this to, he can do this without breaking compatibility. BUT if you only say "this is the level of this track when calculated by this method" then using any other (better?) method would break the whole system.

So, you can't say "the track is x dB loud", because you haven't said how you measured it, or what you're measuring it relative to.

You can say "the track needs to be x dB louder to match an agreed reference", because the measurement method doesn't matter when you're talking about relative (rather than absolute) levels (so long as it works quite well!), and you have a defined reference.


In short: you have to specify a reference level or a method of calculation. That's why you need to say "this is a replay gain tag".

It's not just my ego ;-)

Cheers,
David.
Title: Flaw in ReplayGain spec
Post by: Lear on 2002-05-20 13:39:29
Quote
Originally posted by 2Bdecided
Is almost the same discussion happening in two threads here?


Yep, Garf started this, then Dibrom started that vote thing...

Quote
You see, the "level" as you want to define it is not intrinsically the level of the track. Why not? Because you'll get a (completely) different value depending on how you measure it! So, to be meaningful, you have to say "the level, measured <this way> is x dB". But even this isn't a great idea!

The Replay Gain Proposal suggests a way of measuring the level, AND defines a reference level. Already Frank Klemm has improved the way of measuring the tracks "level". However, because there's a reference level for him to tie this to, he can do this without breaking compatibility. BUT if you only say "this is the level of this track when calculated by this method" then using any other (better?) method would break the whole system.


Why? What's the difference in storing e.g., +6.5 or 76.5? In the first case you increase the volume by 6.5 dB, in the second you increase the volume by 83 - 76.5 = 6.5 dB (assuming 83 was set as the "reference level" in the player)...

That's what I'm suggesting: the calculation of e.g. 83 - 76.5 should be done in the player. That's all.

As you say on www.replaygain.org: (http://www.replaygain.org:) "So, we send the pink noise signal through the ReplayGain program, and store the result (let's call it ref_Vrms)." Which means the reference level isn't an absolute. It depends on the method used to measure the reference signal (the only absolute). So, regardless of method used to calculate the level, it would have to be calibrated against the pink noise signal. I'm not suggesting to change that. My suggestion is simply that the output (when using the reference signal) should be 83 (as an absolute value) rather than 0 (as a relative value).

Quote
So, you can't say "the track is x dB loud", because you haven't said how you measured it, or what you're measuring it relative to.


OK, so the "REPLAYGAIN_" prefix is useful. But then that wasn't the important part of my suggestion anyway... It was just an idea to make the tag names a little shorter.
Title: Flaw in ReplayGain spec
Post by: 2Bdecided on 2002-05-20 17:50:24
All other things being equal, assuming a complete understanding of replaygain in the player, both are equivalent.

So we can store either. ..


"relative": the player does this:

a) read in replay gain value
b) apply replay gain value


"absolute": the player does this:

a) read in replay gain value
b) subtract 83 from this value
c) apply the resulting value



I originally thought about storing the "absolute" value (rather than the "relative" adjustment) but several people pointed out that storing the adjustment is easier - and I agreed. So that's the way it is. You've got to pick one of the other, so I picked the option that made player implementation easier.


Quote
My suggestion is simply that the output (when using the reference signal) should be 83 (as an absolute value) rather than 0 (as a relative value). 


But they're both relative, aren't they? Decibels (dB) are, by definition, a relative measurement. The 83 isn't just "83", like you could say the length of a piece of string is "83cm" and that's that. It's equivalent to the perceived loudness of a -20dB FS (relative to a full scale sinewave) RMS pink noise signal replayed via an SMPTE RP 200 calibrated audio system.


My way, you don't put an 83 in the replay gain calculation, and you don't put an 83 in the player calculation either. Your way, you could well add it at one end, and then subtract it at the other! why?!


and finally...

Quote
Besides, the peaks describe a property of the file. Level would do the same. 


If you store what you suggest (e.g. 80dB instead of -3dB, for example) then you're NOT storing a property of the file. You're storing the gain required to make the file average 83dB perceived loudness in a calibrated system. Rather than storing a property, you're still storing an adjustement. A loud file will have a lower value, whether you add 83 to it or not! So, you'd have to take the "-3dB", switch the sine (+3dB), add it to 83dB, and store this (86dB) as the perceived loudness of the file. Then, to play it back, you take this value (86dB), and subtract it from the required level (e.g. 83dB-86dB=-3dB), and then apply this gain change to the file. BUT LOOK! We're just where we started - with the number -3dB! so why bother?!


To make the stored value a property of the file (i.e. truly absolute, not relative) you have to remove the calibration step (stage 4, if you refer to the explanation on the website). It's the calibration step that causes different calculations (e.g. mine and Franks) to fall on the same scale - take this away, and you've got a big disadvantage: no prospect of improving the calculation.


I hope this clarifies the situation. Yes, you could equally well store the value with 83 added to it or not, just so long as everyone understood which you were doing. But since NOT adding it means you then DON'T have to subtract it again, that makes most sense. It's also what has been happening for a year, so there seems no reason to change now. Also, it's still a relative value: an instruction to turn up or turn down the file to match a reference level. If you remove this reference level, then you can state something explicitly about the file, but you loose the calibration, so making it harder to change the calculation while retaining a compatible scale.

In short: adding 83 makes little difference (but is no use, and a small amount of trouble); changing to a true representation of "what's in the file" will cause many problems for little gain (pardon the pun).


Cheers,
David.
http://www.David.Robinson.org/ (http://www.David.Robinson.org/)
Title: Flaw in ReplayGain spec
Post by: Lear on 2002-05-20 18:16:41
I'll try to keep this short, since we seem to agree on most things now.

Quote
Originally posted by 2Bdecided
All other things being equal, assuming a complete understanding of replaygain in the player, both are equivalent.


Then we agree on that part at least. Now's the question: why doing it the "absolute" way?

Quote
Originally posted by 2Bdecided
"absolute": the player does this:

a) read in replay gain value
b) subtract 83 from this value
c) apply the resulting value


b) should rather read something like this: "subtract a user-configurable value, which defaults to 83, from this value". If that value isn't configurable, then there's indeed not much use in doing it like this.

Quote
Originally posted by 2Bdecided
In short: adding 83 makes little difference (but is no use, and a small amount of trouble); changing to a true representation of "what's in the file" will cause many problems for little gain (pardon the pun).


But there's another problem now. MP3Gain defaults to a "target level" of 89 dB, not 83. To be "compatible" with this, VorbisGain does the same, but also allows that to be changed (--target-level option). Then you can have different files where the gain is based on different target levels, and you can't tell what the level is just by looking at the tags. To avoid that, store the "absolute" level rather than the change needed to get to a certain target level. I.e., move the "--target-level" option from VorbisGain to the player (which is a better place, IMO).
Title: Flaw in ReplayGain spec
Post by: Garf on 2002-05-20 18:45:43
Quote
Originally posted by Lear

OK, I've mostly changed VorbisGain now, including a way to convert the old format to the new (haven't tested it yet though, so it isn't ready for release).


A little note: Vakor (Michael Smith), the vorbis-tools maintainer, has started work on adapting the tool to fit into the vorbis-tools set. He had some issues with the portability of the tool and found some bugs as well. You might want to give him a ring and sync up your improvements. (Dunno the email by hearth but should be easy to find)

Quote
Besides, the peaks describe a property of the file. Level would do the same.  (And do you really need the "REPLAYGAIN_" prefix then? )


I changed RG into REPLAYGAIN to (almost literally) make it possible for someone who has never heard of it to enter the term in google and get to the right place. You need to at least have heard of ReplayGain to make sense of most of the tags, so it must be clear what the tags belong to. The length (if reasonable) is less of an issue than making clear what the tags are about.

My preference for storing a relative gain instead of an "absolute" level follows similar reasoning. It's more clear from looking at the kind of tag what it is about.

REPLAYGAIN_ALBUM_GAIN=-6.00dB

vs

REPLAYGAIN_ALBUM_LEVEL=83.00dB

IMHO, it's more clear from the first tag that somehow the volume has to be changed by -6dB.

There is no real technical reason to prefer one over another. I prefer whatever is more clearer.

I agree the level adjustment doesn't belong in the tool but in the players. But (*chimes*) it already _is_ there. That is what the preamp (or headroom) slider is for.

--
GCP
Title: Flaw in ReplayGain spec
Post by: sam on 2002-05-20 19:27:42
Quote
Originally posted by Lear
But there's another problem now. MP3Gain defaults to a "target level" of 89 dB, not 83. To be "compatible" with this, VorbisGain does the same, but also allows that to be changed (--target-level option). Then you can have different files where the gain is based on different target levels, and you can't tell what the level is just by looking at the tags. To avoid that, store the "absolute" level rather than the change needed to get to a certain target level. I.e., move the "--target-level" option from VorbisGain to the player (which is a better place, IMO).


I'd choose the "absolute" method, zero is a more natural origin for me so there's no need to all agree on an 83 origin. Also, storing a +/- could confuse a few: Does -6 mean I need to decrease the volume of this file by 6 to get to the 83, or does it mean this file is 6dB quieter then the 83 standard. At least by storing 89 I'd know 'how loud it is' right away.
Title: Flaw in ReplayGain spec
Post by: Lear on 2002-05-20 19:36:35
Quote
Originally posted by Garf

A little note: Vakor (Michael Smith), the vorbis-tools maintainer, has started work on adapting the tool to fit into the vorbis-tools set. He had some issues with the portability of the tool and found some bugs as well. You might want to give him a ring and sync up your improvements. (Dunno the email by hearth but should be easy to find)


Michael Smith? I thought Stan Seibert was going to do it; after all, he's the one I've been in contact with regarding VorbisGain... Oh well, I've e-mailed both now.

Quote
There is no real technical reason to prefer one over another. I prefer whatever is more clearer.


True, but it isn't very clear what the result of the change is...

Quote
I agree the level adjustment doesn't belong in the tool but in the players. But (*chimes*) it already _is_ there. That is what the preamp (or headroom) slider is for.


Yep. So if *_GAIN is to remain (and not be changed to *_LEVEL), I'll remove --target-level and mention in the manual that pre-amp can be used to change the "result level". Considering the amount of support here of my suggestion (i.e., none at all ), I guess that's what I'll do. I'll stick with 89 as a "target level" though.
Title: Flaw in ReplayGain spec
Post by: sam on 2002-05-20 19:42:38
Quote
Originally posted by Lear
Considering the amount of support here of my suggestion (i.e., none at all ), I guess that's what I'll do. I'll stick with 89 as a "target level" though.


I support ya! I'm ok with the arguments on the other side, but for Joe public this track is 89dB loud is probably more easily understood then this file needs to be adjusted by this much.

I used MP3Gain and I like the feel of the "absolute" value. I'm just a user tho.
Title: Flaw in ReplayGain spec
Post by: 2Bdecided on 2002-05-21 10:32:33
sam,

that's just the point. the track isn't 89dB loud. If you store REPLAYGAIN_TRACK_LEVEL = 89dB as Lear suggests, what you actually mean is "this track needs to be played at 89dB to make it 83dB loud."*

Whereas REPLAYGAIN_TRACK_GAIN = +6dB means "increase this track by 6dB (to match the reference level)"

In either case, if you want it louder or quiter still, you'll just adjust the pre-amp and that value will be added to it.

David.

P.S. If you really wanted to store something that represents the "level", you have to store something else (see previous post) And that something else makes it even more complicated than * above!
Title: Flaw in ReplayGain spec
Post by: matthijsln on 2002-05-21 10:54:47
Quote
Originally posted by 2Bdecided
It's the calibration step that causes different calculations (e.g. mine and Franks) to fall on the same scale - take this away, and you've got a big disadvantage: no prospect of improving the calculation.


BTW, does Frank's calculation method supersede the original one?

I'm going to update my Winamp plugin this week, so should I use Frank's one, the orignal, or give an option to choose between both?

Matthijs
Title: Flaw in ReplayGain spec
Post by: Case on 2002-05-21 11:06:20
Quote
Originally posted by matthijsln
I'm going to update my Winamp plugin this week, so should I use Frank's one, the orignal, or give an option to choose between both?

Only Frank's plugin have replaygain support.
Title: Flaw in ReplayGain spec
Post by: 2Bdecided on 2002-05-21 12:33:50
Case,

I think you missunderstood his question. or else I did.


Matt,

Frank emailed me about one of his improvements, and then said it didn't work and rejected/changed it. He's made at least one more improvement which he has kept. But the only way to see the changes are to look at his source code in the latest mppdec bundle. (I can't program so it means nothing to me!). I think I'd use his latest version, if I were you - but check with him first!

Cheers,
David.
Title: Flaw in ReplayGain spec
Post by: Case on 2002-05-21 13:02:26
Quote
Originally posted by 2Bdecided
I think you missunderstood his question.

Very possible. But Winamp plugin doesn't know about method used for replaygain calculation so it has no effect here. If one wishes to use official replaygain spec instead of Frank's enhanced version one has to undefine KLEMM in sources and compile ReplayGain.exe again.
Title: Flaw in ReplayGain spec
Post by: sam on 2002-05-21 13:20:30
Quote
Originally posted by 2Bdecided
that's just the point. the track isn't 89dB loud. If you store REPLAYGAIN_TRACK_LEVEL = 89dB as Lear suggests, what you actually mean is "this track needs to be played at 89dB to make it 83dB loud."*

Whereas REPLAYGAIN_TRACK_GAIN = +6dB means "increase this track by 6dB (to match the reference level)"


The way I was thinking is due to my experiance with MP3Gain. When it says Radio Gain is 99 it sounds loud, and when it says Radio Gain is 80 it sounds quiet. Are we measuring from the other end of the scale here - Like my 89 (which I take as loud) is really -89dB, and add on the 83 to gve the -6, the final adjustment to make it play at 83.

My main point tho was that for me, the average user, a "TRACK_LOUDNESS=x" where 0<=x<=~100 is a lot simpler. I take two tracks, one with x=90 and another with x=80, I can see right away that the first is 'louder'. May be an abstract scale is better?
Title: Flaw in ReplayGain spec
Post by: 2Bdecided on 2002-05-21 14:02:14
OT: That's really bizarre - we both live in Essex, and your girlfriend and my wife both do cross stitch! Anyway...


Yes, I see it makes sense from a user's point of view to see a louder file having a bigger number. But hopefully the user will never have to look at the value - the whole process should just happen "in the background".

And yet, ignoring Lear's actual proposal (which is back to front, but in fairness I don't think he realised this when he suggested it), both methods are conceptually sound. You either:

(1) store "this file will sound x dB loud in a calibrated system", and the player says "hey - I want it y dB loud, so I'll scale it y-x dB".

or

(2) store "this file should be x dB louder", and the player says "hey - I want it y dB louder still, so I'll scale it y+x dB".

Of course, in the second, "y" is optional, though recommended to be +6dB.


Is anyone who is coding this proposing that it should be changed? I've heard Lear - what about Frank and Garf?


My worry against changing it is (a) confusion, and (b) in some file formats, the values are stored as binary data (not ASCII comment tags!), and it's more compact to store values between +/- 30, rather than values between 60 and 110 (approx). Unless, of course, you subtract a pre-set number from those second set of values before storing them - but then you're back to where we are now!

I do not think it's an option to (for example) store (1) "level" in Vorbis and (2) "gain" in mpc. That would just be asking for trouble and confusion. So unless BOTH the mpc and vorbis implementations agree to change, they should BOTH DEFINITELY stay as they are!

Another reason against (1) is that almost no one will have a calibrated system - to them 83 dB or 89 dB is just a (meaningless) number. Whereas "6dB louder than suggested" is still just a number, at least it gives you some idea of what you're doing. To know what 89dB means, you have to know it's 6dB louder than what's suggested. but still OK. In contrast, 100dB (which sounds nice and loud) just won't work (user thinks: "why not - my system can output that power"), whereas "+18dB above what's recomended" does sound like you're going to overload it!


btw, the SMPTE RP 200 spec was changed from 85 to 83. I doubt they'll change it again (and actually the two specs are equivalent) - but if it were changed, (1) would be wrong, whereas (2) would be right.

work to do...

Cheers,
David.
Title: Flaw in ReplayGain spec
Post by: sam on 2002-05-21 14:22:50
Quote
Originally posted by 2Bdecided
OT: That's really bizarre - we both live in Essex, and your girlfriend and my wife both do cross stitch! Anyway...


Oh dear, I've been rumbled! ...more bizarre, I'm Imperial College doing a Mathematics PhD in Dynamical Systems (not that you'd tell from my posts...) and from what I can figure you're at Essex Uni.

Quote
Yes, I see it makes sense from a user's point of view to see a louder file having a bigger number. But hopefully the user will never have to look at the value - the whole process should just happen "in the background".


Yeah that's my only real point. Either way the two methods are the 'same' in the end.

Quote

it's more compact to store values between +/- 30, rather than values between 60 and 110 (approx). 


Fair point.

Quote

I do not think it's an option to (for example) store (1) "level" in Vorbis and (2) "gain" in mpc. That would just be asking for trouble and confusion. So unless BOTH the mpc and vorbis implementations agree to change, they should BOTH DEFINITELY stay as they are!


Yeah, 100% with you on this. Although the +/- means everyone would have to stick to 83 I guess.

Quote

Another reason against (1) is that almost no one will have a calibrated system - to them 83 dB or 89 dB is just a (meaningless) number. Whereas "6dB louder than suggested" is still just a number, at least it gives you some idea of what you're doing. To know what 89dB means, you have to know it's 6dB louder than what's suggested. but still OK. In contrast, 100dB (which sounds nice and loud) just won't work (user thinks: "why not - my system can output that power"), whereas "+18dB above what's recomended" does sound like you're going to overload it!


Now that you have raised that ~100dB, which sort of looks like a good value to pick for someone who doesn't realise quite what is goin on, I think my idea should be dropped totally on this point alone!

Anyway, I hope I've added something to the Gain argument by bringing up a few points.
Title: Flaw in ReplayGain spec
Post by: Lear on 2002-05-21 14:30:33
Quote
Originally posted by 2Bdecided
sam,

that's just the point. the track isn't 89dB loud. If you store REPLAYGAIN_TRACK_LEVEL = 89dB as Lear suggests, what you actually mean is "this track needs to be played at 89dB to make it 83dB loud."*


No, I mean it like this: "this track is 89 dB loud when measured by a method that says the reference pink noise is 83 dB loud". I.e., 6 dB louder than the reference (and the gain value would then be -6 dB with a 83 dB reference).

Quote
Whereas REPLAYGAIN_TRACK_GAIN = +6dB means "increase this track by 6dB (to match the reference level)"


But from this information you can't tell for sure what the reference level is. Which only is a problem if the target level can be specified in the program calculating the difference.  (Which is the case at the moment.)

Quote
In either case, if you want it louder or quiter still, you'll just adjust the pre-amp and that value will be added to it.


Really, my suggestion was simply that I thought it would be clearer - for the user - to specify a value like "89" rather than "+6". I.e., the reference level would be explicit rather than implicit.
Title: Flaw in ReplayGain spec
Post by: Case on 2002-05-21 14:41:29
Quote
Originally posted by Lear
Really, my suggestion was simply that I thought it would be clearer - for the user - to specify a value like "89" rather than "+6". I.e., the reference level would be explicit rather than implicit.

Actually I don't see this would work with your proposal. If someone raised the number behind LEVEL tag he would only get quieter file when played back. That's because player thinks the file is louder and needs to be attenuated to sound like reference 83dB signal. If someone wishes to change target level it should be done in the player.
Title: Flaw in ReplayGain spec
Post by: 2Bdecided on 2002-05-21 16:17:07
I appologise if I've sounded harsh to anyone in this discussion. I didn't have all the answers at the start. It's nearly a year since I thought all this through, and I've had to keep going back to replaygain.org to check what I originally came up with! So I've been re-learing as this thread has grown.

Apart from the few sensible (and many trivial) reasons I've thought of to stay with what I originally proposed, I think the confusion we've managed to generate is reason enough to leave it as it is! :-)

Cheers,
David.
Title: Flaw in ReplayGain spec
Post by: Lear on 2002-05-21 16:25:25
And Case saw a quite good reason to keep it the way it is (and that's what I've based my latest changes to VorbisGain on).
Title: Flaw in ReplayGain spec
Post by: 2Bdecided on 2002-05-21 16:53:10
Great!

back to the issue in hand... what are the vorbis people saying at the moment about RG tags?
Title: Flaw in ReplayGain spec
Post by: Garf on 2002-05-21 18:25:59
Quote
Originally posted by sam

Also, storing a +/- could confuse a few: Does -6 mean I need to decrease the volume of this file by 6 to get to the 83, or does it mean this file is 6dB quieter then the 83 standard.


Since the tag is called 'gain' and not 'level', it should be obvious...

--
GCP
Title: Flaw in ReplayGain spec
Post by: Garf on 2002-05-21 18:47:53
Quote
Originally posted by 2Bdecided
Great!

back to the issue in hand... what are the vorbis people saying at the moment about RG tags?


Beats me. I'm no longer willing to discuss this issue on IRC. If there are arguments against the proposal, they are free to discuss them here. Paradox said he would keep following this thread earlier on, so if he keeps his promise, maybe now is a good time for him to comment as CEO of Xiph. (Edit:I've emailed him and asked him to read up on the thread)

Discussing on IRC is tiring, troublesome to time and has been totally nonconstructive. Sometimes fallacies are thrown against the porposal that I cannot refute instantly at 3am in the morning. (Not to mention the annoyance of people putting their finger in their ears and starting to cry 'neener neener' against your arguments). Contrary to email, this board is open for anyones opinion and I think it's an excellent place to discuss this.

I'm glad we've settled how things should look. If I got it correctly, this is the way the tags look now:

REPLAYGAIN_ALBUM_GAIN=-6.43 dB
REPLAYGAIN_TRACK_GAIN=+1.20 dB
REPLAYGAIN_ALBUM_PEAK=1.12443
REPLAYGAIN_TRACK_PEAK=1.04343

Lear, if updating the tool is finished and debugging and testing done, please give Peter P (WinAmp) and zinx at xmms.org a ring and ask them to update their player plugins as well (and explain the changes needed); awaiting a Vorbis reaction, we will want to update the players that already support the old proposal in any case.

--
GCP
Title: Flaw in ReplayGain spec
Post by: Emmett_v2 on 2002-05-21 19:48:10
Quoth Garf:

"Beats me. I'm no longer willing to discuss this issue on IRC. If there are arguments against the proposal, they are free to discuss them here. Paradox said he would keep following this thread earlier on, so if he keeps his promise, maybe now is a good time for him to comment as CEO of Xiph."

I've kind of been hanging back, looking at what's been going on here, seeing the best way to go about getting replaygain implemented.

Here's my current problem:

I have a very small staff, and they're very busy working on 1.0 of Vorbis and the Vorbis spec. I think it's clear to even the most casual observer that although replaygain support is important and useful, there are other, more pressing needs at the moment. That's number one, and that's outside the realm of even discussing replaygain.

Number two, I still am not convinced that adding replaygain tags is the best solution to implement replaygain. I am convinced that it would be the easiest solution, but I am not convinced that it would be the right one. Let me go back to my original point a little bit.

1.0 (and the spec) is due to be published very soon. Implementing a quick-and-dirty solution for replaygain is likely not a good idea, for a couple reasons. One, it's not a good idea to define temporary solutions in a 1.0 release. Two, if we do something we're going to change later, that means that you'll be bugging your player authors twice.

Unfortunately, there are pressing issues for Vorbis that are more important than replaygain support. I'm aware that it's important to a lot of people (including myself), but we don't have the time to implement it in the best possible way. When we have the time to implement replaygain data (which will likely be in a metadata stream), it will be done, I promise.

I'm very sorry if people are let down by this, but it's what we have to do. For every ten people that are screaming for replaygain support, there are a hundred people screaming for a specification. People shouldn't be screaming at all, but they do, at us, all the time.

Please remember that we are basically volunteers. We work on this stuff full-time when we could easily go out and find well-paying jobs elsewhere. We do this because we love it, and because we think it's important. People tend to lose sight of the fact that we do good work, and we give it away.

I understand that a lot of the people in the current discussion want replaygain implemented because they want what's best for the format, and I appreciate that immensely. I agree with them, and when they take the time to openly discuss their needs without resorting to insults, it behooves me to listen. Thanks to all the helpful posters in this thread.

That's about all for now. I'll keep reading.

Emmett Plant
CEO, Xiph.org Foundation
Title: Flaw in ReplayGain spec
Post by: Garf on 2002-05-21 21:03:02
First, thanks for the speedy reply!

Quote
Originally posted by Emmettfish

I have a very small staff, and they're very busy working on 1.0 of Vorbis and the Vorbis spec. I think it's clear to even the most casual observer that although replaygain support is important and useful, there are other, more pressing needs at the moment. That's number one, and that's outside the realm of even discussing replaygain.


I realize this; a full devotion of the devteam towards RG support is not for today or even the near or forseeable future. This is why I started working on it in the first place.

Quote
Number two, I still am not convinced that adding replaygain tags is the best solution to implement replaygain. I am convinced that it would be the easiest solution, but I am not convinced that it would be the right one. Let me go back to my original point a little bit. 
[/b]

I agree the tags are likely not the best place to store it - I already
stated that in the thread above as well.

Quote
1.0 (and the spec) is due to be published very soon. Implementing a quick-and-dirty solution for replaygain is likely not a good idea, for a couple reasons. One, it's not a good idea to define temporary solutions in a 1.0 release. Two, if we do something we're going to change later, that means that you'll be bugging your player authors twice.


There is a lot of sense in not wanting to release a temporary solution, but it should not be generalized. If that is the case, we might not release 1.0 with the unavoidable preecho problems as well. Better wait till we finish 2.0 with wavelets so we won't have to bug the player authors twice.

That is not what we want, is it?

A temporary solution makes sense when there is no real solution in sight for the forseeable future. If you want to attach warning signs to it 'WARNING TEMPORARY HACK', feel free to do so. But don't just shoot it down without proposing a sensible alternative.

I don't feel it justified to call the current proposal quick-n-dirty. The vorbis implemenation predates your involvement with Xiph and it is based on the by now quite matured work of David Robinson. It's a complete solution with as only drawback that the data is not stored in the best place imaginable - forcedly so, since there is no other place for now and for the time to come.

The temporary solution is already out there. ReplayGain has been in use with Vorbis for half a year now. It was unofficial at first, but since there was interest from the Vorbis side as well, it got semi-official with talks of support in the libs. It wasn't clear to me there was a problem with the current method until things erupted on IRC (that was after several core Vorbis developers emailed me they were ok with the proposed change, even). Thing went from 'we're ok with this and like it' to 'it totally sucks and should die ASAP'.

I'm now in the unfortunate situation that the tools and player support are already out there, the proposal being ok with Vorbis developers or not. That's not my fault - the reason it got adopted by the players was because the player authors _wanted_ it. For the same reason, bugging them to change it is no problem at this point. (Moreso because the changes are minimal and we wrote most code ourselves anyway)

I proposed a change to the temporary hack because I saw a problem with it. I can leave it unchanged and not support it any longer, but it's not going to die. The feature is too important for some users. It will stay in use until Xiph comes with its own official implementation that is at least as good. If that's going to happen, I would at least like the hack to be as clean as possible, hence the proposed change.

The disagreement on what info (album/track/peak) should be stored is a non-issue because of this. It is clear at this point Xiph does not support the current proposal as something official at all, so no agreement is strictly needed - the proposal will be replaced at some unspecified point of time in the future anyway.

At the point Xiph does its own ReplayGain support it will quickly become obvious what is needed or not - if the official support is significantly more problematic (or misses essential features) than the temporary hack (which, IMHO, will happen unless Monty revises some of his opinions) then I will leave it up to you to draw the obvious conclusions.

Considering the choice right now is between doing nothing and leaving a somewhat broken proposal out there, or correcting it and having allmost-but-just-not-completely-perfect support out there right now, which can work until Vorbis finishes ReplayGain support itself (i.e. not for _quite_ a while), I will have to choose the latter.

The only real problem that will result from this is that player authors will not be inclined to support the temporary hack taking this into account. Considering the two most important players (WinAmp/XMMS), which are the ones I use, have already adapted the temporarly proposal, it's not a big problem. The others can do as if they feel like it, or might be inclined to if user pressure to do so exists.

The net result will be that for most users Vorbis will effectively have ReplayGain support _now_, even if it is nonofficial.

The nice side effect is that there will be less pressure on the Vorbis team to finish their own implementation.

The only bad thing is that work is going to be duplicated to some extend. Not a problem for me, nor do I suspect it will be for Lear. As long as it is made clear to player authors the solution is temporary, it is completely up to them to go for it or not.

Quote
Please remember that we are basically volunteers. We work on this stuff full-time when we could easily go out and find well-paying jobs elsewhere. We do this because we love it, and because we think it's important. People tend to lose sight of the fact that we do good work, and we give it away.


You don't have to tell us that - most of the people that have been involved in this are in exactly the same situation, with the difference that they receive zip for their efforts.

--
GCP
Title: Flaw in ReplayGain spec
Post by: Emmett_v2 on 2002-05-21 22:19:02
Quoth Garf:

"I don't feel it justified to call the current proposal quick-n-dirty. The vorbis implemenation predates your involvement with Xiph and it is based on the by now quite matured work of David Robinson. It's a complete solution with as only drawback that the data is not stored in the best place imaginable - forcedly so, since there is no other place for now and for the time to come."

I didn't mean to imply that RG was quick-and-dirty, only the proposed implementation. This is a common theme, I'm finding. People assume that I don't like RG, that I think it's useless, that I've never tried it, etc. Balderdash. My primary concern is that the current proposed implementation puts data where data don't belong.

Let me repeat this again. I don't have a problem with replaygain. I think it's useful. My problem is not with the methodology used to do what it does, my problem is that the proposed implementation does not jive with our standard.

And one more time, for those who missed it. I think that replaygain serves a powerful and useful purpose. My problem is not with what it does or what need it serves, my problem lies in the proposed implementation.

I hope that's clear now.

There is a tremendous amount of work that needs to be done, and this particular issue is driving me to distraction. At the end of the day, I have to look at our mission statement and see where things fit. It's my job to facilitate and manage the creation, production and maintenance of Open Multimedia. That's my primary concern.

I feel that a lot of people don't recognize this, and they bang on the 'I want my favorite feature implemented right now' door, with little concern that there may be other things that are simply more important. They're like the Comic Book Guy on the Simpsons. If we don't implement what they want right away, it's 'Worst Codec Ever.'

I feel fairly secure that at the end of the day, the people who really want this feature will implement it themselves, for themselves. After all, they've been doing it for a while now already. This is great, but there is a driving desire to see RG implemented on a much larger scale. That's okay, and definitely understandable (it's useful), but it's not my primary objective.

Emmett Plant
CEO, Xiph.org Foundation
Title: Flaw in ReplayGain spec
Post by: john33 on 2002-05-21 23:11:38
I have no wish to get heavily involved in this discussion, but one thing is clear to me. The current and, for the near future, proposed solution holds the data in a place and in a way that lends itself very readily to conversion at some later date. Since it also, so far as I can see, causes no real problem where it currently resides, it is surely a fairly elegant ,if temporary, solution?

Maybe I am misunderstanding something here and, no doubt, someone will point that out if it is the case, but since no other immediate solution seems to be on the table, this one works and can later be amended if appropriate without too much pain.
Title: Flaw in ReplayGain spec
Post by: HotshotGG on 2002-05-22 00:25:46
Quote
You don't have to tell us that - most of the people that have been involved in this are in exactly the same situation, with the difference that they receive zip for their efforts.


This is true, but sometimes a lot of people rule that out or just forget about it quite often.

I don't have much to say about ReaplayGain spec, it works either way for me.
Title: Flaw in ReplayGain spec
Post by: Lear on 2002-05-22 15:09:45
Quote
Originally posted by Garf
Lear, if updating the tool is finished and debugging and testing done, please give Peter P (WinAmp) and zinx at xmms.org a ring and ask them to update their player plugins as well (and explain the changes needed); awaiting a Vorbis reaction, we will want to update the players that already support the old proposal in any case.


Well, the code passes a basic test now, but I guess some more testing would be in order, even though there isn't much new code.  I'll email them (john33 included) this weekend at the latest.
Title: Flaw in ReplayGain spec
Post by: Garf on 2002-05-22 18:19:16
Quote
Originally posted by Emmettfish
Quoth Garf:
I didn't mean to imply ...[snip]... I hope that's clear now.


I have said at least three times now in this thread I agree the data is not put where it belongs, so you don't have to try to convince me of that in every post you write.

Quote
I feel fairly secure that at the end of the day, the people who really want this feature will implement it themselves, for themselves. After all, they've been doing it for a while now already. 


...and we will continue doing it, thank you very much ;-)

--
GCP
Title: Flaw in ReplayGain spec
Post by: Garf on 2002-05-22 18:24:52
Quote
Originally posted by john33
I have no wish to get heavily involved in this discussion, but one thing is clear to me. The current and, for the near future, proposed solution holds the data in a place and in a way that lends itself very readily to conversion at some later date. Since it also, so far as I can see, causes no real problem where it currently resides, it is surely a fairly elegant ,if temporary, solution?

Maybe I am misunderstanding something here and, no doubt, someone will point that out if it is the case, but since no other immediate solution seems to be on the table, this one works and can later be amended if appropriate without too much pain.


Thank you for restoring my faith in mankind.

--
GCP
Title: Flaw in ReplayGain spec
Post by: john33 on 2002-05-22 18:45:47
To Lear - Thanks.

And, to Garf - You're welcome!
Title: Flaw in ReplayGain spec
Post by: lijil on 2002-06-22 22:17:21
Although my technical audio knowlege and programming abilities are not up to the level of most posting in this thread, I have read it with great interest. I would just like to ask why the following replay gain tag format is not preferrable to the four seperate tags format:

eg. REPLAYGAIN=trackpeak;trackgain;albumpeak;albumgain
eg. REPLAYGAIN=0.00000000;000.00;0.00000000;000.00

eg. REPLAYGAIN=1.23456789;-12.34;1.23000000;+02.00

It is still human readable, with the keyword REPLAYGAIN being enough for users new to replay gain to get more info on what the tag does. I don't see why the four values have to be labeled (or 'sub labeled'), most users will never edit them manually as text in the tag editor, and audiophiles would have no problem figuring it out. Whether you make the field size fixed (pos/neg sign required, zero padding, etc.) is really not that big of an issue, but if it were fixed it would IMO help to standardize the replay gain tag even further, just by setting some sort of template.

Thanks!
Title: Flaw in ReplayGain spec
Post by: Garf on 2002-06-23 08:36:13
Quote
Originally posted by lijil

It is still human readable, with the keyword REPLAYGAIN being enough for users new to replay gain to get more info on what the tag does. I don't see why the four values have to be labeled (or 'sub labeled'), most users will never edit them manually as text in the tag editor, and audiophiles would have no problem figuring it out. Whether you make the field size fixed (pos/neg sign required, zero padding, etc.) is really not that big of an issue, but if it were fixed it would IMO help to standardize the replay gain tag even further, just by setting some sort of template.


1) easier to parse
2) field _must_ be editable in a tag editor that does not know about ReplayGain
3) field _cannot_ be fixed-size
4) field _cannot_ be dependent on exact formatting

--
GCP
Title: Flaw in ReplayGain spec
Post by: mijj on 2002-07-22 02:16:18
This discussion on VorbisGain suggests there may be a lot more utility to the tags than just carrying bits of user information.

Has there been any thought of there being more than one type of tag?

I.e. 
- Variable tags
(such as the ones that exist now - act as labels attached to the music - casually editable by the average person and only for human consumption)
+
- Constant tags
(information of specialised interest to some users, and intimately related to the way the music is machine interpreted - could there be other tags (apart from gain) which could control the way the information is interpreted by a decoder here? - the information is only adjustable after software has opened a lock - thus it can't be casually changed by a novice (such as myself messing with the gain tags because I was fresh and innocent and thought they couldn't be important because they were amongst all those other editable tags that were purely cosmetic)
Title: Flaw in ReplayGain spec
Post by: Lear on 2002-07-22 11:22:12
Quote
Originally posted by mijj
This discussion on VorbisGain suggests there may be a lot more utility to the tags than just carrying bits of user information.

Has there been any thought of there being more than one type of tag?


Sort of. There has been discussions about adding a metadata stream to Ogg Vorbis files, for storing data more suitable for machine interpretation, or that is otherwise unsuitable to store as a tag (such as lyrics, I guess). That would be a bit like the constant tags you suggest.
Title: Flaw in ReplayGain spec
Post by: Jon Ingram on 2002-07-22 12:42:04
Quote
Sort of. There has been discussions about adding a metadata stream to Ogg Vorbis files, for storing data more suitable for machine interpretation, or that is otherwise unsuitable to store as a tag (such as lyrics, I guess). That would be a bit like the constant tags you suggest.

Yep, and it's more general than that.
People have wanted a general metadata stream for ages, but there have been no serious discussions about where it should go, what it should look like, and what it should contain. You can expect these discussions to be long, vitriolic, tedious, and almost completely pointless .
There is a need to define some information stream, particularly with ogg being used as an AVI replacement (including multiple audio streams, multi-language subtitles, etc.). Either something is hashed out quickly, or the people that are actually *using* the format (via the closed-source directshow filters) will just establish a de-facto standard, and we'll just have to live with it.
Title: Flaw in ReplayGain spec
Post by: mijj on 2002-07-22 17:57:15
< ... mijj contributes with a confidence and confusion borne of innocence and ignorance ...>

... erm ... how about if you were able to include Java code as a hidden tag (and ignorable) - so you could enable a sort of user definable pre-processing facility.  E.g. use it to allow for scrambling and unscrambling voice messages.  ... allow calls to internet pages so you could be bugged by advertising while you play that particular Vorbis file. (.. erk!).  ... Use the Java code to generate event triggers based on the coded sound for synchronisation with external processes... etc.
Title: Flaw in ReplayGain spec
Post by: rjamorim on 2002-07-22 20:01:18
Quote
Originally posted by mijj
< ... mijj contributes with a confidence and confusion borne of innocence and ignorance ...>

... erm ... how about if you were able to include Java code as a hidden tag (and ignorable) - so you could enable a sort of user definable pre-processing facility.   E.g. use it to allow for scrambling and unscrambling voice messages.  ... allow calls to internet pages so you could be bugged by advertising while you play that particular Vorbis file. (.. erk!).  ... Use the Java code to generate event triggers based on the coded sound for synchronisation with external processes... etc.


That would be very unsecure. People could start adding malicious Java code to their oggs and uploading them to FTPs. (Or sharing)
Title: Flaw in ReplayGain spec
Post by: smok3 on 2002-07-23 03:08:03
a question:

when turning on RG in mpc winamp decoder, it will turn down the volume even for the songs which doesnt have the RG tags (probably to some reference level?) which seems like a corect action, not the same for vorbis decoder which will play files without tags at the original 'loudness', why is that so?

p.s. in_vorbis.dll is v1.2 b22 and VorbisGain v0.30.
Title: Flaw in ReplayGain spec
Post by: mijj on 2002-07-26 16:25:20
... and speaking of tags ...

... how come whoever-it-was decided not to include track and album rms value tags?  They could have been useful on playback too. (I think)
Title: Flaw in ReplayGain spec
Post by: Case on 2002-07-26 19:21:46
Quote
Originally posted by smok3
when turning on RG in mpc winamp decoder, it will turn down the volume even for the songs which doesnt have the RG tags (probably to some reference level?)

This is only true if you have over 14 dB headroom. At K-14 the volume will be identical with original.
Title: Flaw in ReplayGain spec
Post by: greenirft on 2002-08-27 16:27:33
I don't quite understand what the deal with the tags are:

Vorbis (and Xiph) do not wish to have the REPLAYGAIN_* tags be official tags, but why does this really matter? Are you suggesting calculating replaygain immediately after encoding (essentially putting calculation into oggenc)?

I would think that IF you were calculating right after encoding that wouldn't be needed. The encoder should simply encode the files, and once encoded a seperate program could calculate the replygain stuff. The seperate program could add the tags, and as long the player and plugin developers agreed to a standard, all would be well.

The original suggestiosn about changes to the tags were nice though, make it easier to read, and if someone were to download a file and notice the REPLAYGAIN tags they could easily get more information (except if you notice what people are sharing, most people don't even bother with doing ID3 properly)
Title: Flaw in ReplayGain spec
Post by: SometimesWarrior on 2002-08-27 20:57:12
I love how this thread keeps getting resurrected from the dead 

Where's rjamorim's Batman image when you need it?