HydrogenAudio

Hydrogenaudio Forum => Site Related Discussion => Topic started by: hans-jürgen on 2002-12-22 16:20:33

Title: HTML header meta-tags
Post by: hans-jürgen on 2002-12-22 16:20:33
When I look at the header of the HTML source, I can find some meta-tags that raise some questions for me:

[META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1"]

Is this character set the default setting for all pages in Hydrogen Audio? It might explain why I sometimes have difficulties with extended characters of some posters like apostrophes or the Euro symbol. Perhaps another character set would be better like UTF-8? My browser MSIE 5.5 is set to ISO-8859-1 = Western Europe right now, by the way...

[META name="keywords" content="Hydrogen Audio, Audio, Compression, Audio Compression, MP3, Ogg Vorbis, encoder, codec, decoder, open source, EAC, CBR, VBR, music, settings, switches, portable, samples, listening test, ripping, audiophile, hi-fi, headphones, transparency, MPC, AAC, lossless, transcode, Fraunhofer, FhG, LAME, Psytel, Musepack"]

I miss e.g. MP4 in these keywords while others are redundant like mentioning "audio compression" twice or not separating "ogg" and "vorbis" with a comma. By the way, search robots usually don't index so many keywords as meta-tags (if at all), so it would be better to clean them up, i.e. deleting the very unspecific ones and putting the important ones like the file extensions at the beginning of the tag.

[META NAME="ROBOTS" CONTENT="ALL"]

I'm not sure if this is necessary at all, if you just want the search robots to simply index the whole website from A-Z. But this also makes me wonder if anybody has ever subscribed www.hydrogenaudio.org to the most prominent search engines like Google, Altavista and Yahoo? Because if you don't, no search robot will ever come to index your site. And this seems to be the case with Hydrogen Audio, as I never get a HA forum posting as one of the first results when searching the web for some audio compression related question.
Title: HTML header meta-tags
Post by: CiTay on 2002-12-22 17:18:00
I wrote most of the header.

Quote
Is this character set the default setting for all pages in Hydrogen Audio? It might explain why I sometimes have difficulties with extended characters of some posters like apostrophes or the Euro symbol. Perhaps another character set would be better like UTF-8?


Yes, all pages have that header. ISO-8859-1 is Latin-1 and covers English as well as most western european languages, and is in fact the default charset with most browsers. Also, english is the presupposed language here. But you are right, the Euro symbol sometimes makes problems... we should see how UTF-8 behaves.

Quote
I miss e.g. MP4 in these keywords while others are redundant like mentioning "audio compression" twice or not separating "ogg" and "vorbis" with a comma.


Hehe, how did i know that you miss MP4.  I compiled this list in 5 minutes, and JohnV or Dibrom made some additions. "audio compression" is mentioned twice, because "audio", "compression" and "audio compression" can be regarded as three different keywords.

Quote
By the way, search robots usually don't index so many keywords as meta-tags (if at all), so it would be better to clean them up, i.e. deleting the very unspecific ones and putting the important ones like the file extensions at the beginning of the tag.


Hmm really, i've seen much bigger lists. Anyway, good suggestion.

Quote
But this also makes me wonder if anybody has ever subscribed www.hydrogenaudio.org to the most prominent search engines


I'm not certain about this... i somehow presumed it, but never really thought about it. Dibrom or JohnV can probably answer this.
Title: HTML header meta-tags
Post by: hans-jürgen on 2002-12-22 20:12:59
Quote
Quote
Is this character set the default setting for all pages in Hydrogen Audio? It might explain why I sometimes have difficulties with extended characters of some posters like apostrophes or the Euro symbol. Perhaps another character set would be better like UTF-8?

Yes, all pages have that header. ISO-8859-1 is Latin-1 and covers English as well as most western european languages, and is in fact the default charset with most browsers. Also, english is the presupposed language here. But you are right, the Euro symbol sometimes makes problems... we should see how UTF-8 behaves.

Then it will become difficult if someone doesn't use the same character set in his postings and inserts characters that do not exist in ISO-8859-1 or on different positions in this table. "Honz..." for example uses some Mozilla version, probably not defaulted with this character set, and uses another key for apostrophes like the usual one, so when I reply to him the apostrophes in his quotes get garbled with weird numbers after appearing on HA (but not in my browser cache while writing my answer, so I can only notice this problem after I've already posted my reply).

The same must happen with the Euro sign, because ISO-8859-1 doesn't know this symbol (ISO-8859-15 would be the correct extended character set).

Quote
Quote
I miss e.g. MP4 in these keywords while others are redundant like mentioning "audio compression" twice or not separating "ogg" and "vorbis" with a comma.

Hehe, how did i know that you miss MP4.  I compiled this list in 5 minutes, and JohnV or Dibrom made some additions. "audio compression" is mentioned twice, because "audio", "compression" and "audio compression" can be regarded as three different keywords.


I also miss mp3PRO and WMA by the way...  And you don't have to list combined expressions like the one above separately, because the search engine will do this job if someone  explicitely types "audio compression" into the search field. The same is true for "ogg" and "vorbis", so you don't need another "ogg vorbis" in the keyword list. And small letters are enough, too, because the search robots don't give a damn. 

Quote
Quote
By the way, search robots usually don't index so many keywords as meta-tags (if at all), so it would be better to clean them up, i.e. deleting the very unspecific ones and putting the important ones like the file extensions at the beginning of the tag.

Hmm really, i've seen much bigger lists. Anyway, good suggestion.


Me too, but only because most people don't know about these meta-tags or simply don't care or think keywords would be the most important part in designing a website for a high ranking in the important search engines. But it's much more important that the listed keywords appear in the body of a HTML page as often as possible, because some search engines index only this part or do a comparison between header and body to avoid high rankings for spammer sites and the like.

Quote
Quote
But this also makes me wonder if anybody has ever subscribed www.hydrogenaudio.org to the most prominent search engines

I'm not certain about this... i somehow presumed it, but never really thought about it. Dibrom or JohnV can probably answer this.


If they've missed this or it was long ago maybe, it doesn't do any harm to repeat it now and then, but not within one week or two, because this would also be considered as trying to push the ranking of a spammer site by the search engines and the URL would get blocked by them and wouldn't appear as a search result at all.
Title: HTML header meta-tags
Post by: Andavari on 2002-12-22 23:46:29
Another keyword that should be inplimented is "Discussion."

Doing a Google search for "audio discussion"
http://www.google.com/search?q=audio+discussion (http://www.google.com/search?q=audio+discussion) has some rather unimpressive results.
Title: HTML header meta-tags
Post by: smok3 on 2002-12-23 00:31:18
ha was posted to google, but as allready mentioned in some thread there are some problems with indexing the invisionboard.
( example of a bad result (searching for 'vorbis replaygain') - http://www.google.com/search?q=vorbis+repl...drogenaudio.org (http://www.google.com/search?q=vorbis+replaygain+site%3Ahydrogenaudio.org) )

edit, some rumble is here:
http://forums.invisionpower.com/index.php?...35034&hl=google (http://forums.invisionpower.com/index.php?act=ST&f=30&t=35034&hl=google)
http://forums.invisionpower.com/index.php?...36547&hl=google (http://forums.invisionpower.com/index.php?act=ST&f=7&t=36547&hl=google)
Title: HTML header meta-tags
Post by: hans-jürgen on 2002-12-23 09:17:12
Quote
ha was posted to google, but as allready mentioned in some thread there are some problems with indexing the invisionboard.
edit, some rumble is here:
http://forums.invisionpower.com/index.php?...35034&hl=google (http://forums.invisionpower.com/index.php?act=ST&f=30&t=35034&hl=google)

Thanks, in the first thread "sallam" suggested a modification to the IBF source dealing with the session-ids that are normally inserted into the URL for every user (except he explicitely disables this in his profile). Sallam published it here (http://www.ibresource.com/?pg=db&mod=775), but obviously nobody tested it except him. Nevertheless he claimed that generally switching off the session-ids made the Google search robot index his complete forum after the modification.

Maybe it would also be possible to disable the session-id insertion for guests only (not for registered members) in the source code of IBF, because they shouldn't need those ids or cookies etc. anyhow. Then a search robot, acting like a guest, could reach the forum postings and index them without a problem, because he wouldn't stumble across those session-ids in the URLs.
Title: HTML header meta-tags
Post by: smok3 on 2002-12-25 17:17:02
session cookies are also annoying for ppl not only for google imho, urls that are to long or they need editing every time are listed at the top ten of major web design mistakes (from the local computer mag) as well.

edit: luckily this can be turned off in favor of client side cookies.
edit2: my guess is that default settings are server side cookies ON, if that would defaults to OFF google would have a chance of indexing this board (not sure if the 'forced' client side cookies will work for the spider(s) tho?)