Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: HTML header meta-tags (Read 4631 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

HTML header meta-tags

When I look at the header of the HTML source, I can find some meta-tags that raise some questions for me:

[META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1"]

Is this character set the default setting for all pages in Hydrogen Audio? It might explain why I sometimes have difficulties with extended characters of some posters like apostrophes or the Euro symbol. Perhaps another character set would be better like UTF-8? My browser MSIE 5.5 is set to ISO-8859-1 = Western Europe right now, by the way...

[META name="keywords" content="Hydrogen Audio, Audio, Compression, Audio Compression, MP3, Ogg Vorbis, encoder, codec, decoder, open source, EAC, CBR, VBR, music, settings, switches, portable, samples, listening test, ripping, audiophile, hi-fi, headphones, transparency, MPC, AAC, lossless, transcode, Fraunhofer, FhG, LAME, Psytel, Musepack"]

I miss e.g. MP4 in these keywords while others are redundant like mentioning "audio compression" twice or not separating "ogg" and "vorbis" with a comma. By the way, search robots usually don't index so many keywords as meta-tags (if at all), so it would be better to clean them up, i.e. deleting the very unspecific ones and putting the important ones like the file extensions at the beginning of the tag.

[META NAME="ROBOTS" CONTENT="ALL"]

I'm not sure if this is necessary at all, if you just want the search robots to simply index the whole website from A-Z. But this also makes me wonder if anybody has ever subscribed www.hydrogenaudio.org to the most prominent search engines like Google, Altavista and Yahoo? Because if you don't, no search robot will ever come to index your site. And this seems to be the case with Hydrogen Audio, as I never get a HA forum posting as one of the first results when searching the web for some audio compression related question.
ZZee ya, Hans-Jürgen
BLUEZZ BASTARDZZ - "That lil' ol' ZZ Top cover band from Hamburg..."
INDIGO ROCKS - "Down home rockin' blues. Tasty as strudel."

HTML header meta-tags

Reply #1
I wrote most of the header.

Quote
Is this character set the default setting for all pages in Hydrogen Audio? It might explain why I sometimes have difficulties with extended characters of some posters like apostrophes or the Euro symbol. Perhaps another character set would be better like UTF-8?


Yes, all pages have that header. ISO-8859-1 is Latin-1 and covers English as well as most western european languages, and is in fact the default charset with most browsers. Also, english is the presupposed language here. But you are right, the Euro symbol sometimes makes problems... we should see how UTF-8 behaves.

Quote
I miss e.g. MP4 in these keywords while others are redundant like mentioning "audio compression" twice or not separating "ogg" and "vorbis" with a comma.


Hehe, how did i know that you miss MP4.  I compiled this list in 5 minutes, and JohnV or Dibrom made some additions. "audio compression" is mentioned twice, because "audio", "compression" and "audio compression" can be regarded as three different keywords.

Quote
By the way, search robots usually don't index so many keywords as meta-tags (if at all), so it would be better to clean them up, i.e. deleting the very unspecific ones and putting the important ones like the file extensions at the beginning of the tag.


Hmm really, i've seen much bigger lists. Anyway, good suggestion.

Quote
But this also makes me wonder if anybody has ever subscribed www.hydrogenaudio.org to the most prominent search engines


I'm not certain about this... i somehow presumed it, but never really thought about it. Dibrom or JohnV can probably answer this.

HTML header meta-tags

Reply #2
Quote
Quote
Is this character set the default setting for all pages in Hydrogen Audio? It might explain why I sometimes have difficulties with extended characters of some posters like apostrophes or the Euro symbol. Perhaps another character set would be better like UTF-8?

Yes, all pages have that header. ISO-8859-1 is Latin-1 and covers English as well as most western european languages, and is in fact the default charset with most browsers. Also, english is the presupposed language here. But you are right, the Euro symbol sometimes makes problems... we should see how UTF-8 behaves.

Then it will become difficult if someone doesn't use the same character set in his postings and inserts characters that do not exist in ISO-8859-1 or on different positions in this table. "Honz..." for example uses some Mozilla version, probably not defaulted with this character set, and uses another key for apostrophes like the usual one, so when I reply to him the apostrophes in his quotes get garbled with weird numbers after appearing on HA (but not in my browser cache while writing my answer, so I can only notice this problem after I've already posted my reply).

The same must happen with the Euro sign, because ISO-8859-1 doesn't know this symbol (ISO-8859-15 would be the correct extended character set).

Quote
Quote
I miss e.g. MP4 in these keywords while others are redundant like mentioning "audio compression" twice or not separating "ogg" and "vorbis" with a comma.

Hehe, how did i know that you miss MP4.  I compiled this list in 5 minutes, and JohnV or Dibrom made some additions. "audio compression" is mentioned twice, because "audio", "compression" and "audio compression" can be regarded as three different keywords.


I also miss mp3PRO and WMA by the way...  And you don't have to list combined expressions like the one above separately, because the search engine will do this job if someone  explicitely types "audio compression" into the search field. The same is true for "ogg" and "vorbis", so you don't need another "ogg vorbis" in the keyword list. And small letters are enough, too, because the search robots don't give a damn. 

Quote
Quote
By the way, search robots usually don't index so many keywords as meta-tags (if at all), so it would be better to clean them up, i.e. deleting the very unspecific ones and putting the important ones like the file extensions at the beginning of the tag.

Hmm really, i've seen much bigger lists. Anyway, good suggestion.


Me too, but only because most people don't know about these meta-tags or simply don't care or think keywords would be the most important part in designing a website for a high ranking in the important search engines. But it's much more important that the listed keywords appear in the body of a HTML page as often as possible, because some search engines index only this part or do a comparison between header and body to avoid high rankings for spammer sites and the like.

Quote
Quote
But this also makes me wonder if anybody has ever subscribed www.hydrogenaudio.org to the most prominent search engines

I'm not certain about this... i somehow presumed it, but never really thought about it. Dibrom or JohnV can probably answer this.


If they've missed this or it was long ago maybe, it doesn't do any harm to repeat it now and then, but not within one week or two, because this would also be considered as trying to push the ranking of a spammer site by the search engines and the URL would get blocked by them and wouldn't appear as a search result at all.
ZZee ya, Hans-Jürgen
BLUEZZ BASTARDZZ - "That lil' ol' ZZ Top cover band from Hamburg..."
INDIGO ROCKS - "Down home rockin' blues. Tasty as strudel."


HTML header meta-tags

Reply #4
ha was posted to google, but as allready mentioned in some thread there are some problems with indexing the invisionboard.
( example of a bad result (searching for 'vorbis replaygain') - http://www.google.com/search?q=vorbis+repl...drogenaudio.org )

edit, some rumble is here:
http://forums.invisionpower.com/index.php?...35034&hl=google
http://forums.invisionpower.com/index.php?...36547&hl=google
PANIC: CPU 1: Cache Error (unrecoverable - dcache data) Eframe = 0x90000000208cf3b8
NOTICE - cpu 0 didn't dump TLB, may be hung

HTML header meta-tags

Reply #5
Quote
ha was posted to google, but as allready mentioned in some thread there are some problems with indexing the invisionboard.
edit, some rumble is here:
http://forums.invisionpower.com/index.php?...35034&hl=google

Thanks, in the first thread "sallam" suggested a modification to the IBF source dealing with the session-ids that are normally inserted into the URL for every user (except he explicitely disables this in his profile). Sallam published it here, but obviously nobody tested it except him. Nevertheless he claimed that generally switching off the session-ids made the Google search robot index his complete forum after the modification.

Maybe it would also be possible to disable the session-id insertion for guests only (not for registered members) in the source code of IBF, because they shouldn't need those ids or cookies etc. anyhow. Then a search robot, acting like a guest, could reach the forum postings and index them without a problem, because he wouldn't stumble across those session-ids in the URLs.
ZZee ya, Hans-Jürgen
BLUEZZ BASTARDZZ - "That lil' ol' ZZ Top cover band from Hamburg..."
INDIGO ROCKS - "Down home rockin' blues. Tasty as strudel."

HTML header meta-tags

Reply #6
session cookies are also annoying for ppl not only for google imho, urls that are to long or they need editing every time are listed at the top ten of major web design mistakes (from the local computer mag) as well.

edit: luckily this can be turned off in favor of client side cookies.
edit2: my guess is that default settings are server side cookies ON, if that would defaults to OFF google would have a chance of indexing this board (not sure if the 'forced' client side cookies will work for the spider(s) tho?)
PANIC: CPU 1: Cache Error (unrecoverable - dcache data) Eframe = 0x90000000208cf3b8
NOTICE - cpu 0 didn't dump TLB, may be hung