HydrogenAudio

Hydrogenaudio Forum => General Audio => Topic started by: Frank Klemm on 2004-02-19 16:57:35

Title: A HA.org Sample Database?
Post by: Frank Klemm on 2004-02-19 16:57:35
[span style='font-size:7pt;line-height:100%']Thread split from here (http://www.hydrogenaudio.org/forums/index.php?showtopic=18728&)
-------------------------------------------------------------
[/span]
Task #1  I want/must/should/can take away is the XMMS plugin.
An important second task is maintaining of test samples.

All lossy encoders take benfits from such a web page.
AAC, AC3, Lame, Musepack, Ogg Vorbis, MP2, dts, ...

Currently there is a very old test sample page for gpsycho:

http://www.mp3dev.org/mp3/gpsycho/quality.html (http://www.mp3dev.org/mp3/gpsycho/quality.html)

I think such a page is very important. Otherwise developers must
waste time to hunt for test samples in web forums which appears
and disappears very quickly. Most of the test samples you can't download
2 or 3 days later.

This is a mess and annoying (at least for such persons like Frank Klemm).

The other problem is, when you are downloading these samples,
then you have the mess on your hard disks. Thousands of files
with names like goldc.pac, Hex.pac, jo3.ape, dr4.pac, t1.pac, track07.pac,
lalaw.pac, 11.wav, lust.pac, hvmh3.shn, beo.pac .
You don't know the source of these files and you don't know anymore
what encoder problems these files show.

I have 2,5 GB (4 full CDs) with files with such names. I downloaded
they sometimes somewhere for an unknown purpose, but after 6 or 12 or
24 month I don't know much about these files.

I can backup these 4 CDs and give it CHJW, because I can't upload 2,5 GB
of data.

This  job must be done by someone with useful ears (he/she must make
some checks to avoid too much SPAM in this list), a wide band
internet access, a lot of web space and he/she should be able to send
the test samples on request via S-mail (5..10 €)

Code: [Select]
Listening Test Samples Page
~~~~~~~~~~~~~~~~~~~~~~~~~~~

For Downloads you need a special HydrogenAudio account, which you can get for free.
There is a download limit of 5 MByte per day (except for active developers).
For further information see [download rights].

Samples   [1-50]  [51-100]  [101-150]  [151-186]

------------------------------------------------------------------------------------

This page:
 [1]   [2]   [3] ...
                          ...  [50]

------------------------------------------------------------------------------------

Test Sample #0001

Reported by: Anonymous
Reported on: Mar 20 2003

Source: CD
- Album: Der kleine Hobbit (1980)
   EAN/UPC: 9-783895-841675
   Katalog: 167-6
   LC: ?
   Copyright: 1980 WDR Köln /1986 Hörverlag Stuttgart
- Artits: -
- Media: CD 4
- Track: 2
- Offset 0:00.00
- Title: Folge 8
- Sample rate: 44100 Hz, 2x16 bit
- Length: 5.4 sec

Contents
 This is a radio play. It is mastered very noisy. You have a lot of background noise
 which needs a lot of bits.

Typical errors for encoders:
- Lame: You hear colorization of the background noise and silibant distortions of the in the transition of s-o in the first word.
- Musepack: ...
- FAAC: ...
- Nero AAC: ...

[Download (FLAC file) 0001_Hobbit.flac]

------------------------------------------------------------------------------------

Test Sample #0002

Reported by: Anonymous
Reported on: Mar 22 2003

Source: CD

- Album: Midnite Vultures (1999)
    EAN/UPC: 6-06949-05272-0
    Katalog: 4905272
    LC: LC 07266;
    Copyright: 1999 Geffen Records Inc.
- Artist: Beck
- Media: CD 1
- Track: 4
- Title: Get Real Paid
- Sample rate: 44100 Hz, 2x16 bit
- Offset: 3:51.74 (near the end)
- Length: 8.2 sec

Contents
 Signal is extremely transient. You have attacks in a distance around 16 ms.

Typical errors for encoders:
- Lame: ...
- Musepack: ...
- Nero AAC: ...

[Download (FLAC file) 0002_Beck.flac]

------------------------------------------------------------------------------------

...

------------------------------------------------------------------------------------

This page:
 [1]   [2]   [3] ...
                          ...  [50]

------------------------------------------------------------------------------------

Samples   [1-50]  [51-100]  [101-150]  [151-186]
Title: A HA.org Sample Database?
Post by: Jan S. on 2004-02-19 20:58:15
Quote
An important second task is maintaining of test samples.

What about using the HA wiki (http://doc.hydrogenaudio.org/wikis/hydrogenaudio/FrontPage) for a test sample page? And offering the test samples by bit torrent?
Title: A HA.org Sample Database?
Post by: menno on 2004-02-19 22:44:11
Quote
Quote
An important second task is maintaining of test samples.

What about using the HA wiki (http://doc.hydrogenaudio.org/wikis/hydrogenaudio/FrontPage) for a test sample page? And offering the test samples by bit torrent?

Hmm, I don't think bittorrent will work very well on files that are only downloaded now and then.

I can host with unlimited bandwidth on audiocoding.com if you want, but it's only 100 MB webspace, I suppose that will be filled quite quickly.

Menno
Title: A HA.org Sample Database?
Post by: ChristianHJW on 2004-02-19 22:48:46
IMO its a good idea to start with such a samples page, and not only for MPC. It could be precious for the whole community. matroska has got support from a nice guy lately, his nick is atomic, and he has a server connected with a 100 mbps line, and we have 15 GB space on there. I will talk to him if he could think of doing this for MPC ....
Title: A HA.org Sample Database?
Post by: ViPER1313 on 2004-02-19 22:52:01
Bittorrent also sucks for all us college students who can only dl things at 1kbps because uploads are firewalled  . Also, what about already existing sites such as http://www.ff123.net/samples.html (http://www.ff123.net/samples.html) ??
Title: A HA.org Sample Database?
Post by: Frank Klemm on 2004-02-19 22:55:15
Quote
Quote
Quote
An important second task is maintaining of test samples.

What about using the HA wiki (http://doc.hydrogenaudio.org/wikis/hydrogenaudio/FrontPage) for a test sample page? And offering the test samples by bit torrent?

Hmm, I don't think bittorrent will work very well on files that are only downloaded now and then.

I can host with unlimited bandwidth on audiocoding.com if you want, but it's only 100 MB webspace, I suppose that will be filled quite quickly.

Menno

I expect data in the range ~ 1 GByte and download rates around 20...30 GByte/month.
The base pages (HTML) can be on HA, the PCM files can be splitted on multiple hosts.

There must be done something to avoid that people which are not interested in testing do
download the files only "to have the files".

I would propose to register before download becomes possible.
Active developer has unlimited access, other people has a download limit per day.

When I remove the binary stuff from www.uni-jena.de, I also have 90 MByte for audio files.
Title: A HA.org Sample Database?
Post by: Jan S. on 2004-02-19 22:55:49
The wiki can still be used for the page to make it easy for ppl to add comments and new samples.
But I agree that bit torrent is not a very good solution for many small files...
I have no idea if HA can host these by itself or external hosting would be needed.
Title: A HA.org Sample Database?
Post by: Frank Klemm on 2004-02-20 00:36:58
Quote
Bittorrent also sucks for all us college students who can only dl things at 1kbps because uploads are firewalled  . Also, what about already existing sites such as http://www.ff123.net/samples.html (http://www.ff123.net/samples.html) ??

Three remarks:

The files should not have a known extention. A lot of proxies forbid to access file
with the extensions ".WAV" and ".MP3". I known at least one proxy which do that.
.AIFF, .PAC, .FLAC, .APE are allowed.

Some people can only access via port 21 (ftp) and 80 (http).

I do also know some URLs where sample files can be found. But the problem is that only
10...20% of useful samples files can be found on such stable long term URLs.
Most samples are only temporary available or the URL can only be found by reading
all web forums.
Title: A HA.org Sample Database?
Post by: Florian on 2004-02-20 09:08:47
I could host about 700 MB of the test-samples on anytag.de, but I've only ~ 20-30 GB traffic left per month.

~ Florian
Title: A HA.org Sample Database?
Post by: donovansmith on 2004-02-20 10:03:24
I think I can host 200MB or so of stuff, maybe a bit more. I have 10GB of total bandwidth for my hosting plan, and my sites have yet to exceed even 400MB of bandwidth although they are becoming busier so might need 1GB for them in the next few months. It'd be at donovansmith.us or lowcostaudio.org, you can choose the domain name, or I can get you a subdomain at donovansmith.us. If this can be of any use for either samples and/or web pages feel free to let me know. I appreciate your work on MPC very much, Mr. Klemm
Title: A HA.org Sample Database?
Post by: Frank Klemm on 2004-02-20 11:45:47
Quote
I think I can host 200MB or so of stuff, maybe a bit more. I have 10GB of total bandwidth for my hosting plan, and my sites have yet to exceed even 400MB of bandwidth although they are becoming busier so might need 1GB for them in the next few months. It'd be at donovansmith.us or lowcostaudio.org, you can choose the domain name, or I can get you a subdomain at donovansmith.us. If this can be of any use for either samples and/or web pages feel free to let me know. I appreciate your work on MPC very much, Mr. Klemm

A possible solution would be:

- Base pages are on HA
- Files are spreaded over multiple hosts
- Maintainer has local copy of all spreaded files ...
- ... and test from time to time (2 weeks) the availibility of the spreaded files
- files which become dead are placed on HA.org
- may be the test can be done automatically
- files should be compressed losslessly
- md5 hash and file length of the files should be available, you can test quickly the accurar of your file
Title: A HA.org Sample Database?
Post by: Jan S. on 2004-02-20 12:12:41
Quote
Quote
I think I can host 200MB or so of stuff, maybe a bit more. I have 10GB of total bandwidth for my hosting plan, and my sites have yet to exceed even 400MB of bandwidth although they are becoming busier so might need 1GB for them in the next few months. It'd be at donovansmith.us or lowcostaudio.org, you can choose the domain name, or I can get you a subdomain at donovansmith.us. If this can be of any use for either samples and/or web pages feel free to let me know. I appreciate your work on MPC very much, Mr. Klemm

A possible solution would be:

- Base pages are on HA
- Files are spreaded over multiple hosts
- Maintainer has local copy of all spreaded files ...
- ... and test from time to time (2 weeks) the availibility of the spreaded files
- files which become dead are placed on HA.org
- may be the test can be done automatically
- files should be compressed losslessly
- md5 hash and file length of the files should be available, you can test quickly the accurar of your file

I can take on the task of checking/maintaining the files.
I suggest we use FLAC and md5 (md5 can just be uploaded to the wiki) as Klemm suggested.
Frank Klemm: If you would make an example of a test file at the wiki that would be very nice:
http://doc.hydrogenaudio.org/wikis/hydrogenaudio/FrontPage (http://doc.hydrogenaudio.org/wikis/hydrogenaudio/FrontPage) <-- here under downloads I put a TestSample page you and everybody else can add to.
Title: A HA.org Sample Database?
Post by: Frank Klemm on 2004-02-20 13:39:26
Quote
Quote
Quote
I think I can host 200MB or so of stuff, maybe a bit more. I have 10GB of total bandwidth for my hosting plan, and my sites have yet to exceed even 400MB of bandwidth although they are becoming busier so might need 1GB for them in the next few months. It'd be at donovansmith.us or lowcostaudio.org, you can choose the domain name, or I can get you a subdomain at donovansmith.us. If this can be of any use for either samples and/or web pages feel free to let me know. I appreciate your work on MPC very much, Mr. Klemm

A possible solution would be:

- Base pages are on HA
- Files are spreaded over multiple hosts
- Maintainer has local copy of all spreaded files ...
- ... and test from time to time (2 weeks) the availibility of the spreaded files
- files which become dead are placed on HA.org
- may be the test can be done automatically
- files should be compressed losslessly
- md5 hash and file length of the files should be available, you can test quickly the accurar of your file

I can take on the task of checking/maintaining the files.
I suggest we use FLAC and md5 (md5 can just be uploaded to the wiki) as Klemm suggested.
Frank Klemm: If you would make an example of a test file at the wiki that would be very nice:
http://doc.hydrogenaudio.org/wikis/hydrogenaudio/FrontPage (http://doc.hydrogenaudio.org/wikis/hydrogenaudio/FrontPage) <-- here under downloads I put a TestSample page you and everybody else can add to.

Task #3 is the WinAMP 2/3/5 plugin:

- Do somebody know a valid eMail address of Case? mobiili.net bounces.
- Is Case still interested in WinAMP plugin development?
- Who is interested in WinAMP plugin development and maintaining?

The latest source of the WinAMP has Case, I give the code away
more than 1 1/2 years ago.
Title: A HA.org Sample Database?
Post by: john33 on 2004-02-20 14:29:59
Case's homepage indicates: cse@sci.fi
Title: A HA.org Sample Database?
Post by: dev0 on 2004-02-20 14:48:54
I'm not sure if the wiki would be optimal for such a database. Maybe a specialized solution (anyone up for coding something in PHP or Perl?), which would allow adding/editing/deleting/searching samples/mirrors/comments using a fixed layout and interface.
Title: A HA.org Sample Database?
Post by: Seed on 2004-02-20 14:54:22
1. I think several hosts should share the load, across several countries, to make
it easier for those who have slower connections to certain continents or pay-free
access within their ISP/country. Some redundancy will be needed, meaning that
the samples will have to be hosted by at least 2 web sites, to avoid the
problem of one host going down and lack of availability.

2. I can offer 3 GB of space with unlimited bandwidth on a very fast European site,
but files will have to be uploaded there through me, and not directly by those who
  submit the samples. I don't think I have the time to do this, if many samples are
  submitted. I think atomic should be contacted, as Chris suggested. A site or
    several sites with access to certain chosen devs/volunteers is indeed the 
              appropriate way to do it.

3. BitTorrent can still be used as a back-up system for those who are not limited to
ports 80/21 or are not behind a firewall. The entire collection of samples can be
divided into 15-20 chunks based on either artist/track name or based on the type
of problem (transients on one package, for example) and those 15-10 packages
can be torrented through several dedicated members of the community who can
keep their BitTorrent client open. This should take quite a lot of the load off the
main hosts, and it would allow us to have more sources for the collection, if the
dedicated web sites are down, for whatever reason.

4. AIFF, for example, should not be accepted. Only lossless formats that can be
decoded on all OSs. Shorten = yes. FLAC = yes. AIFF = no (a waste of space).
Title: A HA.org Sample Database?
Post by: music_man_mpc on 2004-02-20 15:12:14
Quote
4. AIFF, for example, should not be accepted. Only lossless formats that can be
decoded on all OSs. Shorten = yes. FLAC = yes. AIFF = no (a waste of space).

I think to keep it simple they should all be in one format.  I would suggest FLAC as it seems to be what is most commonly used here at HA.org.
Title: A HA.org Sample Database?
Post by: dev0 on 2004-02-20 15:13:29
I agree with all the points Seed mentioned and would like to add further considerations:

1. There should be a fixed filenaming scheme. Right now most people, who regularly post samples, use their own schemes or rather meaningless names like track1.flac. I'd suggest artistwithoutspaces-songnospaces.sample20sec.flac. No spaces are used for compatibility and ease of use reasons.

2. Contributers should be able to add/remove their own mirrors, so even people without lots of webspace (many ISPs include 5-10MB for a personal homepage) could help spreading the load.

3. Metadata shouldn't be a requirement, but a recommendation. TrackGain could be useful too, since it gives an indication about the nature of the signal.

dev0
Title: A HA.org Sample Database?
Post by: smok3 on 2004-02-20 16:02:57
i can provide small hosting - about 20 megs, and about 1 Gig monthly limit for the samples, if that is of any use...

edit: iam not aware of any good way to limit the bandwidth btw. (without root rights that is)
Title: A HA.org Sample Database?
Post by: rjamorim on 2004-02-20 16:15:53
Quote
I'm not sure if the wiki would be optimal for such a database. Maybe a specialized solution (anyone up for coding something in PHP or Perl?), which would allow adding/editing/deleting/searching samples/mirrors/comments using a fixed layout and interface.

Another problem with using the wiki, IMO, is that it would probably be hard to hack a download limiter into it (Klemm's idea of allowing registered users downloading 5 samples per day, Developers download as much as they want, non-registered users download nothing...)

Quote
4. AIFF, for example, should not be accepted. Only lossless formats that can be decoded on all OSs. Shorten = yes. FLAC = yes. AIFF = no (a waste of space).


Erm... AIFF is not even lossless, AIFF is Apple's audio container pretty much as WAV is Microsoft's audio container. You can have inside PCM, ADPCM, MP3...

And AIFF can be "decoded" on all mainstream OSs. On Windows: Winamp, nearly every audio editor, QuickTime, (foobar?)... on Mac: QuickTime, iTunes... and on Linux, anything using libsndfile. Converting it to Wav is a snap as well.
Title: A HA.org Sample Database?
Post by: dev0 on 2004-02-20 16:28:29
Still FLAC should be used, since we should all be able to agree on the fact that lossless compression is better than no compression.
Title: A HA.org Sample Database?
Post by: Seed on 2004-02-20 16:34:16
I meant that all collected samples must be compressed, and this is why AIFF should
not be accepted. I've been using this file format for 13 years, so I know very well
what it is, kthx.

dev0's suggestion is perfect. FLAC is the one file format everyone can deal with and
it offers decent compression ratios.
Title: A HA.org Sample Database?
Post by: Jan S. on 2004-02-20 16:38:25
Quote
I'm not sure if the wiki would be optimal for such a database. Maybe a specialized solution (anyone up for coding something in PHP or Perl?), which would allow adding/editing/deleting/searching samples/mirrors/comments using a fixed layout and interface.

I'm trying to make as much use for the wiki as possible. With a good template/example I think the wiki would work out fine.
IMO a new site would be overshooting the mark a bit. I don't think it is gonna be that big and used that much. So that much hosting is probably not needed either IMO.

Quote
1. I think several hosts should share the load, across several countries, to make
it easier for those who have slower connections to certain continents or pay-free
access within their ISP/country. Some redundancy will be needed, meaning that
the samples will have to be hosted by at least 2 web sites, to avoid the
problem of one host going down and lack of availability.

Wouldn't it be enough if one person downloads all files and checks the links every 2 week or something?

Quote
1. There should be a fixed filenaming scheme. Right now most people, who regularly post samples, use their own schemes or rather meaningless names like track1.flac. I'd suggest artistwithoutspaces-songnospaces.sample20sec.flac. No spaces are used for compatibility and ease of use reasons.

I think this is problematic since a lot of the problem samples are old and from unknown source: fatboy, castanets etc.
We should encourage a useful filenaming scheme and tagging though.
A description of the problem can be saved in a tag also.


Quote
Another problem with using the wiki, IMO, is that it would probably be hard to hack a download limiter into it (Klemm's idea of allowing registered users downloading 5 samples per day, Developers download as much as they want, non-registered users download nothing...)


Do you think BW would actually be a problem? If the files a spread around several hosts I find it hard to believe that the load would be very high.
--------

I suggest we do following:What do you think of this?
Title: A HA.org Sample Database?
Post by: rjamorim on 2004-02-20 16:43:22
Quote
Do you think BW would actually be a problem? If the files a spread around several hosts I find it hard to believe that the load would be very high.

I expect there will be bandwidth spikes whenever a developer calls for testing. (considering people will actually respond to the call this time). That could kill several of the small hosts.
Title: A HA.org Sample Database?
Post by: Lefungus on 2004-02-20 16:43:40
Why not keep it simple ?
Everyone could access the pages with informations for each sample.
Only developers could download directly samples.
Everyone else could download with torrents a package that is updated each week, or each month.

For those behind proxies, if they're not developers, maybe some will host as mirrors the full package
Title: A HA.org Sample Database?
Post by: Jan S. on 2004-02-20 16:49:44
Quote
Quote
Do you think BW would actually be a problem? If the files a spread around several hosts I find it hard to believe that the load would be very high.

I expect there will be bandwidth spikes whenever a developer calls for testing. (considering people will actually respond to the call this time). That could kill several of the small hosts.

Do ppl normally delete their samples after a round of testing?
I think that ppl that normally test encoder would already have samples or at least a lot of the usual samples. Should they not I don't think it will take too long before they have most of the samples. And if they want them all they can use torrent IMO.
I thought the biggest load would be when new problem samples are found.

But this is just guessing so I don't know who is right.

Quote
Why not keep it simple ?
Everyone could access the pages with informations for each sample.
Only developers could download directly samples.
Everyone else could download with torrents a package that is updated each week, or each month.

For those behind proxies, if they're not developpers, maybe some will host as mirrors the full package

Yes but do you think it is worth the effort to set up a system like that?
How often do you go hunting for test samples?
I simply don't understand if the load is gonna be that huge.
Title: A HA.org Sample Database?
Post by: danchr on 2004-02-20 17:32:59
Quote
Task #3 is the WinAMP 2/3/5 plugin

Might I suggest that you consider having the necessary decoder portions extracted into a library? This way, you could have people less skilled with C and audio coding take care of the less complicated parts of writing a plug-in.
Title: A HA.org Sample Database?
Post by: g0a on 2004-02-20 18:35:20
I could do any php / database coding and host the site..
Title: A HA.org Sample Database?
Post by: Florian on 2004-02-20 19:17:26
I've just set up a basic listening test samples page (http://www.musepack.net/samples/).

~ Florian

[span style='font-size:8pt;line-height:100%']Edit: changed url to point to the musepack site[/span]
Title: A HA.org Sample Database?
Post by: music_man_mpc on 2004-02-20 21:24:26
Quote
I've just set up a basic listening test samples page (http://www.anytag.de/samples/).

~ Florian

That is beautifuly setup!  Seems to be exactly what was suggested thus far.
Title: A HA.org Sample Database?
Post by: Jan S. on 2004-02-21 18:57:19
ok. Since I really want to use the HA wiki for something useful I stole most of the idea from Ganymed and made a wiki page:

http://doc.hydrogenaudio.org/wikis/hydrogenaudio/TestSamples (http://doc.hydrogenaudio.org/wikis/hydrogenaudio/TestSamples)

What do you think?
Title: A HA.org Sample Database?
Post by: CiTay on 2004-02-21 19:04:03
Quote
ok. Since I really want to use the HA wiki for something useful I stole most of the idea from Ganymed and made a wiki page:

http://doc.hydrogenaudio.org/wikis/hydrogenaudio/TestSamples (http://doc.hydrogenaudio.org/wikis/hydrogenaudio/TestSamples)

What do you think?

Not bad. But i think it should look more structurized, maybe with a table or so.
Title: A HA.org Sample Database?
Post by: Volcano on 2004-02-21 19:08:14
Quote
I've just set up a basic listening test samples page (http://www.anytag.de/samples/).

Very nice!

--

I can also offer to host some of the samples if need be. I'm currently using about 2 of the 50 GB of bandwidth I have available every month, and there's enough ample space on the server to host a fair amount of samples (I have 500 MB available, I suppose I could provide about 150 MB which already makes for about 50 samples).

Regards,

Dominic
Title: A HA.org Sample Database?
Post by: music_man_mpc on 2004-02-21 21:23:51
Hmmm I do have an unusual internet connection.  DSL 1mbit/s upstream and unlimited upstream bandwidth (I am hardly using upstream at all right now).  I could host all of the samples (3 or 4Gb of HD space is insignificant as far as I am concerned) on a FTP server, if no one minds the potenial instability of a Windows based system or if someone could give me some tips on setting up a Linux box I have a spare Pentium 200 laying around (if I can get it working).
Title: A HA.org Sample Database?
Post by: tigre on 2004-02-21 23:01:28
Quote
ok. Since I really want to use the HA wiki for something useful I stole most of the idea from Ganymed and made a wiki page:

http://doc.hydrogenaudio.org/wikis/hydrogenaudio/TestSamples (http://doc.hydrogenaudio.org/wikis/hydrogenaudio/TestSamples)

What do you think?

Looks good (Ganymed's suggestion too of course).

I've got an additional suggestion:
For me - and I assume for many other people who don't have English as their 1st language - it's sometimes hard to express artifacts I hear in (English) words, as well as it can be hard to understand others' descriptions.
Because of this a "Definitions" part of the database would be good, where words like "pre-echo", "smearing", "ringing" are explained and linked to obvious examples.
Title: A HA.org Sample Database?
Post by: Jan S. on 2004-02-21 23:09:25
Quote
Quote
ok. Since I really want to use the HA wiki for something useful I stole most of the idea from Ganymed and made a wiki page:

http://doc.hydrogenaudio.org/wikis/hydrogenaudio/TestSamples (http://doc.hydrogenaudio.org/wikis/hydrogenaudio/TestSamples)

What do you think?

Looks good (Ganymed's suggestion too of course).

I've got an additional suggestion:
For me - and I assume for many other people who don't have English as their 1st language - it's sometimes hard to express artifacts I hear in (English) words, as well as it can be hard to understand others' descriptions.
Because of this a "Definitions" part of the database would be good, where words like "pre-echo", "smearing", "ringing" are explained and linked to obvious examples.

These definitions should be found at the GlossaryPage (http://doc.hydrogenaudio.org/wikis/hydrogenaudio/GlossaryPage) where some already are (they seem to be very technical explanations ATM though).
Title: A HA.org Sample Database?
Post by: Frank Klemm on 2004-02-22 03:24:21
Quote
I meant that all collected samples must be compressed, and this is why AIFF should
not be accepted. I've been using this file format for 13 years, so I know very well
what it is, kthx.

dev0's suggestion is perfect. FLAC is the one file format everyone can deal with and
it offers decent compression ratios.

AIFF = Audio Interchange File Format
Typical extentions: AIF, AIFF, (AIFC)
Big Endian
Audio Container format of Apple Macintosh-Computers.

RIFF = Ressource Interchange File Format
Typical extentions: WAV, AVI, RIFF, RIF
Little Endian
Container format of Microsoft-PC

In both formats you can store everything from G.723 to uncompressed 64 channel, 24 bit, 192 kHz audio.
Title: A HA.org Sample Database?
Post by: westgroveg on 2004-02-22 05:09:50
I think it would be useful to categorize problem samples by format/encoder, so if one was looking for MPC, 1.14b problem samples they could go directly to the MPC section.
Title: A HA.org Sample Database?
Post by: dev0 on 2004-02-22 09:07:07
Quote
I've got an additional suggestion:
For me - and I assume for many other people who don't have English as their 1st language - it's sometimes hard to express artifacts I hear in (English) words, as well as it can be hard to understand others' descriptions.
Because of this a "Definitions" part of the database would be good, where words like "pre-echo", "smearing", "ringing" are explained and linked to obvious examples.

ff123's training page (http://ff123.net/training/training.html) has always been a great help for me.
Title: A HA.org Sample Database?
Post by: dev0 on 2004-02-23 08:45:21
Are there any more oppinions regarding Wiki-based vs. Web-based?

I really like Ganymed's draft and it comes reasonably close to what I imagined, but before work on it can continue we have to decide on one solution.
Title: A HA.org Sample Database?
Post by: ChristianHJW on 2004-02-23 09:22:19
I checked with Atomic, and he is ok about using his server for these samples, as long as the URL is not made public ( he is fearing legal problems about content distribution ).

This gives us a fast server with plenty of space, and from one single location. Jan S. if you want to maintain this sample page, please contact me via PM or on IRC ( hard to get hold of me lately, sorry about that ).....
Title: A HA.org Sample Database?
Post by: Jan S. on 2004-02-23 13:21:22
Quote
Jan S. if you want to maintain this sample page, please contact me via PM or on IRC ( hard to get hold of me lately, sorry about that ).....

No I am not interested in anything not within the framework of HA.
I want to use the wiki as much as possible but everybody else seems to dislike it.
Title: A HA.org Sample Database?
Post by: SacRat on 2004-02-24 06:39:43
A centralized sample archive is definitely a good idea.
If it's created I could submit a couple of samples I used for myself.
So is there anyone, willing to maintain such a thing?
Title: A HA.org Sample Database?
Post by: ChristianHJW on 2004-02-25 00:47:12
Quote
A centralized sample archive is definitely a good idea.
If it's created I could submit a couple of samples I used for myself.
So is there anyone, willing to maintain such a thing?

Well, we are currently searching for a volunteer for this. Jan S. seems to be only available if MPC becomes a Hydrogenaudio Project, which will not be the case. I personally believe HA.org should be a neutral place, and not show any tendencies or support for a specific format, but oh well ......
Title: A HA.org Sample Database?
Post by: joey_m on 2004-02-25 00:59:06
I think Jan S. was only referring to the samples database, not MPC...

I've read the terms and conditions for my free hosting plan at 1and1, and as far as I can see, there would be no problem hosting some 200 MBytes of files (of course, I'm restricted by the 5 GB/month traffic limit), so if you guys are still in search for a few mirrors to distribute the load a bit, I'd be glad to help out.


Cheers, Joey.
Title: A HA.org Sample Database?
Post by: SacRat on 2004-02-26 05:43:12
Well, why not to try this idea:
create a site with a basic sample set, define submission and download rules (descriptions, etc...) and let it live itself. Kinda self-organizing
Don't know if it would work, but it's a way better, than nothing...
btw, why MD5? I think, that ZIPing FLACs would work as well: you won't be able to extract corrupted files
Title: A HA.org Sample Database?
Post by: mpcfiend on 2004-02-26 06:43:15
Simpler solution; link .torrents in the database, and have http/ftp downloads on a more-obscure page. Only those people who are serious about contributing will take the effort to look up the samples and download them from another page.
Title: A HA.org Sample Database?
Post by: robUx4 on 2004-02-26 14:22:06
Quote
Well, we are currently searching for a volunteer for this. Jan S. seems to be only available if MPC becomes a Hydrogenaudio Project, which will not be the case. I personally believe HA.org should be a neutral place, and not show any tendencies or support for a specific format, but oh well ......

This sample database is not MPC related, it concerns all audio codec development.

I'm surprised noone mentioned the copyright problem ! Most of these samples are not copyright free. And so you're not allowed to redistribute it without the approval of someone (maybe even asking for money).

I think we really should take this into account before making the list of samples available to the general public.

Also I suggest that there should be a little fee to access the download. Something that would probably cover the copyright costs. And maybe a license stating that you "agree to download the samples for personal testing purposes and not make it available to anyone else".

For the rest, we have the space. And a database + web interface to handle this should be quite easy. I can do it in PHP (any design proposal is welcomed) in a short time, be it hosted on HA or CoreCodec (PHP that can send emails would be good to inform subscribers when new samples are available). I have no knowledge of Wiki nor want to learn it

Also for the other tasks, I plan to spend time on the XMMS and WinAmp plugins starting from Sunday.
Title: A HA.org Sample Database?
Post by: Jan S. on 2004-02-26 15:00:56
We already post samples here and as long as we only upload <30sec we don't think it is a problem. At least that is the policy of this forum.

Secondly would it be almost impossible to get the right to publish it?
I don't think you can compare it to playing the music at a public place but I don't know.
Title: A HA.org Sample Database?
Post by: rjamorim on 2004-02-26 15:48:56
Quote
We already post samples here and as long as we only upload <30sec we don't think it is a problem. At least that is the policy of this forum.

That's OK, but even a 5 seconds sample contains music created by someone, therefore that sample is copyrighted.

I don't know the legality of distributing samples. When I do my tests, I send out the packages and hope noone will bitch. But I never read at some reliable place that the RIAA waives copyrights of samples smaller than 30 seconds for research or promotional purpose (I.E, Amazon).
Title: A HA.org Sample Database?
Post by: robUx4 on 2004-02-26 15:58:59
I think a sort of license stating that it's educational only or for pur research would cover most normal suit. Of course there will always be the possibility that an asshole would not be happy with it.
Title: A HA.org Sample Database?
Post by: guruboolez on 2004-02-26 16:03:05
Quote
I think a sort of license stating that it's educational only or for pur research would cover most normal suit. Of course there will always be the possibility that an asshole would not be happy with it.

Especially if the database is growing too fast... too big. Gigs of samples "for educational purpose" don't look really serious. Or at least, it's suspicious.
Title: A HA.org Sample Database?
Post by: Florian on 2004-03-03 11:37:52
After there are no other ideas/attempts regarding a samples page in the HA wiki, I've extended my first attempt a little bit. It features aThe page uses the following fields in the database: Filename, URL, Reported By, Reported On, Metadata, Technical info, Description and MD5. It would be very nice, if some of the developers (especially Frank who came up with the idea) can give some feedback to the ideas discussed here.

Please have a look at the Listening Test Samples Page (http://www.musepack.net/samples/).

Best regards,
~ Florian

[span style='font-size:8pt;line-height:100%']--
Edited link to point to new musepack.net samples database[/span]
Title: A HA.org Sample Database?
Post by: ErikS on 2004-03-03 12:38:00
Quote
After there are no other ideas/attempts regarding a samples page in the HA wiki, I've extended my first attempt a little bit. It features a

  • registration system
  • downloading and submitting of samples for registrated members
  • editing of own samples
  • a basic samples search function

Nice.

One suggestion: Try to structure the info more. Use a table for tabular data like this.
Title: A HA.org Sample Database?
Post by: skynetman on 2004-03-03 13:40:58
Boys i don't see the problem.
If u fear legal actions keep only the samples list on the site.
Collect all the samples and release a .rar with a progressive version number on emule/edonkey network.
It will be shared in no time    and it force everybody to have all samples, making the test work esasier.
Title: A HA.org Sample Database?
Post by: westgroveg on 2004-03-08 11:01:54
I created a small MPC sample database (http://www.geocities.com/westgroveg/index.html) page which I was planning on maintaining myself but Ganymed seems to have set-up a very nice page making mine obsolete, just as long as people bother uploading & it doesn't suddenly disappear I think it will be very useful.