Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Reading Flac tags using C# (Read 31143 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Reading Flac tags using C#

Hi All, this is my first post so I don't even know if this is the correct forum to post this ?, I would like to read all my flac files ( 10000+ of them) and extract the tag information into a database - the database side of things is not a problem - understanding the flac documentation for the flac file format is :-(, does anyone know how to do this using c# preferably but I can use c/c++ if necessary - can't find out much Googling so thought I'd try here. TIA
We can't stop here this is bat country - Hunter S Thompson RIP.

Reading Flac tags using C#

Reply #1
Writing your own program is much to complicated IMO. Why not use metaflac? Depending on what operating system you are on, this is very easy. So, what format do want exactly (CSV?), how do you want to handle tags etc.
Music: sounds arranged such that they construct feelings.




Reading Flac tags using C#

Reply #5
Writing your own program is much to complicated IMO. Why not use metaflac? Depending on what operating system you are on, this is very easy. So, what format do want exactly (CSV?), how do you want to handle tags etc.


Hi there and thanks for your reply, I'm a software man by profession so programming is not a problem but I can't seem to find a definitive explanation of the structure of a flac file, the documentation states the first 4 bytes of the header are supposed to read "fLaC" and then goes on to define all the "possible" other information that might or might not be there afterwards , all I need to know is where the "tags" begin and end but it doesn't seem to be written down anywhere
We can't stop here this is bat country - Hunter S Thompson RIP.

Reading Flac tags using C#

Reply #6
Never tried it in c#, but someone once ported a flac encoder to c# directly (see these forums), so perhap it included tagging.  Alternatively, using the official tools in c# via p/invoke:

http://stoyanov.in/2010/01/08/encoding-unc...with-flac-in-c/

Hi and thanks but I need this to be hand rolled - I am completely willing to use tools but I absolutely "need" to understand the file structure to achieve what I'm thinking of doing
We can't stop here this is bat country - Hunter S Thompson RIP.


 

Reading Flac tags using C#

Reply #8
Never tried it in c#, but someone once ported a flac encoder to c# directly (see these forums), so perhap it included tagging.  Alternatively, using the official tools in c# via p/invoke:

http://stoyanov.in/2010/01/08/encoding-unc...with-flac-in-c/

Hi and thanks but I need this to be hand rolled


This is about the dumbest thing you can ever do with a media format, but the specs are on the flac website, so feel free to hang yourself

Reading Flac tags using C#

Reply #9
Never tried it in c#, but someone once ported a flac encoder to c# directly (see these forums), so perhap it included tagging.  Alternatively, using the official tools in c# via p/invoke:

http://stoyanov.in/2010/01/08/encoding-unc...with-flac-in-c/

Hi and thanks but I need this to be hand rolled


This is about the dumbest thing you can ever do with a media format, but the specs are on the flac website, so feel free to hang yourself


What is the dumbest thing ?
We can't stop here this is bat country - Hunter S Thompson RIP.


Reading Flac tags using C#

Reply #11
but it doesn't seem to be written down anywhere

https://xiph.org/flac/format.html#format_overview
https://xiph.org/flac/format.html#metadata_..._vorbis_comment


Oh I've read those, nowhere does it state the location of the data I'm looking for, thanks anyway.
We can't stop here this is bat country - Hunter S Thompson RIP.


Reading Flac tags using C#

Reply #13
Oh I've read those, nowhere does it state the location of the data I'm looking for, thanks anyway.

Uhhh... yes, they do, you probably aren't looking closely enough.

The streamheader starts with fLaC, then various metadata blocks follow. You only need the VORBIS_COMMENT block, which is blocktype 4, so you can skip the other blocks by looking in the metadata block header for their size. So, pseudocode would be

Code: [Select]
1. skip over stream header
2. loop
   2a. check whether this is a metadata block header if yes, keep on going, if no, error
   2b. check whether this is a vorbis_comment block, if yes, keep on going, if no, read the header size, skip over that and restart loop
   2c. read the vorbis_comment stuff into you database


But still, as some here pointed out, making use of a library might be more feature complete and future proof.
Music: sounds arranged such that they construct feelings.

Reading Flac tags using C#

Reply #14
That spec might be difficult to read unless you know how to read formal grammars.
The syntax is defined in top-down manner.

For example, a table named "STREAM" contains 4 rows("<32>", "METADATA_BLOCK", "METADATA_BLOCK*", "FRAME+").

This means that a STREAM (= whole flac file) consists of 32 bit magic number (fLaC), followed by a METADATA_BLOCK, followed by 0 or more METADATA_BLOCKs, followed by 1 or more FRAMEs.
You can read other tables in the same way.
Each table defines a non-terminal symbol like "STREAM" as a sequence of other symbols like METADATA_BLOCK, in the described order.

* and + are often used to describe repetitions (* means 0 or more, + means 1 or more). You should be familiar with it if you know regular expression or something.

Reading Flac tags using C#

Reply #15
but it doesn't seem to be written down anywhere

https://xiph.org/flac/format.html#format_overview
https://xiph.org/flac/format.html#metadata_..._vorbis_comment


Oh I've read those, nowhere does it state the location of the data I'm looking for, thanks anyway.

Then you didn't read it carefully enough.  That page defines exactly what the format of every single bit of data in the headers of a Flac file is and does.  The important ones for you are the metadata blocks, but if you want to parse the file yourself then you'll need to understand the context they sit in, that is the format of the rest of the file.

Still no need for you to parse this entirely by hand.  There is a perfectly good C Flac library which will read out metadat for you at the tag level, if existing tools are not sufficient.



Reading Flac tags using C#

Reply #18
Oh I've read those, nowhere does it state the location of the data I'm looking for, thanks anyway.

Uhhh... yes, they do, you probably aren't looking closely enough.

The streamheader starts with fLaC, then various metadata blocks follow. You only need the VORBIS_COMMENT block, which is blocktype 4, so you can skip the other blocks by looking in the metadata block header for their size. So, pseudocode would be

Code: [Select]
1. skip over stream header
2. loop
   2a. check whether this is a metadata block header if yes, keep on going, if no, error
   2b. check whether this is a vorbis_comment block, if yes, keep on going, if no, read the header size, skip over that and restart loop
   2c. read the vorbis_comment stuff into you database


But still, as some here pointed out, making use of a library might be more feature complete and future proof.


Hi, I've ended up using taglib_sharp which seems very good - when I have more time I'll dig into the code and see how they do it, I know the stream header starts with "fLaC" but where is the block type info and block size ?
We can't stop here this is bat country - Hunter S Thompson RIP.

Reading Flac tags using C#

Reply #19
I know the stream header starts with "fLaC" but where is the block type info and block size ?


In METADATA_BLOCK_HEADER (and METADATA_BLOCK_HEADER is the first 32 bits of METADATA_BLOCK).
So you read first 32 bits from a file (they are equal to "fLaC"), then next 32 bits are METADATA_BLOCK_HEADER.

Also note that "All numbers are big-endian coded. All numbers are unsigned unless otherwise specified."

Reading Flac tags using C#

Reply #20
I know the stream header starts with "fLaC" but where is the block type info and block size ?


In METADATA_BLOCK_HEADER (and METADATA_BLOCK_HEADER is the first 32 bits of METADATA_BLOCK).
So you read first 32 bits from a file (they are equal to "fLaC"), then next 32 bits are METADATA_BLOCK_HEADER.

Also note that "All numbers are big-endian coded. All numbers are unsigned unless otherwise specified."


Hello, and thank you for your patience, I don't know if I'm reading the files correctly as the data after the "fLaC" marker is very odd looking ( I'm reading 4 bytes at a time into a byte array ) smiley faces and musical notes - do I need to convert these values ? I'm using c# at the moment but can use c++ if needs must. Thanks again for your help.
We can't stop here this is bat country - Hunter S Thompson RIP.

Reading Flac tags using C#

Reply #21
That spec might be difficult to read unless you know how to read formal grammars.
The syntax is defined in top-down manner.

For example, a table named "STREAM" contains 4 rows("<32>", "METADATA_BLOCK", "METADATA_BLOCK*", "FRAME+").

This means that a STREAM (= whole flac file) consists of 32 bit magic number (fLaC), followed by a METADATA_BLOCK, followed by 0 or more METADATA_BLOCKs, followed by 1 or more FRAMEs.
You can read other tables in the same way.
Each table defines a non-terminal symbol like "STREAM" as a sequence of other symbols like METADATA_BLOCK, in the described order.

* and + are often used to describe repetitions (* means 0 or more, + means 1 or more). You should be familiar with it if you know regular expression or something.

I think you're right I don't understand the document, are you saying all METDATA_BLOCKS are 32 bits long ?
We can't stop here this is bat country - Hunter S Thompson RIP.

Reading Flac tags using C#

Reply #22
You have to read that doc recursively, top -> bottom, just as if you were a recursive descendant parser.

METADATA_BLOCK, which first appears inside of STREAM definition, is defined on another table for METADATA_BLOCK (just after the STREAM table), where METADATA_BLOCK is defined as METADATA_BLOCK_HEADER followed by METADATA_BLOCK_DATA.

Since both of METADATA_BLOCK_HEADER and METADATA_BLOCK_DATA are not defined at the point, you have to seek for definition of METADATA_BLOCK_HEADER and METADATA_BLOCK_DATA in other places. Recursively means this... you have to continue this procedure until you read all the unknown elements defined at somewhere.

As for METADATA_BLOCK_HEADER, it is described just after the METADATA_BLOCK table, and now it's elements are all defined without using other undefined elements (1bit flag, followed by 7bit BLOCK_TYPE, followed by 24bit length of the metadata).

Reading Flac tags using C#

Reply #23
I don't know if I'm reading the files correctly as the data after the "fLaC" marker is very odd looking ( I'm reading 4 bytes at a time into a byte array ) smiley faces and musical notes - do I need to convert these values ?

What did you expect to see - numbers in text format? FLAC is a binary format so you have to interpret these 32 bits correctly:
1bit flag, followed by 7bit BLOCK_TYPE, followed by 24bit length of the metadata


Reading Flac tags using C#

Reply #24
You have to read that doc recursively, top -> bottom, just as if you were a recursive descendant parser.

METADATA_BLOCK, which first appears inside of STREAM definition, is defined on another table for METADATA_BLOCK (just after the STREAM table), where METADATA_BLOCK is defined as METADATA_BLOCK_HEADER followed by METADATA_BLOCK_DATA.

Since both of METADATA_BLOCK_HEADER and METADATA_BLOCK_DATA are not defined at the point, you have to seek for definition of METADATA_BLOCK_HEADER and METADATA_BLOCK_DATA in other places. Recursively means this... you have to continue this procedure until you read all the unknown elements defined at somewhere.

As for METADATA_BLOCK_HEADER, it is described just after the METADATA_BLOCK table, and now it's elements are all defined without using other undefined elements (1bit flag, followed by 7bit BLOCK_TYPE, followed by 24bit length of the metadata).


Hi there, thought I'd show you what I'm doing in code

Code: [Select]
public void ProcessTag(string filename)
        {
            int BlockSize = 4;
            int BytesRead = 0;

            UTF8Encoding encoding = new UTF8Encoding();
            byte[] block = new Byte[BlockSize];
            
            FileStream fs = new FileStream(filename, FileMode.Open);
            BinaryReader br = new BinaryReader(fs);
          
            while ((block = br.ReadBytes(4)) != null)
            {
                BytesRead += BlockSize;

                string s = encoding.GetString(block);
                // on first pass s = "fLaC" as expected
                // I'm only getting the string for debugging purposes.

                // on next pass where I would expect block flag and block type I get these values in the array
                // block[0] = "0", block[1] = "0",block[2] = "0", block[3] = "34"
                // Don't know what to do next ?
|           }

            fs.Close();
}


Thanks again for your patience.
We can't stop here this is bat country - Hunter S Thompson RIP.