HydrogenAudio

Lossless Audio Compression => FLAC => Topic started by: pkfox on 2014-05-01 17:05:19

Title: Reading Flac tags using C#
Post by: pkfox on 2014-05-01 17:05:19
Hi All, this is my first post so I don't even know if this is the correct forum to post this ?, I would like to read all my flac files ( 10000+ of them) and extract the tag information into a database - the database side of things is not a problem - understanding the flac documentation for the flac file format is :-(, does anyone know how to do this using c# preferably but I can use c/c++ if necessary - can't find out much Googling so thought I'd try here. TIA
Title: Reading Flac tags using C#
Post by: ktf on 2014-05-01 17:33:51
Writing your own program is much to complicated IMO. Why not use metaflac? Depending on what operating system you are on, this is very easy. So, what format do want exactly (CSV?), how do you want to handle tags etc.
Title: Reading Flac tags using C#
Post by: saratoga on 2014-05-01 17:37:21
Never tried it in c#, but someone once ported a flac encoder to c# directly (see these forums), so perhap it included tagging.  Alternatively, using the official tools in c# via p/invoke:

http://stoyanov.in/2010/01/08/encoding-unc...with-flac-in-c/ (http://stoyanov.in/2010/01/08/encoding-uncompressed-audio-with-flac-in-c/)
Title: Reading Flac tags using C#
Post by: nu774 on 2014-05-01 17:56:46
This one is not specific to FLAC:
https://github.com/mono/taglib-sharp (https://github.com/mono/taglib-sharp)
Title: Reading Flac tags using C#
Post by: ozok on 2014-05-01 21:19:14
Using mediainfo is an other option. I remember using a .net wrapper a long time ago. I'm not sure if this is it tho https://code.google.com/p/mediainfo-dot-net/ (https://code.google.com/p/mediainfo-dot-net/)
Title: Reading Flac tags using C#
Post by: pkfox on 2014-05-01 22:49:38
Writing your own program is much to complicated IMO. Why not use metaflac? Depending on what operating system you are on, this is very easy. So, what format do want exactly (CSV?), how do you want to handle tags etc.


Hi there and thanks for your reply, I'm a software man by profession so programming is not a problem but I can't seem to find a definitive explanation of the structure of a flac file, the documentation states the first 4 bytes of the header are supposed to read "fLaC" and then goes on to define all the "possible" other information that might or might not be there afterwards , all I need to know is where the "tags" begin and end but it doesn't seem to be written down anywhere
Title: Reading Flac tags using C#
Post by: pkfox on 2014-05-01 22:53:27
Never tried it in c#, but someone once ported a flac encoder to c# directly (see these forums), so perhap it included tagging.  Alternatively, using the official tools in c# via p/invoke:

http://stoyanov.in/2010/01/08/encoding-unc...with-flac-in-c/ (http://stoyanov.in/2010/01/08/encoding-uncompressed-audio-with-flac-in-c/)

Hi and thanks but I need this to be hand rolled - I am completely willing to use tools but I absolutely "need" to understand the file structure to achieve what I'm thinking of doing
Title: Reading Flac tags using C#
Post by: lvqcl on 2014-05-01 22:59:20
but it doesn't seem to be written down anywhere

https://xiph.org/flac/format.html#format_overview (https://xiph.org/flac/format.html#format_overview)
https://xiph.org/flac/format.html#metadata_..._vorbis_comment (https://xiph.org/flac/format.html#metadata_block_vorbis_comment)
Title: Reading Flac tags using C#
Post by: saratoga on 2014-05-01 23:32:29
Never tried it in c#, but someone once ported a flac encoder to c# directly (see these forums), so perhap it included tagging.  Alternatively, using the official tools in c# via p/invoke:

http://stoyanov.in/2010/01/08/encoding-unc...with-flac-in-c/ (http://stoyanov.in/2010/01/08/encoding-uncompressed-audio-with-flac-in-c/)

Hi and thanks but I need this to be hand rolled


This is about the dumbest thing you can ever do with a media format, but the specs are on the flac website, so feel free to hang yourself
Title: Reading Flac tags using C#
Post by: pkfox on 2014-05-02 06:50:23
Never tried it in c#, but someone once ported a flac encoder to c# directly (see these forums), so perhap it included tagging.  Alternatively, using the official tools in c# via p/invoke:

http://stoyanov.in/2010/01/08/encoding-unc...with-flac-in-c/ (http://stoyanov.in/2010/01/08/encoding-uncompressed-audio-with-flac-in-c/)

Hi and thanks but I need this to be hand rolled


This is about the dumbest thing you can ever do with a media format, but the specs are on the flac website, so feel free to hang yourself


What is the dumbest thing ?
Title: Reading Flac tags using C#
Post by: pkfox on 2014-05-02 07:24:39
This one is not specific to FLAC:
https://github.com/mono/taglib-sharp (https://github.com/mono/taglib-sharp)

Looks the best so far thank you
Title: Reading Flac tags using C#
Post by: pkfox on 2014-05-02 07:27:45
but it doesn't seem to be written down anywhere

https://xiph.org/flac/format.html#format_overview (https://xiph.org/flac/format.html#format_overview)
https://xiph.org/flac/format.html#metadata_..._vorbis_comment (https://xiph.org/flac/format.html#metadata_block_vorbis_comment)


Oh I've read those, nowhere does it state the location of the data I'm looking for, thanks anyway.
Title: Reading Flac tags using C#
Post by: pkfox on 2014-05-02 07:29:17
Using mediainfo is an other option. I remember using a .net wrapper a long time ago. I'm not sure if this is it tho https://code.google.com/p/mediainfo-dot-net/ (https://code.google.com/p/mediainfo-dot-net/)

I will definitely check it out - thanks.
Title: Reading Flac tags using C#
Post by: ktf on 2014-05-02 08:14:14
Oh I've read those, nowhere does it state the location of the data I'm looking for, thanks anyway.

Uhhh... yes, they do, you probably aren't looking closely enough.

The streamheader starts with fLaC, then various metadata blocks follow. You only need the VORBIS_COMMENT block, which is blocktype 4, so you can skip the other blocks by looking in the metadata block header for their size. So, pseudocode would be

Code: [Select]
1. skip over stream header
2. loop
   2a. check whether this is a metadata block header if yes, keep on going, if no, error
   2b. check whether this is a vorbis_comment block, if yes, keep on going, if no, read the header size, skip over that and restart loop
   2c. read the vorbis_comment stuff into you database


But still, as some here pointed out, making use of a library might be more feature complete and future proof.
Title: Reading Flac tags using C#
Post by: nu774 on 2014-05-02 10:37:00
That spec might be difficult to read unless you know how to read formal grammars.
The syntax is defined in top-down manner.

For example, a table named "STREAM" contains 4 rows("<32>", "METADATA_BLOCK", "METADATA_BLOCK*", "FRAME+").

This means that a STREAM (= whole flac file) consists of 32 bit magic number (fLaC), followed by a METADATA_BLOCK, followed by 0 or more METADATA_BLOCKs, followed by 1 or more FRAMEs.
You can read other tables in the same way.
Each table defines a non-terminal symbol like "STREAM" as a sequence of other symbols like METADATA_BLOCK, in the described order.

* and + are often used to describe repetitions (* means 0 or more, + means 1 or more). You should be familiar with it if you know regular expression or something.
Title: Reading Flac tags using C#
Post by: lithopsian on 2014-05-02 13:02:38
but it doesn't seem to be written down anywhere

https://xiph.org/flac/format.html#format_overview (https://xiph.org/flac/format.html#format_overview)
https://xiph.org/flac/format.html#metadata_..._vorbis_comment (https://xiph.org/flac/format.html#metadata_block_vorbis_comment)


Oh I've read those, nowhere does it state the location of the data I'm looking for, thanks anyway.

Then you didn't read it carefully enough.  That page defines exactly what the format of every single bit of data in the headers of a Flac file is and does.  The important ones for you are the metadata blocks, but if you want to parse the file yourself then you'll need to understand the context they sit in, that is the format of the rest of the file.

Still no need for you to parse this entirely by hand.  There is a perfectly good C Flac library which will read out metadat for you at the tag level, if existing tools are not sufficient.
Title: Reading Flac tags using C#
Post by: lithopsian on 2014-05-02 14:01:16
If you want a list of tags, there isn't a definitive one, but here are some useful links that cover most cases:
http://age.hobba.nl/audio/tag_frame_reference.html (http://age.hobba.nl/audio/tag_frame_reference.html)
http://age.hobba.nl/audio/mirroredpages/ogg-tagging.html (http://age.hobba.nl/audio/mirroredpages/ogg-tagging.html)
http://xiph.org/vorbis/doc/v-comment.html (http://xiph.org/vorbis/doc/v-comment.html)
https://wiki.xiph.org/Field_names (https://wiki.xiph.org/Field_names)
Title: Reading Flac tags using C#
Post by: pkfox on 2014-05-03 09:11:10
This one is not specific to FLAC:
https://github.com/mono/taglib-sharp (https://github.com/mono/taglib-sharp)

That is what I'm using now v impressive so far.
Title: Reading Flac tags using C#
Post by: pkfox on 2014-05-03 09:18:37
Oh I've read those, nowhere does it state the location of the data I'm looking for, thanks anyway.

Uhhh... yes, they do, you probably aren't looking closely enough.

The streamheader starts with fLaC, then various metadata blocks follow. You only need the VORBIS_COMMENT block, which is blocktype 4, so you can skip the other blocks by looking in the metadata block header for their size. So, pseudocode would be

Code: [Select]
1. skip over stream header
2. loop
   2a. check whether this is a metadata block header if yes, keep on going, if no, error
   2b. check whether this is a vorbis_comment block, if yes, keep on going, if no, read the header size, skip over that and restart loop
   2c. read the vorbis_comment stuff into you database


But still, as some here pointed out, making use of a library might be more feature complete and future proof.


Hi, I've ended up using taglib_sharp which seems very good - when I have more time I'll dig into the code and see how they do it, I know the stream header starts with "fLaC" but where is the block type info and block size ?
Title: Reading Flac tags using C#
Post by: lvqcl on 2014-05-03 09:59:32
I know the stream header starts with "fLaC" but where is the block type info and block size ?


In METADATA_BLOCK_HEADER (and METADATA_BLOCK_HEADER is the first 32 bits of METADATA_BLOCK).
So you read first 32 bits from a file (they are equal to "fLaC"), then next 32 bits are METADATA_BLOCK_HEADER.

Also note that "All numbers are big-endian coded. All numbers are unsigned unless otherwise specified."
Title: Reading Flac tags using C#
Post by: pkfox on 2014-05-04 09:25:12
I know the stream header starts with "fLaC" but where is the block type info and block size ?


In METADATA_BLOCK_HEADER (and METADATA_BLOCK_HEADER is the first 32 bits of METADATA_BLOCK).
So you read first 32 bits from a file (they are equal to "fLaC"), then next 32 bits are METADATA_BLOCK_HEADER.

Also note that "All numbers are big-endian coded. All numbers are unsigned unless otherwise specified."


Hello, and thank you for your patience, I don't know if I'm reading the files correctly as the data after the "fLaC" marker is very odd looking ( I'm reading 4 bytes at a time into a byte array ) smiley faces and musical notes - do I need to convert these values ? I'm using c# at the moment but can use c++ if needs must. Thanks again for your help.
Title: Reading Flac tags using C#
Post by: pkfox on 2014-05-04 09:40:33
That spec might be difficult to read unless you know how to read formal grammars.
The syntax is defined in top-down manner.

For example, a table named "STREAM" contains 4 rows("<32>", "METADATA_BLOCK", "METADATA_BLOCK*", "FRAME+").

This means that a STREAM (= whole flac file) consists of 32 bit magic number (fLaC), followed by a METADATA_BLOCK, followed by 0 or more METADATA_BLOCKs, followed by 1 or more FRAMEs.
You can read other tables in the same way.
Each table defines a non-terminal symbol like "STREAM" as a sequence of other symbols like METADATA_BLOCK, in the described order.

* and + are often used to describe repetitions (* means 0 or more, + means 1 or more). You should be familiar with it if you know regular expression or something.

I think you're right I don't understand the document, are you saying all METDATA_BLOCKS are 32 bits long ?
Title: Reading Flac tags using C#
Post by: nu774 on 2014-05-04 11:52:01
You have to read that doc recursively, top -> bottom, just as if you were a recursive descendant parser.

METADATA_BLOCK, which first appears inside of STREAM definition, is defined on another table for METADATA_BLOCK (just after the STREAM table), where METADATA_BLOCK is defined as METADATA_BLOCK_HEADER followed by METADATA_BLOCK_DATA.

Since both of METADATA_BLOCK_HEADER and METADATA_BLOCK_DATA are not defined at the point, you have to seek for definition of METADATA_BLOCK_HEADER and METADATA_BLOCK_DATA in other places. Recursively means this... you have to continue this procedure until you read all the unknown elements defined at somewhere.

As for METADATA_BLOCK_HEADER, it is described just after the METADATA_BLOCK table, and now it's elements are all defined without using other undefined elements (1bit flag, followed by 7bit BLOCK_TYPE, followed by 24bit length of the metadata).
Title: Reading Flac tags using C#
Post by: lvqcl on 2014-05-04 12:42:59
I don't know if I'm reading the files correctly as the data after the "fLaC" marker is very odd looking ( I'm reading 4 bytes at a time into a byte array ) smiley faces and musical notes - do I need to convert these values ?

What did you expect to see - numbers in text format? FLAC is a binary format so you have to interpret these 32 bits correctly:
1bit flag, followed by 7bit BLOCK_TYPE, followed by 24bit length of the metadata

Title: Reading Flac tags using C#
Post by: pkfox on 2014-05-04 12:54:22
You have to read that doc recursively, top -> bottom, just as if you were a recursive descendant parser.

METADATA_BLOCK, which first appears inside of STREAM definition, is defined on another table for METADATA_BLOCK (just after the STREAM table), where METADATA_BLOCK is defined as METADATA_BLOCK_HEADER followed by METADATA_BLOCK_DATA.

Since both of METADATA_BLOCK_HEADER and METADATA_BLOCK_DATA are not defined at the point, you have to seek for definition of METADATA_BLOCK_HEADER and METADATA_BLOCK_DATA in other places. Recursively means this... you have to continue this procedure until you read all the unknown elements defined at somewhere.

As for METADATA_BLOCK_HEADER, it is described just after the METADATA_BLOCK table, and now it's elements are all defined without using other undefined elements (1bit flag, followed by 7bit BLOCK_TYPE, followed by 24bit length of the metadata).


Hi there, thought I'd show you what I'm doing in code

Code: [Select]
public void ProcessTag(string filename)
        {
            int BlockSize = 4;
            int BytesRead = 0;

            UTF8Encoding encoding = new UTF8Encoding();
            byte[] block = new Byte[BlockSize];
            
            FileStream fs = new FileStream(filename, FileMode.Open);
            BinaryReader br = new BinaryReader(fs);
          
            while ((block = br.ReadBytes(4)) != null)
            {
                BytesRead += BlockSize;

                string s = encoding.GetString(block);
                // on first pass s = "fLaC" as expected
                // I'm only getting the string for debugging purposes.

                // on next pass where I would expect block flag and block type I get these values in the array
                // block[0] = "0", block[1] = "0",block[2] = "0", block[3] = "34"
                // Don't know what to do next ?
|           }

            fs.Close();
}


Thanks again for your patience.
Title: Reading Flac tags using C#
Post by: pkfox on 2014-05-05 06:21:51
I don't know if I'm reading the files correctly as the data after the "fLaC" marker is very odd looking ( I'm reading 4 bytes at a time into a byte array ) smiley faces and musical notes - do I need to convert these values ?

What did you expect to see - numbers in text format? FLAC is a binary format so you have to interpret these 32 bits correctly:
1bit flag, followed by 7bit BLOCK_TYPE, followed by 24bit length of the metadata



Hi there and thanks, how do I interpret the data ?
Title: Reading Flac tags using C#
Post by: saratoga on 2014-05-05 07:13:31
Hi there and thanks, how do I interpret the data ?


See the structure here: 

https://xiph.org/flac/format.html#metadata_block_header (https://xiph.org/flac/format.html#metadata_block_header)

But basically bit zero tells you if its the last block header, bits 1-8 encode the block type, and the remaining 3 bytes are the length of the block. 
Title: Reading Flac tags using C#
Post by: pkfox on 2014-05-05 10:09:15
I don't know if I'm reading the files correctly as the data after the "fLaC" marker is very odd looking ( I'm reading 4 bytes at a time into a byte array ) smiley faces and musical notes - do I need to convert these values ?

What did you expect to see - numbers in text format? FLAC is a binary format so you have to interpret these 32 bits correctly:
1bit flag, followed by 7bit BLOCK_TYPE, followed by 24bit length of the metadata



How do I interpret these 32 bits correctly ?
Title: Reading Flac tags using C#
Post by: pkfox on 2014-05-05 10:14:20
Hi there and thanks, how do I interpret the data ?


See the structure here: 

https://xiph.org/flac/format.html#metadata_block_header (https://xiph.org/flac/format.html#metadata_block_header)

But basically bit zero tells you if its the last block header, bits 1-8 encode the block type, and the remaining 3 bytes are the length of the block.


Pardon my ignorance but I'm reading the data into a 4 byte array which I take to mean 32 bits am I right in thinking this ?
Title: Reading Flac tags using C#
Post by: Nick.C on 2014-05-05 13:05:09
Pardon my ignorance but I'm reading the data into a 4 byte array which I take to mean 32 bits am I right in thinking this ?


Yes - but remember that the sign convention is Big-Endian. With this in mind, you have to be sure as to how your compiler is treating the reads. If they are read as little-endian then you need to take that into account.
Title: Reading Flac tags using C#
Post by: saratoga on 2014-05-05 16:59:16
Pardon my ignorance but I'm reading the data into a 4 byte array which I take to mean 32 bits am I right in thinking this ?


Yes that is correct.  There are 8 bits in a byte, so if you have 4 bytes you also have 32 bits.

Anyway hydrogen audio may not be the best place for programming questions, and parsing compression formats may not be the best way to learn about programming. I would use the libraries other people linked.
Title: Reading Flac tags using C#
Post by: lithopsian on 2014-05-05 18:03:29
Your have a unsigned char[4], possibly as a pointer.  Best to make it unsigned because you don't want large numbers getting treated as negative.

The first member of the array has to be treated as a bitmap.  The first bit indicates whether this header block is the last one before the audio data.  You can test it using byte_one & 0x80;  Or ignore it using byte_one & 0x7F;

The rest of the first byte is a number indicating the type of header block.  The info header is type 0, the metadata block is type 4.

The other three bytes are to be interpreted as a big-endian number, meaning the first byte (byte two of the array) is the most significant (largest) part of the number.  Best not to rely on the compiler and architecture (unless you want to get into different code based on defines about which endian is in effect) and just build yourself a 24-bit number (unsigned still!):
Code: [Select]
uint32_t length = byte[1]<<16 | byte[2]<<8 | byte[3];


Note that I had to use a 32 bit integer even though the number is only 24 bits long, because there is no 24-bit numeric data type in C.

So now you know how much metadata there is  However, you probably don't need to know that since the metadata block itself contains its own internal length indicators.  The overall length can be a valuable sanity check though.  You don't want to be parsing out the the metadata block if the length field says it is zero characters.  Your next challenge comes with parsing the metadata block itself where the numbers are all little-endian
Title: Reading Flac tags using C#
Post by: Nick.C on 2014-05-05 18:56:52
The other three bytes are to be interpreted as a big-endian number, meaning the first byte (byte two of the array) is the most significant (largest) part of the number.  Best not to rely on the compiler and architecture (unless you want to get into different code based on defines about which endian is in effect) and just build yourself a 24-bit number (unsigned still!):
Code: [Select]
uint32_t length = byte[1]<<16 | byte[2]<<8 | byte[3];


Surely big-endian also means that the bytes are in reverse-bit order compared to little-endian ints in C?
Title: Reading Flac tags using C#
Post by: lithopsian on 2014-05-05 19:15:36
C is "endian-neutral".  On some platforms it will use little endian, on others big-endian.  Unless you are exchanging data with a fixed format such as a file or internet stream then you never see these details.  This is driven by the underlying processor architecture.

While the byte order of a particular multi-byte data type varies with the endianness, C will always treat such types as a continuous sequence of bits, in this case 32 of them.  You can shift up and down those 32 bits without worrying about which order the underlying bytes are arranged in.

What you can't do is map or cast those 32 bits onto a non-endian data type such as a character array and expect the bytes to be consistent between platforms.  There are only two possibilities, and you can determine whether the high order byte is the first or last using compiler-defines, but usually it is best to avoid this sort of assumption.  While fixed byte orders in files such as Flac correspond to big- or little-endian byte arrangements in memory, you do yourself a favour if you just treat them as raw arrangements of bits and don't try to get cute about whether they will map into one particular memory arrangement or another.
Title: Reading Flac tags using C#
Post by: lithopsian on 2014-05-05 19:16:40
I just tried to edit my first reply to clarify a few expressions, but apparently too late, so apologies if anything is a little confusing or unclear.
Title: Reading Flac tags using C#
Post by: lvqcl on 2014-05-05 19:18:45
Surely big-endian also means that the bytes are in reverse-bit order compared to little-endian ints in C?

No, bytes are bytes. http://en.wikipedia.org/wiki/Endianness#At...ment_size_8-bit (http://en.wikipedia.org/wiki/Endianness#Atomic_element_size_8-bit)
Also, http://en.wikipedia.org/wiki/Endianness#.22Bit_endianness.22 (http://en.wikipedia.org/wiki/Endianness#.22Bit_endianness.22)
Title: Reading Flac tags using C#
Post by: Nick.C on 2014-05-05 19:29:27
Ta both.
Title: Reading Flac tags using C#
Post by: lithopsian on 2014-05-05 22:30:33
The equivalent if the bytes were specified to be little-endian (internal metadata block numbers are 32 byte, 4 bytes, little endian):
Code: [Select]
uint32_t length = byte[0] | byte[1]<<8 | byte[2]<<16 | byte[3]<<24;


One or other of those would work on your machine if you just mapped the three bytes onto a long, but the other one would not.  Which on depends on the machine you compile on/for.
Title: Reading Flac tags using C#
Post by: pkfox on 2014-05-06 08:25:17
Your have a unsigned char[4], possibly as a pointer.  Best to make it unsigned because you don't want large numbers getting treated as negative.

The first member of the array has to be treated as a bitmap.  The first bit indicates whether this header block is the last one before the audio data.  You can test it using byte_one & 0x80;  Or ignore it using byte_one & 0x7F;

The rest of the first byte is a number indicating the type of header block.  The info header is type 0, the metadata block is type 4.

The other three bytes are to be interpreted as a big-endian number, meaning the first byte (byte two of the array) is the most significant (largest) part of the number.  Best not to rely on the compiler and architecture (unless you want to get into different code based on defines about which endian is in effect) and just build yourself a 24-bit number (unsigned still!):
Code: [Select]
uint32_t length = byte[1]<<16 | byte[2]<<8 | byte[3];


Note that I had to use a 32 bit integer even though the number is only 24 bits long, because there is no 24-bit numeric data type in C.

So now you know how much metadata there is  However, you probably don't need to know that since the metadata block itself contains its own internal length indicators.  The overall length can be a valuable sanity check though.  You don't want to be parsing out the the metadata block if the length field says it is zero characters.  Your next challenge comes with parsing the metadata block itself where the numbers are all little-endian


Good lord no wonder I wasn't getting anywhere, how do I check if the block is of type 4 ? And thanks for your time.
Title: Reading Flac tags using C#
Post by: pkfox on 2014-05-06 11:14:01
I just tried to edit my first reply to clarify a few expressions, but apparently too late, so apologies if anything is a little confusing or unclear.


Hi there and thanks very much for your help, I'm a professional IT man but have never messed with stuff at bit level so forgive me if my questions seem a bit basic, my basic read code in c# is this

Code: [Select]
int BlockSize = 4;
int BytesRead = 0;
byte[] block = new Byte[BlockSize];

FileStream fs = new FileStream(filename, FileMode.Open);
BinaryReader br = new BinaryReader(fs);

// Read the file in 4 byte chunks

while ((block = br.ReadBytes(4)) != null)
{
    BytesRead += BlockSize;
    // on the first pass has block == "fLaC" as expected.
    // the second has block[0] = 0 , block[1] = 0, block[2] = 0, block[3] = 34.
    // if I apply your code
    uint length = block[1]<<16 | block[2]<<8 | block[3];
    // Unsurprisingly I get 34 which is the value I have in block[3] and the other elements are 0 - but interestingly
    // 34 hex = 52 decimal which is the ASCII code for '4' which is the block type I'm looking for.
}

fs.Close();


In my ignorance of bit manipulation I'm guessing that on my second read the values of my byte array are interpreted thus
Code: [Select]
block[0] = 0 // which means this is not the last block.
block[1] + block[2] + block[3] = the block type, which in this case is 4 if my guess that the 34 value is in hex is correct.


How am I doing ?



Title: Reading Flac tags using C#
Post by: pkfox on 2014-05-06 15:02:13
Your have a unsigned char[4], possibly as a pointer.  Best to make it unsigned because you don't want large numbers getting treated as negative.

The first member of the array has to be treated as a bitmap.  The first bit indicates whether this header block is the last one before the audio data.  You can test it using byte_one & 0x80;  Or ignore it using byte_one & 0x7F;

The rest of the first byte is a number indicating the type of header block.  The info header is type 0, the metadata block is type 4.

The other three bytes are to be interpreted as a big-endian number, meaning the first byte (byte two of the array) is the most significant (largest) part of the number.  Best not to rely on the compiler and architecture (unless you want to get into different code based on defines about which endian is in effect) and just build yourself a 24-bit number (unsigned still!):
Code: [Select]
uint32_t length = byte[1]<<16 | byte[2]<<8 | byte[3];


Note that I had to use a 32 bit integer even though the number is only 24 bits long, because there is no 24-bit numeric data type in C.

So now you know how much metadata there is  However, you probably don't need to know that since the metadata block itself contains its own internal length indicators.  The overall length can be a valuable sanity check though.  You don't want to be parsing out the the metadata block if the length field says it is zero characters.  Your next challenge comes with parsing the metadata block itself where the numbers are all little-endian


Hi again, I checked the "Endianness" of my c# compiler with BitConverter.IsLittleEndian and it reports it is Little Endian so I guess I need to reverse the byte order ?
Title: Reading Flac tags using C#
Post by: lithopsian on 2014-05-08 11:56:22
9 out of 0x0A programmers never have to go near endianness, but it is an important concept when transferring multi-byte numbers in a completely portable way.  The whole point of the bitshifting code is that you still shouldn't have to worry about the endiannness of your machine, only about arranging 24 or 32 bits in the correct way in your file.

I see you are also struggling with the concept of whether "4" should be encoded as a binary 4 or ascii 4, and also between bits and bytes.  The numbers in the Flac metadata blocks are not stored as ascii digits.  A "4" is stored as 0x04 (or \4 in octal), not 0x34, ignoring for now multi-byte mappings.

The metadata block type is only the first byte, or to be more accurate the lowest 7 bits of the first byte.  The other three bytes are the length of the metadata block.  So the "4" is stored simply as 0x04 in a single byte.  If it is the last metadata block, then set the high order bit and you get 0x84.  You could set the bit simply by adding 0x80 to your block type number., but since you are setting a bit, doing 0x04 | 0x80 would be clearer.  You could also do it with octal, for example '\4' | '\200', giving '\204'.  Similarly, for reading the first character (block[0] is the last block bit and the block type number), you can check the bit using block[0] & 0x80, and you can obtain the block type number (excluding the high order bit) using block[0] & 0x7F.  Note that endianness is not relevant for this single byte.

For the block length, now you are constructing a multi-byte number and should be doing your bit-shifting.  Remember, still numbers, not ascii digits.  If block[3] really does contain 0x34 then (ignoring the two higher order, bigger, bytes) then that represents a length of 52 characters, the length of the metadata block itself (excluding the four header bytes).  Quite a coincidence, but might be right.  Are you parsing a real Flac file?  Lengths smaller than 256 bytes will only have block[3] set, so if you want to really test the bit-shifting code then you'll need bigger lengths.

P.S. Your compiler is neither big endian nor little endian.  It can do either, but it detects and reports the relevant endiannness for the machine you are running it on.
Title: Reading Flac tags using C#
Post by: lvqcl on 2014-05-08 15:41:05
If block[3] really does contain 0x34 then (ignoring the two higher order, bigger, bytes) then that represents a length of 52 characters

I'm sure that it contains 34 (dec), not 0x34.
Title: Reading Flac tags using C#
Post by: pkfox on 2014-05-09 12:30:40
9 out of 0x0A programmers never have to go near endianness, but it is an important concept when transferring multi-byte numbers in a completely portable way.  The whole point of the bitshifting code is that you still shouldn't have to worry about the endiannness of your machine, only about arranging 24 or 32 bits in the correct way in your file.

I see you are also struggling with the concept of whether "4" should be encoded as a binary 4 or ascii 4, and also between bits and bytes.  The numbers in the Flac metadata blocks are not stored as ascii digits.  A "4" is stored as 0x04 (or \4 in octal), not 0x34, ignoring for now multi-byte mappings.

The metadata block type is only the first byte, or to be more accurate the lowest 7 bits of the first byte.  The other three bytes are the length of the metadata block.  So the "4" is stored simply as 0x04 in a single byte.  If it is the last metadata block, then set the high order bit and you get 0x84.  You could set the bit simply by adding 0x80 to your block type number., but since you are setting a bit, doing 0x04 | 0x80 would be clearer.  You could also do it with octal, for example '\4' | '\200', giving '\204'.  Similarly, for reading the first character (block[0] is the last block bit and the block type number), you can check the bit using block[0] & 0x80, and you can obtain the block type number (excluding the high order bit) using block[0] & 0x7F.  Note that endianness is not relevant for this single byte.

For the block length, now you are constructing a multi-byte number and should be doing your bit-shifting.  Remember, still numbers, not ascii digits.  If block[3] really does contain 0x34 then (ignoring the two higher order, bigger, bytes) then that represents a length of 52 characters, the length of the metadata block itself (excluding the four header bytes).  Quite a coincidence, but might be right.  Are you parsing a real Flac file?  Lengths smaller than 256 bytes will only have block[3] set, so if you want to really test the bit-shifting code then you'll need bigger lengths.

P.S. Your compiler is neither big endian nor little endian.  It can do either, but it detects and reports the relevant endiannness for the machine you are running it on.


Hi there and thanks again, to be clear, are all the values I see in my byte array in hex ? obviously 0x04 is just 4, the problem ( one of them ! ) I'm having is, I'm reading the file in 4 byte chunks and the first byte which you say should be 0x04 for a comment block is never there ! If I did find a value of 0x04 in the first byte would your bit shifting code on the remaining 3 bytes give me the length of the meta block ? Also can you show me how I would code the the test for 0x04 being in the first byte of the block array ? if(block[0] & 0x7F) ? thank you very much for your help
Title: Reading Flac tags using C#
Post by: pkfox on 2014-05-09 12:42:36
If block[3] really does contain 0x34 then (ignoring the two higher order, bigger, bytes) then that represents a length of 52 characters

I'm sure that it contains 34 (dec), not 0x34.


Hi there, why do you think that ?
Title: Reading Flac tags using C#
Post by: lithopsian on 2014-05-09 12:46:21
The vorbis comment block is never the first block after fLaC.  The first block is always streaminfo and its magic byte is 0x00  Almost never 0x80 because it is rarely the last block.  The streaminfo block has a fixed length, but still has a length indicator in the three bytes following the 0x00: always 0x00, 0x00, and 0x22, indicating 34 bytes.

Following that there are other metadata blocks, which may or may not be vorbis comment blocks.  The seektable seems to be a common second block, with magic byte 0x03, followed by the vorbis comment block, usually with a padding block last.  You have to parse through the blocks in order: read the type, read the length, then either parse that length, or skip over it to the next block.

You check the block type very easily:
Code: [Select]
if (block[0] & 0x7F) == 4)) // then vorbis comment block


If you're confused, look at a Flac file with a hex editor.  If you can't read through it that way then you'll never be able to code your way through it.
Title: Reading Flac tags using C#
Post by: pkfox on 2014-05-09 13:22:28
The vorbis comment block is never the first block after fLaC.  The first block is always streaminfo and its magic byte is 0x00  Almost never 0x80 because it is rarely the last block.  The streaminfo block has a fixed length, but still has a length indicator in the three bytes following the 0x00: always 0x00, 0x00, and 0x22, indicating 34 bytes.

Following that there are other metadata blocks, which may or may not be vorbis comment blocks.  The seektable seems to be a common second block, with magic byte 0x03, followed by the vorbis comment block, usually with a padding block last.  You have to parse through the blocks in order: read the type, read the length, then either parse that length, or skip over it to the next block.

You check the block type very easily:
Code: [Select]
if (block[0] & 0x7F) == 4)) // then vorbis comment block


If you're confused, look at a Flac file with a hex editor.  If you can't read through it that way then you'll never be able to code your way through it.


Ok thanks - apologies for pestering you but I really want to understand all this stuff :-)
Title: Reading Flac tags using C#
Post by: lithopsian on 2014-05-09 14:01:02
The vorbis comment metadata block is a good one to look at in a hex editor, because the field contents are readable ascii (usually!) and so it is easy to find the length markers between them.  Looking at the length indicators inside a vorbis comment block and those of the metadata block headers will also make very clear to you the meaning of big- and little-endianness.  Again, don't get hung up on names, but see that the four (or three) bytes making up the length are in a different order in the file.
Title: Reading Flac tags using C#
Post by: pkfox on 2014-05-09 17:06:10
The vorbis comment metadata block is a good one to look at in a hex editor, because the field contents are readable ascii (usually!) and so it is easy to find the length markers between them.  Looking at the length indicators inside a vorbis comment block and those of the metadata block headers will also make very clear to you the meaning of big- and little-endianness.  Again, don't get hung up on names, but see that the four (or three) bytes making up the length are in a different order in the file.


I'll have a go thanks again
Title: Reading Flac tags using C#
Post by: MOCKBA on 2014-05-16 19:19:54
Not exactly C# and a little messy in design
https://github.com/drogatkin/JustFLAC (https://github.com/drogatkin/JustFLAC)
However it works for me like a charm
Title: Reading Flac tags using C#
Post by: pkfox on 2014-05-18 10:26:55
Not exactly C# and a little messy in design
https://github.com/drogatkin/JustFLAC (https://github.com/drogatkin/JustFLAC)
However it works for me like a charm


Thanks I'll have a look.