Command-line hacking: extracting audio metadata (tags)

display This is another in my series of articles on doing off-beat and (I hope) interesting things with standard Linux command-line tools.

This is a relatively simple application compared to some I've presented. In this article I'll demonstrate how to extract metadata (tags) in a reasonably consistent way, from a variety of different audio file formats.

Why would this be useful? Will, when I rip audio CDs to disk, or buy DRM-free albums from commercial suppliers, they end up in a directory, in no particular format. I usually use a tag editor like EasyTag or Kid3 to add metadata according to my own filing conventions. What I'd like to then, is to move all the individual files to directories based on (for example) the album name. Perhaps I might organize my music into album folders like this:

/home/kevin/music/Bach - Goldberg Variations
/home/kevin/music/Pink Floyd - Dark Side of the Moon

For such a scheme I just need the tag that represents the album name and, perhaps, the tag that represents the track title. If I want to be a bit more systematic, I might organize by artist and album name:

/home/kevin/music/Glenn Gould/Bach - Goldberg Variations
/home/kevin/music/Pink Floyd/Dark Side of the Moon

Or, perhaps, by composer and artist; or genre and composer; or whatever structure suits me. If I want to do this automatically using a script, operating on a bunch of miscellaneous files, a pre-requisite is to be able to get all the necessary metadata from audio files.

There are a number of utilities that can do this, but using them in a script is more complicated than it might seem at first. First, different audio formats use completely different tagging strategies. ID3 tags (usually used in MP3) use a bunch of four-letter abbreviations for the tag names. For example, album is usually TALB. FLAC/Vorbis tags use more descriptive names, like "ALBUM".

Second, utilities that can extract this information are often a bit sloppy about the output layout. After all, it usually isn't designed to be readable from a script.

Third, in file formats that allow flexible tag names, there's no common agreement on things like letter case or punctuation ("ALBUM" vs. "album"; "album_artist" vs. "album-artist")

Another subtlety -- one which experienced shell programmers will usually deal with instinctively -- is that we'll almost certainly be dealing with filenames that contain spaces. This requires a bit of care in scripting.

In practice, I usually use gettags -- a utility I wrote myself because it produces output in exactly the format I want. However, for the purposes of this article, that would be cheating -- I want to use widely-available utilities that are typically found in the usual Linux repositories.

The best standard utility I've found for this purpose is ffprobe, which is part of the FFMpeg utility. If it isn't installed by default, you can probably do dnf install ffmpeg or apt-get install ffmpeg, or whatever is appropriate to your distribution. ffprobe handles most common audio formats, including the high-resolution DSF format. However, it's output can be a little hard to parse, and differs slightly according to the file type.

To get information about an audio file, use ffprobe -i {filename}. Here is a typical output:

Input #0, flac, from 'mediafiles/audio_music/Pink Floyd - 
The Dark Side Of The Moon/06 Money.flac':
    album           : Pink Floyd - The Dark Side Of The Moon
    artist          : Pink Floyd
    title           : Money
    album artist    : Pink Floyd
    genre           : Music - Classic rock and pop
    composer        : Pink Floyd
    date            : 1973
    track           : 06
  Duration: 00:06:36.50, start: 0.000000, bitrate: 3018 kb/s
    Stream #0:0: Audio: flac, 4100 Hz, stereo, s16 (16 bit) 
    comment         : Cover
    title           : cover

If we want to process this output using grep, awk, etc., there are a few things to watch out for -- one of which is not at all obvious. This non-obvious complication is that ffprobe produces output to stderr, not stdout. This means that special care needs to be taken when using pipes and redirection. To pipe the output of ffprobe into grep, for example, we need:

ffprobe -i "$filename" 2>&1 | grep ...

In this example, I've assumed that we've already got the audio filename into the environment variable $filename. The double-quotes are needed to ensure that spaces in the filename don't lead to its being split into multiple arguments. The formulation 2>&1 tells Bash to redirect stream 2 (stderr) to stream 1 (stdout). An alternative to using pipes would be to redirect the output of ffprobe to a file using the 2> syntax, and then process the file.

The next problem is that naive use of grep to select lines is likely to collect too much data. For example, grep -i title will match not only the title tag, but the title of the cover image. grep -i album will match the album tag, but it will also match "album artist". Without complex parsing, I don't see any simple way to deal with the first issue, that the same tag name appears several times. In practice, however, I've found that the first "title" in the ffprobe output is always the one I actually want, so we can just limit grep to one match:

ffprobe -i "$filename" 2>&1 | grep --max-count=1 ... 

The second complication, where the same string of text might appear in multiple tags, can be handled using a properly-crafted match expression. For example, to match "album : xxxx", but not "album artist : xxx" we can check that the colon character is in the right place. So this should match (only) the album tag:

ffprobe -i "$filename" 2>&1 | grep --max-count=1 -i album\\s*:

The -i makes the match case-insensitive, which is needed in general. \\s*: matches some whitespace followed by one colon.

The output of this command will be:

    album           : Pink Floyd - The Dark Side Of The Moon

There are many possible ways to separate the actual tag text from the tag name. Here is a simple method using cut:

ffprobe -i "$filename" 2>&1 | grep --max-count=1 -i album\\s*: \
  | cut -f 2 -d :

The output is:

 Pink Floyd - The Dark Side Of The Moon

This text begins with a space, because cut does not trim spaces from its output. To do that, we can use sed (among many other things).

ffprobe -i "$filename" 2>&1 | grep --max-count=1 -i album\\s*: \
  | cut -f 2 -d : | sed 's/^ *//;s/ *$//' 

This sed invocation has two search-and-replace expressions -- one for the whitespace at the beginning of the line, and one for the end.

This simple(ish) command will do the right thing for the "album" tag -- it will produce the album name, stripped of whitespace, to stdout, ready to be used in another command or script.

To get the title, genre, and composer metadata is simply a matter of substituting the appropriate text for "album" in the command. Artist/performer is a lot more complicated, because there's far less standardization in this area. In practice, you'll need to search for a number of different tags -- artist, album artist, album_artist, performer (at least), and make a decision which to include. To some extent that decision will have to be guided by the player software you use. My Astell & Kern player gets very confused if different files in the same directory have different "album artist" tags, so I generally use this tag (allowing for the variations between formats) as the general "artist" name. The scriping here isn't very interesting -- it's just a bunch of IF...THEN clauses. Implementing this is left as an exercise for the reader.