Squirrel Logo

Johan Vromans

Johan Vromans
Articles


A Day In The Life

A Day In The Life

by Johan Vromans

Document $Revision: 1.4 $.

Introduction

When I bought myself an Archos Jukebox (Recorder 20) MP3 player I quickly found out that its 20Gb disk would not be big enough to hold all my MP3 files. So the need arose to bring more structure in my collection of MP3 files.

For convenience, all MP3 files are maintained on my workstation, running RedHat GNU/Linux. A subset of the MP3 files are synchronised with the Jukebox MP3 Player. No maintenance takes place on the player itself.

The MP3 Player is running Rockbox, a superior replacement of the standard Archos firmware.

The Data Model

The basic entity for maintenance is the album, that is an ordered set of MP3 data. In terms of a file system, the album is a directory with MP3 files and, as we will see later, other information.

The albums are organised in a four level hierarchy:

Group/category/Artist_Name/Album_Title

Group

The Group indicates the status and/or origin of the MP3 files.

Currently implemented groups are:

  • Rips - The collection of ripped CDs.
  • Archive - The collection of MP3s from other origins.
  • New - Yet unclassified MP3s.
  • Rejects - Music I didn't like, waiting to be thrown away.
There's one additional 'pseudo-group' Player, that contains all the albums that are currently on the MP3 player. Actually, it does not contain any real data, only symbolic links to the real data.

Category

The Category defines the kind of music.

Putting music into categories is a hard and ungrateful task. Most of all, it depends on one's personal taste and mood. I use the following categories:

  • blues
  • celtic
  • jazz
  • newage
  • other

Artist

The Artist is the name of the artist, performer, or whatever appropriate for this album.

I use the names as much as they appear on the albums. So Miles Davis is stored as "Miles Davis", although "Davis, Miles" would improve the sorting order. Prefixes like "The" and suffixes like "Trio" and "Quartet" are left out in most cases, depending on the common use. This results in sometimes inconsequent namings like "The Alan Parsons Project" and "The Beatles".

The words of the name are all spelled with title caps.

Collections

For collections, albums with tracks of different artists, the artist is set to "Various".

Album

For the name of the Album the same rules are used as with Artist.

In addition, when an album is part of a multi-disc set, the title is suffixed by ", Disc N", where N is the disc number.

Tracks

For the title of the Tracks the same rules are used as with Artist.

For multi-title tracks a slash is used to separate the titles.

For multi-track titles, a suffix ", Part N" is preferred.

Using ID3 Tags

MP3 files can contain so-called ID3 tags that contain meta-information about the tracks. Most commonly, this includes:
  • the artist or performer;
  • the album title;
  • the track title;
  • the track number.
For consistency, all MP3 tracks get the ID3 tags, both version 1 and version 2, filled in according to the above rules.

For collections the ID3 tags for artist and track title are set to the "real" artist and track title.

Making File Names

When forming file names from artist names, album names, and track titles, special care is taken to prevent sensitive characters in file names. Since the files are eventually stored on the Jukebox, that has a VFAT file system, the following characters are left out from file names: spaces (replaced by an underscore), / (slash, replaced by a semicolon), \ (back-slash, same), : (colon, replaced by an underscore), " (quote), ` (back-quote), ? (question mark), ! (exclamation mark), | (vertical bar) and trailing periods.
Note that the ' (apostrophe) is allowed, since it occurs very often in song titles. The character set used is ISO-8859-1 (Latin-1).

The resultant file name can always be protected against Unix command line programs by putting " " around it.

For artist names and album titles, the file name correponds to the name and title. "John Mayall" becomes John_Mayall.

File names for tracks have a two-digit track number prepended, separated with an underscore. In the case of a collection album, the name of the artist is added to the track title.

Some examples of file names and the correponding ID3 information:

John_Mayall/The_Collection/10_The_Super-Natural.mp3

  artist: John Mayall
  album:  The Collection
  track:  The Super-Natural
  number: 10

Various/Acoustic_Blues_Instrumentals/09_Etta_Baker__Carolina_Breakdown.mp3

  artist: Etta Baker
  album:  Acoustic Blues Instrumentals
  track:  Carolina Breakdown
  number: 9

The Internet CD Databases

There are several CD databases available on the internet. These sites provide information for a gigantic collection of CDs. The two most popular are GraceNote's CDDB.com and its free counterpart, FreeDB.org.

The canonical "unit of information" is the so-called CDDB entry. This is special formatted data that contains information about the CD, like artist, album, tracks, comments, and some details about the layout of the CD.

The CDDB entries play an important part in managing the collections of MP3 files.

An example of CDDB entry data:

# xmcd CD database file generated by Grip 3.0.0
# 
# Track frame offsets:
#       150
#       90165
# 
# Disc length: 2822 seconds
# 
# Revision: 11
# Processed by: cddbd v1.4PL0 Copyright (c) Steve Scherf et al.
# Submitted via: Grip 3.0.0
# 
DISCID=060b0402,070b0402
DTITLE=Miles Davis / Bitches Brew, Disc 1
TTITLE0=Pharaoh's Dance
TTITLE1=Bitches Brew
EXTD=ID3G:   8  Prepared for cd by Teo Macero. Engineered by Ray Moore\nCol
EXTD=umbia Records  G2K 40577
EXTT0=
EXTT1=
PLAYORDER=
Sometimes it is not possible to get CDDB information. For example, when creating a custom CD. For this purpose I developed a fallback 'pseudo' CDDB format. It is just a header line and a list of tracks.

An example of CDDB fallback data:

Miles Davis / Bitches Brew, Disc 1

   1. Pharaoh's Dance
   2. Bitches Brew

Prepared for cd by Teo Macero. Engineered by Ray Moore
Columbia Records  G2K 40577

The Repository

As described earlier, the Group, category, Artist and Album form the path to the actual collection of MP3 data files for a specific album.

Besides the MP3 files, the following meta-data is stored here, in the form of hidden files:

  • .cddb - the CDDB entry data for this collection (CD).
  • .nocddb - the CDDB fallback data, if no .cddb is available.
  • .verified - a timestamp when this data was verified.
  • .playlist.m3u - a playlist for music programs, like XMMS and WinAmp.
  • .playlist.html - a pseudo-playlist to be used with a browser.
  • .onPlayer - only if this album is currently on the MP3 player.
  • .cache - cached data for maintenance programs.
The mp3ov program processes the hierarchy of files, and creates an index.html with a list of all albums, including icons that show the status of each album, and that can be clicked to play it. It will create the .cache and .playlist files if necessary.

Note: the playlist files refer to a local HTTP server to serve the MP3 data.

Classification Icons

Each entry in the main index page has an icon that designates its status. A hand with thumb up indicates an approved entry. Thumb down indicates a rejected entry, and no thumbs indicates a new, yet unclassified entry. If an album is not complete (one or more tracks are missing) then the hand misses a finger, e.g., . If the sleeve is red, e.g., the entry is currently on the MP3 player.

Synchronisation

When synchronisation with the MP3 player is desired, a shadow hierarchy of all MP3 files is constructed by traversing the groups (New, Archive, Rips and Rejects) and symlink all the directories that have an .onPlayer file present. This forms a pseudo-group Player that can be synchronised using the rsync program.

I automount the Archos Jukebox on /misc/usd1, so the rsync command needed for synchronisation is:

    rsync -av --delete --modify-window=1 --copy-links \
	  --exclude='.??*' --exclude='*~' \
	  --exclude=\*.ajz --exclude=\*.ucl --exclude=.rockbox \
	  Player/ /misc/usd1/

Note that excluding *.ajz, *.ucl and .rockbox is mandatory. Leaving these out would cause the Rockbox information the be removed from the player.

The Tools

File Management

File management is implemented by the mp3fm tool. It supports the following commands:
  • add directory

    Add this repository item to the MP3 player.

    Directory is the location of the repository item, i.e. the place where the MP3 files and meta data are stored.

  • del directory

    Remove this repository item from the MP3 player.

  • accept directory

    Move this repository item from the 'New' group to the 'Archive' group.

    The item will be removed from the MP3 player as well.

  • reject directory

    Move this repository item from the 'New' group to the 'Rejects' group.

    The item will be removed from the MP3 player as well.

ID3 Tag Management

ID3 Tag management is implemented by the id3tag tool. It can be run in interaction mode or verification mode.

In interaction mode, id3tag requires the name of a directory that contains the MP3 files and meta data. It load the CDDB data (or the fallback data if no .cddb file is present) and examines all the MP3 files. For each of the files it verifies whether the artist, album, track title and number correspond to the CDDB data and offers to change it. It also offers to rename the MP3 file to its correct form.

Example of a id3tag run (with comments added):

$ ls New/jazz/Miles_Davis/Bitches_Brew,_Disc_1
01_Pharaoh's_Dance.mp3
02_Bitches_Brew.mp3
$ id3tag New/jazz/Miles_Davis/Bitches_Brew,_Disc_1
CD: Miles Davis / Bitches Brew, Disc 1; 2 tracks
This is the information derived from the CDDB data.
 1: Pharaoh's Dance
The first track, according to the CDDB data. Since the name of the MP3 file is 01_Faraoh's Dance, it is not automatically found.
 2: Bitches Brew
    File: 02_Bitches_Brew.mp3
    Change track title "Bitches Brew" to Bitches Brew?
    Change track number "2" to 2? 
The second track is found, and the program offers to change the track name and number. The suggestions given are presented using the readline routines and can easily be edited on-line.
--: File: 01_Faraoh's_Dance.mp3
    Change track title "Faraoh's Dance" to Faraoh's Dance? 1
This is a file the program could not find a track title for. So it offers a replacement. By answering 1 we indicate that this is track 1, and the program corrects the question:
    Change track title "Faraoh's Dance" to Pharaoh's Dance? 
    Change track number "1" to 1?
    Updating 01_Faraoh's_Dance.mp3
The ID3 information is updated, and it now asks to rename the file:
    Rename 01_Faraoh's_Dance.mp3 to 01_Pharaoh's_Dance.mp3 ?

In verification mode, no questions are asked but all the MP3 data files, their ID3 tags and the CDDB data are verified for consistency. If the validation succeeds, id3tag adds a file .verified to the meta data.

Combining CDs

When using an MP3 player, why not combine multi-CD albums into one?

This is, indeed, very handy and easy to accomplish. In particular, if you have a system that supports links, you can link the original tracks into the new location so it doesn't cost you any additional disk space.

The utility mp3combine can be helpful in combining two or more directories into one. Files from the first directory will be prefixed with 1, the second with 2, and so on.

For example:

    $ cd Pink_Floyd
    $ ls The_Wall,_Disc_1
    01_In_The_Flesh.mp3
    02_The_Thin_Ice.mp3
    03_Another_Brick_In_The_Wall_(Part_1).mp3
    ...
    $ ls The_Wall,_Disc_2
    01_Hey_You.mp3
    02_Is_There_Anybody_Out_There.mp3
    03_Nobody_Home.mp3
    ...
    $ mkdir The_Wall
    $ cd The_Wall
    $ mp3combine --verbose ../The_Wall,_Disc_1 ../The_Wall,_Disc_2
    Disc 1: Pink Floyd / The Wall, Disc 1
    Disc 2: Pink Floyd / The Wall, Disc 2
    $ ls
    101_In_The_Flesh.mp3
    102_The_Thin_Ice.mp3
    103_Another_Brick_In_The_Wall_(Part_1).mp3
    ...
    201_Hey_You.mp3
    202_Is_There_Anybody_Out_There.mp3
    203_Nobody_Home.mp3
    ...
Additionally, the program will create fake .nocddb and .verified files that reflect the new structure.

Getting The MP3 Data

Several programs exist that can extract the audio data from CDs. Most of them are capable of directly producing MP3 data. Later we'll see why it can be advantageous to extract the audio data in 'raw' WAV format, and compress it explicitly.

Sometimes the extraction program can query the CDDB internet databases and store the audio data in files with names that reflect the artist, album, and track title. If not, you'll end up with a series of files like Track01.mp3, Track02.mp3 and so on.

If the new MP3 files have reasonable ID3 tag values, the mp3rename program can be used to rename the files into a Artist/Album/NN_Track_Title.mp3 form.

The getcddb program examines a directory with MP3 files and queries the CDDB servers. It then stores all CDDB entry data in the directory, with file names .cddb00, .cddb01 and so on.

The next step is to select one of the CDDB data files, rename it to .cddb and edit it. I change the artist, album and track title information to match the conventions described above. Then the id3tag program can be used to change the ID3 data and file names.

In case no CDDB entry can be found, it is worth trying Google or any other search engine to find the track titles, or type then in manually from the CD cover, in CDDB fallback format (.nocddb file).

The Continuous Play Problem

Tracks that are ripped as separate MP3 files can not be played without small audible interruptions. This is only a problem when ripping a CD with continuous music, for example a live performance or a concept CD.

There are three ways to produce MP3 files that can be played without interruptions.

  • Rip the CD into one big MP3 file. This is always guaranteed to work. Most extraction programs can do this.
  • Rip it into one big MP3 file, and use special tools like mp3cut to cut it into smaller MP3 files.
  • Rip it into WAV files, and compress all the files in one run of the lame compressor, using the --nogap command line option.

MP3Gain: Smoothing the volume

In real life situations, the volume of MP3 rips can differ significantly. Although this may not impose a real ploblem, it is quite annoying at least. Every time you start playing a CD you need to adjust the volume.

There is a nifty tool called MP3Gain that can make all the MP3 files sound at a more or less equal volume. It runs on Linux, Windows and Mac. From the web site: MP3Gain does not just do peak normalization, as many normalizers do. Instead, it does some statistical analysis to determine how loud the file actually sounds to the human ear. Also, the changes MP3Gain makes are completely lossless. There is no quality lost in the change because the program adjusts the mp3 file directly, without decoding and re-encoding.

I grew the habit of applying MP3Gain to all my rips. I have it normalise the volume per album, not per track, to retain the relative volume of the tracks of the album.

In Real Life

These are the steps as I execute them:
  • Create the directory where the files must be placed, e.g.
    $ mkdir -p Dead_Can_Dance/The_Serpent\'s_Egg
    $ cd Dead_Can_Dance/The_Serpent\'s_Egg
  • Rip the CD into WAV files:
    $ cdda2wav -B
    100%  track  1 successfully recorded
    100%  track  2 successfully recorded
    100%  track  3 successfully recorded
    ...
  • Compress the WAV files into MP3 files:
    $ lame -h -b 192 --nogap *.wav
    Note: Disabling VBR Xing/Info tag since it interferes with --nogap
    LAME version 3.93  (http://www.mp3dev.org/)
    Using polyphase lowpass  filter, transition band: 18671 Hz - 19205 Hz
    Encoding audio_01.wav to audio_01.mp3
    Encoding as 44.1 kHz 192 kbps stereo MPEG-1 Layer III (7.3x) qval=2
        Frame          |  CPU time/estim | REAL time/estim | play/CPU |    ETA 
     14506/14509 (100%)|    0:39/    0:39|    1:20/    1:20|   9.4875x|    0:00 
    ...
  • Remove the WAV and other unnecessary files:
    $ rm *.wav *.inf
  • Get CDDB data:
    $ getcddb .
    Found 10 matches
      Created .cddb00: rock a1087b0a Dead Can Dance / The Serpent's Egg
      Created .cddb01: misc 94087d0a Dead Can Dance / The Serpent's Egg
      ...
  • Select a good CDDB entry, rename it to .cddb, and make the final edits. Remove the unneeded .cddbNN files.

  • Adjust the ID3 tags and rename the tracks:
    $ id3tag .
    CD: Dead Can Dance / The Serpent's Egg; 10 tracks
     1: The Host Of Seraphim
     2: Orbis De Ignis
     3: Severance
     4: The Writing On My Father's Hand
     5: In The Kingdom Of The Blind The One-eyed Are Kings
     6: Chant Of The Paladin
     7: Song Of Sophia
     8: Echolalia
     9: Mother Tongue
    10: Ullyses
    --: File: audio_03.mp3
        Change track title "" to Severance? 
        Change track number "" to 3? 
        Updating: audio_03.mp3
        Rename audio_03.mp3 to 03_Severance.mp3? 
    --: File: audio_04.mp3
        Change track title "" to The Writing On My Father's Hand? 
        Change track number "" to 4? 
        Updating: audio_04.mp3
        Rename audio_04.mp3 to 04_The_Writing_On_My_Father's_Hand.mp3? 
    ...
  • Run the verification:
    $ id3tag --vfy .
    CD: Dead Can Dance / The Serpent's Egg; 10 tracks
     1: The Host Of Seraphim
        File: 01_The_Host_Of_Seraphim.mp3
     2: Orbis De Ignis
        File: 02_Orbis_De_Ignis.mp3
     3: Severance
        File: 03_Severance.mp3
    ...
  • Finally, the volume normalisation:
    $ mp3gain -a -p -c *.mp3
    01_The_Host_Of_Seraphim.mp3
    02_Orbis_De_Ignis.mp3                            
    03_Severance.mp3                                 
    ...
    Applying mp3 gain change of -1 to 01_The_Host_Of_Seraphim.mp3...
    Applying mp3 gain change of -1 to 02_Orbis_De_Ignis.mp3...
    Applying mp3 gain change of -1 to 03_Severance.mp3...
    ...
    

Generating Playlists

Playlists, as used on the Archos Jukebox, are just a list of mp3 file names. For example, if you store the MP3 files in de top directory "music", and generate a playlist, the file wil contain names that all start with "/music/...".

Since I maintain all the MP3 data on the hard disk of the PC, and copy selected portions to the Archos MP3 player, it is straighforward to create playlists on the PC. For example, I likt to have big playlists (with over 1000 entries) that contain all kinds of tracks in random order, so I can play it in the car. With the new Rockbox Bookmarks, I can stop and resume multiple playlists simultaneously.

I wrote a small tool playlist.pl (a Perl program), that takes a list of directories that contain mp3 files, gathers the mp3 files from these directories, and creates a shuffled playlist that you can copy to your Rockbox. The program should work on Windows as well, although I have not been able to verify this.

All my Rockbox files reside in a directory /sp3/Player, and the following command:

  perl playlist.pl --strip /sp3/Player /sp3/Player/music/*/* > playlist.m3u

will write the desired playlist to the named file. The -v option will make the program respond with a small statistics line:

  Playlist contains 450 tracks from 39 directories

The function of the -strip option is to strip the path to the files on my hard disk from the names in the playlist, in other words, to make the playlist reflect the situation on the Archos instead of my hard disk.

It is also possible to manually create a list of files and directories, and feed them to the standard input:

   perl playlist.pl < filelist.dat > playlist.m3u

I use this to create preselected sets of albums that I want to generate the playlist from, or to exclude some albums from a large list.

Future

Currently, I'm working on a database application to maintain all the information. Take a look at the Domain model and the DBMS model if you're interested in these things. The pictures are generated by OptimalJ from Compuware.

The database I use is PostgreSQL, a powerful, industrial strength and Open Source database system.

Tools

All tools discussed above are either self-developed or otherwise free (as in speech) and available on the Internet. I advise against the use of proprietary software tools.

ToolOriginRemarks
mp3infoSelfPerl; CPAN
mp3renameSelfPerl; CPAN
id3tagSelfPerl
mp3ovSelfPerl
mp3fmSelfPerl
mp3linkSelfPerl
mp3combineSelfPerl
playlistSelfPerl; source
mp3cutSelfPerl; CPAN
mp3getcddbSelfPerl; CPAN
cdda2wav (cdrtools)Heiko Eissfeldt a.o.C; Linux; Windows; ...
lameThe LAME ProjectC; Linux; Windows; ...
mp3gainMP3GainC; Linux; Windows; ...

Contact the author for details.


Last modified: Sat Nov 18 21:27:24 CET 2006


© Copyright 2003-2018 Johan Vromans. All Rights Reserved.
articles/mp3tools/mp3tools.html last modified 20:52:55 10-Aug-2011