Bug 11289 - scanner ignores first line of cuesheet file, if it has a UTF-8 BOM
: scanner ignores first line of cuesheet file, if it has a UTF-8 BOM
Status: RESOLVED FIXED
Product: Logitech Media Server
Classification: Unclassified
Component: Scanner
: 7.4.0
: PC Debian Linux
: P3 normal (vote)
: 7.7.0
Assigned To: Andy Grundman
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2009-03-08 19:31 UTC by Calum Mackay
Modified: 2011-09-19 06:34 UTC (History)
2 users (show)

See Also:
Category: ---


Attachments
cuesheet starting with embedded UTF-8 BOM (2.52 KB, application/octet-stream)
2009-03-09 14:16 UTC, Calum Mackay
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Calum Mackay 2009-03-08 19:31:40 UTC
I'm using 7.4~25367 on a Debian system.

When scanning a dir with a flac file (single file-per-cd) plus separate cuesheet, I've noted that SC ignores the first line of the file, if the file has a UTF-8 BOM.

Whilst that BOM is arguably useless (it's not really acting as a byte-order marker, given that UTF-8 doesn't need one), it's also not that uncommon, and perhaps SC could be a little more tolerant?

as an example, if I have a cue file that starts:

00000000 - ef bb bf 54  49 54 4c 45  20 22 66 6c  61 63 2d 31  .??TITLE "flac-1

SC does not "see" the TITLE tag used in this first line. Note the 0xefbbbf as the first three bytes, which is the UTF-8 BOM.

It would really help if SC could just ignore the UTF-8 BOM, if its found.

thanks much.
Comment 1 Spies Steven 2009-03-09 13:07:17 UTC
Calum, could you attach a cuesheet that has a UTF-8 BOM as an example?
Comment 2 Calum Mackay 2009-03-09 14:16:36 UTC
Created attachment 4901 [details]
cuesheet starting with embedded UTF-8 BOM

attached a cuesheet starting with embedded UTF-8 BOM. I've marked the attachment as a binary file, not text/plain, as it's obviously 8-bit, and I'm not sure that text/plain included UTF-8.

This cuesheet is exactly as produced by XLD, the lossless accurate ripper application for MacOS. Note that this bug is filed against SC on Linux, however (though presumably it's generic).
Comment 3 Calum Mackay 2009-03-09 14:23:34 UTC
Note the first three bytes of the file are 0xefbbbf:

00000000  ef bb bf 54 49 54 4c 45  20 22 66 6c 61 63 2d 31  |...TITLE "flac-1|

This is the UTF-8 BOM.

The BOM is itself a little silly: UTF-16 requires a byte-order marker, but clearly 8-bit UTF does not, as it is byte-oriented already. But the UTF-8 "BOM" marker exists nevertheless, and is generally used on non 8-bit clean systems to identify a UTF-8 file. It's not much use on UNIX systems, however, but it is recognised as such:

   $ file my.cue
   my.cue: UTF-8 Unicode (with BOM) text

It would be nice if the SC scanner could recognise these three bytes as the UTF-8 BOM, and do the right thing (which is probably to just ignore them).

I believe the BOM was chosen as these 3 bytes since they cannot otherwise occur in a regular text file.
Comment 4 Chris Owens 2009-03-30 17:32:41 UTC
Since there's now a planned 7.3.3 release, bugs which won't make the cut-off are being moved to the next target out.  If you feel that this bug needs to be addressed more (or less) urgently than the 7.4 release, please cc chris@slimdevices.com and leave a comment in the bug to that effect so we can review it.

Thanks.
Comment 5 Chris Owens 2009-03-31 08:55:12 UTC
For some reason Bugzilla did not change the target when I did this yesterday.  Or maybe it was me.  In either case, I'm trying it again.
Comment 6 Andy Grundman 2009-07-29 14:59:03 UTC
Moving 7.4 bugs to 8.0.
Comment 7 Chris Owens 2010-03-08 11:17:02 UTC
Moving P3 and lower bugs to next release target
Comment 8 SVN Bot 2011-09-19 06:34:16 UTC
 == Auto-comment from SVN commit #33477 to the slim repo by agrundman ==
 == http://svn.slimdevices.com/slim?view=revision&revision=33477 ==

Fixed bug 11289, strip BOM from first line of cue sheets