Bug 3913 - Slimserver skipping over songs - readlen undef
: Slimserver skipping over songs - readlen undef
Status: RESOLVED FIXED
Product: Logitech Media Server
Classification: Unclassified
Component: Audio
: 6.5b1
: PC Ubuntu Linux
: P2 normal (vote)
: ---
Assigned To: Dan Sully
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2006-08-11 08:38 UTC by Steve Bernard
Modified: 2008-12-18 11:11 UTC (History)
1 user (show)

See Also:
Category: ---


Attachments
--d_source output (134.35 KB, text/plain)
2006-08-11 08:39 UTC, Steve Bernard
Details
Source.pm patch (897 bytes, patch)
2006-09-16 08:36 UTC, Steve Bernard
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Steve Bernard 2006-08-11 08:38:26 UTC
Running the latest updates from trunk for the last few weeks (currently on rev 8926), I have noticed that songs that I am playing on my SB1 (G) have randomly been skipped over.  For example, I'll be playing an album and it will finish playing track 3 and then start playing track 7.  If I use the remote or web interface and go back to the skipped tracks, they will play just fine.  This is happening when I transcode from FLAC to wav and from FLAC to MP3.  I have also tested it with softsqueeze, and I see the same results while transcoding to mp3 (though in my testing, it didn't skip while playing the FLACs directly).

I have run slimserver with --d_source in order to see what's going on and I'll attach a complete log to this bug report.  Every time a track is skipped, the log entry looks basically the same.  Like this:

2006-08-11 10:45:10.0295 Using command for conversion: "/opt/slimserver/Bin/i386-linux/flac" -dcs --force-raw-format --endian=little --sign=signed --skip=0 --until=-0 -- "/mnt/audio/flacs/A-M/The Dead Texan/The Dead Texan/09-The Dead Texan - La Ballade d'Alain Georges.flac" & |
2006-08-11 10:45:10.0618 openSong: Streaming with format: wav
2006-08-11 10:45:10.1712 readlen undef: (No child processes)10
2006-08-11 10:45:10.1727 end of file or error on socket, opening next song, (song pos: 0(tell says: . ), totalbytes: 0)
2006-08-11 10:45:10.1735 opening next song...
Comment 1 Steve Bernard 2006-08-11 08:39:07 UTC
Created attachment 1413 [details]
--d_source output
Comment 2 KDF 2006-08-11 08:47:43 UTC
another one that should have run through support first, to identify a bug instead of simply a support ticket-style report.
Comment 3 Chris Owens 2006-08-11 09:53:31 UTC
Hi Steve,

Have you tried running that flac commandline by itself as the user that Slimserver runs as on your system?

"/opt/slimserver/Bin/i386-linux/flac" -dcs --force-raw-format --endian=little --sign=signed --skip=0 --until=-0 -- "/mnt/audio/flacs/A-M/The Dead Texan/The Dead Texan/09-The Dead Texan - La Ballade d'Alain Georges.flac"

That last "--" with no actual option after it doesn't look good.  If that command fails, can you try it again with that removed?

I suppose that command may barf up a bunch of flac data on your screen, so you may want to do the old append "> /dev/null" on the end as well.  Errors should still show up on the screen (I think).

Once you get it going, try it a few times in a row to catch it if it's an intermittent problem.

From the log you posted, it kind of looks like Slimserver isn't able to read the files intermittently.  Are you having any filesystem problems?

You might also post to the beta forum to see if anyone else has noticed this issue, and encourage them to post their information here!
Comment 4 Steve Bernard 2006-08-11 11:15:55 UTC
Hrm.  Okay, the filesystem hasn't given me any trouble, no.  fsck checked fine.  I umounted and forced a check just to be sure.  Regarding that "--", my convert.conf entry for flac to wav looks like this:

flc wav * *
        [flac] -dcs --force-raw-format --endian=little --sign=signed --skip=$START$ --until=$END$ -- $FILE$

I haven't edited convert.conf, so that last "--" is in there before "$FILE$"

Okay, having remounted, I tried directly launching the file a dozen times from the commandline, and I didn't get any error messages.

I'm curious about the 'readlen undef: (No child processes)10', since I don't now exactly what every step in the server's playback routine is.  I'm assuming it's trying to get the length of the file in order to display the time remaining to show on the player display, although I'm not sure what exactly goes into that.  Is it trying to connect to the flac process that was started by the convert.conf command and not finding anything?  Or does it try to launch a separate process to get the file length?

Anywho, in the mean time I'll try playing some albums on the newly fsck'd drive and see if it still wants to act up.
Comment 5 Steve Bernard 2006-08-11 11:23:24 UTC
Okay, it just skipped 2 tracks in a row for a different album.  I guess I'll take this to the beta forum and see if anyone there has an insight.
Comment 6 Steve Bernard 2006-08-11 11:56:36 UTC
(In reply to comment #2)
> another one that should have run through support first, to identify a bug
> instead of simply a support ticket-style report.
> 

If this is the way bugs should be reported, you should change the sticky in the beta forum: http://forums.slimdevices.com/showthread.php?t=17483 that says,

"Please remember - if you have an issue with SlimServer, Squeezebox or SqueezeNetwork, post a bug report! That's the only way things will be fixed!

https://bugs-archive.lyrion.org/"
Comment 7 KDF 2006-08-11 12:39:21 UTC
maybe one day, someone who can change the wording will realise just how badly worded it is.
Comment 8 KDF 2006-08-11 12:47:35 UTC
readlen is actually the audio data itself.  The playback handles the audiostream in chunks when passing then onto the squeezebox.  readlen undef basically means that the attempt to grab the next bit of audio has come up with nothing.  "no child processes" would be the error reports by the attempt, and presumably the error code.

If you run slimserver manually, using slismerver.pl, you can see if there are any other dumps to the console when the problem happens.  The server logging is only able to trap errors that slimserver has access to.  With the transcoding, the applciation is run separately and if it crashes, errors are lost.  "no child processes" would seem to indicate that the flac process has gone away.  the console may show why.  It is also a good idea to check permissions, that the files are readable by the 'slimserver' user.
Comment 9 Chris Owens 2006-08-11 14:15:30 UTC
Dan changed the wording of the message.  I appreciate a user who explains he's trying to follow our process!
Comment 10 Steve Bernard 2006-08-13 13:01:17 UTC
(In reply to comment #8)
> readlen is actually the audio data itself.  The playback handles the
> audiostream in chunks when passing then onto the squeezebox.  readlen undef
> basically means that the attempt to grab the next bit of audio has come up with
> nothing.  "no child processes" would be the error reports by the attempt, and
> presumably the error code.
> 
> If you run slimserver manually, using slismerver.pl, you can see if there are
> any other dumps to the console when the problem happens.  The server logging is
> only able to trap errors that slimserver has access to.  With the transcoding,
> the applciation is run separately and if it crashes, errors are lost.  "no
> child processes" would seem to indicate that the flac process has gone away. 
> the console may show why.  It is also a good idea to check permissions, that
> the files are readable by the 'slimserver' user.
> 

Permissions seem to be fine.  As I mentioned before, the skipped files will play fine even if I just use the remote to go back to them.

The log I made earier was done by running 'slimserver.pl 2> ~/sslog.txt' to save the output.  My intention was that I could collect everything being sent to stderr.  Having just run a straight slimserver.pl, I don't have any extra info that was missing.

I just played albums off two different hard drives, and both had skipped songs.  As always, when I backed up with the remote, the songs played fine.

So, that suggests that it isn't a hard drive failure, unless both drives are failing in the same way.  Both drives are on the same IDE controller, so perhaps I'll put in another cable and swap them to another controller and see if there's any difference.
Comment 11 Chris Owens 2006-08-14 10:56:22 UTC
Just to keep things easy to find, the thread Steve started on the forums is at http://forums.slimdevices.com/showthread.php?p=128176
Comment 12 Steve Bernard 2006-09-02 11:30:08 UTC
I've been busy with a new job, and I've just gotten back into testing this today.

I put in a new OS drive and installed a fresh linux OS (a different distro than the one I have been running).  I installed SlimServer 6.2.2 and configured it.  Then, I played three different albums and didn't have a song skipped over.  I then downloaded the newest svn version in the trunk and set it up.  The first album I tried had two songs skipped over.

I was hoping it would turn out to be an issue with my OS or my drive, but it seems to be acting exactly the same once I start using the newest server version.
Comment 13 Steve Bernard 2006-09-03 08:08:23 UTC
(In reply to comment #12)
> I've been busy with a new job, and I've just gotten back into testing this
> today.
> 
> I put in a new OS drive and installed a fresh linux OS (a different distro than
> the one I have been running).  I installed SlimServer 6.2.2 and configured it. 
> Then, I played three different albums and didn't have a song skipped over.  I
> then downloaded the newest svn version in the trunk and set it up.  The first
> album I tried had two songs skipped over.
> 
> I was hoping it would turn out to be an issue with my OS or my drive, but it
> seems to be acting exactly the same once I start using the newest server
> version.
> 

Also, I should add that 6.3.1 seems to work fine, too.

Unfortunately, the nightly builds online don't go back far enough or I'd try to track down the exact day whatever is causing these skipped songs began.
Comment 14 Chris Owens 2006-09-03 14:01:03 UTC
cc'ing Dan in case he has any insight
Comment 15 Dan Sully 2006-09-08 11:42:14 UTC
Steve - can you use 'svn' to go back in time to figure out where this started occuring?

That would help tremendously in tracking this down.

Thanks
Comment 16 Steve Bernard 2006-09-11 11:15:08 UTC
Here's another Beta list discussion regarding this bug:

http://forums.slimdevices.com/showthread.php?t=27190
Comment 17 Dan Sully 2006-09-11 19:03:49 UTC
Fixed in change 9616
Comment 18 Steve Bernard 2006-09-16 08:36:18 UTC
Okay, I just had this problem again several times after playing for a bit.  It was the same error where I got "readlen undef: (No child processes)10"

So, building on av249's research for his patch, I looked up what error would be giving the message "No child processes".  He was getting an error of "Interrupted system call" and it was an issue with EINTR.  So my research led me to deduce that I was getting an ECHILD error, the number for which is 10 (or 10010).  So made a little change to Source.pm based on Dan's patch and I'm attaching it here.  You can probably guess exactly what I did without even looking.

So, Dan, if you can look at this, write a more helpful debug message, and commit it, it would fix my issue completely, I think.  I have already played one album and checked my d_source output and seen that I prevented one skip, so I'm pretty stoked.
Comment 19 Steve Bernard 2006-09-16 08:36:53 UTC
Created attachment 1543 [details]
Source.pm patch
Comment 20 Dan Sully 2006-09-17 14:40:27 UTC
Steve - thanks for the patch. I'm going to hold off on this till after the 6.5 release, as we're right around the corner from it.
Comment 21 Steve Bernard 2006-09-25 09:11:53 UTC
I see my changes got checked in.  Everything's been cool all weekend.  Thanks, Dan!
Comment 22 Steve Bernard 2007-05-18 11:50:43 UTC
Dan, you seem to be the only person other than me attached to this bug, so I hope you see this.

This change got checked into the 7.0 branch back in September, but it never appeared in the 6.5 branch.  Since I see we might be getting ready for an official 6.5.2, is there any chance this could be added?  Every time I update my 6.5 build I have to add error handling for ECHILD myself.