Old 29th September 2014, 09:58   #1
Arsimael_
Junior Member
 
Join Date: Aug 2012
Posts: 32
SC2 hangup completely, need to kill

This morning all our shoutcast servers were hanging. i had to login to the console via root and sent a sigkill, because normal "shutdown" wasn't working. log shows this:

ERROR [YP] Request [http://yp.shoutcast.com:80/yp2] failed, code: 28 [Operation timed out after 60000 milliseconds with 0 bytes received]

what is the problem? Our servers are hosted at ovh and they always seemed to have a stable and good connection. Shoutcast1 servers weren't affected.
Arsimael_ is offline   Reply With Quote
Old 29th September 2014, 11:42   #2
DrO
 
Join Date: Sep 2003
Posts: 27,873
its in the known issues (related to chaning to use libcurl to resolve other yp connction issues) and we're still trying to determine the cause of the issue for some setups. though it locking up the process on exit isn't something seen before.... how many CPU cores does the DNAS show as being able to use?
DrO is offline   Reply With Quote
Old 29th September 2014, 11:49   #3
Arsimael_
Junior Member
 
Join Date: Aug 2012
Posts: 32
Four CPU Cores. It's a dual core with hyperthreading. Is there a workaround for this issue?
Arsimael_ is offline   Reply With Quote
Old 29th September 2014, 12:01   #4
DrO
 
Join Date: Sep 2003
Posts: 27,873
other than restarting the DNAS when the yp connection fails, no.
DrO is offline   Reply With Quote
Old 29th September 2014, 16:17   #5
neralex
Major Dude
 
Join Date: Mar 2011
Posts: 576
Same here...
neralex is offline   Reply With Quote
Old 29th September 2014, 16:25   #6
DrO
 
Join Date: Sep 2003
Posts: 27,873
until we've got a fix (as there is no rhyme or reason for the random failure), there is little that can be suggested other than putting up with it or not being listed. as this is the downside of any new code - however much it is tested (and a lot of testing was done prior to the releases), something always creeps through and is what the old 1.x DNAS went through years ago i.e. growing pains (which for 2.x has taken a lot longer due to it's hindered development).

and as it doesn't happen for the vast majority of 2.2.2 and 2.4 installs and replicating it (which is a pain in itself) means it can take hours to know if a possible change has worked or not (which is what has been the case so far).
DrO is offline   Reply With Quote
Old 30th September 2014, 01:40   #7
DrO
 
Join Date: Sep 2003
Posts: 27,873
i'm testing another round of tweaks as i've gotten nearer to tracking down the issue (which is primarily a race-condition, hence being a pain to resolve and is why it affects some setups and not others). i guess i'll know in the morning if it's worked or not...
DrO is offline   Reply With Quote
Old 30th September 2014, 13:22   #8
DrO
 
Join Date: Sep 2003
Posts: 27,873
the lockup is now resolved, just not the failing of the YP connectivity *grumbles*
DrO is offline   Reply With Quote
Old 30th September 2014, 15:15   #9
neralex
Major Dude
 
Join Date: Mar 2011
Posts: 576
good news! thanks!
neralex is offline   Reply With Quote
Old 30th September 2014, 16:06   #10
DrO
 
Join Date: Sep 2003
Posts: 27,873
not really as the YP connectivity just failing is more of an issue than the YP thread locking up and preventing the DNAS from being able to be shut down.
DrO is offline   Reply With Quote
Old 8th October 2014, 18:14   #11
neralex
Major Dude
 
Join Date: Mar 2011
Posts: 576
Got the issue again between 17:00 and 18:00 CEST
neralex is offline   Reply With Quote
Old 8th October 2014, 18:17   #12
DrO
 
Join Date: Sep 2003
Posts: 27,873
sorry, we're experiencing some external related networking load issues with the YP over the last few hours which has caused some DNAS to experience the locking issue (and other issues which older DNAS experience when the DNAS has issues) and so they will need them to be restarted.

the plan is to have a 2.4.1 update out by the end of the month (hopefully sooner) which i know doesn't help things but we've a number of features and other issues that need to be implemented and resolved before we can reasonably provide a new update.
DrO is offline   Reply With Quote
Old 8th October 2014, 18:33   #13
neralex
Major Dude
 
Join Date: Mar 2011
Posts: 576
I have changed now all my streams from public to private and hope i can set it back to public with the new version of DNAS.
neralex is offline   Reply With Quote
Old 8th October 2014, 18:34   #14
DrO
 
Join Date: Sep 2003
Posts: 27,873
the locking issue only happens if the DNAS gets into a weird state with some error messages/ if the DNAS gets a value YP error then it's not a problem, it's only if it gets something like a networking related error that it may do the weird locking that has been seen (which due to some external factors out of our control has sadly been triggered today).

so i can only apologise again as this is turning into another 2-steps forward and 1-step back with the DNAS updates and to be frank, i'm more annoyed at myself that it's happening even with greater testing of things prior to release.
DrO is offline   Reply With Quote
Old 8th October 2014, 19:44   #15
newbornus
Junior Member
 
Join Date: Feb 2014
Posts: 30
still no patch yet?
newbornus is offline   Reply With Quote
Old 8th October 2014, 20:21   #16
newbornus
Junior Member
 
Join Date: Feb 2014
Posts: 30
Temporary solution is to use 'keepalive' script.

You can use a part of my init script or use it completely.
It must be customized (check correct paths) for your system.

Tested and running in CentOS system

#!/bin/bash
#
# shoutcast Startup script for the shoutcast server
#
# chkconfig: - 99 15
# description: shoutcast server
# processname: sc_serv
# config: /opt/shoutcast
#
### BEGIN INIT INFO
# Provides: shoutcast
# Short-Description: shoutcast server
# Description: blah
### END INIT INFO

# Source function library.
. /etc/rc.d/init.d/functions

prog=sc_serv
binfile=/opt/shoutcast/sc_serv
RETVAL=0
STOP_TIMEOUT=${STOP_TIMEOUT-10}
proc_pid=`pidof sc_serv`

start() {
echo -n $"Starting $prog: "
echo
tar -jvcf /var/bkp/sc_logs_`date +D-%d-%m-%Y--T-%H-%M-%S`.tar.bz2 -C /opt/shoutcast/logs/ .
/bin/rm -f /opt/shoutcast/logs/*
screen -dmS sc_serv bash -c 'cd /opt/shoutcast/ && ./sc_serv sc_serv_main_rly.conf'
RETVAL=$?
echo
# [ $RETVAL = 0 ] && touch ${lockfile}
return $RETVAL
}

stop() {
echo -n $"Stopping $prog: "
killall sc_serv
RETVAL=$?
echo
# [ $RETVAL = 0 ] && rm -f ${lockfile} ${pidfile}
return $RETVAL
}

rotate() {
killall -HUP sc_serv
RETVAL=$?
echo
# [ $RETVAL = 0 ] && touch ${lockfile}
return $RETVAL
}

keepalive() {
wget -O /tmp/sctmp.html --quiet --tries=2 --timeout=10 http://localhost:8000/index.html
if [ -s /tmp/sctmp.html ]; then
rm -f /tmp/sctmp.html
exit 0
else
killall -9 sc_serv
sleep 5;
service shoutcast start
rm -f /tmp/sctmp.html
fi
}

# See how we were called.
case "$1" in
start)
start
;;
stop)
stop
;;
rotate)
rotate
;;
status)
status $binfile
RETVAL=$?
;;
keepalive)
keepalive
;;
restart)
stop
sleep 5;
start
;;
*)
echo $"Usage: $prog {start|stop|rotate|status|keepalive|restart}"
RETVAL=2
esac

exit $RETVAL
newbornus is offline   Reply With Quote
Old 8th October 2014, 20:31   #17
DrO
 
Join Date: Sep 2003
Posts: 27,873
Quote:
Originally Posted by newbornus View Post
still no patch yet?
did you read my reply? as what was mentioned as being 'fixed' a few posts up is just part of the overall solution, as there are some other YP connectivity issues that need to be resolved before we can provide a reasonable 2.4.1 update (in the time scale i'd also mentioned).
DrO is offline   Reply With Quote
Old 13th October 2014, 06:32   #18
newbornus
Junior Member
 
Join Date: Feb 2014
Posts: 30
ok, we'll wait for an update...
for now, my "keepalive" shell script works well, 1-2 hangups eliminated (heh, 1st hangup i've failed because of mistyping a letter in cron rule >_< )
newbornus is offline   Reply With Quote
Old 13th October 2014, 12:56   #19
DrO
 
Join Date: Sep 2003
Posts: 27,873
along with the prior mentioned fix, a lot more work was done over the weekend which now seems to have tracked down the cause of the issues (or at least allow the DNAS to recover far better from connectivity issues) and should after some further testing that needs to be carried out and finishing off new features needed for the new version, resolve the YP issues that were introduced in the change over to using libcurl in 2.2.2. in the mean while, the end of the month (or possibly earlier) is the best that we can provide for when the fixed build will be released.
DrO is offline   Reply With Quote
Reply
Go Back   Winamp & Shoutcast Forums > Shoutcast > Shoutcast Technical Support

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump