Results 1 to 10 of 10
  1. #1
    Power Member
    Join Date
    Nov 2012
    Posts
    182

    Server crash: Unable to open Audit file in append mode

    What might be the cause and solution to this error:

    Code:
    05.May.15 15:30:00,151 <ERROR> Unable to open Audit file in append mode.: {}.                                                
    java.io.FileNotFoundException: /RNS/lightstreamer/conf/./../audit/audit_log_b2-d3-5d-1a-b2-04_28.txt (Too many open files)   
     at java.io.FileOutputStream.<init>(FileOutputStream.java:231) ~[na:1.7.0]                                                   
     at java.io.FileWriter.<init>(FileWriter.java:118) ~[na:1.7.0]                                                               
     at com.lightstreamer.j.d.a(d.java) [lightstreamer.jar:na]                                                                   
     at com.lightstreamer.j.d.a(d.java) [lightstreamer.jar:na]                                                                   
     at com.lightstreamer.j.d.a(d.java) [lightstreamer.jar:na]                                                                   
     at com.lightstreamer.j.a.run(a.java) [lightstreamer.jar:na]                                                                 
     at java.util.TimerThread.mainLoop(Timer.java:566) [na:1.7.0]                                                                
     at java.util.TimerThread.run(Timer.java:516) [na:1.7.0]

  2. #2
    Administrator
    Join Date
    Feb 2012
    Location
    Milano
    Posts
    716
    Hi Kevin,

    I am not sure that the root cause of the crash was the "Unable to open Audit file in append mode.". It is probably a result of the exhaustion of file descriptors available.

    Code:
    	java.io.FileNotFoundException: /RNS/lightstreamer/conf/./../audit/audit_log_b2-d3-5d-1a-b2-04_28.txt (Too many open files)
    Indeed, the limit on the number of open files in Unix and Linux systems also applies to the number of open sockets.

    Could you check if the server had experienced a considerable increase in the number of open connections?
    Do you know if your system imposes a particular limit on this number?
    We try to put the "ulimit" command in our Server launch script, even though it might not work in some environments.

    Regards,
    Giuseppe

  3. #3
    Power Member
    Join Date
    Nov 2012
    Posts
    182
    As it happens this server is currently holding 63k open connections on the port that LS is listening on. There are only about 150 clients though. This is the server we have had problems with before, where "something" causes the clients to open lots and lots of connections until it all dies.

    The original issue was thought to be solved - a XHR bug in the client code that Mone fixed. This must be something different.

    I have restarted the server, but the connections are all still hanging around with a status of "Close-wait" or "Last-ACK".

  4. #4
    Power Member
    Join Date
    Nov 2012
    Posts
    182
    By the way, setting an unlimited limit failed on our server, so it gets set to 66000 instead.

    So I think this is all just the result of the connections going bananas again.

  5. #5
    Administrator
    Join Date
    Feb 2012
    Location
    Milano
    Posts
    716
    Hi Kevin,

    I have checked out with Mone and indeed, we were not sure that the fix to the client code was involved or resolutive in the problem.
    So, we can not rule out that what happened today is similar to the previous episode.

    We could not figure out why the sockets remain in CLOSE_WAIT after you've shut down the server. In LAST_ACK a timeout should clean them.
    From the LightstreamerMonitorText logger of today you can figure out the trend of number of connections until you get to 63K?
    Have you chance to determine whether every single socket opened on the Lightstreamer server, when closed, remains in CLOSE_WAIT or LAST_ACK for a long time?

    Regards,
    Giuseppe

  6. #6
    Power Member
    Join Date
    Nov 2012
    Posts
    182
    My own connection to the LS server on that box disappears completely when I terminate the session (doesn't remain at Close-wait or Last-ACK). It all behaves normally when I open and close connections also.

    The loitering connections have all finally given up the ghost at last also.

    I will get the log from today and share it.

  7. #7
    Power Member
    Join Date
    Nov 2012
    Posts
    182
    I have transferred the log using wetransfer.com to support@lightstreamer.com

  8. #8
    Administrator
    Join Date
    Feb 2012
    Location
    Milano
    Posts
    716
    Hi Kevin,

    The log file shows that the number of connections is exploded when the server was already in severe crisis because of the stalling of SERVER pool.
    Please note messages like these (with delay constantly growing):

    Code:
    05-May-15 14:05:24,728|WARN |LightstreamerLogger.scheduler    |Timer-0                    |If the delay issue persists, taking a JVM full thread dump may be the best way to investigate the issue.
    05-May-15 14:05:26,758|WARN |LightstreamerLogger.scheduler    |Timer-0                    |Current delay for tasks scheduled on thread pool SERVER is 156 seconds.
    The SERVER pool has the task of dealing with client requests and this explains why no new session could start.
    Clients continued to open new connections but requests sent remained without replies and probably triggered a retry loop.

    The most common cause for these messages is due to delays or blocks in the replies from Metadata/Data Adapters to the server requests. Mainly, the requests involved are notifyUser, getItems, getSchema, subscribe and unsubscribe.
    But, in order to know what exactly happened it was necessary take a thread dump of the JVM.

    Furthermore, we can definitely rule out problems of the client library at least in this case.

    Regards,
    Giuseppe

  9. #9
    Power Member
    Join Date
    Nov 2012
    Posts
    182
    So you saying that the problem may be within one of my own adaptors?

  10. #10
    Administrator
    Join Date
    Feb 2012
    Location
    Milano
    Posts
    716
    Yes, it is a possibility.

    Since the accumulated delays of SERVER pool and the "Queued tasks" grow constantly and never decrease, what exactly happened in your case is that all the threads of SERVER pool were blocked in some operation, probably the same for all.
    As I said in the previous post, in these cases it is very likely that there is a blockage in some calls of Adapters and a thread dump taken during the crisis could help to find the culprit.

 

 

Similar Threads

  1. Replies: 3
    Last Post: July 30th, 2014, 01:23 PM
  2. unable to setup HTTP server
    By jdepp in forum General
    Replies: 0
    Last Post: January 2nd, 2013, 04:43 PM
  3. Any ideas what might cause this crash?
    By kpturner in forum Adapter SDKs
    Replies: 4
    Last Post: December 18th, 2012, 10:07 AM
  4. Restart Lightstreamer Server while browser open !!
    By mohamida in forum Adapter SDKs
    Replies: 9
    Last Post: April 21st, 2010, 09:17 PM
  5. Replies: 3
    Last Post: September 29th, 2009, 09:54 AM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
All times are GMT +1. The time now is 04:00 AM.