Results 1 to 6 of 6
  1. #1
    Senior Member
    Join Date
    Dec 2019
    Posts
    66

    Angry Huge number of wait_to_close sessions

    I'm connecting to the light streamer server from an HTML5 client, the client uses a javascript code in order to connect to the light streamer, I have another java client which connects to the light streamer server, both clients uses the same data adapter on the light streamer server.
    HTML5 has it's own light streamer server, and the java client has it's own light streamer server.
    However, when the number of customers increases, the behaviour is totally different between the java and the HTML5, We found a huge number of WAIT TO CLOSE sessions on the HTML5 LS server, but nothing happened on the java side.
    I have read the log and I have found the following line repeated too many times:

    31-Aug-20 14:10:37,592|INFO |LightstreamerLogger.requests |FOR PUMPS PARKING DESTROYER|Closed session S97cc068f69302361T1037592 with internal cause code: 38 (Interrupted).
    (99% of the total repetitions)

    31-Aug-20 10:59:55,136|INFO |LightstreamerLogger.requests |FOR PUMPS PARKING DESTROYER|Closed session Sd801e961cc64051cT5925121 with internal cause code: 39 (Timeout).
    (1% of the total repetitions)
    These two records repeated 117058 for 2000 users starting from 9:30 up to 14:15

    Would you please help me with this serious issue?


    Note: I have tried to upload the logs but unfortunately it is not uploaded for some reason

  2. #2
    Administrator
    Join Date
    Jul 2006
    Location
    Milan
    Posts
    1,090
    Hi,
    We have sent an upload url via a private message.

    Sockets in CLOSE_WAIT are unexpected, because it means that the system has detected a closed socket but the Server has not yet detected it.
    The Server usually keeps the sockets monitored.
    One possibility is that there are delays in the initial handling of client requests. This should be visible in the logs.
    The fact that so many sessions have been established with far fewer clients is a sign that there is an underlying issue.

  3. #3
    Senior Member
    Join Date
    Dec 2019
    Posts
    66
    Dear Dario,

    I have replied to your private messaeg with the uploaded LS.zip file which contains the light streamer logs, I hope these files will help you to find a solution for this issue.

  4. #4
    Administrator
    Join Date
    Jul 2006
    Location
    Milan
    Posts
    1,090
    I confirm that delays in the establishment of the sessions are visible in the log.
    In particular, you can refer to lines like the following:

    31-Aug-20 10:35:13,240|WARN |LightstreamerLogger.scheduler |Timer-1 |Current delay for tasks scheduled on thread pool SERVER is 11 seconds.
    31-Aug-20 10:35:13,240|WARN |LightstreamerLogger.scheduler |Timer-1 |If the delay issue persists, taking a JVM full thread dump may be the best way to investigate the issue.

    These delays are probably originated in the invocations of NotifyUser on your Metadata Adapter.
    If you keep an eye on the log and take a Server thread dump when the message appears, this should confirm the suspect, and also show the exact point in which the processing is blocked.

    These messages are quite frequent after 10:32.
    These delays cause the Server to answer late to client request and the clients to discard the connection first.
    But, from the Server point of view, the sessions are established and then immediately closed.
    After the client discards the connection and before the Server answers, the Server doesn't check the socket, so ithe latter remains in CLOSE_WAIT state.
    Since the delay grows significantly, some sockets may stay in CLOSE_WAIT for long time and many of such sockets could cumulate.
    Usually, this is not the first thing observed, because for all that time no client can initiate a new session.

    There is no evidence of the role of the different client SDK.
    Perhaps the javascript case is just more loaded and puts more pressure on NotifyUser.
    Recent versions of the client SDKs are more robust to such situations and try to avoid a cascading effect due to reconnections.
    This may be a difference between the Java and javascript case. In fact, the javascript SDK seems very old.
    Please specify the versions of the Server and the client SDKs in use for more details.

  5. #5
    Senior Member
    Join Date
    Dec 2019
    Posts
    66
    Dario,

    Thank you very much for your response and clarification.

    But you mentioned that the messages are frequently repeated after 10:32, but unfortunately I noticed to many repeatitions before that time.

    Further, MetaDataAdapter is the same for both Java client and JavaScript client, so both of them sharing the same NotifyUser method.

    The versions that you requested are:

    The lightstreamer server version is : Lightstreamer Server Allegro-Presto-Vivace version 5.1.1 build 1623.2
    The lightstreamer JavaScript: Lightstreamer JavaScript Client Library version 6.1.1 build 1640 Compatible with Lightstreamer Server since version 5.1

    The lightstreamer Java: Lightstreamer Java Adapter Interface version 5.1.1 build 1623.2
    Compatible with Lightstreamer Server since version 5.1

  6. #6
    Administrator
    Join Date
    Jul 2006
    Location
    Milan
    Posts
    1,090
    I see the first "Current delay" message at 10:32:08,036 in the log.
    What did you notice before? The sockets in CLOSE_WAIT?
    This is possible, because the "Current delay" messages are logged only for delays longer than 10 seconds, but slightly shorter delays could still cause some sockets to remain in CLOSE_WAIT for a few seconds
    (this is more true with some old Client SDK versions, which is your case).

    Anyway, if the Server reports delays in the internal thread pool, this is actually where you should focus.
    And, as said, it is probably the notifyUser invocation that clutters the system.
    The fact that the issue appears only in one of the two Server instances may or may not be significant.
    I provided a possible explanation, assuming a different number of concurrent invocations of notifyUser in the two cases.
    This has not been disproved.
    There are two aspects:
    • The number of javascript clients concurrently connecting may be more than in the Java case.
      Only a log for the Java case could clarify.
    • The javascript client may be more aggressive in retrying after a delay, causing a short delay to grow.
      This depends on the two SDK versions and it is quite likely.
      You didn't report the version of the Java client SDK in use, but since the Server is 5.1.1, the version should be 2.5.
      So, I confirm that the javascript clients perform automatic retries, whereas the Java clients don't (unless you enforced such retries in custom code).


    So, for now, our suggestion is to focus on notifyUser, rather than on the differences between the two installations.
    And, as said, you could gather important clues by taking a thread dump of the Server JVM after you see the first occurrences of the "Current delay" message; we can help analyzing the dump.
    If you find any bottleneck in notifyUser, you can make it more efficient.
    Otherwise, we can revise the configuration of the thread pools.

 

 

Similar Threads

  1. Managing sessions
    By EmersonPardo in forum General
    Replies: 2
    Last Post: July 6th, 2019, 01:26 AM
  2. Replies: 1
    Last Post: January 8th, 2014, 09:41 AM
  3. Replies: 3
    Last Post: February 8th, 2007, 10:11 AM

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
All times are GMT +1. The time now is 09:32 PM.