JRun threadpool configuration

Question 1

Me and my team have been struggling to keep a clustered ColdFusion application stable for the better part of the last 6 months with little result. We are turning to SF in the hope of some finding some JRun experts or fresh ideas cause we can’t seem to figure it out.

The setup:

Two ColdFusion 7.0.2 instances clustered with JRun 4 (w/ the latest update) on IIS 6 under Windows Server 2003. Two quad core CPUs, 8GB RAM.

The issue:
Every now and again, usually once a week one of the instance will stop handling request completely. There is no activity on it what so ever and we have to restart it.

What we know:
Every time this happen JRun’s error log is always full of java.lang.OutOfMemoryError: unable to create new native thread.

After reading JRun documentation from Macromedia/Adobe and many confusing blog posts we’ve more or less narrowed it down to incorrect/unoptimized JRun thread pool settings in the instance’s jrun.xml.

Relevant part of our jrun.xml:

<service class="jrun.servlet.jrpp.JRunProxyService" name="ProxyService">
    <attribute name="activeHandlerThreads">500</attribute>
    <attribute name="backlog">500</attribute>
    <attribute name="deactivated">false</attribute>
    <attribute name="interface">*</attribute>
    <attribute name="maxHandlerThreads">1000</attribute>
    <attribute name="minHandlerThreads">1</attribute>
    <attribute name="port">51003</attribute>
    <attribute name="threadWaitTimeout">300</attribute>
    <attribute name="timeout">300</attribute>
{snip}  
</service>

I’ve enabled JRun’s metrics logging last week to collect data related to threads. This is a summary of the data after letting it log for a week.

Average values:

{jrpp.listenTh}       1
{jrpp.idleTh}         9
{jrpp.delayTh}        0
{jrpp.busyTh}         0
{jrpp.totalTh}       10
{jrpp.delayRq}        0
{jrpp.droppedRq}      0
{jrpp.handledRq}      4
{jrpp.handledMs}   6036
{jrpp.delayMs}        0
{freeMemory}      48667
{totalMemory}    403598
{sessions}          737
{sessionsInMem}     737

Maximum values:

{jrpp.listenTh}       10
{jrpp.idleTh}         94
{jrpp.delayTh}         1
{jrpp.busyTh}         39
{jrpp.totalTh}       100
{jrpp.delayRq}         0
{jrpp.droppedRq}       0
{jrpp.handledRq}      87
{jrpp.handledMs}  508845
{jrpp.delayMs}         0
{freeMemory}      169313
{totalMemory}     578432
{sessions}          2297
{sessionsInMem}     2297

Any ideas as to what we could try now?

Cheers!

EDIT #1 ->
Things I forgot to mention:
Windows Server 2003 Enterprise w/ JVM 1.4.2 (for JRun)

The max heap size is around 1.4GB yeah. We used to have leaks but we fixed them, now the application use around 400MB, rarely more. The max heap size is set to 1200MB so we aren’t reaching it. When we did have leaks the JVM would just blow up and the instance would restart itself. This isn’t happening now, it simply stops handling incoming request.

We were thinking it has to do with thread following this blog post:
http://www.talkingtree.com/blog/index.cfm/2005/3/11/NewNativeThread

The Java exception being thrown is of type OutOfMemory but it’s not actually saying that we ran out of heap space, just that it couldn’t create new threads. The exception type is a bit misleading.

Basically the blog is saying that 500 as activeHandlerThreads might be too high but my metrics seems to show that we get no where near that which is confusing us.

Question 2

Well, let’s look at some bigger picture issues before getting into JRun configuration details.

If you’re getting java.lang.OutOfMemoryError exceptions in the JRun error log, well, you’re out of memory. No upvote for that, please ;-). You didn’t say whether you were running 32- or 64-bit Windows, but you did say that you have 8 GB of RAM, so that will have some impact on an answer. Whether or not you’re running a 32- or 64-bit JVM (and what version) will also impact things. So those are a few answers that will help us get to the bottom of this.

Regardless, your application IS running out of memory. It’s running out of memory for one or more of these reasons:

Your application is leaking memory. Some object your application uses is continually referenced and therefore never eligible for garbage collection; or worse- some object created new on every request is referenced by another object in perpetuity and therefore, never eligible for garbage collection. Correct J2EE session handling can be particularly tricky in this regard.
The amount of required memory to handle each concurrent request (at the configured concurrent request level) exceeds the amount of memory available in the JVM heap. For instance, you have a heap size of 1 GB and each request can use up to 10 MB. Your app server is tuned to allow 150 concurrent requests. (Simplistic numbers, I know). In that case, you would definitely be running out of memory if you experienced 100 or more concurrent requests under load (if each request used the maximum amount of memory necessary to fulfill the request).

Other things to keep in mind: on 32-bit Windows, a 32-bit JVM can only allocate approximately 1.4 GB of memory. I don’t recall off the top of my head if a 32-bit JVM on 64-bit Windows has a limitation less than the theoretical 4 GB max for any 32-bit process.

UPDATED

I read the blog post linked via TalkingTree and the other post linked within that post as well. I haven’t run into this exact case, but I did have the following observation: the JRUN metrics logging may not record the “max values” you cited in a period of peak thread usage. I think it logs metrics at a fixed, recurring interval. That’s good for showing you smooth, average performance characteristics of your application, but it may not capture JRUN’s state right before your error condition begins to occur.

Without knowing about the internal workings of JRUN’s thread management, I still say that it really is out of memory. Perhaps it’s not out of memory because your app needed to allocate memory on the JVM heap and none was available, but it’s out of memory because JRUN tried to create another thread to handle an incoming request and the heap memory necessary to support another thread wasn’t available- in other words, threads aren’t free- they require heap memory as well.

Your options seem to be as follows:

Reduce the amount of memory your application uses in each request, or-
Experimentally reduce the value of the thread tuning parameters in JRUN’s configuration to make more threads queue up for processing instead of becoming runnable at the same time, or-
Reduce the number of simultaneous requests in the ColdFusion administrator (Request Tuning page, field “Maximum number of simultaneous Template requests”)

Regardless of the option you pursue, I think a valid fix here is going to be experimental in nature. You’re going to have to make a change and see what affect it has on the application. You have a load testing environment, right?

JRun threadpool configuration

Answer

Leave a Comment Cancel reply