Uptimal usage of resources, result in “overuse” of licenses

A user group we support uses an application and they have the license to use 20 concurrent runs. For them we've implement very recently the Sun Grid Engine job scheduler. Although they recently start complaining that jobs didn't run.

Their cluster exists out of 8 nodes with each 2 cores, so they've 16 slots in the job scheduler. We've set up two queues, the suspendable.q and the unsuspendable.q. Jobs in the suspendable.q queue can get suspended by jobs in the unsuspendable.q queue. So we can have in total 32 concurrent jobs (where 16 will be suspended).

Once their cluster is really busy, some jobs will not run... and after some investigation we found out why. The jobs that are suspended don't release their license to the license server. So we can have a total of 16 jobs in the suspendable.q and 4 in the unsuspendable.q. So we start limiting the number of unsuspendable jobs to 4, because the 21nd job that will start running will fail because it won't get a license.