Skip to main content

Job Queue Optimization

This page describes the techniques that Holistics employs to optimize its Job Queue and what you can do to optimize your Job Queue.

Knowledge Checkpoint

To understand how Holistics Job System works, please refer to this documentation.

Holistics's internal mechanisms to optimize Job Queues

Caching

When the cache data is available, Holistics will fetch the result directly from cache and will not execute a new Job.

Holistics has many caching mechanisms to optimize its performance. You can learn more about the main caching mechanism, which is the Reporting cache, via this link.

Job de-duplication

If 2 users (A and B) open the same report within a short amount of time, there's a high chance a duplicate query will be sent to the system while the first query is still running. This unnecessarily overloads the system, and increases the waiting time of both users.

To avoid this from happening, Holistics has a built-in de-duplication mechanism, which works as follow:

  • Every time a query job is submitted, the query hash is used to look up concurrent running jobs within the last 10 minutes to find the same query currently being executed at customer database
  • If found, the job status is set to "already existed" and the job result is routed to the previous running job with the same query.

If the first job already exists and the result is stored in cache, the caching mechanism will kick in. The second query will not be sent at all, and we will use the cached result to serve user B.

Note

The exact de-duplication behavior may vary according to your version of Holistics and your Report type.

What to do when encountering slow jobs

Optimize slow queries/reports

Important

This is often the most effective and sustainable way to optimize your Job Queue.

Slow queries/reports are often the biggest culprit that occupies and blocks your Job Workers. Thus, optimizing the slow queries/reports will help your Job Queue performance tremendously. Please refer to our documentations on optimizing reports:

  1. Optimizing database execution
  2. Tips to improve reporting performance

Cancel Running Jobs

Some type of jobs are able to be cancelled while running. This is incredibly helpful when a user accidentally farms a long-running job as an available slot in the queue will be wasted. A running job's ability to be canceled is determined based on the job's Source type and Action. Currently, Holistics supports canceling these types of running job:

  • DataTransform: execute
  • DataImport: execute
  • EmailSchedule: execute
  • QueryReport: execute, preview
  • DashboardWidget: execute For QueryReport jobs (execute, preview), there are also Cancel buttons in Report view and Report editor, which share the same effect of cancelling the job from Jobs monitoring screen.

While cancelling a running job, Holistics will also try to cancel all job's running queries in order to save your database server's resources.

Holistics uses a simple, yet effective mechanism to cancel job's running queries. We includes the job's id as a comment in every query sent to your database server, and uses this information to identify the specific process running the query. Afterwards, Holistics will send a specific query depending on DataSource type (e.g, pg_cancel_backend for PostgreSQL) to kill the identified process.

Increase your default slots for specific Job Queues

This approach cannot be done from your side since this action will require our support engineer to adjust the queue size.

If you think that your current default slot (2 concurrent jobs for data transforms for example) is not enough for your operation, please contact us via [email protected] and we will process your request.

Automatically cancel unused Jobs

Coming Soon

This feature is currently in development and will be released soon!

Holistics offers the Unused Job Timeout setting that automatically cancel unused Jobs:

Explanation:

  • Unused Jobs: Jobs that are not being used by any user's browser
    • For example: A user visits a Dashboard with 10 widgets and creates 10 Jobs, then closes their browser => Those 10 Jobs are unused.

Notes:

  • This auto cancellation only applies to Jobs in these Job Queues:
    • Report
    • Embed
    • Adhoc Query

Contact Holistics Support

info

Please see the below section.

Reporting slow-running Jobs to Holistics Support

Holistics is here to help! Please send us a ticket via [email protected] should you encounter slow-running jobs while using our application. To help us serve you better, consider including these suggested information in your request:

info

Important: When comparing jobs’ execution time between Holistics and your database, please make sure to run those jobs using the same database user credentials.

✅ Attach your support request with the ID of the job

Holisitics offers Job Monitoring dashboard which records every job executed.

Please access this dashboard to find the ID of the slow-running job and send it to us.

✅ Share with us your thoughts on what you've expected of the job's performance

We would love to gain insights on your expectations of the job's execution speed. If possible, we are keen to know why you have adopted these expectations (i.e that is the normal performance of the applications that you have used).

✅(Optional) Include a before/ after log that demonstrates a performance degradation

If your normal jobs become noticeably slower, you can add a before/ after log showing the differences in these jobs’ duration (using the Job Monitoring dashboard like above).

✅ (Optional) Check the job logs to see which steps take long time to finish

You can see the detailed logs of your job by selecting the Logs button in Jobs Monitoring dashboard. This log is helpful to pinpoint what step(s) is slow.

Other settings to control Job Queue resources

Below are the settings that allow you to have more control over your Job Queue resources.

Limit Number of Report Jobs Run Per User

Since the max concurrent jobs are being capped per queue, when a user A purposely or accidentally farms multiple jobs, these jobs quickly take up all the available slots in the queue, causing other users unable to use the system. To prevent this from happening, admins can set a limit on how many report jobs each user can run concurrently. This is done in the Admin, Settings page.

Disclaimer

This Limit setting is most suitable when you have many Users using many different Reports.
Otherwise, Job De-duplication may apply and actually result in your Users waiting longer. See explanation below.

How Report-Jobs-Per-User-Limit works with Job De-duplication

When multiple Users use the same Reports, their Jobs would be de-duped and only the first Jobs are executed.

So for example, if the Report-Jobs-Per-User-Limit is 5, then this can happen:

  1. User A visits Dashboard X with 10 widgets -> create 10 Report Jobs (ID 1 to 10).
    • Since the Limit is 5, only the first 5 Jobs (ID 1 to 5) will be Running, and the latter 5 (ID 6 to 10) will be Pending.
  2. User B visits Dashboard X -> create 10 new Report Jobs (ID 11 to 20).
  3. The system detects that the Jobs with ID 11 to 20 are the same with Jobs ID 1 to 10.
    • -> It de-dups those Jobs and hence User B would also use the same 10 Jobs of User A (ID 1 to 10).
  4. -> Although there are 2 concurrent Users, there are only 5 Report Jobs (ID 1 to 5) running at a time.

Limit the Query Timeout of your Database Connection

Having a limit on the Query Timeout would help prevent slow queries from occupying the Job Queue for too long and block the other Jobs.

Please refer to this documentation to configure the Query Timeout of your Database Connection.

Example Scenario Let's say you have a Dashboard of 25 Reports, each Report takes 1 second to run. By default, the Report Job Queue has 10 slots.

Therefore, when you open the Dashboard (and the cache is not available):

  1. At t = 0, 10 Reports will be executed first in parallel.
  2. At t = 1 second, the first 10 Reports will have finished, so the next 10 Reports will be executed in parallel.
  3. At t = 2 seconds, the second 10 Reports will have finished, so the last 5 Reports will be executed next in parallel.
  4. At t = 3 seconds, the last 5 Report will have finished.

-> It takes 3 seconds in total to run your whole Dashboard.

Then, an Analyst makes some modifications to the first 10 Reports, causing them to take 5 seconds to run. Without any Query Timeout, those first 10 Reports will occupy the whole Job Queue for 10 second and delay the other Reports.

As the result:

  • It now takes 7 seconds in total to run your whole Dashboard
  • It takes 6 seconds to see the result of Report 11, even when Report 11 takes only 1 second to run

With a Query Timeout of 1 second, the first 10 Reports will be terminated after 1 second, freeing the Job Workers in the Job Queue for the other Reports:

Thus, the Query Timeout configuration can be handy for you to control the Job Queue.

Notes

Query Timeout can also help protect your Database resources from too much load.

Notes

If a query keeps timing-out, it is best to optimize the query or quarantine it (e.g. by moving it to your personal workspace or deleting it) to avoid unnecessary load whenever opening the Report using that query.


Let us know what you think about this document :)