You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Batch2 Processing of job instances has the capacity to load significant amount of workchunks to be processed in parallel. In some testing this can exceed 100k workchunks, which means high availability of processing nodes and saturation of these nodes is paramount for efficient processing and maximum parallelization.
The current JobInstanceProcessor class appears to make use of a paging iterator to manage this instead of a Java stream, or something similar. When testing at scale it has been observed that nodes process their batch of tasks and will wait idle for the next iteration of chunks to be released by the iterator
To Reproduce
Run a Batch2 job like to process a large qty of workchunks (20k workchunks is a good example) to process
This could be where the first step creates the workchunks to process, and the second processes across available consumers.
Observe the behavior of step 2 of the job. In a single node deployment, the batch2 processing should allow for all consumers to process simultaneously. This can be observed on the workchunks in the table per instance with status 'in-progress'.
As the job processes you will observe on several occasions that the 'in-progress' queue will saturate fully to for all consumers initially and then trickle off down to 1 consumer processing, instead of remaining at capacity of processing across all available consumers.
This can be observed over and over as records are processing, which significantly impacts performance
Expected behavior
JobInstanceProcessor should allow for all available consumers to process available tasks until no tasks remaining. The only time 'in-progress' tasks should fall below available capacity is when remaining tasks is < available work.
Suggested fix
Convert usage of paging iterator to use a stream instead to avoid the need manual iteration and a more best practices approach to managing large batched workloads.
Another option would be to eliminate use of PagingIterator and find an alternative approach to queue workchunks without bloating memory or under utilizing available processors
The text was updated successfully, but these errors were encountered:
Describe the issue
Batch2 Processing of job instances has the capacity to load significant amount of workchunks to be processed in parallel. In some testing this can exceed 100k workchunks, which means high availability of processing nodes and saturation of these nodes is paramount for efficient processing and maximum parallelization.
The current JobInstanceProcessor class appears to make use of a paging iterator to manage this instead of a Java stream, or something similar. When testing at scale it has been observed that nodes process their batch of tasks and will wait idle for the next iteration of chunks to be released by the iterator
To Reproduce
Expected behavior
Suggested fix
The text was updated successfully, but these errors were encountered: