T
- model-typepublic class ConcurrentModelPool<T extends InferenceModel> extends Object implements AutoCloseable
The copies van variously use GPU and CPU for execution, with GPU always being given priority.
Constructor and Description |
---|
ConcurrentModelPool(ConcurrencyPlan plan,
CreateModelForPool<T> createModel,
Logger logger)
Creates with a particular plan and function to create models.
|
Modifier and Type | Method and Description |
---|---|
void |
close()
Close all models, to indicate they are no longer in use, and to perform tidy-up.
|
<S> S |
executeOrWait(CheckedFunction<ConcurrentModel<T>,S,ConcurrentModelException> functionToExecute)
Execute on the next available model (or wait until one becomes available).
|
public ConcurrentModelPool(ConcurrencyPlan plan, CreateModelForPool<T> createModel, Logger logger) throws CreateModelFailedException
plan
- a plan determining how many CPUs and GPUs to use for inference.createModel
- called to create a new model, as needed.logger
- where feedback is written about how many GPUs or CPUs were selected.CreateModelFailedException
- if a model cannot be created.public <S> S executeOrWait(CheckedFunction<ConcurrentModel<T>,S,ConcurrentModelException> functionToExecute) throws Throwable
If an exception is thrown while executing on a GPU, the GPU processor is no longer used, and instead an additional CPU node is added. The failed job is tried again.
S
- return typefunctionToExecute
- function to execute on a given model, possibly throwing an
exception.functionToExecute
after it is executed.Throwable
- if thrown from functionToExecute
while executing on a CPU. It is
suppressed if thrown on a GPU.public void close() throws Exception
close
in interface AutoCloseable
Exception
- if a model cannot be successfully closed.Copyright © 2010–2023 Owen Feehan, ETH Zurich, University of Zurich, Hoffmann-La Roche. All rights reserved.