disco.job – Disco Jobs¶
This module contains the core objects for creating and interacting with Disco jobs.
Job is the only thing you need
in order to start running distributed computations with Disco.
Jobs in Disco are used to encapsulate and schedule computation pipelines. A job specifies a worker, the worker environment, a list of inputs, and some additional information about how to run the job. For a full explanation of how the job is specified to the Disco master, see The Job Pack.
A typical pattern in Disco scripts is to run a job synchronously,
that is, to block the script until the job has finished.
This can be accomplished using the
from disco.job import Job results = Job(name).run(**jobargs).wait()
Job(name=None, master=None, worker=None, settings=None)¶
Creates a Disco Job with the given name, master, worker, and settings. Use
Job.run()to start the job.
- name (string) – the job name.
When you create a handle for an existing job, the name is used as given.
When you create a new job, the name given is used as the
jobdict.prefixto construct a unique name, which is then stored in the instance.
- master (url of master or
disco.core.Disco) – the Disco master to use for submitting or querying the job.
- worker (
disco.worker.Worker) – the worker instance used to create and run the job. If none is specified, the job creates a worker using its
disco.worker.classic.worker.Worker. If no worker parameter is specified,
Workeris called with no arguments to construct the
Note that due to the mechanism used for submitting jobs to the Disco cluster, the submitted job class cannot belong to the __main__ module, but needs to be qualified with a module name. See
examples/faq/chain.pyfor a simple solution for most cases.
proxy_functions= ('clean', 'events', 'kill', 'jobinfo', 'jobpack', 'oob_get', 'oob_list', 'profile_stats', 'purge', 'results', 'stageresults', 'wait')¶
These methods from
disco.core.Disco, which take a jobname as the first argument, are also accessible through the
For instance, you can use job.wait() instead of disco.wait(job.name). The job methods in
disco.core.Discocome in handy if you want to manipulate a job that is identified by a jobname instead of a
JobPackfor the worker using
disco.task.jobdata(), and attempts to submit it. This method executes on the client submitting a job to be run. More information on how job inputs are specified is available in
disco.worker.Worker.jobdict(). The default worker implementation is called
classic, and is implemented by
Parameters: jobargs (dict) – runtime parameters for the job. Passed to the
disco.worker.Workermethods listed above, along with the job itself. The interpretation of the jobargs is performed by the worker interface in
disco.worker.Workerand the class implementing that interface (which defaults to
disco.error.JobErrorif the submission fails.
Job, with a unique name assigned by the master.
- name (string) – the job name. When you create a handle for an existing job, the name is used as given. When you create a new job, the name given is used as the
JobPack(version, jobdict, jobenvs, jobhome, jobdata)¶
This class implements The Job Pack in Python. The attributes correspond to the fields in the job pack file. Use
dumps()to serialize the
JobPackfor sending to the master.
The dictionary of job parameters for the master.
See also The Job Dict.
The dictionary of environment variables to set before the worker is run.
See also Job Environment Variables.
The zipped archive to use when initializing the job home. This field should contain the contents of the serialized archive.
See also The Job Home.
Binary data that the builtin
disco.worker.Workeruses for serializing itself.
See also Additional Job Data.
Return the serialized
Essentially encodes the
jobenvsdictionaries, and prepends a valid header.