disco.job
– Disco Jobs¶
This module contains the core objects for creating and interacting with Disco jobs.
Often, Job
is the only thing you need
in order to start running distributed computations with Disco.
Jobs in Disco are used to encapsulate and schedule computation pipelines. A job specifies a worker, the worker environment, a list of inputs, and some additional information about how to run the job. For a full explanation of how the job is specified to the Disco master, see The Job Pack.
A typical pattern in Disco scripts is to run a job synchronously,
that is, to block the script until the job has finished.
This can be accomplished using the Job.wait()
method:
from disco.job import Job
results = Job(name).run(**jobargs).wait()
-
class
disco.job.
Job
(name=None, master=None, worker=None, settings=None)¶ Creates a Disco Job with the given name, master, worker, and settings. Use
Job.run()
to start the job.Parameters: - name (string) – the job name.
When you create a handle for an existing job, the name is used as given.
When you create a new job, the name given is used as the
jobdict.prefix
to construct a unique name, which is then stored in the instance. - master (url of master or
disco.core.Disco
) – the Disco master to use for submitting or querying the job. - worker (
disco.worker.Worker
) – the worker instance used to create and run the job. If none is specified, the job creates a worker using itsJob.Worker
attribute.
-
Worker
¶ Defaults to
disco.worker.classic.worker.Worker
. If no worker parameter is specified,Worker
is called with no arguments to construct theworker
.
Note
Note that due to the mechanism used for submitting jobs to the Disco cluster, the submitted job class cannot belong to the __main__ module, but needs to be qualified with a module name. See
examples/faq/chain.py
for a simple solution for most cases.-
proxy_functions
= ('clean', 'events', 'kill', 'jobinfo', 'jobpack', 'oob_get', 'oob_list', 'profile_stats', 'purge', 'results', 'stageresults', 'wait')¶ These methods from
disco.core.Disco
, which take a jobname as the first argument, are also accessible through theJob
object:For instance, you can use job.wait() instead of disco.wait(job.name). The job methods in
disco.core.Disco
come in handy if you want to manipulate a job that is identified by a jobname instead of aJob
object.
-
run
(**jobargs)¶ Creates the
JobPack
for the worker usingdisco.worker.Worker.jobdict()
,disco.worker.Worker.jobenvs()
,disco.worker.Worker.jobhome()
,disco.task.jobdata()
, and attempts to submit it. This method executes on the client submitting a job to be run. More information on how job inputs are specified is available indisco.worker.Worker.jobdict()
. The default worker implementation is calledclassic
, and is implemented bydisco.worker.classic.worker
.Parameters: jobargs (dict) – runtime parameters for the job. Passed to the disco.worker.Worker
methods listed above, along with the job itself. The interpretation of the jobargs is performed by the worker interface indisco.worker.Worker
and the class implementing that interface (which defaults todisco.worker.classic.worker
).Raises: disco.error.JobError
if the submission fails.Returns: the Job
, with a unique name assigned by the master.
- name (string) – the job name.
When you create a handle for an existing job, the name is used as given.
When you create a new job, the name given is used as the
-
class
disco.job.
JobPack
(version, jobdict, jobenvs, jobhome, jobdata)¶ This class implements The Job Pack in Python. The attributes correspond to the fields in the job pack file. Use
dumps()
to serialize theJobPack
for sending to the master.-
jobdict
¶ The dictionary of job parameters for the master.
See also The Job Dict.
-
jobenvs
¶ The dictionary of environment variables to set before the worker is run.
See also Job Environment Variables.
-
jobhome
¶ The zipped archive to use when initializing the job home. This field should contain the contents of the serialized archive.
See also The Job Home.
-
jobdata
¶ Binary data that the builtin
disco.worker.Worker
uses for serializing itself.See also Additional Job Data.
-