`disco.job` – Disco Jobs¶

This module contains the core objects for creating and interacting with Disco jobs. Often, Job is the only thing you need in order to start running distributed computations with Disco.

Jobs in Disco are used to encapsulate and schedule computation pipelines. A job specifies a worker, the worker environment, a list of inputs, and some additional information about how to run the job. For a full explanation of how the job is specified to the Disco master, see The Job Pack.

A typical pattern in Disco scripts is to run a job synchronously, that is, to block the script until the job has finished. This can be accomplished using the Job.wait() method:

from disco.job import Job
results = Job(name).run(**jobargs).wait()

class disco.job.Job(name=None, master=None, worker=None, settings=None)¶

Creates a Disco Job with the given name, master, worker, and settings. Use Job.run() to start the job.

Parameters:

name (string) – the job name. When you create a handle for an existing job, the name is used as given. When you create a new job, the name given is used as the jobdict.prefix to construct a unique name, which is then stored in the instance.
master (url of master or disco.core.Disco) – the Disco master to use for submitting or querying the job.
worker (disco.worker.Worker) – the worker instance used to create and run the job. If none is specified, the job creates a worker using its Job.Worker attribute.

Worker¶: Defaults to disco.worker.classic.worker.Worker. If no worker parameter is specified, Worker is called with no arguments to construct the worker.

Note

Note that due to the mechanism used for submitting jobs to the Disco cluster, the submitted job class cannot belong to the __main__ module, but needs to be qualified with a module name. See examples/faq/chain.py for a simple solution for most cases.

proxy_functions = ('clean', 'events', 'kill', 'jobinfo', 'jobpack', 'oob_get', 'oob_list', 'profile_stats', 'purge', 'results', 'stageresults', 'wait')¶

These methods from disco.core.Disco, which take a jobname as the first argument, are also accessible through the Job object:

disco.core.Disco.clean()

disco.core.Disco.events()

disco.core.Disco.kill()

disco.core.Disco.jobinfo()

disco.core.Disco.jobpack()

disco.core.Disco.oob_get()

disco.core.Disco.oob_list()

disco.core.Disco.profile_stats()

disco.core.Disco.purge()

disco.core.Disco.results()

disco.core.Disco.wait()

For instance, you can use job.wait() instead of disco.wait(job.name). The job methods in disco.core.Disco come in handy if you want to manipulate a job that is identified by a jobname instead of a Job object.

run(**jobargs)¶

Creates the JobPack for the worker using disco.worker.Worker.jobdict(), disco.worker.Worker.jobenvs(), disco.worker.Worker.jobhome(), disco.task.jobdata(), and attempts to submit it. This method executes on the client submitting a job to be run. More information on how job inputs are specified is available in disco.worker.Worker.jobdict(). The default worker implementation is called classic, and is implemented by disco.worker.classic.worker.

Parameters:	jobargs (dict) – runtime parameters for the job. Passed to the `disco.worker.Worker` methods listed above, along with the job itself. The interpretation of the jobargs is performed by the worker interface in `disco.worker.Worker` and the class implementing that interface (which defaults to `disco.worker.classic.worker`).
Raises:	`disco.error.JobError` if the submission fails.
Returns:	the `Job`, with a unique name assigned by the master.

class disco.job.JobPack(version, jobdict, jobenvs, jobhome, jobdata)¶

This class implements The Job Pack in Python. The attributes correspond to the fields in the job pack file. Use dumps() to serialize the JobPack for sending to the master.

jobdict¶

The dictionary of job parameters for the master.

disco.job – Disco Jobs¶

`disco.job` – Disco Jobs¶