servicex package

Module contents

servicex.Delivery(value)

alias of DeliveryEnum :Member Type: str

pydantic model servicex.General[source]

Bases: DocStringBaseModel

Represents a group of samples to be transformed together.

Parameters:
  • Codegen – (typing.Optional[str]) Code generator name to be applied across all of the samples, if applicable. Generally users don’t need to specify this. It is implied by the query class

  • OutputFormat – (OutputFormatEnum) Output format for the transform request.

  • Delivery – (DeliveryEnum) Specifies the delivery method for the output files.

  • OutputDirectory – (typing.Optional[str]) Directory to output a yaml file describing the output files.

  • OutFilesetName – (str) Name of the yaml file that will be created in the output directory.

  • IgnoreLocalCache – (bool) Flag to ignore local cache for all samples.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

enum OutputFormatEnum(value)[source]

Bases: str, Enum

Specifies the output format for the transform request.

Member Type:

str

Valid values are as follows:

parquet = <OutputFormatEnum.parquet: 'parquet'>
root_ttree = <OutputFormatEnum.root_ttree: 'root-ttree'>

The Enum and its members also have the following methods:

to_ResultFormat() ResultFormat[source]

This method is used to convert the OutputFormatEnum enum to the ResultFormat enum, which is what is actually used for the TransformRequest. This allows us to use different string values in the two enum classes to maintain backend compatibility

enum DeliveryEnum(value)[source]

Bases: str, Enum

Member Type:

str

Valid values are as follows:

LocalCache = <DeliveryEnum.LocalCache: 'LocalCache'>
URLs = <DeliveryEnum.URLs: 'URLs'>
field Codegen: str | None = None

Code generator name to be applied across all of the samples, if applicable. Generally users don’t need to specify this. It is implied by the query class

Code generator name to be applied across all of the samples, if applicable. Generally users don’t need to specify this. It is implied by the query class

field OutputFormat: OutputFormatEnum = OutputFormatEnum.root_ttree

Output format for the transform request.

field Delivery: DeliveryEnum = DeliveryEnum.LocalCache

Specifies the delivery method for the output files.

field OutputDirectory: str | None = None

Directory to output a yaml file describing the output files.

field OutFilesetName: str = 'servicex_fileset'

Name of the yaml file that will be created in the output directory.

field IgnoreLocalCache: bool = False

Flag to ignore local cache for all samples.

servicex.OutputFormat(value)

alias of OutputFormatEnum :Member Type: str

enum servicex.ProgressBarFormat(value)[source]

Bases: str, Enum

Specify the way progress bars are displayed.

Member Type:

str

Valid values are as follows:

expanded = <ProgressBarFormat.expanded: 'expanded'>
compact = <ProgressBarFormat.compact: 'compact'>
none = <ProgressBarFormat.none: 'none'>
enum servicex.ResultDestination(value)[source]

Bases: str, Enum

Direct the output to object store or posix volume

Member Type:

str

Valid values are as follows:

object_store = <ResultDestination.object_store: 'object-store'>
volume = <ResultDestination.volume: 'volume'>
pydantic model servicex.Sample[source]

Bases: DocStringBaseModel

Represents a single transform request within a larger submission.

Parameters:
  • Name – (str)

  • Dataset – (typing.Optional[servicex.dataset_identifier.DataSetIdentifier])

  • NFiles – (typing.Optional[int])

  • Query – (typing.Union[str, servicex.query_core.QueryStringGenerator, NoneType])

  • IgnoreLocalCache – (bool)

  • Codegen – (typing.Optional[str])

  • RucioDID – (typing.Optional[str])

  • XRootDFiles – (typing.Union[str, typing.List[str], NoneType])

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Validators:
field Name: str [Required]

The name of the sample. This makes it easier to identify the sample in the output.

field Dataset: DataSetIdentifier | None = None

Dataset identifier for the sample

field NFiles: int | None = None

Limit the Number of files to be used in the sample. The DID Finder will guarantee the same files will be returned between each invocation. Set to None to use all files.

field Query: str | QueryStringGenerator | None = None

Query string or query generator for the sample.

field IgnoreLocalCache: bool = False

Flag to ignore local cache.

field Codegen: str | None = None

Code generator name, if applicable. Generally users don’t need to specify this. It is implied by the query class

field RucioDID: str | None = None
Rucio Dataset Identifier, if applicable.

Deprecated: Use ‘Dataset’ instead.

field XRootDFiles: str | List[str] | None = None
XRootD file(s) associated with the sample.

Deprecated: Use ‘Dataset’ instead.

property dataset_identifier: DataSetIdentifier

Access the dataset identifier for the sample.

validator validate_did_xor_file  »  all fields[source]

Ensure that only one of Dataset, RootFile, or RucioDID is specified. :param values: :return:

validator validate_nfiles_is_not_zero  »  all fields[source]

Ensure that NFiles is not set to zero

validator truncate_long_sample_name  »  Name[source]

Truncate sample name to 128 characters if exceed Print warning message

property hash
pydantic model servicex.ServiceXSpec[source]

Bases: DocStringBaseModel

ServiceX Submission Specification - pass this into the ServiceX deliver function

Parameters:
  • General – (General) General settings for the transform request

  • Sample – (typing.List[servicex.databinder_models.Sample]) List of samples to be transformed

  • Definition – (typing.Optional[typing.List]) Any reusable definitions that are needed for the transform request

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Validators:
field General: General = General(Codegen=None, OutputFormat=<OutputFormatEnum.root_ttree: 'root-ttree'>, Delivery=<DeliveryEnum.LocalCache: 'LocalCache'>, OutputDirectory=None, OutFilesetName='servicex_fileset', IgnoreLocalCache=False)

General settings for the transform request

field Sample: List[Sample] [Required]

List of samples to be transformed

field Definition: List | None = None

Any reusable definitions that are needed for the transform request

validator validate_unique_sample  »  Sample[source]
servicex.deliver(spec: ServiceXSpec | Mapping[str, Any] | str | Path, config_path: str | None = None, servicex_name: str | None = None, return_exceptions: bool = True, fail_if_incomplete: bool = True, ignore_local_cache: bool = False, progress_bar: ProgressBarFormat = ProgressBarFormat.expanded, concurrency: int = 10)[source]

Execute a ServiceX query.

Parameters:
  • spec – The specification of the ServiceX query, either in a dictionary or a ServiceXSpec object.

  • config_path – The filesystem path to search for the servicex.yaml or .servicex file.

  • servicex_name – The name of the ServiceX instance, as specified in the configuration YAML file (None will give the default backend).

  • return_exceptions – If something goes wrong, bubble up the underlying exception for debugging (as opposed to just having a generic error).

  • fail_if_incomplete – If True: if not all input files are transformed, the transformation will be marked as a failure and no outputs will be available. If False, a partial file list will be returned.

  • ignore_local_cache – If True, ignore the local query cache and always run the query on the remote ServiceX instance.

  • progress_bar – specify the kind of progress bar to show. ProgressBarFormat.expanded (the default) means every Sample will have its own progress bars; ProgressBarFormat.compact gives one summary progress bar for all transformations; ProgressBarFormat.none switches off progress bars completely.

  • concurrency – specify how many downloads to run in parallel (default is 8).

Returns:

A dictionary mapping the name of each Sample to a GuardList with the file names or URLs for the outputs.