servicex
package¶
Module contents¶
- servicex.Delivery(value)¶
alias of
DeliveryEnum
:Member Type:str
- pydantic model servicex.General[source]¶
Bases:
DocStringBaseModel
Represents a group of samples to be transformed together.
- Parameters:
Codegen – (typing.Optional[str]) Code generator name to be applied across all of the samples, if applicable. Generally users don’t need to specify this. It is implied by the query class
OutputFormat – (OutputFormatEnum) Output format for the transform request.
Delivery – (DeliveryEnum) Specifies the delivery method for the output files.
OutputDirectory – (typing.Optional[str]) Directory to output a yaml file describing the output files.
OutFilesetName – (str) Name of the yaml file that will be created in the output directory.
IgnoreLocalCache – (bool) Flag to ignore local cache for all samples.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- enum OutputFormatEnum(value)[source]¶
Bases:
str
,Enum
Specifies the output format for the transform request.
- Member Type:
str
Valid values are as follows:
- parquet = <OutputFormatEnum.parquet: 'parquet'>¶
- root_ttree = <OutputFormatEnum.root_ttree: 'root-ttree'>¶
The
Enum
and its members also have the following methods:- to_ResultFormat() ResultFormat [source]¶
This method is used to convert the OutputFormatEnum enum to the ResultFormat enum, which is what is actually used for the TransformRequest. This allows us to use different string values in the two enum classes to maintain backend compatibility
- enum DeliveryEnum(value)[source]¶
Bases:
str
,Enum
- Member Type:
str
Valid values are as follows:
- LocalCache = <DeliveryEnum.LocalCache: 'LocalCache'>¶
- URLs = <DeliveryEnum.URLs: 'URLs'>¶
- field Codegen: str | None = None¶
Code generator name to be applied across all of the samples, if applicable. Generally users don’t need to specify this. It is implied by the query class
Code generator name to be applied across all of the samples, if applicable. Generally users don’t need to specify this. It is implied by the query class
- field OutputFormat: OutputFormatEnum = OutputFormatEnum.root_ttree¶
Output format for the transform request.
- field Delivery: DeliveryEnum = DeliveryEnum.LocalCache¶
Specifies the delivery method for the output files.
- field OutputDirectory: str | None = None¶
Directory to output a yaml file describing the output files.
- field OutFilesetName: str = 'servicex_fileset'¶
Name of the yaml file that will be created in the output directory.
- field IgnoreLocalCache: bool = False¶
Flag to ignore local cache for all samples.
- servicex.OutputFormat(value)¶
alias of
OutputFormatEnum
:Member Type:str
- enum servicex.ProgressBarFormat(value)[source]¶
Bases:
str
,Enum
Specify the way progress bars are displayed.
- Member Type:
str
Valid values are as follows:
- expanded = <ProgressBarFormat.expanded: 'expanded'>¶
- compact = <ProgressBarFormat.compact: 'compact'>¶
- none = <ProgressBarFormat.none: 'none'>¶
- enum servicex.ResultDestination(value)[source]¶
Bases:
str
,Enum
Direct the output to object store or posix volume
- Member Type:
str
Valid values are as follows:
- object_store = <ResultDestination.object_store: 'object-store'>¶
- volume = <ResultDestination.volume: 'volume'>¶
- pydantic model servicex.Sample[source]¶
Bases:
DocStringBaseModel
Represents a single transform request within a larger submission.
- Parameters:
Name – (str)
Dataset – (typing.Optional[servicex.dataset_identifier.DataSetIdentifier])
NFiles – (typing.Optional[int])
Query – (typing.Union[str, servicex.query_core.QueryStringGenerator, NoneType])
IgnoreLocalCache – (bool)
Codegen – (typing.Optional[str])
RucioDID – (typing.Optional[str])
XRootDFiles – (typing.Union[str, typing.List[str], NoneType])
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Validators:
validate_did_xor_file
»all fields
validate_nfiles_is_not_zero
»all fields
- field Name: str [Required]¶
The name of the sample. This makes it easier to identify the sample in the output.
- field Dataset: DataSetIdentifier | None = None¶
Dataset identifier for the sample
- field NFiles: int | None = None¶
Limit the Number of files to be used in the sample. The DID Finder will guarantee the same files will be returned between each invocation. Set to None to use all files.
- field Query: str | QueryStringGenerator | None = None¶
Query string or query generator for the sample.
- field IgnoreLocalCache: bool = False¶
Flag to ignore local cache.
- field Codegen: str | None = None¶
Code generator name, if applicable. Generally users don’t need to specify this. It is implied by the query class
- field RucioDID: str | None = None¶
- Rucio Dataset Identifier, if applicable.
Deprecated: Use ‘Dataset’ instead.
- field XRootDFiles: str | List[str] | None = None¶
- XRootD file(s) associated with the sample.
Deprecated: Use ‘Dataset’ instead.
- property dataset_identifier: DataSetIdentifier¶
Access the dataset identifier for the sample.
- validator validate_did_xor_file » all fields[source]¶
Ensure that only one of Dataset, RootFile, or RucioDID is specified. :param values: :return:
- validator truncate_long_sample_name » Name[source]¶
Truncate sample name to 128 characters if exceed Print warning message
- property hash¶
- pydantic model servicex.ServiceXSpec[source]¶
Bases:
DocStringBaseModel
ServiceX Submission Specification - pass this into the ServiceX deliver function
- Parameters:
General – (General) General settings for the transform request
Sample – (typing.List[servicex.databinder_models.Sample]) List of samples to be transformed
Definition – (typing.Optional[typing.List]) Any reusable definitions that are needed for the transform request
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Validators:
- field General: General = General(Codegen=None, OutputFormat=<OutputFormatEnum.root_ttree: 'root-ttree'>, Delivery=<DeliveryEnum.LocalCache: 'LocalCache'>, OutputDirectory=None, OutFilesetName='servicex_fileset', IgnoreLocalCache=False)¶
General settings for the transform request
- field Definition: List | None = None¶
Any reusable definitions that are needed for the transform request
- servicex.deliver(spec: ServiceXSpec | Mapping[str, Any] | str | Path, config_path: str | None = None, servicex_name: str | None = None, return_exceptions: bool = True, fail_if_incomplete: bool = True, ignore_local_cache: bool = False, progress_bar: ProgressBarFormat = ProgressBarFormat.expanded, concurrency: int = 10)[source]¶
Execute a ServiceX query.
- Parameters:
spec – The specification of the ServiceX query, either in a dictionary or a
ServiceXSpec
object.config_path – The filesystem path to search for the servicex.yaml or .servicex file.
servicex_name – The name of the ServiceX instance, as specified in the configuration YAML file (None will give the default backend).
return_exceptions – If something goes wrong, bubble up the underlying exception for debugging (as opposed to just having a generic error).
fail_if_incomplete – If
True
: if not all input files are transformed, the transformation will be marked as a failure and no outputs will be available. IfFalse
, a partial file list will be returned.ignore_local_cache – If
True
, ignore the local query cache and always run the query on the remote ServiceX instance.progress_bar – specify the kind of progress bar to show.
ProgressBarFormat.expanded
(the default) means everySample
will have its own progress bars;ProgressBarFormat.compact
gives one summary progress bar for all transformations;ProgressBarFormat.none
switches off progress bars completely.concurrency – specify how many downloads to run in parallel (default is 8).
- Returns:
A dictionary mapping the name of each
Sample
to aGuardList
with the file names or URLs for the outputs.