Files Controller¶
Data controller Practically unchanged from the original code in biomed-client library. Methods renamed from data to files, and some minor changes in the code to adapt it to use a single file-service. Merged with the original separated tools.py file from the same directory into the same file.
- class genomcore.controllers.files.FilesController(*args, **kwargs)¶
Bases:
BaseControllerData controller to interact with data section of the API
- _idproject¶
If set, applies idproject filter to any call that accepts filters
- Type:
int
- _idactor¶
If set, applies idactor filter to any call that accepts filters
- Type:
int
- static _format_extensions(allowed_extensions: List[str]) List[str]¶
Transform all extensions to lowercase and add a dot prefix if they don’t start with a dot.
It checks if the string does not start with a dot: in that case, it adds the dot and appends the extension to the returned list.
In case it starts with a dot, it checks with a regex that it does not have more than one dot: if it does, it raises an error; if it doesn’t, it adds the extension to the list that will be returned.
- is_alive() str¶
- _is_valid_extension(local_path: str, allowed_extensions: List[str], allowed_secondary_extensions: List[str])¶
Returns True/False depending on whether file extension is valid or not.
It takes the 1st primary extension with the 1st secondary extension, the 2nd primary extension with the 2nd secondary extension, and so on, then it checks for each combination if a valid extension has been found.
Finally, it stores those True/False checks in a list called valid_files and at the end it returns True if any of the elements in that list is True, meaning the file is valid for at least one of the conditions.
- _check_extensions(local_datas: List[LocalData], allowed_extensions: List[str], allowed_secondary_extensions: List[str])¶
Get a list of LocalData objects and check for each of them if its local_path extension is valid.
Raise an InvalidFileExtension at the first check that fails.
When entering this function, we already checked that allowed_extensions is not an empty list.
Warning
If any value in allowed_secondary_extensions is “null” or “.null”, it means that the file should end with the value given in its respective allowed_extensions element.
- unfold(iddata: str) List[UnfoldedData]¶
Return the unfolded data entries for a given iddata.
If the data ID is a Biomed directory, it will be the list of Data corresponding to said directory and its contents (files and subdirectories).
If it’s a Biomed directory from Public and it’s mounted, the returned list contains only the Biomed mounted folder.
If it’s a Biomed file, it will be a list of Data with just an entry with the file.
After the request, a relative_path key has to be added to the data dictionaries (creating a datas_with_relative_path list of dictionaries) to be able to convert it to UnfoldedData Biomed objects. The relative_path is computed subtracting the initial parent directory.
- Parameters:
iddata (str) – ID of the data to unfold
- Returns:
List of the corresponding data
- Return type:
List[UnfoldedData]
- create_folder(remote_dir)¶
Create a directory in biomed. Create intermediate directories
- Parameters:
remote_dir (str) – path to created directory
- Returns:
LocalData with new directory created
- Return type:
LocalData
- download(iddata: str, local_dir: str, allowed_extensions: list = None, allowed_secondary_extensions: list = None) List[LocalData]¶
Download single data.
Given an iddata, download datas corresponding to that iddata. If a file, just the file. If a directory, the results will include both the directory and the files inside it.
If allowed_extensions is given, it checks that all downloaded files match one of the extensions of the list.
If allowed_secondary_extensions is not an empty list, its length has to match the allowed_extensions list length, because it will match for every allowed_extensions element a secondary extension, e.g.:
>>> allowed_extensions = ["fastq", "bam", "bam"] >>> allowed_secondary_extensions = ["gz", "bai", "null"]
This will pass all checks if it ONLY downloads *.fastq.gz, *.bam.bai and *.bam files (*.fastq files are not allowed).
- Parameters:
iddata (str) – ID of the data to download
local_dir (str) – Directory to which datas will be downloaded
allowed_extensions (list) – List of extensions allowed for the downloaded files
allowed_secondary_extensions (list) – List of secondary extensions allowed for the downloaded files
- Returns:
List of the LocalData of the downloaded files/folders
- Return type:
List[LocalData]
- upload_multiple(file_paths: List[str], dest_dir: str = None, action: str = 'default') List[LocalData]¶
Upload multiple files to the Biomed project found in the auth token.
It first checks that the files to be uploaded exist locally, and then performs the request to upload them through the specified File Manager.
- Parameters:
file_paths (List[str]) – List of file paths to upload
dest_dir (str) – Directory path to which files will be uploaded
action (str) – Choose between ‘overwrite’, ‘default’, and ‘non-action’. Default value is ‘default’
- Returns:
list of uploaded files
- Return type:
List[LocalData]
- download_multiple(iddatas: List[str], local_dir: str, allowed_extensions: List[str] = None, allowed_secondary_extensions: List[str] = None) List[LocalData]¶
Download multiple files or directories.
Given a list of data IDS, download them to the provided local_dir. If any of the IDs is a Biomed folder, the result will include both the folder and the files inside it.
For information about the allowed_extensions and allowed_secondary_extensions parameters, take a look at the documentation in DataController.download() method.
- Parameters:
iddatas (List[str]) – list of files or directories to download
local_dir (str) – Directory to which datas will be downloaded
allowed_extensions (List[str]) – List of extensions allowed for the downloaded files
allowed_secondary_extensions (List[str]) – List of secondary extensions allowed for the downloaded files
- Returns:
List of LocalDatas of the downloaded files/folders.
- Return type:
List[LocalData]
- _check_if_exists(path: str)¶
Check if local path exists.
- _check_filetypes(data: LocalData)¶
Check if local file type and remote file type match.
- upload(data: LocalData, action: str = 'default', v2=False) LocalData¶
Uploads a local file.
Given a LocalData, check if local file exists and if local and remote filetypes match, and then uploads local file to Biomed.
- Parameters:
data (LocalData) – Data to update with the corresponding local_path attribute with the path to the actual file.
action (str) – Choose between ‘overwrite’, ‘default’, and ‘non-action’. Default value is ‘default’.
- Returns:
LocalData with updated information
- Return type:
LocalData
- upload_dir(local_dir: str, remote_dir: str, do_not_upload: str = None) List[LocalData]¶
Upload all the files of a directory to Biomed.
First, check that the input local_dir is really a directory. Then, use a static method to build the local_paths and remote_paths that are needed to create the LocalData objects that will be passed to self.upload() method.
Examples
If local_dir is ‘/tmp/tmp2/dir_to_upload’ and remote_dir is ‘A/B/C’, this method will upload to biomed the folder ‘/A/B/C/dir_to_upload’ with its contents.
- Parameters:
local_dir – Local directory to upload.
remote_dir – Remote biomed path where the local directory will be uploaded.
do_not_upload – empty files starting with the string defined here will not be uploaded.
- Returns:
A list of files uploaded to biomed (as LocalData objects).
- static _get_local_and_remote_paths(local_dir: str, remote_dir: str, do_not_upload: str) Tuple[List[str], List[str]]¶
Return a tuple of the paths of all files from local_dir and the new paths given a remote_dir.
Given a local directory (‘./A/B/C’) and a remote directory (‘D/E’) where you want to upload the local one, build the remote paths stripping the parent directories of local_dir from them, i.e. a file will have a remote_path like ‘D/E/C/file.txt’.
Note
Files starting with the string defined in do_not_upload will be removed from both lists.
- is_dir(iddata=None, path=None)¶
- Get if iddata or path is a directory.
You can only use one argument or iddata or path
- Parameters:
iddata (str) – Id of the data to get
path (str) – Id of the data to get
- Returns:
metadata from biomed as a Data object
- Return type:
Data
- get(iddata: str, unfolded: bool = False) Data | UnfoldedData¶
Get metadata of a data ID.
- Parameters:
iddata (str) – Id of the data to get the metadata from
unfolded (bool) – a boolean to return an UnfoldedData if this method is called from the ‘unfold’ method
- Returns:
metadata from biomed as a Data object
- Return type:
Data
- all() List[Data]¶
Get all data
Returns all datas from a project.
- Returns:
List[Data]
- filter(filters: Dict[str, Any]) List[Data]¶
Get filtered data with input filters
The input must contain the key search, and it will split its value between a parent directory (and add a leading ‘/’ if missing) and the basename of the last folder/file to avoid breaking backwards compatibility.
- Parameters:
filters (Dict[str,Any]) – dictionary with the filters to apply to FileService
- Returns:
List[Data]
- filter_kube(filters) List[Data]¶
Get filtered data
Returns data that follows filter criteria (filtered by project or actor if filters set at controllers levels)
- Parameters:
filters (Dict[str,Any]) – dictionary with the filters to apply to biomed backend
Examples
>>> filters = { >>> "parent": "aParentFolder", >>> "search": "test_regex_download_extensions/compression-no/biomed-1.0.0.tar", >>> "order_direction": "DESC", >>> "order_by": "updated", >>> }
- Returns:
List[Data]
- get_metadata(filters: Dict[str, Any]) Data¶
Get metadata of a file given a filter criteria
- Parameters:
filters (Dict[str,Any]) – dictionary with the filters to apply to biomed backend
Examples
>>> filters = { >>> "file": "/example_path/example_file.txt" #NOTE: Root path must have `/` >>> }
- Returns:
Biomed file or folder
- Return type:
Data
- get_public_url(iddata: str, ttl_in_seconds: int = 7200) str¶
Get public URL from an ID data (it accepts an ID from a file or a folder).
- Parameters:
iddata (str) – ID data to get public URL from.
ttl_in_seconds (int) – lifespan/expiration of generated URL in seconds (by default: 7200).
- delete(iddata: str) Data¶
Delete a file or folder given its data ID
- Parameters:
iddata (str) – ID data to delete.
- move(src: str, dst: str) Data¶
Move file or directory to other difrectory
- Parameters:
src (str) – ID data of file or directory.
dst (str) – ID data of directory.
- unlock_file(iddata: str, days_unlock: int = 10) str¶
Unlock a single file for a given number of days.
- Parameters:
iddata (str) – ID data to get public URL from.
days_unlock (int) – Number of days the file will remain unlocked. Default is 10 days.
- __annotations__ = {}¶
- __firstlineno__ = 45¶
- __static_attributes__ = ('_file_manager', '_file_manager_class', '_kube_requester', '_onprem_requester')¶