Variants

The Variants component of the Genomcore API allows you to store and query genetic variant observations.

Get more information about the Variant API here

Create

Create variant observations.The following code starts the sdk, the logging module and loads an .env file. Assumes that the user has a list of observations(list of dictionaries) and sends

Reads a putative csv file with the necessary columns. Construct the list of dictionaries required and send the observations to the Variant API in chunks of 300 observations.

import logging
from dotenv import load_dotenv
from genomcore.client import GenomcoreApiClient

load_dotenv(override=True)

logging.basicConfig(
    level="INFO",
    format="[%(asctime)s][%(levelname)s][%(name)s] -- %(message)s",
    datefmt="%Y-%m-%d %H:%M:%S",
)

api = GenomcoreApiClient(
    token=os.getenv("TOKEN"),
    refresh_token=os.getenv("REFRESH_TOKEN")
)

observation_list = [
    {
        "uri": "SOME::URI::00000000001",
        "origin": "GERMLINE",
        "type": "SNV_INDEL",
        "collection": "SOME_COLLECTION",
        "Position": {
            "VCF_Genome": "hg38",
            "VCF_Chr": "15",
            "VCF_Position": "2000000",
            "VCF_ID": "some_id",
        },
        "Genotype": {
            "VCF_Allele_REF": "A"
            "VCF_Allele_ALT": "T"
            "INFO_CALC_Genotype": "0/1",
        },
        "Calling_statistics": {
            "VCF_Filter": "PASS",
            "VCF_Quality": 99,
        },
        "Gene": {
            "INFO_CSQ_SYMBOL": "some_SYMBOL",
        },
        "Feature": {
            "INFO_CSQ_Consequence": "synonymous",
            "INFO_CSQ_Feature": "NR_99999999",
            "INFO_CSQ_STRAND": "forward",
        }
    },
    {
        "uri": "SOME::URI::00000000002",
        "origin": "GERMLINE",
        "type": "SNV_INDEL",
        "collection": "SOME_COLLECTION",
        "Position": {
            "VCF_Genome": "hg38",
            "VCF_Chr": "15",
            "VCF_Position": "3000000",
            "VCF_ID": "some_id",
        },
        "Genotype": {
            "VCF_Allele_REF": "C"
            "VCF_Allele_ALT": "G"
            "INFO_CALC_Genotype": "1/1",
        },
        "Calling_statistics": {
            "VCF_Filter": "PASS",
            "VCF_Quality": 99,
        },
        "Gene": {
            "INFO_CSQ_SYMBOL": "some_SYMBOL",
        },
        "Feature": {
            "INFO_CSQ_Consequence": "non_synonymous",
            "INFO_CSQ_Feature": "NR_99999999",
            "INFO_CSQ_STRAND": "forward",
        }
    },
    ...
]

created_count = api.variants.create_observations(
    observations=observation_list,
    chunksize=300
)

Warning

The necessary fields for the variant observations are uri, origin, type, collection.

Variants of multiple types can be uploaded in the same request. Available types are: “SNV/INDEL”, “CNV”, “SV”

The values for collection and uri are not checked by the API. The user must provide these values at their own discretion.

Warning

Variant observation template is flexible and defined in a per project basis. The complete list of fields can be obtained using the method get_template().

Tip

The method will automatically detect if the total number of observations is greater than the chunk size and will upload them accordingly in chunks of the specified size. Or wihout chunks if the total number of observations is less than the chunk size.

In case of a chunked upload, if The parameter chunk_details is set to True, the method will log some information about the first and last observation of the chunk.

The parameter max_retries is deactivated (defaults to 0). The user can specify the number of retries in case of connection failure when uploading a chunk of observations.

Query

This example queries the variants observations of the collection “SomeCollection” using MongoDB filters.

More information about MongoDB filters can be found here.

from genomcore.client import GenomcoreApiClient

api = GenomcoreApiClient(token="A_VALID_TOKEN", refresh_token="A_VALID_REFRESH_TOKEN")

body = {
    "filter": {
        "collection": {"$in": ["SomeCollection"]}
    }
}

response = api.variants.query_observations(page = 0, pageSize = 500, body = body)

Note

The response is paginated with a page size of 500. The response will contain the first 500 observations.

Delete

This example deletes the the variants observations of the collection “SomeCollection”

from genomcore.client import GenomcoreApiClient

api = GenomcoreApiClient(token="A_VALID_TOKEN", refresh_token="A_VALID_REFRESH_TOKEN")

deleted_count = api.variants.delete_observations(collection="SomeCollection")

Note

The method will delete all the observations of the specified collection or uri. At least one of these parameters must be provided.

Warning

If the connection is closed before receiving the response, it could be due to a large deletion that takes longer than usual. The server might close the connection but still process the request successfully. Please verify if the data points were deleted correctly.

Template

The global template for the variant observations in the project can be obtained with the following code:

from genomcore.client import GenomcoreApiClient

api = GenomcoreApiClient(token="A_VALID_TOKEN", refresh_token="A_VALID_REFRESH_TOKEN")

template_dict = api.variants.get_template()