popv.hub.HubModel#
- class popv.hub.HubModel(local_dir, metadata=None, repo_name=None, model_card=None, ontology_dir=None)[source]#
Wrapper for
BaseModelClassbacked by HuggingFace Hub.- Parameters:
repo_name (
str|None(default:None)) – ID of the HuggingFace repo where this model is uploadedlocal_dir (
str) – Local directory where the data and pre-trained model reside.metadata (
dict|str|None(default:None)) – Dict or a path to a file on disk where this metadata can be read from.model_card (
HubModelCardHelper|ModelCard|str|None(default:None)) – The model card for this pre-trained model. Model card is a markdown file that describes the pre-trained model/data and is displayed on HuggingFace. This can be either an instance ofModelCardor an instance ofHubModelCardHelperthat wraps the model card or a path to a file on disk where the model card can be read from.
Attributes table#
Returns the full training data for this model. |
|
The local directory where the data and pre-trained model reside. |
|
The metadata for this model. |
|
Returns the minified data for this model. |
|
The model card for this model. |
|
The local directory where the models are downloaded. |
|
The local directory where the data and pre-trained model reside. |
Methods table#
|
Annotate the query data with the trained model. |
|
Map genes to CELLxGENE census gene IDs. |
|
Download the given model repo from HuggingFace. |
|
Push this model to HuggingFace. |
|
Save the model card and metadata to the model directory. |
Attributes#
- HubModel.adata[source]#
Returns the full training data for this model.
If the data has not been loaded yet, this will call
cellxgene_census.download_source_h5ad(). Otherwise, it will simply return the loaded data.
- HubModel.minified_adata[source]#
Returns the minified data for this model.
If the data has not been loaded yet, this will call
scanpy.read_h5ad(). Otherwise, it will simply return the loaded data.
Methods#
- HubModel.annotate_data(query_adata, query_batch_key=None, save_path='tmp', prediction_mode='fast', methods=None, gene_symbols=None)[source]#
Annotate the query data with the trained model.
- Parameters:
query_adata (
AnnData) – The query data to annotate.query_batch_key (
str|None(default:None)) – The batch key in the query data.save_path (
str(default:'tmp')) – Path to save the query models.prediction_mode (
str(default:'fast')) – The prediction mode to use. Either “fast” or “inference”. “fast” will only predict on the query data, while “inference” will integrate query and reference data.methods (
list|None(default:None)) – List of methods to use for annotation. If None, all methods in the model will be used.gene_symbols (
str|None(default:None)) – Gene symbols given as query_adata.var_names.
- Return type:
- Returns:
AnnData The annotated data.
- classmethod HubModel.pull_from_huggingface_hub(repo_name, cache_dir=None, revision=None, **kwargs)[source]#
Download the given model repo from HuggingFace.
The model, its card, data, metadata are downloaded to a cached location on disk selected by HuggingFace and an instance of this class is created with that info and returned.
- Parameters:
repo_name (
str) – ID of the HuggingFace repo where this model needs to be uploadedcache_dir (
str|None(default:None)) – The directory where the downloaded model artifacts will be cachedrevision (
str|None(default:None)) – The revision to pull from the repo. This can be a branch name, a tag, or a full-length commit hash. If None, the default (latest) revision is pulled.kwargs – Additional keyword arguments to pass to
huggingface_hub.snapshot_download().
- HubModel.push_to_huggingface_hub(repo_name, repo_token=None, repo_create=False, repo_create_kwargs=None, collection_slug=None, delete_existing_files=False, **kwargs)[source]#
Push this model to HuggingFace.
If the dataset is too large to upload to HuggingFace, this will raise an exception prompting the user to upload the data elsewhere. Otherwise, the data, model card, and metadata are all uploaded to the given model repo.
- Parameters:
repo_name (
str) – ID of the HuggingFace repo where this model needs to be uploadedrepo_token (
str|None(default:None)) – HuggingFace API token with write permissions if None uses token in HfFolder.get_token()repo_create (
bool(default:False)) – Whether to create the reporepo_create_kwargs (
dict|None(default:None)) – Keyword arguments passed intohuggingface_hub.HfApi.create_repo()ifrepo_create=True.collection_slug (
str|None(default:None)) – The internal name in HuggingFace for a dataset collection.delete_existing_files (
bool(default:False)) – Whether to delete existing files in the repo before uploading new ones.**kwargs – Additional keyword arguments passed into
huggingface_hub.HfApi.upload_file().