"Fossies" - the Fresh Open Source Software Archive

Member "pytorch-1.8.2/docs/source/hub.rst" (23 Jul 2021, 6053 Bytes) of package /linux/misc/pytorch-1.8.2.tar.gz:


As a special service "Fossies" has tried to format the requested source page into HTML format (assuming markdown format). Alternatively you can here view or download the uninterpreted source code file. A member file download can also be achieved by clicking within a package contents listing on the according byte size field.

torch.hub

Pytorch Hub is a pre-trained model repository designed to facilitate research reproducibility.

Publishing models

Pytorch Hub supports publishing pre-trained models(model definitions and pre-trained weights) to a github repository by adding a simple hubconf.py file;

hubconf.py can have multiple entrypoints. Each entrypoint is defined as a python function (example: a pre-trained model you want to publish).

def entrypoint_name(*args, **kwargs):
    # args & kwargs are optional, for models which take positional/keyword arguments.
    ...

How to implement an entrypoint?

Here is a code snippet specifies an entrypoint for resnet18 model if we expand the implementation in pytorch/vision/hubconf.py. In most case importing the right function in hubconf.py is sufficient. Here we just want to use the expanded version as an example to show how it works. You can see the full script in pytorch/vision repo

dependencies = ['torch']
from torchvision.models.resnet import resnet18 as _resnet18

# resnet18 is the name of entrypoint
def resnet18(pretrained=False, **kwargs):
    """ # This docstring shows up in hub.help()
    Resnet18 model
    pretrained (bool): kwargs, load pretrained weights into the model
    """
    # Call the model, load pretrained weights
    model = _resnet18(pretrained=pretrained, **kwargs)
    return model
if pretrained:
    # For checkpoint saved in local github repo, e.g. <RELATIVE_PATH_TO_CHECKPOINT>=weights/save.pth
    dirname = os.path.dirname(__file__)
    checkpoint = os.path.join(dirname, <RELATIVE_PATH_TO_CHECKPOINT>)
    state_dict = torch.load(checkpoint)
    model.load_state_dict(state_dict)

    # For checkpoint saved elsewhere
    checkpoint = 'https://download.pytorch.org/models/resnet18-5c106cde.pth'
    model.load_state_dict(torch.hub.load_state_dict_from_url(checkpoint, progress=False))

Important Notice

Loading models from Hub

Pytorch Hub provides convenient APIs to explore all available models in hub through torch.hub.list(), show docstring and examples through torch.hub.help() and load the pre-trained models using torch.hub.load().

torch.hub

list

help

load

download_url_to_file

load_state_dict_from_url

Running a loaded model:

Note that *args and **kwargs in torch.hub.load() are used to instantiate a model. After you have loaded a model, how can you find out what you can do with the model? A suggested workflow is

To help users explore without referring to documentation back and forth, we strongly recommend repo owners make function help messages clear and succinct. It's also helpful to include a minimal working example.

Where are my downloaded models saved?

The locations are used in the order of

get_dir

set_dir

Caching logic

By default, we don't clean up files after loading it. Hub uses the cache by default if it already exists in the directory returned by ~torch.hub.get_dir().

Users can force a reload by calling hub.load(..., force_reload=True). This will delete the existing github folder and downloaded weights, reinitialize a fresh download. This is useful when updates are published to the same branch, users can keep up with the latest release.

Known limitations:

Torch hub works by importing the package as if it was installed. There're some side effects introduced by importing in Python. For example, you can see new items in Python caches sys.modules and sys.path_importer_cache which is normal Python behavior.

A known limitation that worth mentioning here is user CANNOT load two different branches of the same repo in the same python process. It's just like installing two packages with the same name in Python, which is not good. Cache might join the party and give you surprises if you actually try that. Of course it's totally fine to load them in separate processes.