Member "cloudkitty-9.0.0/doc/source/developer/collector.rst" (10 Apr 2019, 5285 Bytes)

Data format

Internally, CloudKitty's data format is a bit more detailled than what can be found in the architecture documentation.

The internal data format is the following:

    "bananas": [
            "vol": {
                "unit": "banana",
                "qty": 1
            "rating": {
                "price": 1
            "groupby": {
                "xxx_id": "hello",
                "yyy_id": "bye",
            "metadata": {
                "flavor": "chocolate",
                "eaten_by": "gorilla",

However, developers implementing a collector don't need to format the data themselves, as there are helper functions for these matters.


Each collector must implement the following class:


The retrieve method of the BaseCollector class is called by the orchestrator. This method calls the fetch_all method of the child class.

To create a collector, you need to implement at least the fetch_all method.

Data collection

Collectors must implement a fetch_all method. This method is called for each metric type, for each scope, for each collect period. It has the following prototype:


This method is supposed to return a list of objects formatted by CloudKittyFormatTransformer.

Example code of a basic collector:

from cloudkitty.collector import BaseCollector

class MyCollector(BaseCollector):
    def __init__(self, **kwargs):
        super(MyCollector, self).__init__(**kwargs)

    def fetch_all(self, metric_name, start, end,
                  project_id=None, q_filter=None):
        data = []
        for CONDITION:
            # do stuff
                groupby, # dict
                metadata, # dict
                unit, # str
                qty=qty, # int / float

        return data

project_id can be misleading, as it is a legacy name. It contains the ID of the current scope. The attribute corresponding to the scope is specified in the configuration, under [collect]/scope_key. Thus, all queries should filter based on this attribute. Example:

from oslo_config import cfg

from cloudkitty.collector import BaseCollector


class MyCollector(BaseCollector):
    def __init__(self, **kwargs):
        super(MyCollector, self).__init__(**kwargs)

    def fetch_all(self, metric_name, start, end,
                  project_id=None, q_filter=None):
        scope_key = CONF.collect.scope_key
        filters = {'start': start, 'stop': stop, scope_key: project_id}

        data = self.client.query(
        # Format data etc
        return output

Additional configuration

If you need to extend the metric configuration (add parameters to the extra_args section of metrics.yml), you can overload the check_configuration method of the base collector:


This method uses voluptuous for data validation. The base schema for each metric can be found in cloudkitty.collector.METRIC_BASE_SCHEMA. This schema is meant to be extended by other collectors. Example taken from the gnocchi collector code:

from cloudkitty import collector

    Required('extra_args'): {
        Required('resource_type'): All(str, Length(min=1)),
        # Due to Gnocchi model, metric are grouped by resource.
        # This parameter allows to adapt the key of the resource identifier
        Required('resource_key', default='id'): All(str, Length(min=1)),
        Required('aggregation_method', default='max'):
            In(['max', 'mean', 'min']),

class GnocchiCollector(collector.BaseCollector):

    collector_name = 'gnocchi'

    def check_configuration(conf):
        conf = collector.BaseCollector.check_configuration(conf)
        metric_schema = Schema(collector.METRIC_BASE_SCHEMA).extend(

        output = {}
        for metric_name, metric in conf.items():
            met = output[metric_name] = metric_schema(metric)

            if met['extra_args']['resource_key'] not in met['groupby']:

        return output

If your collector does not need any extra_args, it is not required to overload the check_configuration method.