Supported Compression Algorithms

For a high-level introduction of all NNI pruners and quantizers, and the full list of parameters required for each compression algorithm in config_list, please refer to compressors. We maintain the same parameters for each compression algorithm as in the original NNI compression module.

In this section, we provide examples for all of the supported compression algorithms that include:

  • An example configuration (for one-time compression) to present the required “framework”, “type” and “compressor” parameters.

  • An example of a``aup.create_compressor`` call. If the compressor supports “dependency-aware” mode, it will be included in the call.

Pruners

Level Pruner

Supports both TensorFlow and PyTorch.

Configuration:

"compression": {
    "framework": "tensorflow" | "torch"
    "type": "pruning",
    "compressor": "level",
    "config_list": [{
            "sparsity": 0.8,
            "op_types": ["default"]
        }
    ]
}

Example creation:

aup.create_compressor(model, config, optimizer=None)

Slim Pruner

Configuration:

"compression": {
    "framework": "torch",
    "type": "pruning",
    "compressor": "slim",
    "config_list": [{
            "sparsity": 0.8,
            "op_types": ["BatchNorm2d"]
        }
    ]
}

Example creation:

aup.create_compressor(model, config, optimizer=None)

FPGM Pruner

This pruner supports a dependency-aware mode to get better speed-up from the pruning.

Configuration:

"compression": {
    "framework": "torch",
    "type": "pruning",
    "compressor": "fpgm",
    "config_list": [{
            "sparsity": 0.5,
            "op_types": ["Conv2d"]
        }
    ]
}

Example creation:

aup.create_compressor(model, config, optimizer=None, dependency_aware=False, dummy_input=None)

L1Filter Pruner

This pruner supports the dependency-aware mode.

Configuration:

"compression": {
    "framework": "torch",
    "type": "pruning",
    "compressor": "l1_filter",
    "config_list": [{
            "sparsity": 0.5,
            "op_types": ["Conv2d"]
        }
    ]
}

Example creation:

aup.create_compressor(model, config, optimizer=None, dependency_aware=False, dummy_input=None)

L2Filter Pruner

This pruner supports the dependency-aware mode.

Configuration:

"compression": {
    "framework": "torch",
    "type": "pruning",
    "compressor": "l2_filter",
    "config_list": [{
            "sparsity": 0.5,
            "op_types": ["Conv2d"]
        }
    ]
}

Example creation:

aup.create_compressor(model, config, optimizer=None, dependency_aware=False, dummy_input=None)

ActivationAPoZRankFilter Pruner

This pruner supports the dependency-aware mode.

Configuration:

"compression": {
    "framework": "torch",
    "type": "pruning",
    "compressor": "activation_apoz_rank_filter",
    "config_list": [{
            "sparsity": 0.5,
            "op_types": ["Conv2d"]
        }
    ]
}

Example creation:

aup.create_compressor(model, config, optimizer=None, activation='relu', statistics_batch_num=1, dependency_aware=False, dummy_input=None)

ActivationMeanRankFilter Pruner

This pruner supports the dependency-aware mode.

Configuration:

"compression": {
    "framework": "torch",
    "type": "pruning",
    "compressor": "activation_mean_rank_filter",
    "config_list": [{
            "sparsity": 0.5,
            "op_types": ["Conv2d"]
        }
    ]
}

Example creation:

aup.create_compressor(model, config, optimizer=None, activation='relu', statistics_batch_num=1, dependency_aware=False, dummy_input=None)

TaylorFOWeightFilter Pruner

This pruner supports the dependency-aware mode.

Configuration:

"compression": {
    "framework": "torch",
    "type": "pruning",
    "compressor": "taylor_fo_weight_filter",
    "config_list": [{
            "sparsity": 0.5,
            "op_types": ["Conv2d"]
        }
    ]
}

Example creation:

aup.create_compressor(model, config, optimizer=None, statistics_batch_num=1, dependency_aware=False, dummy_input=None)

AGP Pruner

Special requirements for usage (example):

compressor = aup.compression.create_compressor(model, config, optimizer=optimizer)
model = compressor.compress()

for epoch in range(1, args.epochs + 1):
    # ... train the model here for one epoch
    compressor.update_epoch(epoch)

Use compressor.update_epoch(epoch) to update epoch number when you finish one epoch in your training code.

Configuration:

"compression": {
    "framework": "torch",
    "type": "pruning",
    "compressor": "agp",
    "config_list": [{
            "initial_sparsity": 0.0,
            "final_sparsity": 0.8,
            "start_epoch": 0,
            "end_epoch": 10,
            "frequency": 1,
            "op_types": ["default"]
        }
    ]
}

Example creation:

aup.create_compressor(model, config, optimizer, pruning_algorithm='level')

NetAdapt Pruner

Configuration:

"compression": {
    "framework": "torch",
    "type": "pruning",
    "compressor": "net_adapt",
    "config_list": [{
            "sparsity": 0.5,
            "op_types": ["Conv2d"]
        }
    ]
}

Example creation:

aup.create_compressor(model, config, short_term_fine_tuner, evaluator, optimize_mode='maximize', base_algo='l1', sparsity_per_iteration=0.05, experiment_data_dir='./')

SimulatedAnnealing Pruner

Configuration:

"compression": {
    "framework": "torch",
    "type": "pruning",
    "compressor": "simulated_annealing",
    "config_list": [{
            "sparsity": 0.5,
            "op_types": ["Conv2d"]
        }
    ]
}

Example creation:

aup.create_compressor(model, config, evaluator, optimize_mode='maximize', base_algo='l1', start_temperature=100, stop_temperature=20, cool_down_rate=0.9, perturbation_magnitude=0.35, experiment_data_dir='./')

AutoCompress Pruner

Configuration:

"compression": {
    "framework": "torch",
    "type": "pruning",
    "compressor": "auto_compress",
    "config_list": [{
            "sparsity": 0.5,
            "op_types": ["Conv2d"]
        }
    ]
}

Example creation:

aup.create_compressor(model, config, trainer, evaluator, dummy_input, num_iterations=3, optimize_mode='maximize', base_algo='l1', start_temperature=100, stop_temperature=20, cool_down_rate=0.9, perturbation_magnitude=0.35, admm_num_iterations=30, admm_training_epochs=5, row=0.0001, experiment_data_dir='./')

AMC Pruner

Configuration:

"compression": {
    "framework": "torch",
    "type": "pruning",
    "compressor": "amc",
    "config_list": [{
            "op_types": ["Conv2d", "Linear"]
        }
    ]
}

Example creation:

aup.create_compressor(model, config, evaluator, val_loader, suffix=None, model_type='mobilenet', dataset='cifar10', flops_ratio=0.5, lbound=0.2, rbound=1.0, reward='acc_reward', n_calibration_batches=60, n_points_per_layer=10, channel_round=8, hidden1=300, hidden2=300, lr_c=0.001, lr_a=0.0001, warmup=100, discount=1.0, bsize=64, rmsize=100, window_length=1, tau=0.01, init_delta=0.5, delta_decay=0.99, max_episode_length=1000000000.0, output_dir='./logs', debug=False, train_episode=800, epsilon=50000, seed=None)

ADMM Pruner

Configuration:

"compression": {
    "framework": "torch",
    "type": "pruning",
    "compressor": "admm",
    "config_list": [{
            "sparsity": 0.5,
            "op_types": ["Conv2d"],
            "op_names": ["conv1"]
        }, {
            "sparsity": 0.5,
            "op_types": ["Conv2d"],
            "op_names": ["conv2"]
        }
    ]
}

Example creation:

aup.create_compressor(model, config, trainer, num_iterations=30, training_epochs=5, row=0.0001, base_algo='l1')

Lottery Ticket Hypothesis Pruner

Special requirements for usage (example):

compressor = aup.compression.create_compressor(model, config, optimizer=optimizer, lr_scheduler=scheduler)
model = compressor.compress()

for _ in compressor.get_prune_iterations():
    compressor.prune_iteration_start()
    for epoch in range(1, args.epochs + 1):
        # ... train model here for one epoch

Configuration:

"compression": {
    "framework": "torch",
    "type": "pruning",
    "compressor": "lottery_ticket",
    "config_list": [{
            "prune_iterations": 5,
            "sparsity": 0.8,
            "op_types": ["Conv2d"]
        }
    ]
}

Example creation:

aup.create_compressor(model, config, optimizer=None, lr_scheduler=None, reset_weights=True)

Sensitivity Pruner

Special requirements for usage (example):

compressor = aup.compression.create_compressor(model, config, finetuner=short_term_fine_tuner, evaluator=evaluator)
model = compressor.compress(eval_args=[model], finetune_args=[model])

Notice the arguments passed to compressor.compress.

Configuration:

"compression": {
    "framework": "torch",
    "type": "pruning",
    "compressor": "sensitivity",
    "config_list": [{
            "sparsity": 0.5,
            "op_types": ["Conv2d"]
        }
    ]
}

Example creation:

aup.create_compressor(model, config_list, evaluator, finetuner=None, base_algo='l1', sparsity_proportion_calc=None, sparsity_per_iter=0.1, acc_drop_threshold=0.05, checkpoint_dir=None)

Quantizers

Naive Quantizer

Configuration:

"compression": {
    "framework": "torch",
    "type": "quantization",
    "compressor": "naive",
    "config_list": []
}

Example creation:

aup.create_compressor(model, config)

QAT Quantizer

Configuration:

"compression": {
    "framework": "torch",
    "type": "quantization",
    "compressor": "qat",
    "config_list": [{
        "quant_types": ["weight"],
        "quant_bits": {
            "weight": 8
        },
        "op_types": ["Conv2d", "Linear"]
    }, {
        "quant_types": ["output"],
        "quant_bits": 8,
        "quant_start_step": 7000,
        "op_types":["ReLU6"]
    }]
}

Example creation:

aup.create_compressor(model, config)

DoReFa Quantizer

Configuration:

"compression": {
    "framework": "torch",
    "type": "quantization",
    "compressor": "dorefa",
    "config_list": [{
        "quant_types": ["weight"],
        "quant_bits": 8,
        "op_types": ["default"]
    }]
}

Example creation:

aup.create_compressor(model, config)

BNN Quantizer

Configuration:

"compression": {
    "framework": "torch",
    "type": "quantization",
    "compressor": "bnn",
    "config_list": [{
            "quant_bits": 1,
            "quant_types": ["weight"],
            "op_types": ["Conv2d", "Linear"],
            "op_names": ["conv1", "conv2", "fc1", "fc2"]
        }, {
            "quant_bits": 1,
            "quant_types": ["output"],
            "op_types": ["relu"],
            "op_names": ["relu1", "relu2", "relu3"]
    }]
}

Example creation:

aup.create_compressor(model, config)