FlexMF WARP

This page analyzes the hyperparameter tuning results for the FlexMF scorer in implicit-feedback mode with WARP loss.

Parameter Search Space

/home/mde48/lenskit/lenskit-codex/.venv/lib/python3.12/site-packages/ray/tune/search/sample.py:700: RayDeprecationWarning: The `base` argument is deprecated. Please remove it as it is not actually needed in this method.

Parameter	Type	Distribution	Values	Selected
embedding_size	Integer	LogUniform	4 ≤ \(x\) ≤ 512	28
regularization	Float	LogUniform	0.0001 ≤ \(x\) ≤ 10	0.0286
learning_rate	Float	LogUniform	0.001 ≤ \(x\) ≤ 0.1	0.00408
reg_method	Categorical	Uniform	L2, AdamW	AdamW
item_bias	Categorical	Uniform	True, False	False

Final Result

Searching selected the following configuration:

{
    'embedding_size': 28,
    'regularization': 0.02855101489089173,
    'learning_rate': 0.004078197462005269,
    'reg_method': 'AdamW',
    'item_bias': False,
    'epochs': 14
}

With these metrics:

{
    'RBP': 0.17094542879019017,
    'DCG': 11.334649470685775,
    'NDCG': 0.41540282134128215,
    'RecipRank': 0.33539222916924855,
    'Hit10': 0.5556309362279511,
    'max_epochs': 50,
    'epoch_train_s': 70.86528935699607,
    'epoch_measure_s': 17.860809533995052,
    'done': True,
    'training_iteration': 14,
    'trial_id': 'db5a5260',
    'date': '2025-07-29_00-10-20',
    'timestamp': 1753762220,
    'time_this_iter_s': 88.72992277145386,
    'time_total_s': 1232.675313949585,
    'pid': 179487,
    'hostname': 'CCI-ws21',
    'node_ip': '10.248.127.152',
    'config': {
        'embedding_size': 28,
        'regularization': 0.02855101489089173,
        'learning_rate': 0.004078197462005269,
        'reg_method': 'AdamW',
        'item_bias': False,
        'epochs': 14
    },
    'time_since_restore': 1232.675313949585,
    'iterations_since_restore': 14
}

Parameter Analysis

Embedding Size

The embedding size is the hyperparameter that most affects the model’s fundamental logic, so let’s look at performance as a fufnction of it:

Iteration Completion

How many iterations, on average, did we complete?

How did the metric progress in the best result?

How did the metric progress in the longest results?