FlexMF BPR on ML10M

This page analyzes the hyperparameter tuning results for the FlexMF scorer in implicit-feedback mode with pairwise loss (Bayesian Personalized Ranking).

Parameter Search Space

Parameter Type Distribution Values Selected
embedding_size Integer LogUniform 4 ≤ \(x\) ≤ 512 125
regularization Float LogUniform 0.0001 ≤ \(x\) ≤ 10 0.187
learning_rate Float LogUniform 0.001 ≤ \(x\) ≤ 0.1 0.00238
reg_method Categorical Uniform L2, AdamW AdamW
negative_count Integer Uniform 1 ≤ \(x\) ≤ 5 4
item_bias Categorical Uniform True, False True

Final Result

Searching selected the following configuration:

{
    'embedding_size': 125,
    'regularization': 0.18690470640031773,
    'learning_rate': 0.002384935477030483,
    'reg_method': 'AdamW',
    'negative_count': 4,
    'item_bias': True,
    'epochs': 12
}

With these metrics:

{
    'RBP': 0.2523367609430753,
    'NDCG': 0.48489473623806856,
    'RecipRank': 0.44162699373801495,
    'TrainTask': '99fe91b6-061b-45a1-a530-075dfa74a2dd',
    'TrainTime': None,
    'TrainCPU': None,
    'max_epochs': 50,
    'done': True,
    'training_iteration': 12,
    'trial_id': '6fcf9_00005',
    'date': '2025-04-04_22-34-50',
    'timestamp': 1743820490,
    'time_this_iter_s': 7.355039834976196,
    'time_total_s': 91.3506453037262,
    'pid': 502774,
    'hostname': 'CCI-ws21',
    'node_ip': '10.248.127.152',
    'config': {
        'embedding_size': 125,
        'regularization': 0.18690470640031773,
        'learning_rate': 0.002384935477030483,
        'reg_method': 'AdamW',
        'negative_count': 4,
        'item_bias': True,
        'epochs': 12
    },
    'time_since_restore': 15.703184127807617,
    'iterations_since_restore': 2
}

Parameter Analysis

Embedding Size

The embedding size is the hyperparameter that most affects the model’s fundamental logic, so let’s look at performance as a fufnction of it:

Data Handling

Learning Parameters

Iteration Completion

How many iterations, on average, did we complete?

How did the metric progress in the best result?

How did the metric progress in the longest results?