ALS Implicit

This page analyzes the hyperparameter tuning results for the implicit-feedback ALS matrix factorization model.

Parameter Search Space

/home/mde48/lenskit/lenskit-codex/.venv/lib/python3.12/site-packages/ray/tune/search/sample.py:700: RayDeprecationWarning: The `base` argument is deprecated. Please remove it as it is not actually needed in this method.

Parameter	Type	Distribution	Values	Selected
embedding_size	Integer	LogUniform	4 ≤ \(x\) ≤ 512	29
regularization.user	Float	LogUniform	1e-05 ≤ \(x\) ≤ 1	0.000426
regularization.item	Float	LogUniform	1e-05 ≤ \(x\) ≤ 1	0.000669
damping.user	Float	LogUniform	1e-12 ≤ \(x\) ≤ 100	0.00059
damping.item	Float	LogUniform	1e-12 ≤ \(x\) ≤ 100	4.75e-11
weight	Float	Uniform	5 ≤ \(x\) ≤ 100	5.1

Final Result

Searching selected the following configuration:

{
    'embedding_size': 29,
    'regularization': {'user': 0.0004256186789587035, 'item': 0.0006685413608869978},
    'damping': {'user': 0.0005900842648067328, 'item': 4.748990451426033e-11},
    'weight': 5.099171622868312,
    'epochs': 4
}

With these metrics:

{
    'RBP': 0.19025128426592744,
    'LogRBP': 2.1348304379208702,
    'NDCG': 0.441576760655177,
    'RecipRank': 0.36493155503425506,
    'TrainTask': '0df97675-64d0-4dd1-ac5c-16311ab09df0',
    'TrainTime': None,
    'TrainCPU': None,
    'max_epochs': 30,
    'done': False,
    'training_iteration': 4,
    'trial_id': '269b1cd6',
    'date': '2025-05-05_22-57-44',
    'timestamp': 1746500264,
    'time_this_iter_s': 32.28228139877319,
    'time_total_s': 145.5365447998047,
    'pid': 407079,
    'hostname': 'CCI-ws21',
    'node_ip': '10.248.127.152',
    'config': {
        'embedding_size': 29,
        'regularization': {'user': 0.0004256186789587035, 'item': 0.0006685413608869978},
        'damping': {'user': 0.0005900842648067328, 'item': 4.748990451426033e-11},
        'weight': 5.099171622868312,
        'epochs': 4
    },
    'time_since_restore': 145.5365447998047,
    'iterations_since_restore': 4
}

Parameter Analysis

Embedding Size

The embedding size is the hyperparameter that most affects the model’s fundamental logic, so let’s look at performance as a fufnction of it:

Learning Parameters

Iteration Completion

How many iterations, on average, did we complete?

How did the metric progress in the best result?

How did the metric progress in the longest results?