ML25M Data Description

Rating Statistics

Ratings 25,000,095
Users 162,541
Items 59,047
Density 0.260%
Item Gini 0.942
Start Date 1995-01-09 11:46:49
End Date 2019-11-21 09:15:03

Item Statistics

Now let’s look at the distributions of item statistics. What is the distribution of popularity?

Let’s also look at this as a Lorenz curve, for clarity:

What is the distribution of average ratings?

User Statistics

How are user averages distributed?

And what is the distribution of user activity levels (# of ratings)?

Ratings over Time

The MovieLens ratings have timestamps, so we’ll also look at a temporal view of the data.

Data Volume

How did the data grow over time?

How many ratings are we getting each month through the life of the data set?

User Activity

Monthly unique users is a good measure of user activity.

How long do users usually stick around?