Skip to content

Commit

Permalink
Documentation and tests for AsymKNN
Browse files Browse the repository at this point in the history
  • Loading branch information
mkhe93 committed Dec 1, 2024
1 parent d72d4cb commit deb0972
Show file tree
Hide file tree
Showing 8 changed files with 128 additions and 0 deletions.
14 changes: 14 additions & 0 deletions asset/model_list.json
Original file line number Diff line number Diff line change
Expand Up @@ -154,6 +154,20 @@
"repository": "RecBole",
"repo_link": "https://github.com/RUCAIBox/RecBole"
},
{
"category": "General Recommendation",
"cate_link": "/docs/user_guide/model_intro.html#general-recommendation",
"year": "2013",
"pub": "RecSys'13",
"model": "AsymKNN",
"model_link": "/docs/user_guide/model/general/asymknn.html",
"paper": "Efficient Top-N Recommendation for Very Large Scale Binary Rated Datasets",
"paper_link": "https://doi.org/10.1145/2507157.2507189",
"authors": "Fabio Aiolli",
"ref_code": "",
"repository": "RecBole",
"repo_link": "https://github.com/RUCAIBox/RecBole"
},
{
"category": "General Recommendation",
"cate_link": "/docs/user_guide/model_intro.html#general-recommendation",
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
.. automodule:: recbole.model.general_recommender.asymknn
:members:
:undoc-members:
:show-inheritance:
1 change: 1 addition & 0 deletions docs/source/recbole/recbole.model.general_recommender.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ recbole.model.general\_recommender
.. toctree::
:maxdepth: 4

recbole.model.general_recommender.asymknn
recbole.model.general_recommender.admmslim
recbole.model.general_recommender.bpr
recbole.model.general_recommender.cdae
Expand Down
88 changes: 88 additions & 0 deletions docs/source/user_guide/model/general/asymknn.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
AsymKNN
===========

Introduction
---------------------

`[paper] <https://dl.acm.org/doi/pdf/10.1145/2507157.25071896>`_

**Title:** Efficient Top-N Recommendation for Very Large Scale Binary Rated Datasets

**Authors:** Fabio Aiolli

**Abstract:** We present a simple and scalable algorithm for top-N recommendation able to deal with very large datasets and (binary rated) implicit feedback. We focus on memory-based collaborative filtering
algorithms similar to the well known neighboor based technique for explicit feedback. The major difference, that makes the algorithm particularly scalable, is that it uses positive feedback only
and no explicit computation of the complete (user-by-user or itemby-item) similarity matrix needs to be performed.
The study of the proposed algorithm has been conducted on data from the Million Songs Dataset (MSD) challenge whose task was to suggest a set of songs (out of more than 380k available songs) to more than 100k users given half of the user listening history and
complete listening history of other 1 million people.
In particular, we investigate on the entire recommendation pipeline, starting from the definition of suitable similarity and scoring functions and suggestions on how to aggregate multiple ranking strategies to define the overall recommendation. The technique we are
proposing extends and improves the one that already won the MSD challenge last year.

In this article, we introduce a versatile class of recommendation algorithms that calculate either user-to-user or item-to-item similarities as the foundation for generating recommendations. This approach enables the flexibility to switch between UserKNN and ItemKNN models depending on the desired application.

A distinguishing feature of this class of algorithms, exemplified by AsymKNN, is its use of asymmetric cosine similarity, which generalizes the traditional cosine similarity. Specifically, when the asymmetry parameter
``alpha = 0.5``, the method reduces to the standard cosine similarity, while other values of ``alpha`` allow for tailored emphasis on specific aspects of the interaction data. Furthermore, setting the parameter
``beta = 1.0`` ensures a traditional UserKNN or ItemKNN, as the final scores are only divided by a fixed positive constant, preserving the same order of recommendations.

Running with RecBole
-------------------------

**Model Hyper-Parameters:**

- ``k (int)`` : The neighborhood size. Defaults to ``100``.

- ``alpha (float)`` : Weight parameter for asymmetric cosine similarity. Defaults to ``0.5``.

- ``beta (float)`` : Parameter for controlling the balance between factors in the final score normalization. Defaults to ``1.0``.

- ``q (int)`` : The 'locality of scoring function' parameter. Defaults to ``1``.

**Additional Parameters:**

- ``knn_method (str)`` : Calculate the similarity of users if method is 'user', otherwise, calculate the similarity of items.. Defaults to ``item``.


**A Running Example:**

Write the following code to a python file, such as `run.py`

.. code:: python
from recbole.quick_start import run_recbole
run_recbole(model='AsymKNN', dataset='ml-100k')
And then:

.. code:: bash
python run.py
Tuning Hyper Parameters
-------------------------

If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``.

.. code:: bash
k choice [10,50,100,200,250,300,400,500,1000,1500,2000,2500]
alpha choice [0.0,0.2,0.5,0.8,1.0]
beta choice [0.0,0.2,0.5,0.8,1.0]
q choice [1,2,3,4,5,6]
Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model.

Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning:

.. code:: bash
python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test
For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`.

If you want to change parameters, dataset or evaluation settings, take a look at

- :doc:`../../../user_guide/config_settings`
- :doc:`../../../user_guide/data_intro`
- :doc:`../../../user_guide/train_eval_intro`
- :doc:`../../../user_guide/usage`
1 change: 1 addition & 0 deletions docs/source/user_guide/model_intro.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ task of top-n recommendation. All the collaborative filter(CF) based models are
.. toctree::
:maxdepth: 1

model/general/asymknn
model/general/pop
model/general/itemknn
model/general/bpr
Expand Down
1 change: 1 addition & 0 deletions recbole/model/general_recommender/__init__.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
from recbole.model.general_recommender.asymknn import AsymKNN
from recbole.model.general_recommender.bpr import BPR
from recbole.model.general_recommender.cdae import CDAE
from recbole.model.general_recommender.convncf import ConvNCF
Expand Down
5 changes: 5 additions & 0 deletions recbole/properties/model/AsymKNN.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
k: 100 # Number of neighbors to consider in the similarity calculation.
q: 1 # Exponent for adjusting the 'locality of scoring function' after similarity computation.
beta: 1.0 # Parameter for controlling the balance between factors in the final score normalization.
alpha: 0.5 # Weight parameter for asymmetric cosine similarity
knn_method: 'item' # Calculate the similarity of users if method is 'user', otherwise, calculate the similarity of items.
14 changes: 14 additions & 0 deletions tests/model/test_model_auto.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,20 @@ def test_userknn(self):
}
quick_test(config_dict)

def test_asymitemknn(self):
config_dict = {
"model": "AsymKNN",
"knn_method": "item"
}
quick_test(config_dict)

def test_asymuserknn(self):
config_dict = {
"model": "AsymKNN",
"knn_method": "user"
}
quick_test(config_dict)

def test_bpr(self):
config_dict = {
"model": "BPR",
Expand Down

0 comments on commit deb0972

Please sign in to comment.