|
1 | 1 | Quick Start
|
2 | 2 | ===============
|
3 |
| -Here is a quick-start example for using RecBole. We will show you how to train and test **BPR** model on the **ml-100k** dataset from both **API** |
| 3 | +Here is a quick-start example for using RecBole. RecBole supports general recommendation, sequential recommendation, context-aware recommendation and knowledge-based recommendation. We will select a representative model from each type of recommendation to show you how to train and test on the **ml-100k** dataset from both **API** |
4 | 4 | and **source code**.
|
5 | 5 |
|
| 6 | +.. toctree:: |
| 7 | + :maxdepth: 1 |
6 | 8 |
|
7 |
| -Quick-start From API |
8 |
| --------------------------- |
9 |
| - |
10 |
| -1. Prepare your data: |
11 |
| ->>>>>>>>>>>>>>>>>>>>>>>>>>>>>> |
12 |
| -Before running a model, firstly you need to prepare and load data. To help users quickly get start, |
13 |
| -RecBole has a build-in dataset **ml-100k** and you can directly use it. However, if you want to use other datasets, you can read |
14 |
| -:doc:`../user_guide/usage/running_new_dataset` for more information. |
15 |
| - |
16 |
| -Then, you need to set data config for data loading. You can create a `yaml` file called `test.yaml` and write the following settings: |
17 |
| - |
18 |
| -.. code:: yaml |
19 |
| -
|
20 |
| - # dataset config |
21 |
| - USER_ID_FIELD: user_id |
22 |
| - ITEM_ID_FIELD: item_id |
23 |
| - load_col: |
24 |
| - inter: [user_id, item_id] |
25 |
| -
|
26 |
| -For more details of data config, please refer to :doc:`../user_guide/config/data_settings`. |
27 |
| - |
28 |
| -2. Choose a model: |
29 |
| ->>>>>>>>>>>>>>>>>>>>>>>>> |
30 |
| -In RecBole, we implement 73 recommendation models covering general recommendation, sequential recommendation, |
31 |
| -context-aware recommendation and knowledge-based recommendation. You can choose a model from our :doc:`../user_guide/model_intro`. |
32 |
| -Here we choose BPR model to train and test. |
33 |
| - |
34 |
| -Then, you need to set the parameter for BPR model. You can check the :doc:`../user_guide/model/general/bpr` and add the model settings into the `test.yaml`, like: |
35 |
| - |
36 |
| -.. code:: yaml |
37 |
| -
|
38 |
| - # model config |
39 |
| - embedding_size: 64 |
40 |
| -
|
41 |
| -If you want to run different models, you can read :doc:`../user_guide/usage/running_different_models` for more information. |
42 |
| - |
43 |
| -3. Set training and evaluation config: |
44 |
| ->>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> |
45 |
| -In RecBole, we support multiple training and evaluation methods. You can choose how to train and test model by simply setting the config. |
46 |
| - |
47 |
| -Here we want to train and test the BPR model in training-validation-test method (optimize model parameters on the training set, do parameter selection according to the results on the validation set, |
48 |
| -and finally report the results on the test set) and evaluate the model performance by full ranking with all item candidates, |
49 |
| -so we can add the following settings into the `test.yaml`. |
50 |
| - |
51 |
| -.. code:: yaml |
52 |
| -
|
53 |
| - # Training and evaluation config |
54 |
| - epochs: 500 |
55 |
| - train_batch_size: 4096 |
56 |
| - eval_batch_size: 4096 |
57 |
| - neg_sampling: |
58 |
| - uniform: 1 |
59 |
| - eval_args: |
60 |
| - group_by: user |
61 |
| - order: RO |
62 |
| - split: {'RS': [0.8,0.1,0.1]} |
63 |
| - mode: full |
64 |
| - metrics: ['Recall', 'MRR', 'NDCG', 'Hit', 'Precision'] |
65 |
| - topk: 10 |
66 |
| - valid_metric: MRR@10 |
67 |
| - metric_decimal_place: 4 |
68 |
| -
|
69 |
| -For more details of training and evaluation config, please refer to :doc:`../user_guide/config/training_settings` and :doc:`../user_guide/config/evaluation_settings`. |
70 |
| - |
71 |
| -4. Run the model and collect the result |
72 |
| ->>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> |
73 |
| -Now you have finished all the preparations, it's time to run the model! |
74 |
| - |
75 |
| -You can create a new python file (e.g., `run.py`), and write the following code: |
76 |
| - |
77 |
| -.. code:: python |
78 |
| -
|
79 |
| - from recbole.quick_start import run_recbole |
80 |
| -
|
81 |
| - run_recbole(model='BPR', dataset='ml-100k', config_file_list=['test.yaml']) |
82 |
| -
|
83 |
| -
|
84 |
| -Then run the following command: |
85 |
| - |
86 |
| -.. code:: bash |
87 |
| -
|
88 |
| - python run.py |
89 |
| -
|
90 |
| -And you will obtain the output like: |
91 |
| - |
92 |
| -.. code:: none |
93 |
| -
|
94 |
| - 24 Aug 01:46 INFO ml-100k |
95 |
| - The number of users: 944 |
96 |
| - Average actions of users: 106.04453870625663 |
97 |
| - The number of items: 1683 |
98 |
| - Average actions of items: 59.45303210463734 |
99 |
| - The number of inters: 100000 |
100 |
| - The sparsity of the dataset: 93.70575143257098% |
101 |
| - Remain Fields: ['user_id', 'item_id'] |
102 |
| - 24 Aug 01:46 INFO [Training]: train_batch_size = [4096] negative sampling: [{'uniform': 1}] |
103 |
| - 24 Aug 01:46 INFO [Evaluation]: eval_batch_size = [4096] eval_args: [{'split': {'RS': [0.8, 0.1, 0.1]}, 'group_by': 'user', 'order': 'RO', 'mode': 'full'}] |
104 |
| - 24 Aug 01:46 INFO BPR( |
105 |
| - (user_embedding): Embedding(944, 64) |
106 |
| - (item_embedding): Embedding(1683, 64) |
107 |
| - (loss): BPRLoss() |
108 |
| - ) |
109 |
| - Trainable parameters: 168128 |
110 |
| - Train 0: 100%|████████████████████████| 40/40 [00:00<00:00, 200.47it/s, GPU RAM: 0.01 G/11.91 G] |
111 |
| - 24 Aug 01:46 INFO epoch 0 training [time: 0.21s, train loss: 27.7228] |
112 |
| - Evaluate : 100%|██████████████████████| 472/472 [00:00<00:00, 518.65it/s, GPU RAM: 0.01 G/11.91 G] |
113 |
| - 24 Aug 01:46 INFO epoch 0 evaluating [time: 0.92s, valid_score: 0.020500] |
114 |
| - ...... |
115 |
| - Train 96: 100%|████████████████████████| 40/40 [00:00<00:00, 229.26it/s, GPU RAM: 0.01 G/11.91 G] |
116 |
| - 24 Aug 01:47 INFO epoch 96 training [time: 0.18s, train loss: 3.7170] |
117 |
| - Evaluate : 100%|██████████████████████| 472/472 [00:00<00:00, 857.00it/s, GPU RAM: 0.01 G/11.91 G] |
118 |
| - 24 Aug 01:47 INFO epoch 96 evaluating [time: 0.56s, valid_score: 0.375200] |
119 |
| - 24 Aug 01:47 INFO valid result: |
120 |
| - recall@10 : 0.2162 mrr@10 : 0.3752 ndcg@10 : 0.2284 hit@10 : 0.7508 precision@10 : 0.1602 |
121 |
| - 24 Aug 01:47 INFO Finished training, best eval result in epoch 85 |
122 |
| - 24 Aug 01:47 INFO Loading model structure and parameters from saved/BPR-Aug-24-2021_01-46-43.pth |
123 |
| - Evaluate : 100%|██████████████████████| 472/472 [00:00<00:00, 866.53it/s, GPU RAM: 0.01 G/11.91 G] |
124 |
| - 24 Aug 01:47 INFO best valid : {'recall@10': 0.2195, 'mrr@10': 0.3871, 'ndcg@10': 0.2344, 'hit@10': 0.7582, 'precision@10': 0.1627} |
125 |
| - 24 Aug 01:47 INFO test result: {'recall@10': 0.2523, 'mrr@10': 0.4855, 'ndcg@10': 0.292, 'hit@10': 0.7953, 'precision@10': 0.1962} |
126 |
| -
|
127 |
| -Finally you will get the model's performance on the test set and the model file will be saved under the `/saved`. Besides, |
128 |
| -RecBole allows tracking and visualizing train loss and valid score with TensorBoard, please read the :doc:`../user_guide/usage/use_tensorboard` for more details. |
129 |
| - |
130 |
| -The above is the whole process of running a model in RecBole, and you can read other docs for depth usage. |
131 |
| - |
132 |
| - |
133 |
| -Quick-start From Source |
134 |
| --------------------------- |
135 |
| -Besides using API, you can also directly run the source code of `RecBole <https://github.com/RUCAIBox/RecBole>`_. |
136 |
| -The whole process is similar to Quick-start From API. |
137 |
| -You can create a `yaml` file called `test.yaml` and set all the config as follow: |
138 |
| - |
139 |
| -.. code:: yaml |
140 |
| -
|
141 |
| - # dataset config |
142 |
| - USER_ID_FIELD: user_id |
143 |
| - ITEM_ID_FIELD: item_id |
144 |
| - load_col: |
145 |
| - inter: [user_id, item_id] |
146 |
| - |
147 |
| - # model config |
148 |
| - embedding_size: 64 |
149 |
| -
|
150 |
| - # Training and evaluation config |
151 |
| - epochs: 500 |
152 |
| - train_batch_size: 4096 |
153 |
| - eval_batch_size: 4096 |
154 |
| - neg_sampling: |
155 |
| - uniform: 1 |
156 |
| - eval_args: |
157 |
| - group_by: user |
158 |
| - order: RO |
159 |
| - split: {'RS': [0.8,0.1,0.1]} |
160 |
| - mode: full |
161 |
| - metrics: ['Recall', 'MRR', 'NDCG', 'Hit', 'Precision'] |
162 |
| - topk: 10 |
163 |
| - valid_metric: MRR@10 |
164 |
| - metric_decimal_place: 4 |
165 |
| -
|
166 |
| -Then run the following command: |
167 |
| - |
168 |
| -.. code:: bash |
169 |
| -
|
170 |
| - python run_recbole.py --model=BPR --dataset=ml-100k --config_files=test.yaml |
171 |
| -
|
172 |
| -And you will get the output of running the BPR model on the ml-100k dataset. |
173 |
| - |
174 |
| -If you want to change the parameters, such as ``embedding_size``, |
175 |
| -just set the additional command parameters as you need: |
176 |
| - |
177 |
| -.. code:: bash |
178 |
| -
|
179 |
| - python run_recbole.py --model=BPR --dataset=ml-100k --config_files=test.yaml --embedding_size=0.0001 |
180 |
| -
|
181 |
| -
|
| 9 | + started/general |
| 10 | + started/sequential |
| 11 | + started/context-aware |
| 12 | + started/knowledge-based |
182 | 13 |
|
183 | 14 | In-depth Usage
|
184 | 15 | -------------------
|
|
0 commit comments