Skip to content

Commit bb333a5

Browse files
authored
Merge pull request #1351 from leoleojie/master
FEA: add sequential, context, knowledge quick start
2 parents 26a926b + d24335e commit bb333a5

File tree

6 files changed

+1095
-176
lines changed

6 files changed

+1095
-176
lines changed

docs/source/developer_guide/customize_trainers.rst

+106
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,8 @@ and revise :meth:`~recbole.trainer.trainer.Trainer.evaluate` or :meth:`~recbole.
1919

2020
Example
2121
----------------
22+
1. Alternative Optimization
23+
>>>>>>>>>>>>>>>>>>>>>>>>>
2224
Here we present a simple Trainer example, which is used for alternative optimization.
2325
We revise the :meth:`~recbole.trainer.trainer.Trainer._train_epoch` method.
2426
To begin with, we need to create a new class for
@@ -102,3 +104,107 @@ Complete Code
102104
total_loss += loss.item()
103105
return total_loss
104106
107+
2. Mixed precision training
108+
>>>>>>>>>>>>>>>>>>>>>>>>>
109+
Here we present a simple Trainer example, which is used for mixed
110+
precision training. Mixed precision training offers significant
111+
computational speedup by performing operations in half-precision
112+
format, while storing minimal information in single-precision to
113+
retain as much information as possible in critical parts of the
114+
network. Let's give an example based on torch ``torch.cuda.amp``. To
115+
begin with, we need to create a new class for ``NewTrainer`` based on
116+
``Trainer``.
117+
118+
.. code:: python
119+
120+
from recbole.trainer import Trainer
121+
import torch.cuda.amp as amp
122+
class NewTrainer(Trainer):
123+
def __init__(self, config, model):
124+
super(NewTrainer, self).__init__(config, model)
125+
126+
Then we revise ``_train_epoch()``.
127+
128+
.. code:: python
129+
130+
def _train_epoch(self, train_data, epoch_idx):
131+
self.model.train()
132+
scaler = amp.GradScaler(enabled=self.enable_scaler)
133+
for batch_idx, interaction in enumerate(iter_data):
134+
interaction = interaction.to(self.device)
135+
self.optimizer.zero_grad()
136+
with amp.autocast(enabled=self.enable_amp):
137+
losses = loss_func(interaction)
138+
total_loss = losses.item() if total_loss is None else total_loss + losses.item()
139+
scaler.scale(loss).backward()
140+
scaler.step(self.optimizer)
141+
scaler.update()
142+
143+
Complete Code
144+
^^^^^^^^^^^^^^^^
145+
.. code:: python
146+
147+
from recbole.trainer import Trainer
148+
import torch.cuda.amp as amp
149+
class NewTrainer(Trainer):
150+
def __init__(self, config, model):
151+
super(NewTrainer, self).__init__(config, model)
152+
153+
def _train_epoch(self, train_data, epoch_idx):
154+
self.model.train()
155+
scaler = amp.GradScaler(enabled=self.enable_scaler)
156+
for batch_idx, interaction in enumerate(iter_data):
157+
interaction = interaction.to(self.device)
158+
self.optimizer.zero_grad()
159+
with amp.autocast(enabled=self.enable_amp):
160+
losses = loss_func(interaction)
161+
total_loss = losses.item() if total_loss is None else total_loss + losses.item()
162+
scaler.scale(loss).backward()
163+
scaler.step(self.optimizer)
164+
scaler.update()
165+
166+
3. Layer-specific learning rate
167+
>>>>>>>>>>>>>>>>>>>>>>>>>
168+
Here we present a simple Trainer example, which is used for setting
169+
layer-specific learning rate. For pretrained model, layers closer to
170+
the input layer are more likely to have learned more general
171+
features. On the other hand, later layers of the model learn the
172+
detailed features. In this case, we can set different learning rate
173+
for different layers. We can do this by modifying the optimizer.
174+
175+
.. code:: python
176+
177+
def _build_optimizer(self, learner, learning_rate, weight_decay):
178+
pretrained_params = list(map(id, self.model.pretrained_part.parameters())
179+
base_params = filter(lambda p: id(p) not in pretrained_params, self.model.parameters())
180+
if learner.lower() == 'adam':
181+
optimizer = optim.Adam([
182+
{"params":base_params},
183+
{"pretrained_params":self.model.pretrained_part.parameters(),"lr":1e-5}],
184+
lr=learning_rate,weight_decay=weight_decay)
185+
return optimizer
186+
187+
188+
189+
Complete Code
190+
^^^^^^^^^^^^^^^^
191+
.. code:: python
192+
193+
from recbole.trainer import Trainer
194+
class NewTrainer(Trainer):
195+
def __init__(self, config, model):
196+
super(NewTrainer, self).__init__(config, model)
197+
self.optimizer = self._build_optimizer()
198+
199+
def _train_epoch(self, train_data, epoch_idx):
200+
self.model.train()
201+
total_loss = 0.
202+
for batch_idx, interaction in enumerate(train_data):
203+
interaction = interaction.to(self.device)
204+
self.optimizer.zero_grad()
205+
loss = self.model.calculate_loss1(interaction)
206+
self._check_nan(loss)
207+
loss.backward()
208+
self.optimizer.step()
209+
total_loss += loss.item()
210+
return total_loss

docs/source/get_started/quick_start.rst

+7-176
Original file line numberDiff line numberDiff line change
@@ -1,184 +1,15 @@
11
Quick Start
22
===============
3-
Here is a quick-start example for using RecBole. We will show you how to train and test **BPR** model on the **ml-100k** dataset from both **API**
3+
Here is a quick-start example for using RecBole. RecBole supports general recommendation, sequential recommendation, context-aware recommendation and knowledge-based recommendation. We will select a representative model from each type of recommendation to show you how to train and test on the **ml-100k** dataset from both **API**
44
and **source code**.
55

6+
.. toctree::
7+
:maxdepth: 1
68

7-
Quick-start From API
8-
--------------------------
9-
10-
1. Prepare your data:
11-
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
12-
Before running a model, firstly you need to prepare and load data. To help users quickly get start,
13-
RecBole has a build-in dataset **ml-100k** and you can directly use it. However, if you want to use other datasets, you can read
14-
:doc:`../user_guide/usage/running_new_dataset` for more information.
15-
16-
Then, you need to set data config for data loading. You can create a `yaml` file called `test.yaml` and write the following settings:
17-
18-
.. code:: yaml
19-
20-
# dataset config
21-
USER_ID_FIELD: user_id
22-
ITEM_ID_FIELD: item_id
23-
load_col:
24-
inter: [user_id, item_id]
25-
26-
For more details of data config, please refer to :doc:`../user_guide/config/data_settings`.
27-
28-
2. Choose a model:
29-
>>>>>>>>>>>>>>>>>>>>>>>>>
30-
In RecBole, we implement 73 recommendation models covering general recommendation, sequential recommendation,
31-
context-aware recommendation and knowledge-based recommendation. You can choose a model from our :doc:`../user_guide/model_intro`.
32-
Here we choose BPR model to train and test.
33-
34-
Then, you need to set the parameter for BPR model. You can check the :doc:`../user_guide/model/general/bpr` and add the model settings into the `test.yaml`, like:
35-
36-
.. code:: yaml
37-
38-
# model config
39-
embedding_size: 64
40-
41-
If you want to run different models, you can read :doc:`../user_guide/usage/running_different_models` for more information.
42-
43-
3. Set training and evaluation config:
44-
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
45-
In RecBole, we support multiple training and evaluation methods. You can choose how to train and test model by simply setting the config.
46-
47-
Here we want to train and test the BPR model in training-validation-test method (optimize model parameters on the training set, do parameter selection according to the results on the validation set,
48-
and finally report the results on the test set) and evaluate the model performance by full ranking with all item candidates,
49-
so we can add the following settings into the `test.yaml`.
50-
51-
.. code:: yaml
52-
53-
# Training and evaluation config
54-
epochs: 500
55-
train_batch_size: 4096
56-
eval_batch_size: 4096
57-
neg_sampling:
58-
uniform: 1
59-
eval_args:
60-
group_by: user
61-
order: RO
62-
split: {'RS': [0.8,0.1,0.1]}
63-
mode: full
64-
metrics: ['Recall', 'MRR', 'NDCG', 'Hit', 'Precision']
65-
topk: 10
66-
valid_metric: MRR@10
67-
metric_decimal_place: 4
68-
69-
For more details of training and evaluation config, please refer to :doc:`../user_guide/config/training_settings` and :doc:`../user_guide/config/evaluation_settings`.
70-
71-
4. Run the model and collect the result
72-
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
73-
Now you have finished all the preparations, it's time to run the model!
74-
75-
You can create a new python file (e.g., `run.py`), and write the following code:
76-
77-
.. code:: python
78-
79-
from recbole.quick_start import run_recbole
80-
81-
run_recbole(model='BPR', dataset='ml-100k', config_file_list=['test.yaml'])
82-
83-
84-
Then run the following command:
85-
86-
.. code:: bash
87-
88-
python run.py
89-
90-
And you will obtain the output like:
91-
92-
.. code:: none
93-
94-
24 Aug 01:46 INFO ml-100k
95-
The number of users: 944
96-
Average actions of users: 106.04453870625663
97-
The number of items: 1683
98-
Average actions of items: 59.45303210463734
99-
The number of inters: 100000
100-
The sparsity of the dataset: 93.70575143257098%
101-
Remain Fields: ['user_id', 'item_id']
102-
24 Aug 01:46 INFO [Training]: train_batch_size = [4096] negative sampling: [{'uniform': 1}]
103-
24 Aug 01:46 INFO [Evaluation]: eval_batch_size = [4096] eval_args: [{'split': {'RS': [0.8, 0.1, 0.1]}, 'group_by': 'user', 'order': 'RO', 'mode': 'full'}]
104-
24 Aug 01:46 INFO BPR(
105-
(user_embedding): Embedding(944, 64)
106-
(item_embedding): Embedding(1683, 64)
107-
(loss): BPRLoss()
108-
)
109-
Trainable parameters: 168128
110-
Train 0: 100%|████████████████████████| 40/40 [00:00<00:00, 200.47it/s, GPU RAM: 0.01 G/11.91 G]
111-
24 Aug 01:46 INFO epoch 0 training [time: 0.21s, train loss: 27.7228]
112-
Evaluate : 100%|██████████████████████| 472/472 [00:00<00:00, 518.65it/s, GPU RAM: 0.01 G/11.91 G]
113-
24 Aug 01:46 INFO epoch 0 evaluating [time: 0.92s, valid_score: 0.020500]
114-
......
115-
Train 96: 100%|████████████████████████| 40/40 [00:00<00:00, 229.26it/s, GPU RAM: 0.01 G/11.91 G]
116-
24 Aug 01:47 INFO epoch 96 training [time: 0.18s, train loss: 3.7170]
117-
Evaluate : 100%|██████████████████████| 472/472 [00:00<00:00, 857.00it/s, GPU RAM: 0.01 G/11.91 G]
118-
24 Aug 01:47 INFO epoch 96 evaluating [time: 0.56s, valid_score: 0.375200]
119-
24 Aug 01:47 INFO valid result:
120-
recall@10 : 0.2162 mrr@10 : 0.3752 ndcg@10 : 0.2284 hit@10 : 0.7508 precision@10 : 0.1602
121-
24 Aug 01:47 INFO Finished training, best eval result in epoch 85
122-
24 Aug 01:47 INFO Loading model structure and parameters from saved/BPR-Aug-24-2021_01-46-43.pth
123-
Evaluate : 100%|██████████████████████| 472/472 [00:00<00:00, 866.53it/s, GPU RAM: 0.01 G/11.91 G]
124-
24 Aug 01:47 INFO best valid : {'recall@10': 0.2195, 'mrr@10': 0.3871, 'ndcg@10': 0.2344, 'hit@10': 0.7582, 'precision@10': 0.1627}
125-
24 Aug 01:47 INFO test result: {'recall@10': 0.2523, 'mrr@10': 0.4855, 'ndcg@10': 0.292, 'hit@10': 0.7953, 'precision@10': 0.1962}
126-
127-
Finally you will get the model's performance on the test set and the model file will be saved under the `/saved`. Besides,
128-
RecBole allows tracking and visualizing train loss and valid score with TensorBoard, please read the :doc:`../user_guide/usage/use_tensorboard` for more details.
129-
130-
The above is the whole process of running a model in RecBole, and you can read other docs for depth usage.
131-
132-
133-
Quick-start From Source
134-
--------------------------
135-
Besides using API, you can also directly run the source code of `RecBole <https://github.com/RUCAIBox/RecBole>`_.
136-
The whole process is similar to Quick-start From API.
137-
You can create a `yaml` file called `test.yaml` and set all the config as follow:
138-
139-
.. code:: yaml
140-
141-
# dataset config
142-
USER_ID_FIELD: user_id
143-
ITEM_ID_FIELD: item_id
144-
load_col:
145-
inter: [user_id, item_id]
146-
147-
# model config
148-
embedding_size: 64
149-
150-
# Training and evaluation config
151-
epochs: 500
152-
train_batch_size: 4096
153-
eval_batch_size: 4096
154-
neg_sampling:
155-
uniform: 1
156-
eval_args:
157-
group_by: user
158-
order: RO
159-
split: {'RS': [0.8,0.1,0.1]}
160-
mode: full
161-
metrics: ['Recall', 'MRR', 'NDCG', 'Hit', 'Precision']
162-
topk: 10
163-
valid_metric: MRR@10
164-
metric_decimal_place: 4
165-
166-
Then run the following command:
167-
168-
.. code:: bash
169-
170-
python run_recbole.py --model=BPR --dataset=ml-100k --config_files=test.yaml
171-
172-
And you will get the output of running the BPR model on the ml-100k dataset.
173-
174-
If you want to change the parameters, such as ``embedding_size``,
175-
just set the additional command parameters as you need:
176-
177-
.. code:: bash
178-
179-
python run_recbole.py --model=BPR --dataset=ml-100k --config_files=test.yaml --embedding_size=0.0001
180-
181-
9+
started/general
10+
started/sequential
11+
started/context-aware
12+
started/knowledge-based
18213

18314
In-depth Usage
18415
-------------------

0 commit comments

Comments
 (0)