Contribution

Contributing to the codebase

We welcome:

Bug reports
Pull requests for bug fixes
Logs and documentation improvements
New algorithms and datasets
Better hyperparameters (but with proofs)

Setup

Contributing code is done through standard github methods:

Fork this repo
Make a change and commit your code
Submit a pull request. It will be reviewed by maintainers, and they'll give feedback or make requests as applicable

git clone git@github.com:tinkoff-ai/CORL.git
cd CORL
pip install -r requirements/requirements_dev.txt

For dependencies installation see get started section.

Code style

The CI will run several checks on the new code pushed to the CORL repository. These checks can also be run locally without waiting for the CI by following the steps below:

install pre-commit,
install the Git hooks by running pre-commit install.

Once those two steps are done, the Git hooks will be run automatically at every new commit. The Git hooks can also be run manually with pre-commit run --all-files, and if needed they can be skipped (not recommended) with git commit --no-verify.

We use Ruff as our main linter. If you want to see possible problems before pre-commit, you can run ruff check --diff . to see exact linter suggestions and future fixes.

Adding new algorithms

Warning

While we welcome any algorithms, it is better to open an issue with the proposal before 
so we can discuss the details. Unfortunately, not all algorithms are equally 
easy to understand and reproduce. We may be able to give a couple of advices to you,
or on the contrary warn you that this particular algorithm will require too much 
computational resources to fully reproduce the results, and it is better to do something else.

All new algorithms should go to the algorithms/contrib/offline for just offline algorithms and to the algorithms/contrib/finetune for the offline-to-online algorithms.

We as a team try to keep the core as reliable and reproducible as possible, but we may not have the resources to support all future algorithms. Therefore, this separation is necessary, as we cannot guarantee that all algorithms from algorithms/contrib exactly reproduce the results of their original publications.

Make sure your new code is properly documented and all references to the original implementations and papers are present (for example as in Decision Transformer). Follow the conventions for naming argument of configs, functions, classes. Try to stylistically imitate already existing implementations.

Please, explain all the tricks and possible differences from the original implementation in as much detail as possible. Keep in mind that this code may be used by other researchers. Make their lives easier!

Running benchmarks

Although you will have to do a hyperparameter search while reproducing the algorithm, in the end we expect to see final configs in configs/contrib/<algo_type>/<algo_name>/<dataset_name>.yaml with the best hyperparameters for all datasets considered. The configs should be in yaml format, containing all hyperparameters sorted in alphabetical order (see existing configs for an inspiration).

Use these conventions to name your runs in the configs: 1. name: <algo_name> 2. group: <algo_name>-<dataset_name>-multiseed-v0, increment version if needed 3. use our __post_init__ implementation in your config dataclass

Since we are releasing wandb logs for all algorithms, you will need to submit multiseed (~4 seeds) training runs the CORL project in the wandb corl-team organization. We'll invite you there when the time will come.

We usually use wandb sweeps for this. You can use this example config (it will work with pyrallis as it expects config_path cli argument):

sweep_config.yaml

entity: corl-team
project: CORL
program: algorithms/contrib/<algo_name>.py
method: grid
parameters:
  config_path:
    # algo_type is offline or finetune (see sections above)
    values: [
        "configs/contrib/<algo_type>/<algo_name>/<dataset_name_1>.yaml",
        "configs/contrib/<algo_type>/<algo_name>/<dataset_name_2>.yaml",
        "configs/contrib/<algo_type>/<algo_name>/<dataset_name_3>.yaml",
    ]
  train_seed:
    values: [0, 1, 2, 3]

Then proceed as usual. Create wandb sweep with wandb sweep sweep_config.yaml, then run agents with wandb agent <agent_id>.

Based on the results, you will need to make wandb reports to make it easier for other users to understand. You can use any of the already existing ones as an example (see README.md).

Checklist

Ideally, all checks should be completed!

Issue about new algorithm is open
Single-file implementation is added to the algorithms/contrib
PR has passed all the tests
Evidence that implementation reproduces original results is provided
Configs with the best hyperparameters for all datasets are added to the configs/contrib
Logs and reports for best hyperparameters are submitted to our wandb organization