This is the supplementary code repo for the paper "Learning Syntax from Naturally-Occurring Bracketings" (NAACL 2021).
The code has been tested with the following dependencies and versions:
python==3.6.7
torch==1.7.1
transformers==4.4.0
numpy==1.19.4
fire==0.1.3
Our pre-processed data are included in the data
directory. You can use the command tar xzvf data.tar.gz
to decompress.
Simply run ./train.sh
. You can change the training data source and the loss function through the DATASOURCE
and CHART_MODE
variables.
Run python evaluate.py
. Change model path and test file path as needed.
Our code is based on this project and licensed under MIT license.
The file attention.py
is based on an implemention in AllenNLP(Apache-2.0).
You can cite our paper if our project is useful to your research:
Tianze Shi, Ozan İrsoy, Igor Malioutov, and Lillian Lee. 2021. Learning Syntax from Naturally-Occurring Bracketings. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics.