YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
A Universal Dependency parser built on top of a Transformer language model
Score on pre-tokenized test data:
Metric | Precision | Recall | F1 Score | AligndAcc
-----------+-----------+-----------+-----------+-----------
Tokens | 99.70 | 99.77 | 99.73 |
Sentences | 100.00 | 100.00 | 100.00 |
Words | 99.62 | 99.61 | 99.61 |
UPOS | 96.99 | 96.97 | 96.98 | 97.36
XPOS | 93.65 | 93.64 | 93.65 | 94.01
UFeats | 91.31 | 91.29 | 91.30 | 91.65
AllTags | 86.86 | 86.85 | 86.86 | 87.19
Lemmas | 95.83 | 95.81 | 95.82 | 96.19
UAS | 89.01 | 89.00 | 89.00 | 89.35
LAS | 85.72 | 85.70 | 85.71 | 86.04
CLAS | 81.39 | 80.91 | 81.15 | 81.34
MLAS | 69.21 | 68.81 | 69.01 | 69.17
BLEX | 77.44 | 76.99 | 77.22 | 77.40
Score on untokenized test data:
Metric | Precision | Recall | F1 Score | AligndAcc
-----------+-----------+-----------+-----------+-----------
Tokens | 99.50 | 99.66 | 99.58 |
Sentences | 100.00 | 100.00 | 100.00 |
Words | 99.42 | 99.50 | 99.46 |
UPOS | 96.80 | 96.88 | 96.84 | 97.37
XPOS | 93.48 | 93.56 | 93.52 | 94.03
UFeats | 91.13 | 91.20 | 91.16 | 91.66
AllTags | 86.71 | 86.78 | 86.75 | 87.22
Lemmas | 95.66 | 95.74 | 95.70 | 96.22
UAS | 88.76 | 88.83 | 88.80 | 89.28
LAS | 85.49 | 85.55 | 85.52 | 85.99
CLAS | 81.19 | 80.73 | 80.96 | 81.31
MLAS | 69.06 | 68.67 | 68.87 | 69.16
BLEX | 77.28 | 76.84 | 77.06 | 77.39
To use the model, you need to setup COMBO, which makes it possible to use word embeddings from a pre-trained transformer model (electra-base-igc-is).
git submodule update --init --recursive
pip install -U pip setuptools wheel
pip install --index-url https://pypi.clarin-pl.eu/simple combo==1.0.5
- For Python 3.9, you might need to install cython:
pip install -U pip cython
- Then you can run the model as it is done in parse_file.py
For more instructions, see here: https://gitlab.clarin-pl.eu/syntactic-tools/combo
The Tokenizer directory is a clone of Mi冒eind's tokenizer.
The directory transformer_models/ contains the pretrained model electra-base-igc-is,
which supplies the parser with contextual embeddings and attention, trained by J贸n Fri冒rik Da冒ason.
License
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
馃檵
Ask for provider support