ECER and EWER metrics

The ie-eval ecer-ewer command can be used to compute the reading-order-independent versions of the Entity Character Error Rate (ECER) and Entity Word Error Rate (EWER) metrics, globally.

Metric description

To compute the distance between two sequences of named entities, an edit distance is calculated where the substitution cost depends first on the correctness of the tagging and, finally, on the quality of the transcription of the named entity if it was tagged correctly.

In the computation of the ECER, the quality of the transcription is assessed by computing the Character Error Rate (CER).
In contrast, in the EWER, the quality of the transcription is evaluated with the Word Error Rate (WER). More specifically, the substitution cost between the hypothesized named entity \(y_k\) and the Ground Truth entity \(x_j\) can be computed as:

\[ \delta_{\mathrm{ECER}}(x_j, y_k) = \begin{cases} 1 \, & \text{if} \, x_j = \lambda \vee y_k = \lambda \\ 1 \, & \text{if} \, c(x_j) \neq c(y_k) \\ \mathrm{CER}(t(x_j), t(y_k)) \, & \mathrm{otherwise} \end{cases} \\ \]

\[ \delta_{\mathrm{EWER}}(x_j, y_k) = \begin{cases} 1 \, & \text{if} \, x_j = \lambda \vee y_k = \lambda \\ 1 \, & \text{if} \, c(x_j) \neq c(y_k) \\ \mathrm{WER}(t(x_j), t(y_k)) \, & \mathrm{otherwise} \end{cases} \\ \]

This computation is extended to complete documents by computing the best assignment (non-sequential alignment) between the sequence of Ground Truth entities \(x\) and the sequence of hypothesized named entities \(y\). This is done using the Hungarian algorithm , with computation time \(O(n^3)\), and is formally expressed as:

\[ \mathrm{OIECER}(x,y) = \min_{A(x,y)} \sum_{j,k \in A(x,y)} \delta_{\mathrm{ECER}}(x_j, y_k) \]

\[ \mathrm{OIEWER}(x,y) = \min_{A(x,y)} \sum_{j,k \in A(x,y)} \delta_{\mathrm{EWER}}(x_j, y_k) \]

The computation of the score for a whole corpus is a weighted average over each document \(s\) belonging to a corpus \(C\) by considering the set of all hypothesized named entity sequences \(Y\) and the ground truth sequences \(X\):

\[ \mathrm{OIECER}(C) = \frac{\sum_{s \in C}{\mathrm{OIECER}(X_s, Y_s)}}{\sum_{s \in C}{|X_s|}} \]

\[ \mathrm{OIEWER}(C) = \frac{\sum_{s \in C}{\mathrm{OIEWER}(X_s, Y_s)}}{\sum_{s \in C}{|X_s|}} \]

Parameters

Here are the available parameters for these metrics:

Parameter	Description	Type	Default
`--label-dir`	Path to the directory containing BIO label files.	`pathlib.Path`
`--prediction-dir`	Path to the directory containing BIO prediction files.	`pathlib.Path`

The parameters are also described when running ie-eval ecer-ewer --help.

Examples

Global evaluation

Use the following command to compute the overall ecer-ewer metrics:

ie-eval ecer-ewer \
    --label-dir tests/data/labels/ \
    --prediction-dir tests/data/predictions/

It will output the results in Markdown format:

2024-01-24 12:20:26,973 INFO/bio_parser.utils: Loading labels...
2024-01-24 12:20:27,104 INFO/bio_parser.utils: Loading prediction...
2024-01-24 12:20:27,187 INFO/bio_parser.utils: The dataset is complete and valid.
| Category | ECER (%) | EWER (%) | N entities | N documents |
|:---------|:--------:|:--------:|-----------:|------------:|
| total    |  21.33   |  28.27   |         28 |           5 |