Alignment Data
What is alignment data?
Alignment data describes the relationship between two bodies of text in terms of words, groups of words, or parts of words. ClearAligner defines a json
format for alignment data. Alignment json
files can be imported and exported.
The json
alignment format allows for many tokens on either the source
or target
side. Tokens do not need to be contiguous and can cross BCV boundaries.
Example
{
"type": "translation",
"meta": {
"creator": "ClearAligner"
},
"records": [
{
"meta": {
"id": "d6820b2b-1a30-4247-a16f-18ed0c89b469",
"origin": "manual",
"status": "needsReview",
"note": [
{
"note": "This is how a note is stored.",
"id": "687280fd-386e-4263-bbab-de0373331086",
"authorEmail": "name@domain.com"
}
]
},
"source": [
"40001001001"
],
"target": [
"40001001001",
"40001001002",
"40001001003",
"40001001004"
]
},
{
"meta": {
"id": "6eb56565-e097-4108-9368-f9e2e715b5f0",
"origin": "manual",
"status": "created",
"note": []
},
"source": [
"40001001002"
],
"target": [
"40001001005",
"40001001006",
"40001001007"
]
},
{
"meta": {
"id": "ad64567f-b3c0-4e0e-9e8a-3a96ad1c621f",
"origin": "manual",
"status": "approved",
"note": []
},
"source": [
"40001001003"
],
"target": [
"40001001008",
"40001001009"
]
},
]
}
Details
The
id
field on alignment records is a GUID used for internal change tracking.Values for
source
arrays match the IDs in the canonical source text TSVs used by ClearAligner.Values for
target
arrays match the IDs provided in target text files.The
origin
field describes the type of process the alignment record originated from.manual
is used for human-created records. A variety of other strings can describe automated processes. Values other thanmanual
will be displayed with a ✨icon.The
status
field describes the status of an alignment record. Supported statuses arecreated
,approved
,rejected
, andneedsReview
.While many records can be stored in the
note
array, ClearAligner currently on supports a single note per alignment record.
Last updated