Alignment Data
What is alignment data?
Alignment data describes the relationship between two bodies of text in terms of words, groups of words, or parts of words. ClearAligner defines a json format for alignment data. Alignment json files can be imported and exported.
The json alignment format allows for many tokens on either the source or target side. Tokens do not need to be contiguous and can cross BCV boundaries.
Example
{
"type": "translation",
"meta": {
"creator": "ClearAligner"
},
"records": [
{
"meta": {
"id": "d6820b2b-1a30-4247-a16f-18ed0c89b469",
"origin": "manual",
"status": "needsReview",
"note": [
{
"note": "This is how a note is stored.",
"id": "687280fd-386e-4263-bbab-de0373331086",
"authorEmail": "name@domain.com"
}
]
},
"source": [
"40001001001"
],
"target": [
"40001001001",
"40001001002",
"40001001003",
"40001001004"
]
},
{
"meta": {
"id": "6eb56565-e097-4108-9368-f9e2e715b5f0",
"origin": "manual",
"status": "created",
"note": []
},
"source": [
"40001001002"
],
"target": [
"40001001005",
"40001001006",
"40001001007"
]
},
{
"meta": {
"id": "ad64567f-b3c0-4e0e-9e8a-3a96ad1c621f",
"origin": "manual",
"status": "approved",
"note": []
},
"source": [
"40001001003"
],
"target": [
"40001001008",
"40001001009"
]
},
]
}Details
The
idfield on alignment records is a GUID used for internal change tracking.Values for
sourcearrays match the IDs in the canonical source text TSVs used by ClearAligner.Values for
targetarrays match the IDs provided in target text files.The
originfield describes the type of process the alignment record originated from.manualis used for human-created records. A variety of other strings can describe automated processes. Values other thanmanualwill be displayed with a ✨icon.The
statusfield describes the status of an alignment record. Supported statuses arecreated,approved,rejected, andneedsReview.While many records can be stored in the
notearray, ClearAligner currently on supports a single note per alignment record.
Last updated