> For the complete documentation index, see [llms.txt](https://biblica.gitbook.io/clear-aligner/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://biblica.gitbook.io/clear-aligner/file-formats/target-text.md).

# Target Text

ClearAligner uses a simple `TSV` file format to describe tokenized texts. Starting a project is as easy as importing a TSV file representing the text. The required columns are `id`, `text`, and `source_verse`. UTF-8 encoding is expected.&#x20;

## How do I generate a TSV for my project?

We have created [a toolkit called Kathairo for generating TSVs from USFM](https://github.com/Clear-Bible/kathairo.py). We also have a [public repository that contains checked TSVs for openly-licensed texts](https://github.com/Clear-Bible/Open-Bible-TSVs). If find a problem with an existing TSV or need help generating a new one, [create an issue here](https://github.com/Clear-Bible/Open-Bible-TSVs/issues).&#x20;

## What does ClearAligner's TSV format look like?

## Example

<table><thead><tr><th>id</th><th width="132">source_verse</th><th width="121">text</th><th width="194">skip_space_after</th><th>exclude</th></tr></thead><tbody><tr><td>01001001001</td><td>01001001</td><td>Hapo</td><td></td><td></td></tr><tr><td>01001001002</td><td>01001001</td><td>mwanzo</td><td></td><td></td></tr><tr><td>01001001003</td><td>01001001</td><td>Mungu</td><td></td><td></td></tr><tr><td>01001001004</td><td>01001001</td><td>aliumba</td><td></td><td></td></tr><tr><td>01001001005</td><td>01001001</td><td>mbingu</td><td></td><td></td></tr><tr><td>01001001006</td><td>01001001</td><td>na</td><td></td><td></td></tr><tr><td>01001001007</td><td>01001001</td><td>dunia</td><td>y</td><td></td></tr><tr><td>01001001008</td><td>01001001</td><td>.</td><td></td><td>y</td></tr></tbody></table>

## Details

### Column:  \`id\`&#x20;

The `id` column should contain BCVW values correlating to the target text's native versification:

* **Book**: 2 characters
* **Chapter**: 3 characters
* **Verse**: 3 characters
* **Word**: 3 characters

### Column: \`source\_verse\`

The `source_verse` column should contain BCV correlating to the source text versification (often referred to as `org`).&#x20;

#### What is this column for?

Bible editions may use different approaches to identify the same verse content. For example,  most English Bibles identify Malachi’s statement, “Behold, I will send you Elijah the prophet before the great and awesome day of the LORD comes.” as Malachi chapter 4, verse 5. However, Hebrew Bibles designate this content as Mal 3:23. These schemes are called the *versification* of the Bible, and there are several common systems and ways to map between them.

ClearAligner allows users to navigate scripture via a versification scheme that is native to the target text. The `source_verse` column maps targets tokens in to source (aka `org`) versification scheme.

### Column: \`text\`

The `text` column should contain UTF-8 tokens representing the text to be aligned.

### Column: \`skip\_space\_after\`

The `skip_space_after`column is used to correctly render spaces between tokens when displaying the text. If empty, ClearAligner assumes a falsy value. If value `y`is present, the result is truthy.

### Column: \`exclude\`

The`exclude`column is used to mark tokens that should be displayed (for correct rendering of the text) but cannot be aligned. If empty, ClearAligner assumes a falsy value. If value `y`is present, the token will be displayed but cannot be aligned. It is common to exclude punctuation tokens from alignment.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://biblica.gitbook.io/clear-aligner/file-formats/target-text.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
