The index of tokens identifies the occurrence of every white-space delimited token in a text with a canonical citation expressed as a CTS URN, and maps this occurrence to one or more entries in the inventory of linguistic entities.
The index of tokens is stored in a simple tabular format with structured metadata expressing the coverage of the index as a series of CTS URNs. This information can be readily imported by any kind of software. A full description of the tabular format will be available from this page.
We have begun analyzing Greek texts in verse down to about 300 C.E. When the first inventory of extant texts has been released on this site, we will simultaneously begin releasing current versions of the inventory of linguistic entities, and of the associated index of tokens.