TLex / tlTerm / tlDatabase Documentation
See also the TLex User Guide for primary documentation.
FAQ: TLex/tlTerm/tlDatabase Frequently Asked Questions; Tips & Tricks; Undocumented
Listed here are answers to common queries, as well as some potentially useful features that haven't quite yet made "full-fledged feature" status and/or have otherwise not yet been documented elsewhere, as well as other useful "tips & tricks" for TLex, tlTerm and tlDatabase.
[TshwaneLex 2] Pressing F12 while in an "Attributes (F1)" window pops up a larger overlay window for editing the current attribute.
(Note one can also widen the "Attributes (F1)" window with the "View/Wide Tools window layout (Ctrl+Alt+L)" option.)
Delete: Delete current element subtree
Ctrl+Up: Move current element up relative to sibling [change order of elements]
Ctrl+Down: Move current element down relative to sibling [change order of elements]
Ctrl+X: Cut current element subtree to clipboard
Ctrl+C: Copy current element subtree to clipboard
Ctrl+V: Paste current element subtree from clipboard
Undocumented:
Ctrl+Shift+Left: Collapse sub-elements
Ctrl+Shift+Right: Expand sub-elements
[Version 5] By specifying a "$LUA$:" prefix in the 'text to insert' field under 'Tools/Options/Keyboard shortcuts', one can enter Lua script to be executed when that key is pressed. If a current entry or node are selected in the Entry List and/or Tree View, gCurrentEntry and gCurrentNode will point to these. The following example is used to toggle a 'has been proofread' flag on the entry (assuming that attribute has been configured).
$LUA$:
if gCurrentEntry==nil then return;end
local A=gCurrentEntry:GetElement():FindAttributeByName("_CheckFlags");
if A~=nil then
gCurrentEntry:ToggleAttributeListItem(A, "ProofRead");
tEntryHasChanged(gCurrentEntry);
gCurrentEntry:SetChanged(true, true);
end
Yes, all of our software products do.
Yes; as of early 2010, there are native Mac versions of these applications.
Prior to this, virtualisation was also a solution: TLex and tlTerm will also work on the Mac if using Parallels. TLex and tlTerm have also been reported to more or less work in the CrossOver environment, although some configuration is needed - one of our users has kindly posted instructions online: Getting TshwaneLex working in CrossOver.
In most cases, this simply means that the font you have chosen to display some text does not support that particular character. Changing to (or installing) a font that supports that character usually resolves the problem.
This is usually a "cosmetic" issue, in that the underlying data itself is (in most cases) correctly encoded, but is just not being displayed correctly.
Note that you may have to specify a font in more than one place in TshwaneLex or tlTerm; see below.
See also 'Useful Fonts' below for fonts that may help resolve such problems.
The Preview is generated using the Styles system for the document, while the fonts used to display text in the Tree View or under F1 are configured separately, under "Tools/Options/Fonts".
If the chosen font in the Styles system doesn't support a given character, the Preview area may also 'intelligently' select a different font that does, just to display the character - the Tree View and Preview Area do not do this.
This is a "cosmetic" issue, in that the underlying data itself is correctly encoded, but is just not being displayed correctly in certain parts of the software.
DejaVu - A Unicode font family based on the Bitstream Vera Fonts. Free / open source. Wide range of characters. Includes serif, sans serif and monospaced varieties. Cross-platform (Mac/Windows/Linux etc.).
Doulous SIL - "A Unicode font with a comprehensive set of characters needed for almost any Roman- or Cyrillic-based writing system (including IPA), whether for phonetic or orthographic needs. In addition, there is provision for other characters and symbols useful to linguists. This font makes use of state-of-the-art font technologies to support complex typographic issues, such as the need to position arbitrary combinations of base glyphs and diacritics optimally."
Arial Unicode MS (Not downloadable) - A (large) font from Microsoft that covers a very broad range of characters in Unicode 2.1. A very useful 'cover-all(/most)' font, particularly for editing purposes.
The font settings under "Tools/Options" allow you to configure the font that will be used for all 'Attributes (F1)' boxes by default. You may, however, want or need to use different fonts for different fields (that e.g. use very different writing systems), or for different sides (sections) of the dictionary. You can configure specific fonts to be used for individual attribute types under 'Attributes (F1)' by creating a text file "Fonts.txt" in the TshwaneLex/tlTerm application folder. Each line of the file must have three comma-separated values: The name of the 'language' or 'section', followed by an "Element|Attribute" item, followed by the font name. The following example, for a bilingual Tshivenda/English dictionary, would use "DejaVu Sans" for the headword box on the Tshivenda side of the dictionary and for the Tshivenda 'Translation Equivalent' box on the English side:
Tshivenda,Lemma|LemmaSign,DejaVu Sans
English,TE|TE,DejaVu Sans
If one regularly imports CSV documents with given columns, defaults for the corresponding attributes can be configured by creating a text file in the TshwaneLex/tlTerm application folder (typically something like "c:\Program Files\TshwaneLex") called "ImportCSVDefaults.txt". The contents of this text file are simply what you would see in the right-side column of the 'Import CSV' dialog box, e.g.:
Lemma::LemmaSign
Lemma::PartOfSpeech
Lemma::Sense::TE::TE
A blank row corresponds to a "Nothing". If this text file is present, then when the 'Import CSV' dialog is opened, it will automatically attempt to fill in the right-side column with the fields read from the text file (provided those fields exist in the DTD; if not a "Nothing" will be added instead).
[New from 2006-11-15] For some projects the lemma list width may seem a bit on the small side. The width can be configured by creating a DWORD registry key called "EntryListWidth" under "HKEY_CURRENT_USER/Software/TshwaneDJe/(ApplicationName)/UserInterface". This specifies the width as a percentage of the width of the application window. The default is around 11%.
[TshwaneLex 4] As of TshwaneLex Suite 4 (future release), one can change the width of the Lemma List or Term List dynamically in TshwaneLex or tlTerm using the keyboard shortcuts Ctrl+Alt+Shift+Right and Ctrl+Alt+Shift+Left.
Under "HKEY_CURRENT_USER/Software/TshwaneDJe/(ApplicationName)/App", create a DWORD registry key called "NoSplash" and set the value to 1.
Under "HKEY_CURRENT_USER/Software/TshwaneDJe/(ApplicationName)/Settings" (create if necessary), create a DWORD registry key called "MaxSearchResults" and set the value to the desired maximum number of search results.
[New from 2006-11-16] Create "HKEY_CURRENT_USER/Software/TshwaneDJe/(ApplicationName)/Settings". Underneath this, there are two possible settings:
Under "HKEY_CURRENT_USER/Software/TshwaneDJe/(ApplicationName)/PerLexicon/(ProjectKey)/SectionID(Num)/", the following settings apply:
Once in InDesign via RTF, each field (information type, i.e. element/attribute) has an associated 'style' which may be manipulated centrally in InDesign.
Our regular expressions use the Perl syntax: Regular Expression Syntax Reference.
There are at least two methods:
(1) In the DTD editor, configure the attribute value to be 'required'. Then do an "Error check" (or use the "Required attribute" F5 Filter).
(2) Do a field-specific F3 search for the regular expression "^$".
Then when exporting, select the 'Use filters' option.
[TshwaneLex 2] To find or export all entries beginning with a certain letter or letters, one can use a field-specific regular expression search, combined with a search filter, as explained in the following steps:
Now export as usual (e.g. to RTF), making sure to select the "Use filters" option.
(For languages with prefixes, e.g. if a "-" precedes roots/stems in the headword, the regular expressions "^-?a" and "^-?[q-z]" can be used respectively.)
Apply the desired filter(s) under "Filter (F5)", and select "File/Save a copy/TshwaneLex file". Select "Use filters", and if desired, "Include incomplete articles".
This can also be used to clear a side of the dictionary, by applying a filter on that side that leaves no entries remaining (e.g. 'subtract all entries with a lemma sign').
NB: One thing to watch out for when exporting a filtered subset in this way, is that cross-references to articles that are not also included by the filter will be removed from the exported data.
See previous question.
If one wants to retain certain settings from the original database, such as the user logons, see previous question.
If one only wants to use the core TshwaneLex DTD and styles, but lose e.g. the user logons, one can use "DTD Templates", explained in the User Guide.
Open "Edit/Search and replace". Under "Fields", do 'Clear all' and tick only the 'Incomplete' attribute. Do a find of "0" and replace with "1" (enter these without quotes).
Open "Edit/Search and replace". Under "Fields", do 'Clear all' and tick only the 'Incomplete' attribute. Do a find of "1" and replace with "0" (enter these without quotes).
This can be achieved using the CSV import and using the values '0' or '1' (to clear or set the incomplete status, respectively) for the incomplete attribute. Be sure to uncheck 'Mark imported entries as incomplete'.
The quickest method [TshwaneLex 3] is to go to "Format (F4)" and select the 'modified date' (typically named "Modified") attribute under "Sort by". The most recent work will now be at the bottom of the Lemma List or Term List. Once done, you can select the "-" option from the "Sort by" list to go back to normal.
Note that this can be combined with "Filter (F5)"; for example, you could filter on a particular username to see all the latest work of that user only.
Go to 'Search (F3)', tick 'Regular expression' and enter '\w+'. The number of words will be the first number after "Matches found".
Clicking on the number of results will display the word count on a per-field basis.
Go to "Search (F3)". Click "Fields <" to display the fields list, click 'Clear all' and tick only the 'Modified' attribute on the lemma/entry element (this means "restrict the search to the 'last modified date' field"). Enter a date into the search box in YYYY-MM-DD format (e.g. "2008-03-20", without quotes) and click 'Search'.
You can also easily search for work from an entire month by searching for that month specified in YYYY-MM format, e.g. "2008-03".
To find work from multiple days, you can tick 'Regular expression' and then separate them with the "|" (vertical line, meaning 'or' in regular expression syntax) character, e.g. you could search for "2008-03-20|2008-03-21|2008-03-22". More advanced regular expressions could be used to construct fancier queries.
Note that this can be combined with "Filter (F5)"; for example, you could first filter on a particular username to see the work of that user only.
'Inline elements' refers to when an element is used inside the text of a field, i.e. when another field type occurs somewhere within the text of a single "Attributes (F1)" box. Take the following simple example Dutch - French dictionary article:
aanbouwmeubel meuble [m.] assemblé par éléments
The gender "m." of "meuble" is indicated in italics with square brackets within the actual translation "meuble assemblé par éléments". Using an inline element, the gender information may be specially tagged as such, e.g. with its own "gender" element, using XML syntax inside the translation equivalent field, like so:
meuble <gender>m.</gender> assemblé par éléments
Here "gender" is an element defined in the DTD. It thus has its own style in TshwaneLex, allowing it to automatically always appear with its own particular distinctive font/colour/formatting as well as automatic punctuation such as the square brackets around it.
The sample "English - French Inline Element Sample.tldict" that comes with TshwaneLex demonstrates this usage of inline elements.
Inline Elements Background:
In XML, only "PCDATA" can contain inline elements - regular attributes may not. This is because for an inline tag to be understood, the text that it occurs in must be 'parsed' (i.e. basically 'processed'), and ordinary attribute values are not parsed - only the PCDATA section of an element is parsed (PCDATA actually stands for "Parsed Character Data"). The PCDATA of an element is basically everything that falls between the opening tag of an element and its corresponding closing tag:
<Element ...>THIS IS PCDATA</Element>
Ordinarily, by default, a translation equivalent in TshwaneLex is stored as a regular attribute of an element, which will be exported as XML that looks like the following:
<TE TE="meuble assemblé par éléments"></TE>
If the "TE" attribute above is instead marked as being used for the PCDATA of the TE element in the DTD Editor, then it will be exported as XML that looks like the following (note the translation equivalent is now in the PCDATA section of the element):
<TE>meuble assemblé par éléments</TE>
This then allows inline elements to be used, e.g. the following is valid XML:
<TE>meuble <gender>m.</gender> assemblé par éléments</TE>
Using inline elements within a regular attribute, however, is invalid XML:
<TE TE="meuble <gender>m.</gender> assemblé par éléments"></TE>
Inline elements may also be used for basic formatting, such as bold and italics, instead of the TshwaneLex "%" markup characters:
<TE>here is some <b>bold</b> text</TE>
Advantages of Inline Elements:
In the above example of the "gender" element, using an inline element effectively allows "knowledge" of what the "m." means to be encoded into the data, rather than just "dumb" text. This has advantages, e.g. in the TshwaneLex Electronic Dictionary module, if the end-user clicks on a gender label, a grammar window explaining the gender system could be displayed. A smarter search index could also be generated, allowing the end-user to search for "meuble assemblé", since the search system could 'know' that the gender label is not strictly part of the translation.
Inline Cross-references [TshwaneLex 3]
Another advantage of inline elements in TshwaneLex is that they allow for the creation of inline cross-references - i.e. cross-references somewhere within the text of a field. This can be configured in the DTD Editor, in the "Attributes" section (using the 'XRef target' and 'XRef display' checkboxes). The 'XRef target' attribute is used to specify the actual cross-reference, while the optional 'XRef display' attribute, if filled in, indicates some substitution text to appear in place of the cross-referenced headword (if you need to display something different). (If both of these two settings are used, they must be ticked for different attributes (or PCDATA) within the same element.)
The 'XRef target' attribute type may be used on either PCDATA (for inline cross-references - this will likely usually be the case) or normal attributes. In the following example it is used in PCDATA:
<reftype>See</reftype> <ref>bunny</ref>
If "ref" has a PCDATA with 'XRef target' selected, then TshwaneLex will try to resolve "bunny" as a cross-reference and make a hyperlink. It doesn't have to be PCDATA, it could be a regular attribute; you could thus also do something like:
<reftype>See</reftype> <ref target="bunny" />
The 'XRef display' is for cases where the text that is displayed in the Preview should be different from the actual cross-reference target. For example, if we create a "ref::display" attribute and check the 'XRef display' option for it, we could do the following:
Check <ref display="Google">http://google.com/</ref> for more info.
or non-PCDATA equivalent:
Check <ref display="Google" target="http://google.com/"> for more info.
The output will then display "Check Google for more info", but if you click on "Google" the actual link will be "http://google.com/". Apart from hyperlinks to websites, this also has lexicographic applications (the inline cross-references sample in TshwaneLex demonstrates this).
'XRef target' is required to be ticked if you want to use inline cross-references, but 'XRef display' is optional. If you use it though, it must be on the same element as the 'XRef target' it corresponds to. This can also be used on either PCDATA or normal attributes.
Note that inline cross-references are not "smart cross-references", although will display as a hyperlink if the cross-reference target is found. It is suggested to use normal smart cross-references instead of inline cross-references unless you have a definite need to e.g. have complex sentences with cross-references appearing anywhere inside a sentence.
By default the cross-reference type is displayed with the same style as the cross-reference headword, e.g.:
hound SEE dog
It is sometimes desirable to use a different style for the cross-reference type, e.g.:
hound SEE dog
This can be achieved to some extent by using markup tags in the 'display labels' under "Dictionary/Edit cross-reference types", e.g.:
%bSEE%b
In this example, the bold tags 'cancel out' the surrounding bold from the overall style of the entire cross-reference. In future, we intend to add more advanced controls for the appearance of cross-references.
Releases of TshwaneLex later than July 2007 include the ability to do closed list item validation for inline elements. A PCDATA section can be specified to be of 'closed list' type in the DTD editor, and if its value isn't in the selected list, it will be highlighted as red in the Preview (likewise for attributes written inline). For example, if you have in an etymology field:
From <lang>Old French</lang> <originword>entreprendre</originword>
then the part inside the "lang" tags may be checked against values in a closed list, such as a list of valid origin languages; thus if you make a typo:
From <lang>Old Frenhc</lang> <originword>entreprendre</originword>
the language name will be highlighted in red, immediately tipping the user off that there is a mistake.
[New from 2007-03-22; TshwaneLex 2] Special "tag" shortcut keys can be created under "Tools/Options/Keyboard shortcuts (macros)" that make tagging data with inline elements under "Attributes (F1)" far more convenient and user-friendly than typing out tags manually. This involves creating a shortcut key with the following format for "Text to insert when shortcut is pressed":
$TAG$:tagname
For example:
$TAG$:gender
Pressing this shortcut key in an "Attributes (F1)" box will then automatically 'intelligently' output either an opening "<gender>" or closing "</gender>" tag as appropriate, or if some text is selected, surround the selected text with a pair of opening and closing tags.
Yes; for now, this can be done by creating an attribute called "_SortKeyOverride" on the main entry element. This will be applied for formatted exporters, e.g. RTF, HTML and XML (Formatted), and is applied at export time, not in TshwaneLex itself. [From 3 May 2009 onwards, and TshwaneLex-only]
Under normal circumstances these should not go wrong, but should circumstances outside the norm result in the ordering of entries or of homonym numbers being incorrect, you can use "Re-sort lemmas" under "Dictionary/Configure sorting" (use for each section of the dictionary).
The database analyse/repair options under "Tools/Database administration" will also repair homonym numbers.
It may be useful sometimes to refer to a certain entry while working on another, so that both are on the screen simultaneously.
This can be done by selecting the entry you want to "pin", selecting "Edit/Tag entry", then enabling "Edit/Show tagged always", which will effectively "pin" tagged entries in the Preview.
Another trick here, or if you're using an older version of TshwaneLex without the 'tags' functionality, is to use "Search (F3)" to get the first entry into the search results window, then select and work on the other entry.
Make sure to use the "cached" ODBC interface, i.e. the one labelled 'ODBC (cached)' - this is like the ordinary ODBC interface, but transparently keeps a local cached copy of entries that haven't changed on the hard disk, allowing TshwaneLex/tlTerm to dramatically speed up general work and activities like searching and filtering.
Also, when working on an ODBC database, enable "Format/Preview selection only" - this can make a very big difference.
For PostgreSQL, try running an "analyze" on the database - this can make a huge difference.
See previous question.
The cache folder lives in your 'temp' folder, typically something like 'c:\Documents and Settings\YourUsername\Local Settings\Temp\TshwaneDJe_Cache' - just close TshwaneLex/tlTerm and delete the TshwaneDJe_Cache folder.
As of 2007-06, TshwaneLex/tlTerm can now be launched with an ODBC database specified as command-line parameter (e.g. "odbc|datasourcename|tl_"), allowing e.g. Desktop or Quicklaunch shortcuts to be created directly that directly open a particular ODBC database.
To do the same but using the "cached ODBC" interface, use "cached" instead of "odbc", e.g. "cached|datasourcename|tl_".
By default only the first 1000 results for a search are returned; this can be changed. Under "HKEY_CURRENT_USER/Software/TshwaneDJe/(ApplicationName)/Settings" (create if necessary), create a DWORD registry key called "MaxCorpusResults" and set the value to the desired maximum number of corpus lines.
The most recent sets of corpus search results are cached, so that if you immediately return to a recent search, the results can be displayed immediately. The number of results to cache can be changed. Under "HKEY_CURRENT_USER/Software/TshwaneDJe/(ApplicationName)/Settings" (create if necessary), create a DWORD registry key called "MaxCorpusCachedResults" and set the value to the desired maximum number of search results. The default is 20. A note of caution, arbitrarily changing this to a very high number may impact performance.
1. Configuring the Corpus
Step-by-step:
- Prepare your corpus files as text files. (It may be a good idea to save them all to a specific dedicated folder, but that is not necessary.)
- Under "Corpus (F6)", click on "Configure" and select "Texts/Add multiple".
- Click "Browse" to select the folder containing the text files (the "recurse" option will specify whether or not TshwaneLex/tlTerm will also auto-add text files from subfolders within the selected folder).
- Click "OK".
- The desired corpus files should now appear in the list.
The configuration will be saved along with the particular database.
More files can be added at any time later, or files may be removed from the list.
2. Doing a Corpus Search
Once the corpus files are configured, you can perform search queries on the corpus. Either a query can be entered manually under "Corpus (F6)", or you can tick the "Auto-search" option, and TshwaneLex/tlTerm will then automatically launch a corpus search for the current headword/term each time you select an entry.
The most recent results are kept in memory, thus if you select another entry and then go back again to the first entry, the search results should re-appear immediately. If the search had not yet completed, it will automatically continue on its way again.
3. Sorting the Results
The ordering of corpus search results can be configured by clicking on "Configure" under "Corpus (F6)", and under "Sort", using "Move up" and "Move down" to change the order of sort items. For example, by moving "Word Before Search Term" to the first position, the entries will first be sorted on the word to the left of the search term within a results line. (If the word to the left is the same for two lines, the next item in the list will decide how they are further sorted, and so on.)
4. Auto-grabbing Usage Examples
One of the powerful time-saving features of TshwaneLex/tlTerm 3 is the ability to automatically 'grab' a sentence from a line in the corpus results and attach it as a usage example in the current entry. To do this, use the following procedure:
- Select the desired 'Sense' element in the Tree View to which you wish to attach the example
- Select the desired line in the corpus results by clicking on its number in the left column
- Press the shortcut key Ctrl+F7
Note: This "relies on" the default "Sense" and "Example" elements from the default TshwaneLex DTD being present.
It is also possible to grab multiple sentences at a time. Just select the desired corpus lines (e.g. holding in "Ctrl" while clicking on them with the mouse to create a multi-selection) and press Ctrl+F7; each one will be added as an 'Example' to the currently selected 'Sense'.
5. Copying Selected Examples (Corpus Lines) to Clipboard
You can use the shortcut "Ctrl+C" to copy the currently selected corpus line to the clipboard.
6. Corpus Encryption
The TshwaneLex/tlTerm corpus tool includes a facility to 'encrypt' corpus files and protect them with a password. The resultant encrypted files can be used within the "Corpus (F6)" tool provided one has the password, but outside of TshwaneLex/tlTerm the files will be unreadable. This allows you to protect your corpus from possible theft by members of the team or anyone else with access to the computers.
To apply encryption to all or part of your corpus, click on "Configure" under "Corpus (F6)", click on the "Texts" tab, then select one or more files that you would like to encrypt from the list. Multiple files can be selected by holding in "Ctrl" on the keyboard while clicking with the mouse. Alternatively, if you wish to encrypt all files, click on "Encrypt all". You will be prompted for the password that will be used to protect the files. Enter the password carefully and click "OK". New copies of each chosen file will be saved (to the same folder) with an extension ".tecrypt". These files may then henceforth be distributed to the compilers instead of the original text files.
IMPORTANT:
- The encryption password is case-sensitive, meaning "a" is considered different from "A".
- Make sure to keep a backup copy of the original corpus files in a safe place. Do not lose the originals. If you forget the password, the original files can not be recovered.
The most common cause of this is having the same dictionary file open more than once simultaneously (e.g. in two separate TshwaneLex windows). Failing that, it may be that the file is marked as 'read-only'.
=> The 'Getting Started with TshwaneLua' guide is being moved to the TshwaneLua page.
=> Sample Lua scripts are available here.
Try declaring variables and/or functions "local" (i.e. with local keyword in front).
If you have (for example) a variable of type tcNode called NODE which (NB) you know is of type tcReference, you can typecast it using tolua.cast as follows:
local Reference = tolua.cast(NODE, "tcReference");
This was caused by a bug in Windows Vista. A workaround has been implemented in newer versions of TLex - if you are experiencing this, use "Help/Check for updates" to update your software.
The idea behind 'templates' is to pre-create entire element tree sub-structures and then enter these 'in one go' in the Tree View. Templates are not yet directly supported in TshwaneLex/tlTerm, but are planned for a future release. In the meantime, for many applications, there are 'work-arounds' - tips that allow one to achieve a similar effect in certain cases. One, for 'lemma templates' as a whole, is to create a few 'dummy' entries, perhaps sorted to the top of the dictionary always, and use the "Lemma/Duplicate article" (Ctrl+Shift+U) menu command in TshwaneLex (or "Entry/Duplicate entry" in tlTerm) when you wish to create an instance of the 'template'. A similar technique can be used for sub-element trees within an entry, by using the Tree View 'copy' and 'paste' commands (Ctrl+C and Ctrl+V respectively) to duplicate Tree View structures from a 'template' source. Another possibility in some cases is to use the DTD child relation constraints, e.g. specifying a "one or more" relation will cause the child element to automatically be created when one of its parent elements are created. Finally, for advanced users, the built-in Lua scripting could be used.
Data in XML form can be imported into TshwaneLex or tlTerm via the "File/Import/XML" menu option.
It is best to import data into a 'clean/empty' document, i.e. to select the XML import command when no database is open in TshwaneLex/tlTerm.
Note that after importing XML, there would usually be no TshwaneLex/tlTerm styles, thus all imported entries will usually be displayed in a default text style in black on a white background. You can use the "Format/Styles" menu option as usual to add styles once you are satisfied with the import.
TshwaneLex has one or two basic 'expectations' of how the data should be structured in order to import the data in a meaningful way (i.e. in a way that allows TshwaneLex to 'understand' what some of the key fields are, such as the headword). The following is an example of roughly the simplest XML document that can be thrown at the importer:
<Dictionary> <Language> <Entry LemmaSign="cow"> </Entry> </Language> </Dictionary>
Note that the element for a 'dictionary entry' appears at the third depth level in the document, and should contain an attribute called "LemmaSign" that contains the headword; this allows TshwaneLex to recognise which attribute it should use as the headword for purposes of sorting, indexing in the Lemma List, and so on. (If the headword is in a different element or attribute, it will still be imported - TshwaneLex will just not 'know' to use that field for the Lemma List and so on.)
The names of the elements above ("Dictionary", "Language" and "Entry") can be anything, although their structure is important (i.e. second-level element represents each 'section' or 'side' of a dictionary within TshwaneLex, and third level represents the list of entries within that section).
Note that you do not necessarily need a DTD attached to the data - if importing XML data with no DTD, TshwaneLex will attempt to construct a DTD based on the elements/attributes it encounters. For well-structured data, this can work well.
Here is a slightly more complex example (note "TE" stands for "Translation Equivalent" for, in this case, bilingual English - Afrikaans data):
<Dictionary>
<Language>
<Entry LemmaSign="dog">
<Plural>dogs</Plural>
<Sense>
<TE TE="hond" />
<Definition Definition="A domestic mammal that barks" />
</Sense>
</Entry>
<Entry LemmaSign="cat">
<Sense>
<TE TE="kat" />
</Sense>
</Entry>
</Language>
</Dictionary>
The entries do not need to be correctly sorted within the XML (e.g. "dog" then "cat" above) - TshwaneLex will automatically resort them according to the default configured 'sort method' (which can also be changed at any time later on).
'Merge' XML Import
If you want to import entries into an existing database, the most important thing is to 'tell' the importer which 'side' (section/language) of a dictionary to import sets of entries into. This is done by filling in the language "Name" attribute with the exact same name configured for a language side/section under "Dictionary/Properties". This is shown in the following example:
<Dictionary> <Language Name="English"> <Entry LemmaSign="dog"> <Sense><TE TE="inja" /></Sense> </Entry> </Language> <Language Name="Zulu"> <Entry LemmaSign="inja"> <Sense><TE TE="dog" /></Sense> </Entry> </Language> </Dictionary>
To do the 'merge' import, one then just selects "File/Import/XML" while the desired database is open.
NB: It is a good idea to always do a 'File/Create a backup' before doing a 'merge import'.
Click on the "Pronunciation" link near the bottom of the TshwaneLex page to hear an audio recording. (Note how the "sh" is not pronounced as in English, but more resembling an ordinary English "s".)
'Tshwane' is the African name for the city in which the company and its software were originally created, namely Pretoria, while 'Lex' refers to 'lexicon' (or alternatively 'lexicography'). For further background, see the Wikipedia entries on Tshwane and Pretoria.
TLex/tlTerm/tlDatabase Best Practices
• A common mistake is to export your data to an 'output' format such as Microsoft Word, and then start editing the data externally, e.g. within Microsoft Word. Do not do work on your data in Microsoft Word. If you are proofreading data that has been exported to MS Word (or any other printable / formatted output, such as InDesign), only mark the required changes, and implement those changes in TLex/tlTerm/tlDatabase directly. You can then re-export to MS Word easily. The latest version of your data should always be in TLex/tlTerm/tlDatabase. In general, it is easy to go from TLex/tlTerm/tlDatabase to MS Word / InDesign etc., but is very difficult to take any data or changes from MS Word back into TLex/tlTerm/tlDatabase. Because Word is an unstructured editing format and the tlDatabase product family are structured editing formats, there is no easy, generic way to get work done in Word into TLex/tlTerm/tlDatabase. (If you have made a mistake like this, contact us for possible assistance with data conversion to get the data back into TLex and/or XML - we are very experienced in dealing with these kinds of problems.)
• 'Ordinary' users / lexicographers / terminology practitioners / data enterers should be restricted from being allowed to edit the DTD using the User Management system; changes to the DTD should be coordinated and centralised through managers and/or IT administrators who are technically clued up enough to solve problems using the DTD in the correct way. Users who don't fully understand the DTD often create bad 'ad hoc' solutions to specific problems; these can cause many other problems down the line.
• For projects involving more than one person, enable User Management and give each user their own logon to the database - there are numerous significant advantages to using this.
• Keep your software up to date! Use the "Help/Check for updates" menu option to check if there are updates available.
• Always follow sensible backup procedures; see the User Guide for more guidance on this topic.