patgen

NAME

patgen - generate patterns for TeX hyphenation

SYNOPSIS

patgen dictionary_file pattern_file patout_file translate_file

DESCRIPTION

This manual page is not meant to be exhaustive. The complete documentation for this version of can be found in the info file or manual R Web2C: A TeX implementation .

The patgen program reads the dictionary_file containing a list of hyphenated words and the pattern_file containing previously-generated patterns (if any) for a particular language, and produces the patout_file with (previously- plus newly-generated) hyphenation patterns for that language. The translate_file defines language specific values for the parameters R left_hyphen_min and right_hyphen_min used by 's hyphenation algorithm and the external representation of the lower and upper case version(s) of all \`letters' of that language. Further details of the pattern generation process such as hyphenation levels and pattern lengths are requested interactively from the user's terminal. Optionally patgen creates a new dictionary file I pattmp. n showing the good and bad hyphens found by the generated patterns, where n is the highest hypenation level.

The patterns generated by patgen can be read by initex for use in hyphenating words. For a (very) long example of R patgen 's output, see R $TEXMFMAIN/tex/generic/hyphen/hyphen.tex , which contains the patterns uses for English. At some sites, patterns for several other languages may be available, and the local tex programs may have them preloaded; consult your Local Guide or your system administrator for details.

All filenames must be complete; no adding of default extensions or path searching is done.

FILE FORMATS

Letters

When initex digests hyphenation patterns, first expands macros and the result must entirely consist of digits (hyphenation levels), dots (\`.', edge of a word), and letters. In pattern files for non-English languages letters are often represented by macros or other expandable constructs. For the purpose of patgen these are just character sequences, subject to the condition that no such sequence is a prefix of another one.

Dictionary file

A dictionary file contains a weighted list of hyphenated words, one word per line starting in column 1. A digit in column 1 indicates a global word weight (initially =1) applicable to all following words up to the next global word weight. A digit at some intercharacter position indicates a weight for that position only. The hyphens in a word are indicated by \`-', \`*', or \`.' (or their replacements as defined in the translate file) for hyphens yet to be found, \`good' hyphens (correctly found by the patterns), and \`bad' hyphens (erroneously found by the patterns) respectively; when reading a dictionary file \`*' is treated like \`-' and \`.' is ignored.

Translate file

A translate file starts with a line containing the values of left_hypen_min in columns 1-2, right_hyphen_min in columns 3-4, and either a blank or the replacement for one of the "hyphen" characters \`-', \`*', and \`.' in columns 5, 6, and 7. (Input lines are padded with blanks as for many related programs.) Each following line defines one \`letter': an arbitrary delimiter character in column 1, followed by one or more external representations of that character (first the \`lower' case one used for output), each one terminated by the delimiter and the whole sequence terminated by another delimiter. If the translate file is empty, the values R left_hypen_min =2, right_hyphen_min =3, and the 26 lower case letters R a ... z with their upper case representations R A ... Z are assumed.

Terminal input

After reading the translate_file and any previously-generated patterns from R pattern_file, patgen requests input from the user's terminal. First the integer values of R hyph_start and hyph_finish , the lowest and highest hyphenation level for which patterns are to be generated. The value of hyph_start should be larger than any hyphenation level already present in R pattern_file . Then, for each hyphenation level, the integer values of R pat_start and pat_finish , the smallest and largest pattern length to be analyzed, as well as R good weight , bad weight , and threshold , the weights for good and bad hyphens and a weight threshold for useful patterns. Finally the decision (\`y' or \`Y' vs. anything else) whether or not to produce a hypenated word list.

FILES

$TEXMFMAIN/tex/generic/hyphen/hyphen.tex

Patterns for English.

AUTHORS

Frank Liang wrote the first version of this program. Peter Breitenlohner made a substantial revision in 1991 for 3. The first version was published as the appendix to the ware technical report, available from the Users Group. Howard Trickey originally ported it to Unix.

Zprávičky

GitHub Action tj-actions/changed-files napadena, krade tokeny, hesla a další údaje

Premiér Petr Fiala se vyslovil proti návrhu vyhlášky o zaznamenávání navštívených stránek

Defender ve Windows odstraňuje WinRing0 jako škodlivý, monitorovací software jako FanControl přestane fungovat

Komerční sdělení

Společnost CGOS vsadila na Microsoft 365 pro efektivnější spolupráci

patgen

NAME

SYNOPSIS

DESCRIPTION

FILE FORMATS

FILES

SEE ALSO

AUTHORS

Zprávičky

GitHub Action tj-actions/changed-files napadena, krade tokeny, hesla a další údaje

Premiér Petr Fiala se vyslovil proti návrhu vyhlášky o zaznamenávání navštívených stránek

Defender ve Windows odstraňuje WinRing0 jako škodlivý, monitorovací software jako FanControl přestane fungovat

Komerční sdělení

Společnost CGOS vsadila na Microsoft 365 pro efektivnější spolupráci

Dále u nás najdete

Lékaři transplantovali játra pětitýdenní holčičce

Ze Španělska do Česka. Jaký je příběh výrobců olivového oleje?

O penzijko pro děti roste zájem. Jak funguje?

Severní Korea získává finance krádežemi kryptoměn

Aplikace v App Storu nadměrně prozrazují citlivá data

Češi dokázali přenést SMS do telefonu přes satelit

Lidé si mylně myslí, že sklenka vína denně nevadí

Sleva na manžela se omezila. Kdo ji v přiznání může uplatnit?

Zelený zákal začíná výpadky v zorném poli

Největší hrozbou pro majitele kryptoměn jsou oni sami

Ledviny by se od 50 let výše měly pravidelně vyšetřovat všem

Hackuj stát 2025: kdo letos zvítězil?

Rusko šíří dezinformace pomocí gen AI nástrojů

Z ostrova ohně a ledu k uzení: Tento příběh je plný chuti a dýmu

Výrobu leteckých motorů rok co rok navyšujeme na dvojnásobek

Vnitro chce, aby operátoři ukládali, na jaké weby chodíme

Měnila se pravidla o cenách pro e-shopy. Pokuty mohou být likvidační

Nemocnice v ohrožení, přístroje mají nebezpečnou zranitelnost

U mobilních tarifů mají podnikatelé výběr. Ceny se dost liší

Google Maps už jsou kolem nás dvacet let