Modul:mlawc
Tampilan
Tes diri ada di subhalaman dokumentasi. |
--[===[
MODULE "MLAWC" (language and word class)
"eo.wiktionary.org/wiki/Modulo:mlawc" <!--2024-Oct-09-->
"id.wiktionary.org/wiki/Modul:mlawc"
Purpose: shows the lemma in bold text and brews 2 or 3
tooltip texts and 3 or 5 locally invisible category includes
from langcode and 1 or 2 word class codes,
creates 2...5 invisible "anchors" for linking to section,
optionally splits a multiword lemma into links to the parts,
optionally categorizes the parts
Utilo: montras kapvorton per grasa tiparfasono kaj generas 2 aux 3
musumkonsilajn tekstojn kaj 3 aux 5 loke nevideblajn kategorienmetojn
el lingva kodo kaj 1 aux 2 vortospecaj kodoj,
kreas 2...5 nevideblajn "ankerojn" por ligado al sekcio,
opcie disigas plurvortan kapvorton al ligiloj al la partoj,
opcie kategoriigas la partojn
Manfaat: memperlihatkan lema dengan teks tebal dan membuat 2 atau 3
teks tooltip dan 3 atau 5 masukan kategori tak terlihat secara
setempat dari kode bahasa dan 1 atau 2 kode kelas kata,
membuat 2...5 jangkar yang tidak terlihat untuk pranala ke bagian,
juga bisa memotong lema beberapa kata menjadi pranala ke bagiannya,
juga memungkinkan mengategorikan semua bagian ini
Syfte: visar uppslagsordet med fet stil och skapar 2 eller 3
tooltiptexter och 3 eller 5 lokalt osynliga kategoriinlaeggningar
fraan spraakkoden och 1 eller 2 ...
Used by templates / Uzata far sxablonoj /
Digunakan oleh templat / Anvaent av mallar:
* livs (EO) , bakk (ID)
Required submodules / Bezonataj submoduloj /
Submodul yang diperlukan / Behoevda submoduler:
* "loaddata-tbllingvoj" T78 in turn requiring template "tbllingvoj" (EO)
* "loaddata-tblbahasa" T78 in turn requiring template "tblbahasa" (ID)
* "msplitter"
This module can accept parameters whether sent to itself (own frame) or
to the caller (caller's frame). If there is a parameter "caller=true"
on the own frame then that own frame is discarded in favor of the
caller's one. Empty parameters and parameters longer than 120
octet:s are inherently invalid (#E09), further checks follow.
Incoming: - 2 anonymous obligatory parameters (one of them
can be "??" but NOT both)
- langcode (2 or 3 lowercase letters, use "??" if unknown)
- word class code (2 UPPERCASE letters, use "??" if unknown)
or 2 word class codes (4 UPPERCASE letters, no "??" then)
- 1 or 4 named optional parameters (depends on semi-hardcoded
configuration)
- "dst=" (2...40 octet:s) distinction hint for word class, for
example "koleg+o", "kol+eg+o", "fleksia bana+n", "baza bana",
"en", "ett" (all brackets prohibited, apo "'" prohibited, OTOH
plus "+" permitted and recommended) not showed but built in
into the "anchor" (related to word class, not to compound split)
- "fra=" (1...120 octet:s) split control string, pagename
is always used as lemma, if it is multiword then it is split
automatically, this parameter can request assisted automatic
split or manual split or no split, can generate #E23 #E14 #E16
if faulty, this parameter is NOT supported (thus fully ignored)
if splitting or even showing the lemma is deactivated in the
source code, see below and "spec-splitter-en.txt" for details
- "ext=" extra parameter for additional compound categories,
can contain 1...4 fragments of type F210 only (":" or "!", no
"L"), see "spec-splitter-en.txt" for details, or "&"-syntax
- "scr=" script code, one uppercase letter, copied to cat
name as-is, bypasses the splitter and is added to its output
- 3 hidden parameters
- "pagenameoverridetestonly=" can cause #E01
- "nocat=" no error possible !!!FIXME!!! remove "nocat" in favor of "pate"
- "detrc=" no error possible
Returned: * one string intended to be showed alone in a line below
h3-heading, consisting of:
* the word in bold and enclosed in <bdi>...</bdi>
* space
* short summary with word classes
* 1 word class (example: "( sv , VE )") with 2 tooltips (example:
"Bahasa: Swedia (svenska)" and "Kelas kata: verba (kata
kerja)") and 2 invisible anchors and 3 base categories
or
* 2 word classes (example: "( sv , VE , GR )") with 3 tooltips
and 3 invisible anchors and 5 categories
* up to 18 optional compound categories
This module is unbreakable (when called with correct module name
and function name). Every imaginable input from the caller and
from the imported modules will output either a useful result or
at least a helpful error string.
Cxi tiu modulo estas nerompebla (kiam vokita kun gxustaj nomo de modulo
kaj nomo de funkcio). Cxiu imagebla enigo de la vokanto kaj
de la importataj moduloj eldonos aux utilan rezulton aux
almenaux helpeman eraranoncan signocxenon.
Following errors are possible:
* <<#E01 Internal error in module "mlawc">>
Possible causes:
* strings not uncommented
* function "mw.title.getCurrentTitle().text" AKA "{{PAGENAME}}" failed
* pagename is invalid such as empty or too long or contains
invalid brackets []{} or more than one consecutive apo, even if
coming from "pagenameoverridetestonly="
* <<#E09 Erara uzo de sxablono "livs", legu gxian dokumentajxon>>
Possible causes (early detected obvious problems with parameters):
* less than 2 or more than 3 parameters, or holes
* empty parameters or parameters longer than 120 octet:s
* <<#E02 Malica eraro en subprogramaro uzata far sxablono "livs">> !!!FIXME!!!
Possible causes:
* submodule not found
* submodule caused unspecified failure
* the 2 required columns /c0/ and /c2/ are missing or invalid
("-" is tolerable in /c2/ but NOT in /c0/)
* <<#E03 Nombrigita eraro en subprogramaro uzata far sxablono "livs">> !!!FIXME!!!
Possible causes:
* submodule failed and returned valid error code
* <<#E04>>
* <<#E05>>
* <<#E11 Evidente nevalida lingvokodo en sxablono "livs">>
* <<#E12 Nekonata lingvokodo en sxablono "livs">>
* <<#E13 Erara uzo de sxablono "livs" pro vortospeco>>
Possible causes (later detected more clandestine problems with parameters):
* invalid word class code
* "??" used inside 4-char string
* both langcode and word class given as "??"
- <<#E14 Erara uzo de sxablono "livs" pro pagxonomo por "$S" "$H">>
- "$S" used with wrong pagename (must end with "a"..."z")
- "$H" used with wrong pagename (must not contain spaces mm)
- <<#E16 Erara uzo de sxablono "livs" pro "sumkontrolo">>
Possible causes:
- "sum check" failure with manual split
- <<#E19 Erara uzo de sxablono "livs" pro "dst=" distingo>>
Possible causes (later detected more clandestine problems with parameters):
- distinction hint parameter is faulty
- <<#E20 Erara uzo de sxablono "livs" pro "ext=" kroma parametro>>
Possible causes (later detected more clandestine problems with parameters):
- extra parameter is faulty
- <<#E21 Erara uzo de sxablono "livs" pro "scr=" skriba parametro>>
Possible causes (later detected more clandestine problems with parameters):
- script parameter is faulty (not one uppercase letter)
* <<#E23 Erara uzo de sxablono "livs" pro "fra=" disiga parametro>>
Possible causes (later detected more clandestine problems with parameters):
- split control parameter is faulty (assi or manu, excl "sum
check", see below and spec)
The 26 word classes are:
Main big classes (3):
- SB noun - substantivo (O-vorto) - nomina (kata benda)
- VE verb - verbo (I-vorto) - verba (kata kerja)
- AJ adjective - adjektivo (A-vorto) - adjektiva (kata sifat)
Further smaller classes (12):
- PN pronoun - pronomo - pronomina (kata pengganti)
- NV numeral - numeralo (nombrovorto) - numeralia (kata bilangan)
- AV adverb - adverbo (E-vorto) - adverbia (kata keterangan)
- PV verb particle (EN,SV) - verbpartiklo - partikel verba
- QV question word - demandvorto - kata tanya
- KJ coordinator - konjunkcio - konjungsi
- SJ subordinator - subjunkcio (subfrazenkondukilo) - subjungsi (pengaju klausa terikat)
- PP preposition - prepozicio (antauxlokigita rolvorteto) - preposisi (kata depan)
- PO postposition (EN,SV) - postpozicio - postposisi (kata belakang)
- PC circumposition (SV) - cirkumpozicio - sirkumposisi
- AR article (EN,EO,SV) - artikolo - artikel (kata sandang)
- IN interjection - interjekcio - interjeksi
Nonstandalone elements (5):
- PF prefix - prefikso - prefiks (awalan)
- UF suffix - sufikso (postfikso, finajxo) - sufiks (akhiran)
- KF circumfix - cirkumfikso (konfikso) - sirkumfiks (konfiks)
- IF infix - infikso - infiks (sisipan)
- NR nonstandalone root - nememstara radiko - akar kata terikat (prakategorial)
Misc (2):
- KA sentence - frazo - kalimat
- KK character - signo - karakter
Additional classes (4) :
- KU abbreviation - mallongigo (kurtigo) - singkatan (abreviasi)
- GR group of words - vortgrupo - kumpulan kata
- PA participle - participo - partisip
- TV table word - tabelvorto - kata tabel
Class "NR" is exclusive and may NOT be combined with anything else (violation
gives #E13). It affects the "$S" simple bare root split.
Class "KA" is almost exclusive and may NOT be combined with anything other
than "KU" (violation gives #E13). It is also special in that it affects
morpheme cat:s (changes them from "vortgrupo" to "frazo") if they are enabled.
Here we do NOT care about the "base word" property, it is categorized by
module "tagg" / "k" instead. Similarly we do not care about "kofrovorto",
"blandajxo", "derivajxo de tabelvorto" here. And we do NOT care about
"Proverbo" (subclass of KA) and "Esprimo" (subclass of GR) either.
We theoretically could autodetect the word classes KA and GR but don't. The
chief trouble with autodetecting KA are some multiword abbreviations
beginning with uppercase and ending with a dot, GR is probably
less problematic. Still both would cause several problems:
* how to override or suppress autodetection
* how many word classes are permitted at same time given that an additional
one can be autodetected
List of 6+1+1+1 selectable morpheme types:
C circumfix cirkumfikso
I infix infikso (EO: -o- -et- -il- ...)
M standalone root memstara radiko (EO: tri dek post ...)
N nonstandalone root nememstara radiko (EO: fer voj ...)
P prefix prefikso
U suffix sufikso (postfikso, finajxo, EO: -a -j -n)
-------
W word vorto
-------
L same as "N" but changes linking behavior (only in F210)
-------
X only after "&" in the extra parameter (convert it to 1 or 2 fragments)
Note that 5 of those 9 are also word classes, but "M" and "W" aren't
and reasonably shouldn't be.
These mortyp:s can be used in the split control parameter before colon ":"
with manual split, and in the extra parameter, but then "L" is prohibited
(thus C I M N P U W are left plus maybe X), either after "&", or in fragments
before ":" or "!" (see "spec-splitter-en.txt" for syntax details).
We put only the letter symbol into the category name (except for the type
word) as it otherwise would become unreasonably long. It must contain
3 pieces of information:
- language (consider "-an" in SV and ID)
- "mortyp" (consider "-an" and "an-" and "an" in SV)
- the morpheme / affix / word itself
Categories:
There are obligatory base categories constructed from language and word class
(3 or 5), and optional compound (morpheme) categories (1...18) that can arise
from the fragments generated by the splitter if requested so. Structure of
categories from both those groups is defined by "contabkatoj" (see submodule).
EO:
Kategorio:Kapvorto (angla) Kategorio:Kapvorto (Esperanto)
Kategorio:Verbo Kategorio:Verbo
Kategorio:Verbo (angla) Kategorio:Verbo (Esperanto)
ID:
Kategori:Kata bahasa Indonesia
Kategori:Nomina
Kategori:id:Nomina
Notes: - we auto-remove the part of word class in brackets and auto-adjust
the letter case, thus "adverbo (E-vorto)" becomes "Adverbo"
or "nomina (kata benda)" becomes "Nomina"
- "angla" is lowercase when in brackets, but begins uppercase when
separate (pagename in category namespace), we can auto-adjust
the letter case as needed
Anchors:
* Qsekt-en (lang only)
* Qsekt-en-SB (lang and word class) (2 such created if 2 word classes)
* Qsekt-sv-SB-ett (lang and word class and hint) (2 such created
if 2 word classes)
With 1 word class we brew 2 or 3 anchors.
With 2 word classes we brew 3 or 5 anchors.
With the hint provided we brew both a category without and with it built in.
There are 2 ways to brew "anchors" in HTML:
* <span id="tujuh"></span> HTML5 and works from wikitext, used here
* <a name="tujuh"></a> HTML2 but does NOT work from wikitext, showed
as plain text
Semi-hardcoded configuration in the source:
* "constrmainctl" type string 2 digits :
* show image (0 or 1) the image is in "contabscrmisc[1]"
* show lemma (0 none -- 1 raw -- 2 maybe split --
3 maybe split, morpheme cat insertions)
* "conbookodlng"
* "conboomiddig"
The splitter (see "spec-splitter-en.txt" for syntax details):
The base split strategies available (selectable with
the "fra=" split control parameter, var "numsplit") are:
- #S0 automatic multiword split (default if splitter active)
- #S1 assisted split
- #S2 manual split
- #S3 simple root split
- #S4 simple bare root
- #S5 large letter split
- #S6 reserved
- #S7 no split (only choice if splitter inactive)
It is possible to deactivate (semi-hardcoded configuration in the source
code of this "mlawc") only compound categories, or to deactivate the splitter
resulting in the raw lemma showed without linking, or to deactivate showing
the lemma altogether, in both latter cases the splitter is inactive and the
submodule "msplitter" is not called at all. Some or all of the parameters
"fra=" "ext=" "scr=" are NOT supported then (thus fully ignored, no error
can arise from them).
The "fra=" split control parameter is subject to strict prevalidation (unless
the splitter is inactive) and can generate #E23. For manual split the
prevalidation includes the "sum check" against the pagename that can give
#E16. Later when the split is carried out no error can occur anymore, possible
problems (with assisted split) are safely ignored instead.
The automatic multiword splitter ("numsplit" = 0 and "lfsplitaa" "msplitter"
"qsplitter") is fully automatic and the 2 tables "tabblock" and "tablinx" must
be empty then. No error can occur here, but there is risk for a failure that
no split boundaries can be applied, and the output is identical to the input.
The assisted splitter ("numsplit" = 1 and "lfsplitaa" "qsplitter")
is controlled by 2 prevalidated tables generated from the "fra=" parameter.
* Table "tabblock" contains up to 16 values indexed by integers 0 to 15,
value type string "1" means do block, type "nil" means do not
block (the default). Other values should not occur and evaluate to
do not block like "nil" does.
* Table "tablinx" contains up to 16 values indexed by integers 0 to 15, value:
* type string:
* "N" or "I" or "A" (as described in "spec-splitter-en.txt")
* colon ":" followed by the link target (length 1...40 octet:s NOT
checked anymore here)
Beginning char other than "N" or "I" or "A" or ":" should not
occur and evaluates to do nothing unusual like "nil" does.
* type "nil" means do nothing unusual (the default)
No error can occur in the assisted splitter, but there is risk
for a failure that no split boundaries can be applied, and the output is
identical to the input.
The manual splitter ("numsplit" = 2 and "lfsplitmn" "qsplitter") is controlled
by one prevalidated table generated from the "fra=" parameter, the pagename
does not even enter the split process.
* Table "tabmnfragments" contains 1 to 16 strings indexed by integers 0 to 15,
one string for every fragment. The 5 legal types are:
* F000 : no brackets, no colon, no slash (visible text no link)
* F200 : 2 brackets, no colon, no slash (combo target visible text)
* F201 : 2 brackets, no colon, 1 slash (target / visible text)
* F210 : 2 brackets, 1 colon, no slash (mortyp : combo target visible text)
* F211 : 2 brackets, 1 colon, 1 slash (mortyp : target / visible text)
No error can occur in the manual splitter and no failure due to
lack of boundaries either, the "sum check" is part of the prevalidation.
Note that we use slashes and single rectangular brackets "+[I:bug/BUG]"
instead of wikisyntax "[[bug|BUG]]", beware that "[bug|BUG]" would NOT work.
The tooltips:
There are some difficulties with the tooltip to be displayed via the "title="
attribute. HTML tags cannot be nested, thus neither <br> nor <bdi>...</bdi>
can be used. We have no solution to <br> (apart from splitting the tooltip
into 2 fragments showed separately from different positions), and for
<bdi>...</bdi> we use the unicode explicit isolator "FIRST STRONG
ISOLATE (FSI)" which does have the expected effect but may as a side effect
show as a rectangle in some browsers. Alternatively, an advanced tooltip can
be achieved using CSS and the "hover" selector but this is not accessible
from inside wikitext. Even an extension for such advanced tooltips exists
but is not enabled on most public wikies.
{{hr3}} <!-------------------------------->
* #T00 (no params, evil)
* expected result: #E09
* actual result: "{{#invoke:mlawc|ek}}"
::* #T01 ("eo", one param, evil)
::* expected result: #E09
::* actual result: "{{#invoke:mlawc|ek|eo}}"
* #T02 ("en|SB", page "hole", simplest example)
* expected result: OK
* actual result: "{{#invoke:mlawc|ek|en|SB|pagenameoverridetestonly=hole|nocat=true}}"
::* #T03 ("en|??", page "hole")
::* expected result: OK
::* actual result: "{{#invoke:mlawc|ek|en|??|pagenameoverridetestonly=hole|nocat=true}}"
* #T04 ("??|SB", page "hole")
* expected result: OK
* actual result: "{{#invoke:mlawc|ek|??|SB|pagenameoverridetestonly=hole|nocat=true}}"
::* #T05 ("??|??", page "mojosa")
::* expected result: #E13
::* actual result: "{{#invoke:mlawc|ek|??|??|pagenameoverridetestonly=mojosa|nocat=true}}"
* #T06 ("id|SBGR", page "pembangkit listrik", default split)
* expected result: OK
* actual result: "{{#invoke:mlawc|ek|id|SBGR|pagenameoverridetestonly=pembangkit listrik|nocat=true}}"
::* #T07 ("en|SB|tria", page "hole", too many params)
::* expected result: #E09
::* actual result: "{{#invoke:mlawc|ek|en|SB|tria|pagenameoverridetestonly=hole|nocat=true}}"
* #T08 ("en|SB|tria|kvara", page "hole", too many params)
* expected result: #E09
* actual result: "{{#invoke:mlawc|ek|en|SB|tria|kvara|pagenameoverridetestonly=hole|nocat=true}}"
{{hr3}} <!-------------------------------->
* #T10 ("id|SBGR|fra=-", page "pembangkit listrik", no split)
* expected result: OK
* actual result: "{{#invoke:mlawc|ek|id|SBGR|fra=-|pagenameoverridetestonly=pembangkit listrik|nocat=true}}"
::* #T11 ("id|SBGR", page "pembangkit listrik tenaga surya", default split)
::* expected result: OK
::* actual result: "{{#invoke:mlawc|ek|id|SBGR|pagenameoverridetestonly=pembangkit listrik tenaga surya|nocat=true}}"
* #T12 ("id|SBGR|fra=-", page "pembangkit listrik tenaga surya", no split)
* expected result: OK
* actual result: "{{#invoke:mlawc|ek|id|SBGR|fra=-|pagenameoverridetestonly=pembangkit listrik tenaga surya|nocat=true}}"
::* #T13 ("id|SBGR|fra=%0", page "pembangkit listrik tenaga surya", auto split except ZERO)
::* expected result: OK
::* actual result: "{{#invoke:mlawc|ek|id|SBGR|fra=%0|pagenameoverridetestonly=pembangkit listrik tenaga surya|nocat=true}}"
* #T14 ("id|SBGR|fra=%1", page "pembangkit listrik tenaga surya", auto split except ONE)
* expected result: OK
* actual result: "{{#invoke:mlawc|ek|id|SBGR|fra=%1|pagenameoverridetestonly=pembangkit listrik tenaga surya|nocat=true}}"
::* #T15 ("id|SBGR|fra=%2", page "pembangkit listrik tenaga surya", auto split except 2)
::* expected result: OK
::* actual result: "{{#invoke:mlawc|ek|id|SBGR|fra=%2|pagenameoverridetestonly=pembangkit listrik tenaga surya|nocat=true}}"
{{hr3}} <!-------------------------------->
* #T20 ("id|SBGR|fra=%3", page "pembangkit listrik tenaga surya", auto split except 3, ignored)
* expected result: OK
* actual result: "{{#invoke:mlawc|ek|id|SBGR|fra=%3|pagenameoverridetestonly=pembangkit listrik tenaga surya|nocat=true}}"
::* #T21 ("id|SBGR|fra=%F", page "pembangkit listrik tenaga surya", auto split except "F" AKA 15, ignored)
::* expected result: OK
::* actual result: "{{#invoke:mlawc|ek|id|SBGR|fra=%F|pagenameoverridetestonly=pembangkit listrik tenaga surya|nocat=true}}"
* #T22 ("id|SBGR|fra=%G", page "pembangkit listrik tenaga surya", invalid split control string, bad char)
* expected result: #E23
* actual result: "{{#invoke:mlawc|ek|id|SBGR|fra=%G|pagenameoverridetestonly=pembangkit listrik tenaga surya|nocat=true}}"
::* #T23 ("id|SBGR|fra=%12", page "pembangkit listrik tenaga surya", auto split except 1 and 2)
::* expected result: OK
::* actual result: "{{#invoke:mlawc|ek|id|SBGR|fra=%12|pagenameoverridetestonly=pembangkit listrik tenaga surya|nocat=true}}"
* #T24 ("id|SBGR|fra=%23456789", page "pembangkit listrik tenaga surya", auto split except 2...9, junk ignored)
* expected result: OK
* actual result: "{{#invoke:mlawc|ek|id|SBGR|fra=%23456789|pagenameoverridetestonly=pembangkit listrik tenaga surya|nocat=true}}"
::* #T25 ("id|SBGR|fra=%123456789", page "pembangkit listrik tenaga surya", auto split except 1...9, too long)
::* expected result: #E23
::* actual result: "{{#invoke:mlawc|ek|id|SBGR|fra=%123456789|pagenameoverridetestonly=pembangkit listrik tenaga surya|nocat=true}}"
* #T26 ("id|SBGR|fra=%23456781", page "pembangkit listrik tenaga surya", auto split except nonsense, not ascending)
* expected result: #E23
* actual result: "{{#invoke:mlawc|ek|id|SBGR|fra=%23456781|pagenameoverridetestonly=pembangkit listrik tenaga surya|nocat=true}}"
{{hr3}} <!-------------------------------->
* #T30 ("en|KA", page "When in a hole, stop digging.", default auto split but suboptimal result)
* expected result: OK
* actual result: "{{#invoke:mlawc|ek|en|KA|pagenameoverridetestonly=When in a hole, stop digging.|nocat=true}}"
::* #T31 ("en|KA|fra=-", page "When in a hole, stop digging.", no split, no link)
::* expected result: OK
::* actual result: "{{#invoke:mlawc|ek|en|KA|fra=-|pagenameoverridetestonly=When in a hole, stop digging.|nocat=true}}"
* #T32 ("en|KA|fra=#0I", page "When in a hole, stop digging.", assisted split, lowercase frag index 0)
* expected result: OK
* actual result: "{{#invoke:mlawc|ek|en|KA|fra=#0I|pagenameoverridetestonly=When in a hole, stop digging.|nocat=true}}"
::* #T33 ("id|SBGR|fra=%1 #2A", page "pembangkit listrik tenaga surya", assisted split, block boun ONE and uppercase frag index 2)
::* expected result: OK (silly with "listrik tenaga" together and "surya" linking to "Surya")
::* actual result: "{{#invoke:mlawc|ek|id|SBGR|fra=%1 #2A|pagenameoverridetestonly=pembangkit listrik tenaga surya|nocat=true}}"
* #T34 ("en|KA|fra=#0I", page "When In A Hole, Stop Digging.", assisted split, German style, lowercase frag index 0)
* expected result: OK
* actual result: "{{#invoke:mlawc|ek|en|KA|fra=#0I|pagenameoverridetestonly=When In A Hole, Stop Digging.|nocat=true}}"
::* #T35 ("en|KA|fra=#0I #3I #4I #5I", page "When In A Hole, Stop Digging.", assisted split, German style, lowercase frag index 0 3 4 5)
::* expected result: OK
::* actual result: "{{#invoke:mlawc|ek|en|KA|fra=#0I #3I #4I #5I|pagenameoverridetestonly=When In A Hole, Stop Digging.|nocat=true}}"
{{hr3}} <!-------------------------------->
* #T40 ("en|KA|fra=#0I", page "Digging", assisted split and fix case requested index 0 but no split boundaries available)
* expected result: OK (raw text "Digging" and no link to "digging" nor "Digging")
* actual result: "{{#invoke:mlawc|ek|en|KA|fra=#0I|pagenameoverridetestonly=Digging|nocat=true}}"
::* #T41 ("sv|KA", page "?va?", default split)
::* expected result: OK (link to "va")
::* actual result: "{{#invoke:mlawc|ek|sv|KA|pagenameoverridetestonly=?va?|nocat=true}}"
* #T42 ("sv|KA", page "?va", default split)
* expected result: OK (link to "va")
* actual result: "{{#invoke:mlawc|ek|sv|KA|pagenameoverridetestonly=?va|nocat=true}}"
::* #T43 ("sv|KA", page "va?", default split)
::* expected result: OK (link to "va")
::* actual result: "{{#invoke:mlawc|ek|sv|KA|pagenameoverridetestonly=va?|nocat=true}}"
* #T44 ("sv|KA", page "va", default auto split but no split boundaries available)
* expected result: OK (no link)
* actual result: "{{#invoke:mlawc|ek|sv|KA|pagenameoverridetestonly=va|nocat=true}}"
::* #T45 ("sv|KA|fra=%01", page "?va?", assisted split, 2 boundaries available but both are blocked)
::* expected result: OK (raw text "?va?" and no link)
::* actual result: "{{#invoke:mlawc|ek|sv|KA|fra=%01|pagenameoverridetestonly=?va?|nocat=true}}"
{{hr3}} <!-------------------------------->
* #T50 ("en|KA|fra=#0I", page "When in Rome, do as the Romans do.", assisted split and fix case frag 0, suboptimal result due to word "Romans")
* expected result: OK (links to "when" and "Romans")
* actual result: "{{#invoke:mlawc|ek|en|KA|fra=#0I|pagenameoverridetestonly=When in Rome, do as the Romans do.|nocat=true}}"
::* #T51 ("en|KA|fra=#0I #6:Roman", page "When in Rome, do as the Romans do.", assisted split and fix case frag 0, good result, fixed word "Romans" index 6)
::* expected result: OK (links to "when" and "Roman")
::* actual result: "{{#invoke:mlawc|ek|en|KA|fra=#0I #6:Roman|pagenameoverridetestonly=When in Rome, do as the Romans do.|nocat=true}}"
* #T52 ("en|KA|fra=#0I #6:Roman", page "When in,, , Rome, do as the Romans do.", assisted split, fix case frag 0, fix word "Romans" idx 6)
* expected result: silly OK (links to "when" and "Roman")
* actual result: "{{#invoke:mlawc|ek|en|KA|fra=#0I #6:Roman|pagenameoverridetestonly=When in,, , Rome, do as the Romans do.|nocat=true}}"
::* #T53 ("en|KA|fra=%01 #0I #4:Romania", page "When in,, , Rome, do as the Romans do.", assisted split, block 0&1, fix 0, fix word "Romans" idx 4 now)
::* expected result: very silly OK (links to "when in,, , Rome" and "Romania")
::* actual result: "{{#invoke:mlawc|ek|en|KA|fra=%01 #0I #4:Romania|pagenameoverridetestonly=When in,, , Rome, do as the Romans do.|nocat=true}}"
* #T54 ("eo|KA", page "!!!Mi jam,? estas fin-venkisto!!!", default auto split)
* expected result: OK
* actual result: "{{#invoke:mlawc|ek|eo|KA|pagenameoverridetestonly=!!!Mi jam,? estas fin-venkisto!!!|nocat=true}}"
::* #T55 ("eo|KA|fra=-", page "!!!Mi jam,? estas fin-venkisto!!!", no split)
::* expected result: OK
::* actual result: "{{#invoke:mlawc|ek|eo|KA|fra=-|pagenameoverridetestonly=!!!Mi jam,? estas fin-venkisto!!!|nocat=true}}"
* #T56 ("eo|KA|fra=#3:fino", page "!!!Mi jam,? estas fin-venkisto!!!", assisted split, link "fin-venkisto" to "fino")
* expected result: OK
* actual result: "{{#invoke:mlawc|ek|eo|KA|fra=#3:fino|pagenameoverridetestonly=!!!Mi jam,? estas fin-venkisto!!!|nocat=true}}"
{{hr3}} <!-------------------------------->
* #T60 ("deu|SB", page "hole", invalid lng)
* expected result: #E11
* actual result: "{{#invoke:mlawc|ek|deu|SB|pagenameoverridetestonly=hole|nocat=true}}"
::* #T61 ("xxx|SB", page "hole", unknown lng)
::* expected result: #E12
::* actual result: "{{#invoke:mlawc|ek|xxx|SB|pagenameoverridetestonly=hole|nocat=true}}"
* #T62 ("en|SS", page "hole", invalid word class)
* expected result: #E13
* actual result: "{{#invoke:mlawc|ek|en|SS|pagenameoverridetestonly=hole|nocat=true}}"
::* #T63 ("en|SB??", page "move", invalid use of "??")
::* expected result: #E13
::* actual result: "{{#invoke:mlawc|ek|en|SB??|pagenameoverridetestonly=move|nocat=true}}"
* #T64 ("en|??SB", page "move", invalid use of "??")
* expected result: #E13
* actual result: "{{#invoke:mlawc|ek|en|??SB|pagenameoverridetestonly=move|nocat=true}}"
::* #T65 ("en|????", page "move", invalid use of "??")
::* expected result: #E13
::* actual result: "{{#invoke:mlawc|ek|en|????|pagenameoverridetestonly=move|nocat=true}}"
* #T66 ("en|KAKU", page "PEBKAC")
* expected result: OK
* actual result: "{{#invoke:mlawc|ek|en|KAKU|pagenameoverridetestonly=PEBKAC|nocat=true}}"
::* #T67 ("en|KAAV", page "ASAP", "KA" is almost exclusive and "ASAP" is NOT a sentence)
::* expected result: #E13
::* actual result: "{{#invoke:mlawc|ek|en|KAAV|pagenameoverridetestonly=ASAP|nocat=true}}"
{{hr3}} <!-------------------------------->
* #T70 ("eo|KA", page "Mi estas fin-venkisto!!!", default auto split)
* expected result: OK
* actual result: "{{#invoke:mlawc|ek|eo|KA|pagenameoverridetestonly=Mi estas fin-venkisto!!!|nocat=true}}"
::* #T71 ("eo|KA|fra=-", page "Mi estas fin-venkisto!!!", no split)
::* expected result: OK
::* actual result: "{{#invoke:mlawc|ek|eo|KA|fra=-|pagenameoverridetestonly=Mi estas fin-venkisto!!!|nocat=true}}"
* #T72 ("eo|KA|fra=[ri/Mi] [estas fin-v]enk[-ist/isto]!!!", page "Mi estas fin-venkisto!!!", manual split)
* expected result: OK
* actual result: "{{#invoke:mlawc|ek|eo|KA|fra=[ri/Mi] [estas fin-v]enk[-ist/isto]!!!|pagenameoverridetestonly=Mi estas fin-venkisto!!!|nocat=true}}"
::* #T73 ("eo|KA|fra=[ri/Mi] [estas fin-v]]enk[-ist/isto]!!!", page "Mi estas fin-venkisto!!!", broken manual split, double bracket)
::* expected result: #E23
::* actual result: "{{#invoke:mlawc|ek|eo|KA|fra=[ri/Mi] [estas fin-v]]enk[-ist/isto]!!!|pagenameoverridetestonly=Mi estas fin-venkisto!!!|nocat=true}}"
* #T74 <nowiki>("eo|KA|fra=[mi/Mi] [estas fin-v]e''nki''sto!!!", page "Mi estas fin-venkisto!!!", broken manual split, apo:s)</nowiki>
* expected result: #E23
* actual result: "{{#invoke:mlawc|ek|eo|KA|fra=[mi/Mi] [estas fin-v]e''nki''sto!!!|pagenameoverridetestonly=Mi estas fin-venkisto!!!|nocat=true}}"
::* #T75 ("eo|KA|fra=[ri/Mi] [estas fin-v]enk[-ist/i[s]to]!!!", page "Mi estas fin-venkisto!!!", broken manual split, nested brackets)
::* expected result: #E23
::* actual result: "{{#invoke:mlawc|ek|eo|KA|fra=[ri/Mi] [estas fin-v]enk[-ist/i[s]to]!!!|pagenameoverridetestonly=Mi estas fin-venkisto!!!|nocat=true}}"
* #T76 ("eo|KA|fra=[ri/Mi] [estas fin-v]enk[-ist /isto]!!!", page "Mi estas fin-venkisto!!!", broken manual split, illegal space)
* expected result: #E23
* actual result: "{{#invoke:mlawc|ek|eo|KA|fra=[ri/Mi] [estas fin-v]enk[-ist /isto]!!!|pagenameoverridetestonly=Mi estas fin-venkisto!!!|nocat=true}}"
{{hr3}} <!-------------------------------->
* #T80 ("sv|AJ", page "icke-binaer", default auto split does nothing due to no boundary)
* expected result: OK (suboptimal)
* actual result: "{{#invoke:mlawc|ek|sv|AJ|pagenameoverridetestonly=icke-binaer|nocat=true}}"
::* #T81 ("sv|AJ|fra=[P:icke-][M:binaer]", page "icke-binaer", manual split)
::* expected result: OK
::* actual result: "{{#invoke:mlawc|ek|sv|AJ|fra=[P:icke-][M:binaer]|pagenameoverridetestonly=icke-binaer|nocat=true}}"
* #T82 ("sv|AJ|fra=[P:icke][M:binaer]", page "icke-binaer", broken manual split)
* expected result: #E16
* actual result: "{{#invoke:mlawc|ek|sv|AJ|fra=[P:icke][M:binaer]|pagenameoverridetestonly=icke-binaer|nocat=true}}"
::* #T83 ("id|SB|fra=[C:per-...-an/per][M:tidak][M:sama][C:per-...-an/an]", page "pertidaksamaan", manual split)
::* expected result: OK
::* actual result: "{{#invoke:mlawc|ek|id|SB|fra=[C:per-...-an/per][M:tidak][M:sama][C:per-...-an/an]|pagenameoverridetestonly=pertidaksamaan|nocat=true}}"
* #T84 ("id|SB|fra=[C:per-...-an/per]+[M:tidak]+[M:sama]+[C:per-...-an/an]", page "pertidaksamaan", manual split, plussed)
* expected result: OK
* actual result: "{{#invoke:mlawc|ek|id|SB|fra=[C:per-...-an/per]+[M:tidak]+[M:sama]+[C:per-...-an/an]|pagenameoverridetestonly=pertidaksamaan|nocat=true}}"
::* #T85 ("id|SB|fra=[C:per-...-an/per]+[M:kereta( )api]+[C:per-...-an/an]", page "perkeretaapian", manual split, plussed, deleted space)
::* expected result: OK
::* actual result: "{{#invoke:mlawc|ek|id|SB|fra=[C:per-...-an/per]+[M:kereta( )api]+[C:per-...-an/an]|pagenameoverridetestonly=perkeretaapian|nocat=true}}"
* #T86 ("eo|SB|fra=[L:polv(o)]+[I:o]+[L:sucx(i)]+[I:il]+[U:o]", page "polvosucxilo", manual split, deleted letter, "L"-trick, plussed)
* expected result: OK
* actual result: "{{#invoke:mlawc|ek|eo|SB|fra=[L:polv(o)]+[I:o]+[L:sucx(i)]+[I:il]+[U:o]|pagenameoverridetestonly=polvosucxilo|nocat=true}}"
::* #T87 ("sv|SB|fra=[M:vara/var(a)u][M:maerke]", page "varumaerke", manual split, deleted and replaced letter)
::* expected result: OK
::* actual result: "{{#invoke:mlawc|ek|sv|SB|fra=[M:vara/var(a)u][M:maerke]|pagenameoverridetestonly=varumaerke|nocat=true}}"
* #T88 ("id|VE|fra=[P:meN-/meng][M:(k)irim]", page "mengirim", manual split, deleted letter, plussed)
* expected result: OK
* actual result: "{{#invoke:mlawc|ek|id|VE|fra=[P:meN-/meng][M:(k)irim]|pagenameoverridetestonly=mengirim|nocat=true}}"
::* #T89 ("sv|SB|fra=[M:kung]+a+[M:doeme]", page "kungadoeme", manual split, plusses around "F000" fragment)
::* expected result: OK (see categories)
::* actual result nocat: "{{#invoke:mlawc|ek|sv|SB|fra=[M:kung]+a+[M:doeme]|pagenameoverridetestonly=kungadoeme|nocat=true}}"
::* actual result via debu: "{{debu|{{#invoke:mlawc|ek|sv|SB|fra=[M:kung]+a+[M:doeme]|pagenameoverridetestonly=kungadoeme}}|outctl=nw}}"
{{hr3}} <!-------------------------------->
* #T90 ("en|SB", page "sun", default auto split does nothing due to no boundary)
* expected result: OK
* actual result: "{{#invoke:mlawc|ek|en|SB|pagenameoverridetestonly=sun|nocat=true}}"
::* #T91 ("en|SB|fra=$B", page "sun", simple bare root strategy)
::* expected result: OK (no link, see categories, cat it as "M" under "sun" and main "-")
::* actual result nocat: "{{#invoke:mlawc|ek|en|SB|fra=$B|pagenameoverridetestonly=sun|nocat=true}}"
::* actual result via debu: "{{debu|{{#invoke:mlawc|ek|en|SB|fra=$B|pagenameoverridetestonly=sun}}|outctl=nw}}"
* #T92 ("en|SB|fra=$B", page "Sun", simple bare root strategy)
* expected result: OK (link to "sun" and see categories, cat it as "M" under "sun" and main "-")
* actual result nocat: "{{#invoke:mlawc|ek|en|SB|fra=$B|pagenameoverridetestonly=Sun|nocat=true}}"
* actual result via debu: "{{debu|{{#invoke:mlawc|ek|en|SB|fra=$B|pagenameoverridetestonly=Sun}}|outctl=nw}}"
::* #T93 ("en|SB|ext=&M", page "Inverness", extra parameter)
::* expected result: OK (no link, see categories, cat it as "M" under "Inverness" and main "-")
::* actual result nocat: "{{#invoke:mlawc|ek|en|SB|ext=&M|pagenameoverridetestonly=Inverness|nocat=true}}"
::* actual result via debu: "{{debu|{{#invoke:mlawc|ek|en|SB|ext=&M|pagenameoverridetestonly=Inverness}}|outctl=nw}}"
* #T94 ("eo|SB", page "suno", default auto split does nothing due to no boundary)
* expected result: OK
* actual result: "{{#invoke:mlawc|ek|eo|SB|pagenameoverridetestonly=suno|nocat=true}}"
::* #T95 ("eo|SB|fra=$S", page "suno", simple root split)
::* expected result: OK (no link, see categories, cat it as "N" under "sun" and main "-")
::* actual result nocat: "{{#invoke:mlawc|ek|eo|SB|fra=$S|pagenameoverridetestonly=suno|nocat=true}}"
::* actual result via debu: "{{debu|{{#invoke:mlawc|ek|eo|SB|fra=$S|pagenameoverridetestonly=suno}}|outctl=nw}}"
* #T96 ("eo|SB|fra=$S", page "Suno", simple root split)
* expected result: OK (link to "suno" and see categories, cat it as "N" under "sun" and main "-")
* actual result nocat: "{{#invoke:mlawc|ek|eo|SB|fra=$S|pagenameoverridetestonly=Suno|nocat=true}}"
* actual result via debu: "{{debu|{{#invoke:mlawc|ek|eo|SB|fra=$S|pagenameoverridetestonly=Suno}}|outctl=nw}}"
::* #T97 ("eo|SB|fra=GXakart+[U:o]|ext=[N!gxakart]", page "GXakarto", extra parameter)
::* expected result: OK (no link, see categories, cat it as "N" under "gxakart" and main "-")
::* actual result nocat: "{{#invoke:mlawc|ek|eo|SB|fra=GXakart+[U:o]|ext=[N!gxakart]|pagenameoverridetestonly=GXakarto|nocat=true}}"
::* actual result via debu: "{{debu|{{#invoke:mlawc|ek|eo|SB|fra=GXakart+[U:o]|ext=[N!gxakart]|pagenameoverridetestonly=GXakarto}}|outctl=nw}}"
* #T98 ("eo|SB|fra=GXakart+[U:o]|ext=[N:gxakart/Jakarta]", page "GXakarto", faulty extra parameter)
* expected result: #E20
* actual result nocat: "{{#invoke:mlawc|ek|eo|SB|fra=GXakart+[U:o]|ext=[N:gxakart/Jakarta]|pagenameoverridetestonly=GXakarto}}"
::* #T99 ("sv|SB|fra=[M:loep(a)]+[U:-are/ar(e)]+[M:sko]", page "loeparsko", 2 stolen letters)
::* expected result: OK (see categories)
::* actual result nocat: "{{#invoke:mlawc|ek|sv|SB|fra=[M:loep(a)]+[U:-are/ar(e)]+[M:sko]|pagenameoverridetestonly=loeparsko|nocat=true}}"
::* actual result via debu: "{{debu|{{#invoke:mlawc|ek|sv|SB|fra=[M:loep(a)]+[U:-are/ar(e)]+[M:sko]|pagenameoverridetestonly=loeparsko}}|outctl=nw}}"
{{hr3}} <!-------------------------------->
* #TA0 ("en|AVKU", page "ASAP")
* expected result: OK
* actual result: "{{#invoke:mlawc|ek|en|AVKU|pagenameoverridetestonly=ASAP|nocat=true}}"
::* #TA1 ("en|SJ", page "when")
::* expected result: OK
::* actual result: "{{#invoke:mlawc|ek|en|SJ|pagenameoverridetestonly=when|nocat=true}}"
* #TA2 ("sv|SB|dst=baza banan", page "banan", try to link to this one)
* expected result: OK
* actual result: "{{#invoke:mlawc|ek|sv|SB|dst=baza banan|pagenameoverridetestonly=banan|nocat=true}}"
::* #TA3 ("sv|SB|dst=fleksia", page "banan", try to link to this one)
::* expected result: OK (see categories)
::* actual result nocat: "{{#invoke:mlawc|ek|sv|SB|dst=fleksia|pagenameoverridetestonly=banan|nocat=true}}"
::* actual result via debu: "{{debu|{{#invoke:mlawc|ek|sv|SB|dst=fleksia|pagenameoverridetestonly=banan}}|outctl=nw}}"
* #TA4 ("sv|SB|dst=baza [ba]nan", page "banan", illegal brackets)
* expected result: #E19
* actual result: "{{#invoke:mlawc|ek|sv|SB|dst=baza [ba]nan|pagenameoverridetestonly=banan|nocat=true}}"
::* #TA5 ("sv|SB|dst=banan'", page "banan", illegal apo)
::* expected result: #E19
::* actual result: "{{#invoke:mlawc|ek|sv|SB|dst=banan'|pagenameoverridetestonly=banan|nocat=true}}"
* #TA6 ("en|AVKU", page "ASAP", see categories)
* expected result: OK
* actual result: "{ {#invoke:mlawc|ek|en|AVKU|pagenameoverridetestonly=ASAP} }" (blocked)
* actual result via debu: "{{debu|{{#invoke:mlawc|ek|en|AVKU|pagenameoverridetestonly=ASAP}}|outctl=nw}}"
::* #TA7 ("en|SJ", page "when", see categories)
::* expected result: OK
::* actual result: "{ {#invoke:mlawc|ek|en|SJ|pagenameoverridetestonly=when} }" (blocked)
::* actual result via debu "{{debu|{{#invoke:mlawc|ek|en|SJ|pagenameoverridetestonly=when}}|outctl=nw}}"
* #TA8 ("en|AVKU|dst=test", page "ASAP", silly maximal test for anchors and categories)
* expected result: OK
* actual result: "{ {#invoke:mlawc|ek|en|AVKU|dst=test|pagenameoverridetestonly=ASAP} }" (blocked)
* actual result via debu "{{debu|{{#invoke:mlawc|ek|en|AVKU|dst=test|pagenameoverridetestonly=ASAP}}|outctl=nw}}"
{{hr3}} <!-------------------------------->
* note that tests #T89 #T91...#T93 #T95...#T97 #T99 #TA3 #TA6 #TA7 and #TA8 depend on "debu"
* note that tests #TA6 #TA7 and #TA8 cannot be reasonably executed on the docs subpage without help of "pate" or "debu"
{{hr3}} <!-------------------------------->
]===]
local exporttable = {}
------------------------------------------------------------------------
---- CONSTANTS [O] ----
------------------------------------------------------------------------
-- uncommentable EO vs ID constant strings (core site-related features)
-- local constrpriv = "eo" -- EO (privileged site language)
local constrpriv = "id" -- ID (privileged site language)
-- local constringvoj = "Modulo:loaddata-tbllingvoj" -- EO
local constringvoj = "Modul:loaddata-tblbahasa" -- ID
-- local constrsplit = "Modulo:msplitter" -- EO
local constrsplit = "Modul:msplitter" -- ID
-- local constrkatq = "Kategorio" -- EO !!!FIXME!!!
local constrkatq = "Kategori" -- ID
-- constant table -- ban list -- add obviously invalid access codes (2-letter or 3-letter) only
-- length of the list is NOT stored anywhere, the processing stops
-- when type "nil" is encountered, used by "lfivalidatelnkoadv" only
-- controversial codes (sh sr hr), (zh cmn)
-- "en.wiktionary.org/wiki/Wiktionary:Language_treatment" excluded languages
-- "en.wikipedia.org/wiki/Spurious_languages"
-- "iso639-3.sil.org/code/art" only valid in ISO 639-2
-- "iso639-3.sil.org/code/gem" only valid in ISO 639-2 and 639-5, "collective"
-- "iso639-3.sil.org/code/zxx" "No linguistic content"
local contabisbanned = {}
contabisbanned = {'by','dc','ll','jp','art','deu','eng','epo','fra','gem','ger','ido','lat','por','rus','spa','swe','tup','zxx'} -- 1...19
-- surrogate transcoding table (only needed for EO)
local contabtransluteo = {}
contabtransluteo[ 67] = 0xC488 -- CX
contabtransluteo[ 99] = 0xC489 -- cx
contabtransluteo[ 71] = 0xC49C -- GX
contabtransluteo[103] = 0xC49D -- gx
contabtransluteo[ 74] = 0xC4B4 -- JX
contabtransluteo[106] = 0xC4B5 -- jx
contabtransluteo[ 83] = 0xC59C -- SX
contabtransluteo[115] = 0xC59D -- sx
contabtransluteo[ 85] = 0xC5AC -- UX breve
contabtransluteo[117] = 0xC5AD -- ux breve
-- surrogate transcoding table (only needed for SV)
local contabtranslutsv = {}
contabtranslutsv['AA'] = 0xC385 -- Aring
contabtranslutsv['Aa'] = 0xC385 -- Aring
contabtranslutsv['aa'] = 0xC3A5 -- aring
contabtranslutsv['AE'] = 0xC384
contabtranslutsv['Ae'] = 0xC384
contabtranslutsv['ae'] = 0xC3A4
contabtranslutsv['EE'] = 0xC389 -- rarely used
contabtranslutsv['Ee'] = 0xC389 -- rarely used
contabtranslutsv['ee'] = 0xC3A9 -- rarely used
contabtranslutsv['OE'] = 0xC396
contabtranslutsv['Oe'] = 0xC396
contabtranslutsv['oe'] = 0xC3B6
-- constant strings (anchor HTML code and prefix)
local constrankkom = '<span id="Qsekt' -- do NOT add the dash "-" here
local constaankend = '"></span>'
-- constant strings (error circumfixes)
local constrelabg = '<span class="error"><b>' -- lagom whining begin
local constrelaen = '</b></span>' -- lagom whining end
local constrlaxhu = ' [] ' -- lagom -> huge circumfix " [] "
-- uncommentable EO vs ID string (caller name for error messages)
-- local constrkoll = 'sxablono "livs"' -- EO augmented name of the caller (semi-hardcoded, we do NOT peek it)
local constrkoll = 'templat "bakk"' -- ID augmented name of the caller (semi-hardcoded, we do NOT peek it)
-- uncommentable EO vs ID constant table (error messages)
-- #E02...#E99
-- note that #E00 and #E01 are NOT supposed to be included here
-- separate "strpikparent" needed for "\\@"
local contaberaroj = {}
-- contaberaroj[02] = 'Malica eraro en subprogramaro uzata far \\@' -- EO #E02 !!!FIXME!!!
contaberaroj[02] = 'Kesalahan jahat dalam subprogram digunakan oleh \\@' -- ID #E02
-- contaberaroj[03] = 'Nombrigita eraro en subprogramaro uzata far \\@' -- EO #E03 !!!FIXME!!!
contaberaroj[03] = 'Kesalahan ternomor dalam subprogram digunakan oleh \\@' -- ID #E03
-- contaberaroj[09] = 'Erara uzo de \\@, legu gxian dokumentajxon' -- EO #E09
contaberaroj[09] = 'Penggunaan salah \\@, bacalah dokumentasinya' -- ID #E09
-- contaberaroj[11] = 'Evidente nevalida lingvokodo en \\@' -- EO #E11
contaberaroj[11] = 'Kode bahasa jelas-jelas salah dalam \\@' -- ID #E11
-- contaberaroj[12] = 'Nekonata lingvokodo en \\@' -- EO #E12
contaberaroj[12] = 'Kode bahasa tidak dikenal dalam \\@' -- ID #E12
-- contaberaroj[13] = 'Erara uzo de \\@ pro vortospeco' -- EO #E13
contaberaroj[13] = 'Penggunaan salah \\@ oleh karena kelas kata' -- ID #E13
-- contaberaroj[14] = 'Erara uzo de \\@ pro pagxonomo por "$S" "$H"' -- EO #E14
contaberaroj[14] = 'Penggunaan salah \\@ oleh karena nama halaman untuk "$S" "$H"' -- ID #E14
-- contaberaroj[16] = 'Erara uzo de \\@ pro "sumkontrolo"' -- EO #E16
contaberaroj[16] = 'Penggunaan salah \\@ oleh karena "pemeriksaan jumlah"' -- ID #E16
-- contaberaroj[19] = 'Erara uzo de \\@ pro "dst=" distingo' -- EO #E19
contaberaroj[19] = 'Penggunaan salah \\@ oleh karena "dst=" pembedaan' -- ID #E19
-- contaberaroj[20] = 'Erara uzo de \\@ pro "ext=" kroma parametro' -- EO #E20
contaberaroj[20] = 'Penggunaan salah \\@ oleh karena "ext=" parameter ekstra' -- ID #E20
-- contaberaroj[21] = 'Erara uzo de \\@ pro "scr=" skribsistema parametro' -- EO #E21
contaberaroj[21] = 'Penggunaan salah \\@ oleh karena "scr=" parameter aksara' -- ID #E21
-- contaberaroj[23] = 'Erara uzo de \\@ pro "fra=" disiga parametro' -- EO #E23
contaberaroj[23] = 'Penggunaan salah \\@ oleh karena "fra=" pemotongan' -- ID #E23
-- constant strings and table (tooltip and misc to be sent to the screen)
local constrtoolt = 'style="border-bottom:1px dotted; cursor:help;"' -- lousy tooltip
local constrisobg = '(⁨ ' -- isolator for "strange" (RTL, submicroscopic) text begin
local constrisoen = ' ⁨)' -- isolator for "strange" (RTL, submicroscopic) text end
local contabscrmisc = {}
contabscrmisc[0] = '<div style="margin:0.2em;"></div>' -- must be empty, tiny EOL
contabscrmisc[1] = '[[File:Speech balloon orange 24 24 px trans.png|24px|link=]]'
-- uncommentable EO vs ID constant table (lng and wc stuff)
local contablaxwc = {}
-- contablaxwc [0] = "Lingvo: " -- EO tooltip only 1
contablaxwc [0] = "Bahasa: " -- ID tooltip only 1
-- contablaxwc [1] = "Vortospeco: " -- EO tooltip can be 2
contablaxwc [1] = "Kelas kata: " -- ID tooltip can be 2
-- contablaxwc [2] = "nekonata lingvo" -- EO placeholder
contablaxwc [2] = "bahasa yang tidak dikenal" -- ID placeholder
-- contablaxwc [3] = "nekonata vortospeco" -- EO placeholder
contablaxwc [3] = "kelas kata yang tidak dikenal" -- ID placeholder
-- uncommentable EO vs ID constant table (categories)
-- syntax of insertion and discarding magic string:
-- "@" followed by 2 uppercase letters and 2 hex numbers
-- otherwise the hit is not processed, but copied as-is instead
-- 2 letters select the insertable item from table supplied by the caller
-- 2 hex numbers control discarding left and right (0...15 char:s)
-- empty item is legal and results in discarding if some number is non-ZERO
-- if uppercasing or other adjustment is needed then the caller must take
-- care of it in the form of 2 or more separate items provided in the table
-- insertable items defined:
-- constant:
-- * LK langcode (unknown "??" legal but take care elsewhere)
-- * LN langname (unknown legal, for example "dana" or "Ido")
-- * LU langname uppercased (unknown legal, for example "Dana" or "Ido")
-- * LO langname not own (empty or nil if own)
-- * LV langname uppercased not own (empty or nil if own)
-- * LY langname long (for example "bahasa Swedia")
-- * LZ langname long not own (empty or nil if own)
-- * SC script code (for example "T", "S", "P" for ZH, "C" "L" for SH)
-- variable (we can have 2 word classes):
-- * WC word class name (for example "substantivo")
-- * WU word class name uppercased (for example "Substantivo")
-- * MT mortyp code (for example "C")
-- * FR fragment (for example "peN-...-an" or "abelujo")
-- see "lfiultiminsert" and "tabstuff" use space here and avoid "_"
-- note the malicious false friendship between EO:frazo kaj ID:frasa
local contabkatoj = {}
-- contabkatoj[0] = "Kapvorto (@LN00)" -- EO always (except "nocat=true") only 1 piece
contabkatoj[0] = "Kata @LY00" -- ID always (except "nocat=true") only 1 piece
-- contabkatoj[1] = "@WU00 (@LN00)" -- EO always (except "nocat=true") 1 or 2 pieces
contabkatoj[1] = "@LK00:@WU00" -- ID always (except "nocat=true") 1 or 2 pieces
-- contabkatoj[2] = "@WU00" -- EO always (except "nocat=true") 1 or 2 pieces
contabkatoj[2] = "@WU00" -- ID always (except "nocat=true") 1 or 2 pieces
-- uncommentable EO vs ID constant table (26 word classes)
local contabwc = {}
-- contabwc["SB"] = "substantivo (O-vorto)" -- EO |
contabwc["SB"] = "nomina (kata benda)" -- ID |
-- contabwc["VE"] = "verbo (I-vorto)" -- EO | main big (3)
contabwc["VE"] = "verba (kata kerja)" -- ID |
-- contabwc["AJ"] = "adjektivo (A-vorto)" -- EO |
contabwc["AJ"] = "adjektiva (kata sifat)" -- ID |
-- contabwc["PN"] = "pronomo" -- EO %
contabwc["PN"] = "pronomina (kata pengganti)" -- ID %
-- contabwc["NV"] = "numeralo (nombrovorto)" -- EO %
contabwc["NV"] = "numeralia (kata bilangan)" -- ID %
-- contabwc["AV"] = "adverbo (E-vorto)" -- EO %
contabwc["AV"] = "adverbia (kata keterangan)" -- ID %
-- contabwc["PV"] = "verbpartiklo" -- EO %
contabwc["PV"] = "partikel verba" -- ID %
-- contabwc["QV"] = "demandvorto" -- EO %
contabwc["QV"] = "kata tanya" -- ID %
-- contabwc["KJ"] = "konjunkcio" -- EO %
contabwc["KJ"] = "konjungsi" -- ID %
-- contabwc["SJ"] = "subjunkcio (subfrazenkondukilo)" -- EO %
contabwc["SJ"] = "subjungsi (pengaju klausa terikat)" -- ID % further smaller (12)
-- contabwc["PP"] = "prepozicio (antauxlokigita rolvorteto)" -- EO %
contabwc["PP"] = "preposisi (kata depan)" -- ID %
-- contabwc["PO"] = "postpozicio" -- EO %
contabwc["PO"] = "postposisi (kata belakang)" -- ID %
-- contabwc["PC"] = "cirkumpozicio" -- EO %
contabwc["PC"] = "sirkumposisi" -- ID %
-- contabwc["AR"] = "artikolo" -- EO %
contabwc["AR"] = "artikel (kata sandang)" -- ID %
-- contabwc["IN"] = "interjekcio" -- EO %
contabwc["IN"] = "interjeksi" -- ID %
-- contabwc["PF"] = "prefikso" -- EO #
contabwc["PF"] = "prefiks (awalan)" -- ID #
-- contabwc["UF"] = "sufikso (postfikso, finajxo)" -- EO #
contabwc["UF"] = "sufiks (akhiran)" -- ID # nonstandalone (5)
-- contabwc["KF"] = "cirkumfikso (konfikso)" -- EO #
contabwc["KF"] = "sirkumfiks (konfiks)" -- ID #
-- contabwc["IF"] = "infikso" -- EO #
contabwc["IF"] = "infiks (sisipan)" -- ID #
-- contabwc["NR"] = "nememstara radiko" -- EO #
contabwc["NR"] = "akar kata terikat (prakategorial)" -- ID #
-- contabwc["KA"] = "frazo" -- EO $
contabwc["KA"] = "kalimat" -- ID $
-- contabwc["KK"] = "signo" -- EO $ misc (2)
contabwc["KK"] = "karakter" -- ID $
-- contabwc["KU"] = "mallongigo (kurtigo)" -- EO &
contabwc["KU"] = "singkatan (abreviasi)" -- ID &
-- contabwc["GR"] = "vortgrupo" -- EO & additional (4)
contabwc["GR"] = "kumpulan kata" -- ID &
-- contabwc["PA"] = "participo" -- EO &
contabwc["PA"] = "partisip" -- ID &
-- contabwc["TV"] = "tabelvorto" -- EO &
contabwc["TV"] = "kata tabel" -- ID &
-- constant table (3 integers for preliminary parameter check)
local contabparam = {}
contabparam[0] = 2 -- minimal number of anon parameters
contabparam[1] = 2 -- maximal number of anon parameters
contabparam[2] = 160 -- maximal length of single para (min is hardcoded ONE)
-- constants related to submodules
local connumtblc0 = 0 -- in site language
local connumtblc2 = 2 -- propralingve
-- constants to control behaviour from source AKA semi-hardcoded parameters
local constrmainctl = "13" -- image (0 or 1) lemma (0 none 1 raw 2 maybe split 3 maybe split plus ...)
local conbookodlng = false -- "true" to allow long codes like "zh-min-nan"
local conboomiddig = false -- "true" to allow middle digit "s7a"
------------------------------------------------------------------------
---- SPECIAL STUFF OUTSIDE MAIN [B] ----
------------------------------------------------------------------------
---- SPECIAL VAR:S ----
local qldingvoj = {} -- type "table" and nested
local qsplitter = {} -- type "table" with type "function" inside
local qbooguard = false -- only for the guard test, pass to other var ASAP
local qboodetrc = true -- from "detrc=true" but default is "true" !!!
local qstrtrace = '<br>' -- for main & sub:s, debug report request by "detrc="
local qtabkatoj = {} -- global for compound categories [0]...[41]
---- GUARD AGAINST INTERNAL ERROR & TWO IMPORTS ----
qbooguard = (type(constrpriv)~='string') or (type(constringvoj)~='string') or (type(constrkatq)~='string')
if (not qbooguard) then
qldingvoj = mw.loadData(constringvoj) -- can crash here
qsplitter = require(constrsplit) -- can crash here
qbooguard = (type(qldingvoj)~='table') or (type(qsplitter)~='table')
end--if
------------------------------------------------------------------------
---- DEBUG FUNCTIONS [D] ----
------------------------------------------------------------------------
-- Local function LFDTRACEMSG
-- Enhance upvalue "qstrtrace" with fixed text.
-- for variables the other sub "lfdshowvar" is preferable but in exceptional
-- cases it can be justified to send text with values of variables to this sub
-- no size limit
-- upvalue "qstrtrace" must NOT be type "nil" on entry (is inited to "<br>")
-- uses upvalue "qboodetrc"
local function lfdtracemsg (strshortline)
if (qboodetrc and (type(strshortline)=='string')) then
qstrtrace = qstrtrace .. strshortline .. '.<br>' -- dot added !!!
end--if
end--function lfdtracemsg
------------------------------------------------------------------------
-- Local function LFDIMPORTREPORT
-- Enhance upvalue "qstrtrace" with imported report.
-- use this one to import detrc text from submodule
-- upvalue "qstrtrace" must NOT be type "nil" on entry (is inited to "<br>")
-- uses upvalue "qboodetrc"
local function lfdimportreport (strshortlineorbigtext)
local strseparator = ''
if (qboodetrc and (type(strshortlineorbigtext)=='string')) then
strseparator = '<br>########################<br>'
qstrtrace = qstrtrace .. strseparator .. strshortlineorbigtext .. strseparator .. '<br>'
end--if
end--function lfdimportreport
------------------------------------------------------------------------
-- Local function LFDMINISANI
-- Input : * strdangerous -- must be type "string", empty legal
-- * numlimitdivthree
-- Output : * strsanitized -- can happen to be quasi-empty with <<"">>
-- To be called from "lfdshowvcore" <- "lfdshowvar" only.
-- * we absolutely must disallow: cross "#" 35 | apo "'" 39 |
-- star "*" 42 | dash 45 | colon 58 | "<" 60 | ">" 62 | "[" 91 | "]" 93
-- * spaces are showed as "{32}" if repetitive or at begin or at end
local function lfdminisani (strdangerous, numlimitdivthree)
local strsanitized = '"' -- begin quot
local num38len = 0
local num38index = 1 -- ONE-based
local num38signo = 0
local num38prev = 0
local boohtmlenc = false
local boovisienc = false
num38len = string.len (strdangerous)
while true do
boohtmlenc = false -- % reset on
boovisienc = false -- % every iteration
if (num38index>num38len) then -- ONE-based
break -- done string char after char
end--if
num38signo = string.byte (strdangerous,num38index,num38index)
if ((num38signo<43) or (num38signo==45) or (num38signo==58) or (num38signo==60) or (num38signo==62) or (num38signo==91) or (num38signo==93) or (num38signo>122)) then
boohtmlenc = true
end--if
if ((num38signo<32) or (num38signo>126)) then
boovisienc = true -- overrides "boohtmlenc"
end--if
if ((num38signo==32) and ((num38prev==32) or (num38index==1) or (num38index==num38len))) then
boovisienc = true -- overrides "boohtmlenc"
end--if
if (boovisienc) then
strsanitized = strsanitized .. '{' .. tostring (num38signo) .. '}'
else
if (boohtmlenc) then
strsanitized = strsanitized .. '&#' .. tostring (num38signo) .. ';'
else
strsanitized = strsanitized .. string.char (num38signo)
end--if
end--if
if ((num38len>(numlimitdivthree*3)) and (num38index==numlimitdivthree)) then
num38index = num38len - numlimitdivthree -- jump forwards
strsanitized = strsanitized .. '" ... "'
else
num38index = num38index + 1 -- ONE-based
end--if
num38prev = num38signo
end--while
strsanitized = strsanitized .. '"' -- don't forget final quot
return strsanitized
end--function lfdminisani
------------------------------------------------------------------------
-- Local function LFDSHOWVCORE
-- Prebrew report about content of a variable including optional full
-- listing of a table with numeric and string keys. !!!FIXME!!!
-- Input : * vardubious -- content (any type including "nil" is acceptable)
-- * str77name -- name of the variable (string)
-- * vardescri -- optional comment, default empty, begin with "@" to
-- place it before name of the variable, else after
-- * varlim77tab -- optional limit, limits both string keys and
-- numeric keys, default ZERO no listing
-- Depends on functions :
-- [D] lfdminisani
local function lfdshowvcore (vardubious, str77name, vardescri, varlim77tab)
local taballkeystring = {}
local strtype = ''
local strreport = ''
local numindax = 0
local numlencx = 0
local numkeynumber = 0
local numkeystring = 0
local numkeycetera = 0
local numkey77min = 999999
local numkey77max = -999999
local boobe77fore = false
if (type(str77name)~='string') then
str77name = '??' -- bite the bullet
else
str77name = '"' .. str77name .. '"'
end--if
if (type(vardescri)~='string') then
vardescri = '' -- omit comment
end--if
if (string.len(vardescri)>=2) then
boobe77fore = (string.byte(vardescri,1,1)==64) -- prefix "@"
if (boobe77fore) then
vardescri = string.sub(vardescri,2,-1) -- CANNOT become empty
end--if
end--if
if (type(varlim77tab)~='number') then
varlim77tab = 0 -- deactivate listing of a table
end--if
if ((vardescri~='') and (not boobe77fore)) then
str77name = str77name .. ' (' .. vardescri .. ')' -- now a combo
end--if
strtype = type(vardubious)
if (strtype=='table') then
for k,v in pairs(vardubious) do
if (type(k)=='number') then
numkey77min = math.min (numkey77min,k)
numkey77max = math.max (numkey77max,k)
numkeynumber = numkeynumber + 1
else
if (type(k)=='string') then
taballkeystring [numkeystring] = k
numkeystring = numkeystring + 1
else
numkeycetera = numkeycetera + 1
end--if
end--if
end--for
strreport = 'Table ' .. str77name
if ((numkeynumber==0) and (numkeystring==0) and (numkeycetera==0)) then
strreport = strreport .. ' is empty'
else
strreport = strreport .. ' contains '
if (numkeynumber==0) then
strreport = strreport .. 'NO numeric keys'
end--if
if (numkeynumber==1) then
strreport = strreport .. 'a single numeric key equal ' .. tostring (numkey77min)
end--if
if (numkeynumber>=2) then
strreport = strreport .. tostring (numkeynumber) .. ' numeric keys ranging from ' .. tostring (numkey77min) .. ' to ' .. tostring (numkey77max)
end--if
strreport = strreport .. ' and ' .. tostring (numkeystring) .. ' string keys and ' .. tostring (numkeycetera) .. ' other keys'
end--if
if ((numkeynumber~=0) and (varlim77tab~=0)) then -- !!!FIXME!!!
strreport = strreport .. ' ### content num keys :'
numindax = numkey77min
while true do
if ((numindax>varlim77tab) or (numindax>numkey77max)) then
break -- done table
end--if
strreport = strreport .. ' ' .. tostring(numindax) .. ' -> ' .. lfdminisani(tostring(vardubious[numindax]),30)
numindax = numindax + 1
end--while
end--if
if ((numkeystring~=0) and (varlim77tab~=0)) then -- !!!FIXME!!!
strreport = strreport .. ' ### content string keys :'
end--if
else
strreport = 'Variable ' .. str77name .. ' has type "' .. strtype .. '"'
if (strtype=='string') then
numlencx = string.len (vardubious)
strreport = strreport .. ' and length ' .. tostring (numlencx)
if (numlencx~=0) then
strreport = strreport .. ' and content ' .. lfdminisani (vardubious,30)
end--if
else
if (strtype~='nil') then
strreport = strreport .. ' and content "' .. tostring (vardubious) .. '"'
end--if
end--if (strtype=='string') else
end--if (strtype=='table') else
if ((vardescri~='') and boobe77fore) then
strreport = vardescri .. ' : ' .. strreport -- very last step
end--if
return strreport
end--function lfdshowvcore
------------------------------------------------------------------------
-- Local function LFDSHOWVAR
-- Enhance upvalue "qstrtrace" with report about content of a variable
-- including optional full listing of a table with numeric and string keys. !!!FIXME!!!
-- Depends on functions :
-- [D] lfdminisani lfdshowvcore
-- upvalue "qstrtrace" must NOT be type "nil" on entry (is inited to "<br>")
-- uses upvalue "qboodetrc"
local function lfdshowvar (varduubious, strnaame, vardeskkri, vartabljjm)
if (qboodetrc) then
qstrtrace = qstrtrace .. lfdshowvcore (varduubious, strnaame, vardeskkri, vartabljjm) .. '.<br>' -- dot added !!!
end--if
end--function lfdshowvar
------------------------------------------------------------------------
---- MATH FUNCTIONS [E] ----
------------------------------------------------------------------------
local function mathisintrange (numzjinput, numzjmin, numzjmax)
local booisclean = false -- preASSume guilt
if (type(numzjinput)=='number') then -- no non-numbers, thanks
if (numzjinput==math.floor(numzjinput)) then -- no transcendental
booisclean = ((numzjinput>=numzjmin) and (numzjinput<=numzjmax)) -- rang
end--if
end--if
return booisclean
end--function mathisintrange
local function mathdiv (xdividens, xdivisero)
local resultdiv = 0 -- DIV operator lacks in LUA :-(
resultdiv = math.floor (xdividens / xdivisero)
return resultdiv
end--function mathdiv
local function mathmod (xdividendo, xdivisoro)
local resultmod = 0 -- MOD operator is "%" and bitwise AND operator lack too
resultmod = xdividendo % xdivisoro
return resultmod
end--function mathmod
------------------------------------------------------------------------
-- Local function MATHBITTEST
-- Find out whether single bit selected by ZERO-based index is "1" / "true".
-- Result has type "boolean".
-- Depends on functions :
-- [E] mathdiv mathmod
local function mathbittest (numincoming, numbitindex)
local boores = false
while true do
if ((numbitindex==0) or (numincoming==0)) then
break -- we have either reached our bit or run out of bits
end--if
numincoming = mathdiv(numincoming,2) -- shift right
numbitindex = numbitindex - 1 -- count down to ZERO
end--while
boores = (mathmod(numincoming,2)==1) -- pick bit
return boores
end--function mathbittest
------------------------------------------------------------------------
---- NUMBER CONVERSION FUNCTIONS [N] ----
------------------------------------------------------------------------
-- Local function LFDEC1DIGCL
-- Convert 1 decimal ASCII digit to UINT8 with clamp.
local function lfdec1digcl (num1dugyt, num1clim)
num1dugyt = num1dugyt - 48 -- may become invalid ie negative
if ((num1dugyt<0) or (num1dugyt>num1clim)) then
num1dugyt = 0 -- valid ZERO output on invalid input digit
end--if
return num1dugyt
end--function lfdec1digcl
------------------------------------------------------------------------
-- Local function LFNONEHEXTOINT
-- Convert 1 ASCII code of a hex digit to an UINT4 ie 0...15 (255 invalid).
local function lfnonehextoint (numdigit)
local numresult = 255
if ((numdigit>47) and (numdigit<58)) then
numresult = numdigit-48
end--if
if ((numdigit>64) and (numdigit<71)) then
numresult = numdigit-55
end--if
return numresult
end--function lfnonehextoint
------------------------------------------------------------------------
-- Local function LFNUMTO2DIGIT
-- Convert integer 0...99 to decimal ASCII string always 2 digits "00"..."99".
-- Depends on functions :
-- [E] mathisintrange mathdiv mathmod
local function lfnumto2digit (numzerotoninetynine)
local strtwodig = '??' -- always 2 digits
if (mathisintrange(numzerotoninetynine,0,99)) then
strtwodig = tostring(mathdiv(numzerotoninetynine,10)) .. tostring(mathmod(numzerotoninetynine,10))
end--if
return strtwodig
end--function lfnumto2digit
------------------------------------------------------------------------
---- STRING FUNCTIONS [G] ---- !!!FIXME!!! move [I] functions out from here
------------------------------------------------------------------------
-- Local function LFGSTRINGRANGE
local function lfgstringrange (varvictim, nummini, nummaxi)
local nummylengthofstr = 0
local booveryvalid = false -- preASSume guilt
if (type(varvictim)=='string') then
nummylengthofstr = string.len(varvictim)
booveryvalid = ((nummylengthofstr>=nummini) and (nummylengthofstr<=nummaxi))
end--if
return booveryvalid
end--function lfgstringrange
------------------------------------------------------------------------
-- Local function LFGPOKESTRING
-- Replace single octet in a string.
-- Input : * strinpokeout -- empty legal
-- * numpokepoz -- ZERO-based, out of range legal
-- * numpokeval -- new value
-- This is inefficient by design of LUA. The caller is responsible to
-- minimize the number of invocations of this, in particular, not to
-- call if the new value is equal the existing one.
local function lfgpokestring (strinpokeout, numpokepoz, numpokeval)
local numpokelen = 0
numpokelen = string.len(strinpokeout)
if ((numpokelen==1) and (numpokepoz==0)) then
strinpokeout = string.char(numpokeval) -- totally replace
end--if
if (numpokelen>=2) then
if (numpokepoz==0) then
strinpokeout = string.char(numpokeval) .. string.sub (strinpokeout,2,numpokelen)
end--if
if ((numpokepoz>0) and (numpokepoz<(numpokelen-1))) then
strinpokeout = string.sub (strinpokeout,1,numpokepoz) .. string.char(numpokeval) .. string.sub (strinpokeout,(numpokepoz+2),numpokelen)
end--if
if (numpokepoz==(numpokelen-1)) then
strinpokeout = string.sub (strinpokeout,1,(numpokelen-1)) .. string.char(numpokeval)
end--if
end--if (numpokelen>=2) then
return strinpokeout
end--function lfgpokestring
------------------------------------------------------------------------
-- test whether char is an ASCII digit "0"..."9", return boolean
local function lfgtestnum (numkaad)
local boodigit = false
boodigit = ((numkaad>=48) and (numkaad<=57))
return boodigit
end--function lfgtestnum
------------------------------------------------------------------------
-- test whether char is an ASCII uppercase letter, return boolean
local function lfgtestuc (numkode)
local booupperc = false
booupperc = ((numkode>=65) and (numkode<=90))
return booupperc
end--function lfgtestuc
------------------------------------------------------------------------
-- test whether char is an ASCII lowercase letter, return boolean
local function lfgtestlc (numcode)
local boolowerc = false
boolowerc = ((numcode>=97) and (numcode<=122))
return boolowerc
end--function lfgtestlc
------------------------------------------------------------------------
-- Local function LFIMULTESTUC
-- Test whether incoming string consists of given number
-- of ASCII uppercase letters, return boolean.
-- return "true" on success
-- Depends on functions :
-- [G] lfgtestuc
local function lfimultestuc (strinputi, numlenc)
local booallupper = false
local numtestindexx = 1 -- ONE-based
local numtestedchar = 0
booallupper = (string.len(strinputi)==numlenc)
if (booallupper) then
while true do
if (numtestindexx>numlenc) then
break
end--if
numtestedchar = string.byte (strinputi,numtestindexx,numtestindexx)
booallupper = booallupper and (lfgtestuc(numtestedchar))
numtestindexx = numtestindexx + 1
end--while
end--if
return booallupper
end--function lfimultestuc
------------------------------------------------------------------------
-- Local function LFIBANMULTI
-- Test string for validity by banning listed single char:s by multiplicity.
-- Input : * "strkoneven" -- even and 2...24, wrong length gives
-- "true", tolerated multiplicity "0"..."9"
-- * "strsample" -- 0...1'024, empty gives "false",
-- too long gives "true"
-- Output : * "booisevil" -- "true" if evil
-- Depends on functions :
-- [G] lfgtestnum
-- [E] mathmod
-- Incoming control string "strkoneven" with pairs of char:s, for
-- example "'2&0" will tolerate 2 consecutive apo:s but
-- not 3, and completely ban the and-sign "&".
local function lfibanmulti (strkoneven, strsample)
local booisevil = false
local numkonlen = 0 -- length of control string
local numsamlen = 0 -- length of sample string
local numinndex = 0 -- ZERO-based outer index
local numinneri = 0 -- ZERO-based inner index
local numchear = 0
local numnexxt = 0
local nummultiq = 1 -- counted multiplicity
local numcrapp = 0 -- from "strkoneven" char to test
local numvrapp = 0 -- from "strkoneven" multiplicity limit
numsamlen = string.len (strsample)
if (numsamlen~=0) then
numkonlen = string.len (strkoneven)
booisevil = (numkonlen<2) or (numkonlen>24) or (mathmod(numkonlen,2)~=0) or (numsamlen>1024)
while true do -- outer loop
if (booisevil or (numinndex>=numsamlen)) then
break
end--if
numchear = string.byte (strsample,(numinndex+1),(numinndex+1))
if (numchear==0) then
booisevil = true -- ZERO is unconditionally prohibited
break
end--if
numinndex = numinndex + 1
numnexxt = 0
if (numinndex~=numsamlen) then
numnexxt = string.byte (strsample,(numinndex+1),(numinndex+1))
end--if
if (numchear==numnexxt) then
nummultiq = nummultiq + 1
end--if
if ((numchear~=numnexxt) or (numinndex==numsamlen)) then
numinneri = 0
while true do -- innner loop
if (numinneri==numkonlen) then
break
end--if
numcrapp = string.byte (strkoneven,(numinneri+1),(numinneri+1))
numvrapp = string.byte (strkoneven,(numinneri+2),(numinneri+2))
if (not lfgtestnum(numvrapp)) then
booisevil = true -- crime in control string detected
break
end--if
if ((numchear==numcrapp) and (nummultiq>(numvrapp-48))) then
booisevil = true -- multiplicity crime in sample string detected
break
end--if
numinneri = numinneri + 2 -- ZERO-based inner index and STEP 2
end--while -- innner loop
if (booisevil) then
break
end--if
nummultiq = 1 -- restart from ONE !!!
end--if ((numchear~=numnexxt) or (numinndex==numsamlen)) then
end--while -- outer loop
end--if (numsamlen~=0) then
return booisevil
end--function lfibanmulti
------------------------------------------------------------------------
-- Local function LFTESTSPACE
-- Detect leading or trailing space in a string.
local function lftestspace (strsuspectofspacing)
local boospacedetected = false
local numspacelength = 0
numspacelength = string.len(strsuspectofspacing)
if (numspacelength~=0) then
boospacedetected = (string.byte(strsuspectofspacing,1,1)==32) or (string.byte(strsuspectofspacing,numspacelength,numspacelength)==32)
end--if
return boospacedetected
end--function lftestspace
------------------------------------------------------------------------
-- Local function LFIDEBRACKET
-- Separate bracketed part of a string and return the inner or outer
-- part. On failure the string is returned complete and unchanged.
-- There must be exactly ONE "(" and exactly ONE ")" in correct order.
-- Input : * numxminlencz -- minimal length of inner part, must be >= 1 !!!
-- Note that for length of hit ZERO ie "()" we have "begg" + 1 = "endd"
-- and for length of hit ONE ie "(x)" we have "begg" + 2 = "endd".
-- Example: "crap (NO)" -> len = 9
-- 123456789
-- "begg" = 6 and "endd" = 9
-- Expected result: "NO" or "crap " (note the trailing space)
-- Example: "(XX) YES" -> len = 8
-- 12345678
-- "begg" = 1 and "endd" = 4
-- Expected result: "XX" or " YES" (note the leading space)
local function lfidebracket (strdeath, boooutside, numxminlencz)
local numindoux = 1 -- ONE-based
local numdlong = 0
local numwesel = 0
local numbegg = 0 -- ONE-based, ZERO invalid
local numendd = 0 -- ONE-based, ZERO invalid
numdlong = string.len (strdeath)
while true do
if (numindoux>numdlong) then
break -- ONE-based -- if both "numbegg" "numendd" non-ZERO then maybe
end--if
numwesel = string.byte(strdeath,numindoux,numindoux)
if (numwesel==40) then -- "("
if (numbegg==0) then
numbegg = numindoux -- pos of "("
else
numbegg = 0
break -- damn: more than 1 "(" present
end--if
end--if
if (numwesel==41) then -- ")"
if ((numendd==0) and (numbegg~=0) and ((numbegg+numxminlencz)<numindoux)) then
numendd = numindoux -- pos of ")"
else
numendd = 0
break -- damn: more than 1 ")" present or ")" precedes "("
end--if
end--if
numindoux = numindoux + 1
end--while
if ((numbegg~=0) and (numendd~=0)) then
if (boooutside) then
strdeath = string.sub(strdeath,1,(numbegg-1)) .. string.sub(strdeath,(numendd+1),numdlong)
else
strdeath = string.sub(strdeath,(numbegg+1),(numendd-1)) -- separate substring
end--if
end--if
return strdeath -- same string variable
end--function lfidebracket
------------------------------------------------------------------------
---- UTF8 FUNCTIONS [U] ----
------------------------------------------------------------------------
-- Local function LFULNUTF8CHAR
-- Evaluate length of a single UTF8 char in octet:s.
-- Input : * numbgoctet -- beginning octet of a UTF8 char
-- Output : * numlen1234x -- number 1...4 or ZERO if invalid
-- Does NOT thoroughly check the validity, looks at 1 octet only.
local function lfulnutf8char (numbgoctet)
local numlen1234x = 0
if (numbgoctet<128) then
numlen1234x = 1 -- $00...$7F -- ANSI/ASCII
end--if
if ((numbgoctet>=194) and (numbgoctet<=223)) then
numlen1234x = 2 -- $C2 to $DF
end--if
if ((numbgoctet>=224) and (numbgoctet<=239)) then
numlen1234x = 3 -- $E0 to $EF
end--if
if ((numbgoctet>=240) and (numbgoctet<=244)) then
numlen1234x = 4 -- $F0 to $F4
end--if
return numlen1234x
end--function lfulnutf8char
------------------------------------------------------------------------
-- Local function LFUCASEGENE
-- Adjust (generous) case of a single letter (from ASCII + limited extra
-- set from UTF8 with some common ranges) or longer string. (this is GENE)
-- Input : * strinco7cs : single unicode letter (1 or 2 octet:s) or
-- longer string
-- * booup7cas : for desired output uppercase "true" and for
-- lowercase "false"
-- * boodo7all : "true" to adjust all letters, "false"
-- only beginning letter
-- Output : * strinco7cs
-- Depends on functions : (this is GENE)
-- [U] lfulnutf8char
-- [G] lfgpokestring lfgtestuc lfgtestlc
-- [E] mathdiv mathmod mathbittest
-- This process never changes the length of a string in octet:s. Empty string
-- on input is legal and results in an empty string returned. When case is
-- adjusted, a 1-octet or 2-octet letter is replaced by another letter of same
-- length. Unknown valid char:s (1-octet ... 4-octet) are copied. Broken UTF8
-- stream results in remaining part of the output string (from 1 char to
-- complete length of the incoming string) filled by "Z".
-- * lowercase is usually above uppercase, but not always, letters can be
-- only misaligned (UC even vs UC odd), and rarely even swapped (French "Y")
-- * case delta can be 1 or $20 or $50 other
-- * case pair distance can span $40-boundary or even $0100-boundary
-- * in the ASCII range lowercase is $20 above uppercase, b5 reveals
-- the case (1 is lower)
-- * the same is valid in $C3-block
-- * this is NOT valid in $C4-$C5-block, lowercase is usually 1 above
-- uppercase, but nothing reveals the case reliably
-- ## $C2-block $0080 $C2,$80 ... $00BF $C2,$BF no letters (OTOH NBSP mm)
-- ## $C3-block $00C0 $C3,$80 ... $00FF $C3,$BF (SV mm) delta $20 UC-LC-UC-LC
-- upper $00C0 $C3,$80 ... $00DF $C3,$9F
-- lower $00E0 $C3,$A0 ... $00FF $C3,$BF
-- AA AE EE NN OE UE mm
-- $D7 $DF $F7 excluded (not letters)
-- $FF excluded (here LC, UC is $0178)
-- ## $C4-$C5-block $0100 $C4,$80 ... $017F $C5,$BF (EO mm)
-- delta 1 and UC even, but messy with many exceptions
-- EO $0108 ... $016D case delta 1
-- for example SX upper $015C $C5,$9C -- lower $015D $C5,$9D
-- $0138 $0149 $017F excluded (not letters)
-- $0178 excluded (here UC, LC is $FF)
-- $0100 ... $0137 UC even
-- $0139 ... $0148 misaligned (UC odd) note that case delta is NOT reversed
-- $014A ... $0177 UC even again
-- $0179 ... $017E misaligned (UC odd) note that case delta is NOT reversed
-- ## $CC-$CF-block $0300 $CC,$80 ... $03FF $CF,$BF (EL mm) delta $20
-- EL $0370 ... $03FF (officially)
-- strict EL base range $0391 ... $03C9 case delta $20
-- $0391 $CE,$91 ... $03AB $CE,$AB upper
-- $03B1 $CE,$B1 ... $03CB $CD,$8B lower
-- for example "omega" upper $03A9 $CE,$A9 -- lower $03C9 $CF,$89
-- ## $D0-$D3-block $0400 $D0,$80 ... $04FF $D3,$BF (RU mm)
-- * delta $20 $50 1
-- * strict RU base range $0410 ... $044F case delta $20 but there
-- is 1 extra char outside !!!
-- * $0410 $D0,$90 ... $042F $D0,$AF upper
-- * $0430 $D0,$B0 ... $044F $D1,$8F lower
-- * for example "CCCP-gamma" upper $0413 $D0,$93 -- lower $0433 $D0,$B3
-- * extra base char and exception is special "E" with horizontal doubledot
-- case delta $50 (upper $0401 $D0,$81 -- lower $0451 $D1,$91)
-- * same applies for ranges $0400 $D0,$80 ... $040F $D0,$8F upper
-- and $0450 $D1,$90 ... $045F $D1,$9F lower
-- * range $0460 $D1,$A0 ... $04FF $D3,$BF (ancient RU, UK, RUE, ...) case
-- delta 1 and UC usually even, but messy with many exceptions $048x
-- $04Cx (combining decorations and misaligned)
-- Variables "numdel7abs" and "numdel7ta" must be at least 16-bit to avoid
-- misevaluation or wrong wrapping when fitting into the range 128...191,
-- even if no deltas exceeding +-127 are supported (there are very few pairs
-- of char:s exceeding this). Also both can be declared unsigned since only
-- addition and subtraction are performed on them.
-- We peek max 2 values per iteration, and change the string in-place, doing
-- so strictly only if there indeed is a change. This is important for LUA
-- where the in-place write access must be emulated by means of a less
-- efficient function.
local function lfucasegene (strinco7cs, booup7cas, boodo7all)
local numlong7den = 0 -- actual length of input string
local numokt7index = 0
local numlong7bor = 0 -- expected length of single char
local numdel7abs = 0 -- at least 16-bit, absolute posi delta
local numdel7ta = 0 -- quasi-signed at least 16-bit, can be negative
local numdel7car = 0 -- quasi-signed 8-bit, can be negative
local numcha7r = 0 -- UINT8 beginning char
local numcha7s = 0 -- UINT8 later char (BIG ENDIAN, lower value here above)
local numcxa7rel = 0 -- UINT8 code relative to beginning of block $00...$FF
local boowan7tlowr = false
local boois7uppr = false
local boois7lowr = false
local boomy7bit0x = false -- single relevant bits picked -- b0
local boomy7bit5x = false -- single relevant bits picked -- b5
local boopen7din = false -- only fake loop
local boodo7adj = true -- preASSume innocence -- continue changing
local boobotch7d = false -- preASSume innocence -- NOT yet botched
local booc3block = false -- $C3 only $00C0...$00FF SV mm delta 32
local booc4c5blk = false -- $C4 $C5 $0100...$017F EO mm delta 1
local boocccfblk = false -- $CC $CF $0300...$03FF EL mm delta 32
local bood0d3blk = false -- $D0 $D3 $0400...$04FF RU mm delta 32 80
booup7cas = not (not booup7cas)
boowan7tlowr = (not booup7cas)
numlong7den = string.len (strinco7cs)
while true do -- genuine loop over incoming string (this is GENE)
if (numokt7index>=numlong7den) then
break -- done complete string
end--if
if ((not boodo7all) and (numokt7index~=0)) then -- loop can skip index ONE
boodo7adj = false
end--if
boois7uppr = false -- preASSume on every iteration
boois7lowr = false -- preASSume on every iteration
numdel7ta = 0 -- preASSume on every iteration
numlong7bor = 1 -- preASSume on every iteration
while true do -- fake loop (this is GENE)
numcha7r = string.byte (strinco7cs,(numokt7index+1),(numokt7index+1))
if (boobotch7d) then
numdel7ta = 90 - numcha7r -- "Z" -- delta must be non-ZERO to write
break -- fill with "Z" char:s
end--if
if (not boodo7adj) then
break -- copy octet after octet
end--if
numlong7bor = lfulnutf8char(numcha7r)
if ((numlong7bor==0) or ((numokt7index+numlong7bor)>numlong7den)) then
numlong7bor = 1 -- reassign to ONE !!!
numdel7ta = 90 - numcha7r -- "Z" -- delta must be non-ZERO to write
boobotch7d = true
break -- truncated char or broken stream
end--if
if (numlong7bor>=3) then
break -- copy UTF8 char, no chance for adjustment
end--if
if (numlong7bor==1) then
boois7uppr = lfgtestuc(numcha7r)
boois7lowr = lfgtestlc(numcha7r)
if (boois7uppr and boowan7tlowr) then
numdel7ta = 32 -- ASCII UPPER->lower
end--if
if (boois7lowr and booup7cas) then
numdel7ta = -32 -- ASCII lower->UPPER
end--if
break -- success with ASCII and one char almost done
end--if
booc3block = (numcha7r==195) -- case delta is 32
booc4c5blk = ((numcha7r==196) or (numcha7r==197)) -- case delta is 1
boocccfblk = ((numcha7r>=204) and (numcha7r<=207)) -- case delta is 32
bood0d3blk = ((numcha7r>=208) and (numcha7r<=211)) -- case delta is 32 80 1
numcha7s = string.byte (strinco7cs,(numokt7index+2),(numokt7index+2)) -- only $80 to $BF
numcxa7rel = (mathmod(numcha7r,4)*64) + (numcha7s-128) -- 4 times 64
boomy7bit0x = ((mathmod(numcxa7rel,2))==1)
boomy7bit5x = mathbittest(numcxa7rel,5)
if (booc3block) then
boopen7din = true -- pending flag
if ((numcxa7rel==215) or (numcxa7rel==223) or (numcxa7rel==247)) then
boopen7din = false -- not a letter, we are done
end--if
if (numcxa7rel==255) then
boopen7din = false -- special LC silly "Y" with horizontal doubledot
if (booup7cas) then
numdel7ta = 121 -- lower->UPPER (distant and reversed order)
end--if
end--if
if (boopen7din) then
boois7lowr = boomy7bit5x -- mostly regular block, look at b5
boois7uppr = not boois7lowr
if (boois7uppr and boowan7tlowr) then
numdel7ta = 32 -- UPPER->lower
end--if
if (boois7lowr and booup7cas) then
numdel7ta = -32 -- lower->UPPER
end--if
end--if (boopen7din) then
break -- to join mark
end--if (booc3block) then
if (booc4c5blk) then
boopen7din = true -- pending flag
if ((numcxa7rel==56) or (numcxa7rel==73) or (numcxa7rel==127)) then
boopen7din = false -- not a letter, we are done
end--if
if (numcxa7rel==120) then
boopen7din = false -- special UC silly "Y" with horizontal doubledot
if (boowan7tlowr) then
numdel7ta = -121 -- UPPER->lower (distant and reversed order)
end--if
end--if
if (boopen7din) then
if (((numcxa7rel>=57) and (numcxa7rel<=73)) or (numcxa7rel>=121)) then
boois7lowr = not boomy7bit0x -- UC odd (misaligned)
else
boois7lowr = boomy7bit0x -- UC even (ordinary align)
end--if
boois7uppr = not boois7lowr
if (boois7uppr and boowan7tlowr) then
numdel7ta = 1 -- UPPER->lower
end--if
if (boois7lowr and booup7cas) then
numdel7ta = -1 -- lower->UPPER
end--if
end--if (boopen7din) then
break -- to join mark
end--if (booc4c5blk) then
if (boocccfblk) then
boois7uppr = ((numcxa7rel>=145) and (numcxa7rel<=171))
boois7lowr = ((numcxa7rel>=177) and (numcxa7rel<=203))
if (boois7uppr and boowan7tlowr) then
numdel7ta = 32 -- UPPER->lower
end--if
if (boois7lowr and booup7cas) then
numdel7ta = -32 -- lower->UPPER
end--if
break -- to join mark
end--if (boocccfblk) then
if (bood0d3blk) then
if (numcxa7rel<=95) then -- messy layout but no exceptions
boois7lowr = (numcxa7rel>=48) -- delta $20 or $50
boois7uppr = not boois7lowr
numdel7abs = 32 -- $20
if ((numcxa7rel<=15) or (numcxa7rel>=80)) then
numdel7abs = 80 -- $50
end--if
end--if
if ((numcxa7rel>=96) and (numcxa7rel<=129)) then -- no exceptions here
boois7lowr = boomy7bit0x -- UC even (ordinary align)
boois7uppr = not boois7lowr
numdel7abs = 1
end--if
if (numcxa7rel>=138) then -- some misaligns here !!!FIXME!!!
boois7lowr = boomy7bit0x -- UC even (ordinary align)
boois7uppr = not boois7lowr
numdel7abs = 1
end--if
if (boois7uppr and boowan7tlowr) then
numdel7ta = numdel7abs -- UPPER->lower
end--if
if (boois7lowr and booup7cas) then
numdel7ta = -numdel7abs -- lower->UPPER
end--if
break -- to join mark
end--if (bood0d3blk) then
break -- finally to join mark -- unknown non-ASCII char is a fact :-(
end--while -- fake loop -- join mark (this is GENE)
if ((numlong7bor==1) and (numdel7ta~=0)) then -- no risk of carry here
strinco7cs = lfgpokestring (strinco7cs,numokt7index,(numcha7r+numdel7ta))
end--if
if ((numlong7bor==2) and (numdel7ta~=0)) then -- no risk of carry here
numdel7car = 0
while true do -- inner genuine loop
if ((numcha7s+numdel7ta)<192) then
break
end--if
numdel7ta = numdel7ta - 64 -- get it down into range 128...191
numdel7car = numdel7car + 1 -- BIG ENDIAN 6 bits with carry
end--while
while true do -- inner genuine loop
if ((numcha7s+numdel7ta)>127) then
break
end--if
numdel7ta = numdel7ta + 64 -- get it up into range 128...191
numdel7car = numdel7car - 1 -- BIG ENDIAN 6 bits with carry
end--while
if (numdel7car~=0) then -- in-place change only if needed
strinco7cs = lfgpokestring (strinco7cs,numokt7index,(numcha7r+numdel7car))
end--if
if (numdel7ta~=0) then -- in-place change only if needed
strinco7cs = lfgpokestring (strinco7cs,(numokt7index+1),(numcha7s+numdel7ta))
end--if
end--if
numokt7index = numokt7index + numlong7bor -- advance in incoming string
end--while -- genuine loop over incoming string (this is GENE)
return strinco7cs
end--function lfucasegene
------------------------------------------------------------------------
---- HIGH LEVEL STRING FUNCTIONS [I] ---- !!!FIXME!!! move to here
------------------------------------------------------------------------
-- Local function LFILONGNAME
local function lfilongname (strlingvonomo, strctlcode)
local numsepanjang = 0
local numsejuta = 0
numsepanjang = string.len(strlingvonomo)
if ((numsepanjang>=1) and (strctlcode=="eo")) then
numsejuta = string.byte (strlingvonomo,numsepanjang,numsepanjang)
if (numsejuta==97) then
strlingvonomo = "la " .. strlingvonomo
end--if
end--if
if ((numsepanjang>=1) and (strctlcode=="id")) then
strlingvonomo = "bahasa " .. strlingvonomo
end--if
return strlingvonomo
end--function lfilongname
------------------------------------------------------------------------
-- Local function LFSTRIPPARENT
-- Strip part of string hidden in parentheses.
-- copy from "strwithparent" to "strystripped" until string " (" found
local function lfstripparent (strwithparent)
local strystripped = ''
local numloongwy = 0
local numiindexx = 0 -- ZERO-based
local numocct = 0
local numoddt = 0
numloongwy = string.len(strwithparent)
while true do
if (numiindexx==numloongwy) then
break -- copied whole string
end--if
numocct = string.byte(strwithparent,(numiindexx+1),(numiindexx+1))
numoddt = 0
if ((numiindexx+1)<numloongwy) then
numoddt = string.byte(strwithparent,(numiindexx+2),(numiindexx+2))
end--if
if (numoddt==40) then
break -- stop copying at " (" (2 char:s but only 1 checked)
end--if
strystripped = strystripped .. string.char(numocct)
numiindexx = numiindexx + 1
end--while
return strystripped
end--function lfstripparent
------------------------------------------------------------------------
local function lfchk789ucase (numasciicode, boollaccepted, booxxaccepted)
local boopositiveverdict = false
if (numasciicode==88) then -- X
boopositiveverdict = booxxaccepted
else
if (numasciicode==76) then -- L
boopositiveverdict = boollaccepted
else
boopositiveverdict = ((numasciicode==67) or (numasciicode==73) or (numasciicode==77) or (numasciicode==78) or (numasciicode==80) or (numasciicode==85) or (numasciicode==87)) -- C I M N P U W
end--if
end--if
return boopositiveverdict
end--function lfchk789ucase
------------------------------------------------------------------------
-- Local function LFIVALIDATELNKOADV
-- Advanced test whether a string (intended to be a langcode) is valid
-- containing only 2 or 3 lowercase letters, or 2...10 char:s and with some
-- dashes, or maybe a digit in middle position or maybe instead equals to "-"
-- or "??" and maybe additionally is not included on the ban list.
-- Input : * strqooq -- string (empty is useless and returns
-- "true" ie "bad" but cannot cause any major harm)
-- * booyesdsh -- "true" to allow special code dash "-"
-- * booyesqst -- "true" to allow special code doublequest "??"
-- * booloonkg -- "true" to allow long codes such as "zh-min-nan"
-- * boodigit -- "true" to allow digit in middle position
-- * boonoban -- (inverted) "true" to skip test against ban table
-- Output : * booisvaladv -- true if string is valid
-- Depends on functions :
-- [G] lfgtestnum lfgtestlc
-- Depends on constants :
-- * table "contabisbanned"
-- Incoming empty string is safe but type "nil" is NOT.
-- Digit is tolerable only ("and" applies):
-- * if boodigit is "true"
-- * if length is 3 char:s
-- * in middle position
-- Dashes are tolerable (except in special code "-") only ("and" applies):
-- * if length is at least 4 char:s (if this is permitted at all)
-- * in inner positions
-- * NOT adjacent
-- * maximally TWO totally
-- There may be maximally 3 adjacent letters, this makes at least ONE dash
-- obligatory for length 4...7, and TWO dashes for length 8...10.
local function lfivalidatelnkoadv (strqooq, booyesdsh, booyesqst, booloonkg, boodigit, boonoban)
local varomongkosong = 0 -- for check against the ban list
local numchiiar = 0
local numukurran = 0
local numindeex = 0 -- ZERO-based -- two loops
local numadjlet = 0 -- number of adjacent letters (max 3)
local numadjdsh = 0 -- number of adjacent dashes (max 1)
local numtotdsh = 0 -- total number of dashes (max 2)
local booislclc = false
local booisdigi = false
local booisdash = false
local booisvaladv = true -- preASSume innocence -- later final verdict here
while true do -- fake (outer) loop
if (strqooq=='-') then
booisvaladv = booyesdsh
break -- to join mark -- good or bad
end--if
if (strqooq=='??') then
booisvaladv = booyesqst
break -- to join mark -- good or bad
end--if
numukurran = string.len (strqooq)
if ((numukurran<2) or (numukurran>10)) then
booisvaladv = false
break -- to join mark -- evil
end--if
if (not booloonkg and (numukurran>3)) then
booisvaladv = false
break -- to join mark -- evil
end--if
numindeex = 0
while true do -- inner genuine loop over char:s
if (numindeex>=numukurran) then
break -- done -- good
end--if
numchiiar = string.byte (strqooq,(numindeex+1),(numindeex+1))
booisdash = (numchiiar==45)
booisdigi = lfgtestnum(numchiiar)
booislclc = lfgtestlc(numchiiar)
if (not (booislclc or booisdigi or booisdash)) then
booisvaladv = false
break -- to join mark -- inherently bad char
end--if
if (booislclc) then
numadjlet = numadjlet + 1
else
numadjlet = 0
end--if
if (booisdigi and ((numukurran~=3) or (numindeex~=1) or (not boodigit))) then
booisvaladv = false
break -- to join mark -- illegal digit
end--if
if (booisdash) then
if ((numukurran<4) or (numindeex==0) or ((numindeex+1)==numukurran)) then
booisvaladv = false
break -- to join mark -- illegal dash
end--if
numadjdsh = numadjdsh + 1
numtotdsh = numtotdsh + 1 -- total
else
numadjdsh = 0 -- do NOT zeroize the total !!!
end--if
if ((numadjlet>3) or (numadjdsh>1) or (numtotdsh>2)) then
booisvaladv = false
break -- to join mark -- evil
end--if
numindeex = numindeex + 1 -- ZERO-based
end--while -- inner genuine loop over char:s
if (not boonoban) then -- if "yesban" then
numindeex = 0
while true do -- lower inner genuine loop
varomongkosong = contabisbanned[numindeex+1] -- number of elem unknown
if (type(varomongkosong)~='string') then
break -- abort inner loop (then outer fake loop) due to end of table
end--if
numukurran = string.len (varomongkosong)
if ((numukurran<2) or (numukurran>3)) then
break -- abort inner loop (then outer fake loop) due to faulty table
end--if
if (strqooq==varomongkosong) then
booisvaladv = false
break -- abort inner loop (then outer fake loop) due to violation
end--if
numindeex = numindeex + 1 -- ZERO-based
end--while -- lower inner genuine loop
end--if (not boonoban) then
break -- finally to join mark
end--while -- fake loop -- join mark
return booisvaladv
end--function lfivalidatelnkoadv
------------------------------------------------------------------------
-- Local function LFIFILLINX
-- Replace placeholders "\@" "\\@" or "\~" "\\~" by given substitute string.
-- Input : * strbeforfill -- request string with placeholders to be filled
-- in, no placeholders or empty input is useless
-- but cannot cause major harm
-- * numaskikodo -- ASCII code of placeholder, 64 for "@" or
-- 126 for "~"
-- * varsupstitu -- substitute, either string (same content reused
-- if multiple placeholders), or ZERO-based table
-- (with one element per placeholder such as
-- {[0]="none","neniu"}), length 1...60
-- Output : * strafterfill
-- Depends on functions :
-- [G] lfgstringrange
local function lfifillinx (strbeforfill, numaskikodo, varsupstitu)
local varpfiller = 0 -- risky picking
local strufiller = '' -- final validated filler
local strafterfill = ''
local numlenbigtext = 0 -- len of strbeforfill
local numsfrcindex = 0 -- char index ZERO-based
local numinsrtinde = 0 -- index in table ZERO-based
local numtecken0d = 0
local numtecken1d = 0
numlenbigtext = string.len (strbeforfill)
while true do
if (numsfrcindex>=numlenbigtext) then
break -- empty input is useless but cannot cause major harm
end--if
numtecken0d = string.byte(strbeforfill,(numsfrcindex+1),(numsfrcindex+1))
numsfrcindex = numsfrcindex + 1 -- INC here
numtecken1d = 0 -- preASSume none
if (numsfrcindex<numlenbigtext) then -- pick but do NOT INC
numtecken1d = string.byte(strbeforfill,(numsfrcindex+1),(numsfrcindex+1))
end--if
if ((numtecken0d==92) and (numtecken1d==numaskikodo)) then -- "\@" "\~"
numsfrcindex = numsfrcindex + 1 -- INC more, now totally + 2
varpfiller = 0 -- preASSume nothing available
strufiller = '??' -- preASSume nothing available
if (type(varsupstitu)=='string') then
varpfiller = varsupstitu -- take it as-is (length check below)
end--if
if (type(varsupstitu)=='table') then
varpfiller = varsupstitu [numinsrtinde] -- risk of type "nil"
numinsrtinde = numinsrtinde + 1 -- INC tab index on every placeholder
end--if
if (lfgstringrange(varpfiller,1,60)) then -- !!!FIXME!!! nowiki and other sanitization
strufiller = varpfiller -- now the substitute is finally accepted
end--if
else
strufiller = string.char (numtecken0d) -- no placeholder -> copy octet
end--if
strafterfill = strafterfill .. strufiller -- add one of 4 possible cases
end--while
return strafterfill
end--function lfifillinx
------------------------------------------------------------------------
-- Local function LFIKODEOSG
-- Transcode eo X-surrogates to cxapeloj in a single string (eo only).
-- Input : * streosurr -- ANSI string (empty is useless but cannot
-- cause major harm)
-- Output : * strutf8eo -- UTF8 string
-- Depends on functions :
-- [E] mathdiv mathmod
-- Depends on constants :
-- * table "contabtransluteo" inherently holy
-- To be called ONLY from "lfhfillsurrstrtab".
-- * the "x" in a surr pair is case insensitive,
-- for example both "kacxo" and "kacXo" give same result
-- * avoid "\", thus for example "ka\cxo" would get converted but the "\" kept
-- * double "x" (both case insensitive) prevents conversion and becomes
-- reduced to single "x", for example "kacxxo" becomes "kacxo"
local function lfikodeosg (streosurr)
local vareopeek = 0
local strutf8eo = ''
local numeoinplen = 0
local numinpinx = 0 -- ZERO-based source index
local numknar0k = 0 -- current char
local numknaf1x = 0 -- next char (ZERO is NOT valid)
local numknaf2x = 0 -- post next char (ZERO is NOT valid)
local boonext1x = false
local boonext2x = false
local boosudahdone = false
numeoinplen = string.len(streosurr)
while true do
if (numinpinx>=numeoinplen) then
break
end--if
numknar0k = string.byte(streosurr,(numinpinx+1),(numinpinx+1))
numknaf1x = 0 -- preASSume no char
numknaf2x = 0 -- preASSume no char
if ((numinpinx+1)<numeoinplen) then
numknaf1x = string.byte(streosurr,(numinpinx+2),(numinpinx+2))
end--if
if ((numinpinx+2)<numeoinplen) then
numknaf2x = string.byte(streosurr,(numinpinx+3),(numinpinx+3))
end--if
boonext1x = ((numknaf1x==88) or (numknaf1x==120)) -- case insensitive
boonext2x = ((numknaf2x==88) or (numknaf2x==120)) -- case insensitive
boosudahdone = false
if (boonext1x and boonext2x) then -- got "xx"
strutf8eo = strutf8eo .. string.char(numknar0k,numknaf1x) -- keep one "x" only
numinpinx = numinpinx + 3 -- eaten 3 written 2
boosudahdone = true
end--if
if (boonext1x and (not boonext2x)) then -- got yes-"x" and no-"x"
vareopeek = contabtransluteo[numknar0k] -- UINT16 or type "nil"
if (type(vareopeek)=='number') then
strutf8eo = strutf8eo .. string.char(mathdiv(vareopeek,256),mathmod(vareopeek,256)) -- add UTF8 char
numinpinx = numinpinx + 2 -- eaten 2 written 2
boosudahdone = true
end--if
end--if
if (not boosudahdone) then
strutf8eo = strutf8eo .. string.char(numknar0k) -- copy char
numinpinx = numinpinx + 1 -- eaten 1 written 1
end--if
end--while
return strutf8eo
end--function lfikodeosg
------------------------------------------------------------------------
-- Local function LFIKODSVSG
-- Transcode sv blackslash-surrogates in a single string (sv only).
-- Input : * strsvsurr -- ANSI string (empty is useless but cannot
-- cause major harm)
-- Output : * strutf8sv -- UTF8 string
-- Depends on functions :
-- [E] mathdiv mathmod
-- Depends on constants :
-- * table "contabtranslutsv" inherently holy
-- To be called ONLY from "lfhfillsurrstrtab".
-- * the latter letter in a surr triple is case insensitive,
-- for example both "\AEgare" and "\Aegare" give same result
local function lfikodsvsg (strsvsurr)
local varsvpeek = 0
local strutf8sv = ''
local strsvdouble = ''
local numsvinplen = 0
local numinpinx = 0 -- ZERO-based source index
local numsvonechar = 0 -- current char
numsvinplen = string.len(strsvsurr)
while true do
if (numinpinx>=numsvinplen) then
break
end--if
numsvonechar = string.byte(strsvsurr,(numinpinx+1),(numinpinx+1))
strsvdouble = '' -- preASSume no dblchar
if ((numsvonechar==92) and ((numinpinx+2)<numsvinplen)) then
strsvdouble = string.sub(strsvsurr,(numinpinx+2),(numinpinx+3))
end--if
varsvpeek = contabtranslutsv[strsvdouble] -- UINT16 or type "nil"
if (type(varsvpeek)=='number') then
strutf8sv = strutf8sv .. string.char(mathdiv(varsvpeek,256),mathmod(varsvpeek,256)) -- add UTF8 char
numinpinx = numinpinx + 3 -- eaten 3 written 2
else
strutf8sv = strutf8sv .. string.char(numsvonechar) -- copy char
numinpinx = numinpinx + 1 -- eaten 1 written 1
end--if
end--while
return strutf8sv
end--function lfikodsvsg
------------------------------------------------------------------------
---- HIGH LEVEL FUNCTIONS [H] ----
------------------------------------------------------------------------
-- Local function LFHVALI1STATUS99CODE
-- Depends on functions :
-- [E] mathisintrange
local function lfhvali1status99code (varvalue)
local boovalid = false -- preASSume guilt
while true do -- fake loop
if (varvalue==0) then
break -- success thus keep false since no valid error code ;-)
end--if
if (mathisintrange(varvalue,1,99)) then
boovalid = true -- got an error and valid error code returned
else
varvalue = 255 -- failed to return valid status code
end--if
break -- finally to join mark
end--while -- fake loop -- join mark
return varvalue, boovalid
end--function lfhvali1status99code
------------------------------------------------------------------------
-- Local function LFHCONSTRUCTERAR
-- Input : * numerar6code -- 1 ... 99 or 2 ... 99 (invalid data type ignored)
-- * boopeek6it -- do peek description #E02...#E99 from table
-- Depends on functions :
-- [N] lfnumto2digit
-- [E] mathisintrange mathdiv mathmod
-- Depends on constants :
-- * maybe table contaberaroj TWO-based (holes permitted)
-- To be called ONLY from lfhbrewerror, lfhbrewerrsm,
-- lfhbrewerrsvr, lfhbrewerrinsi.
local function lfhconstructerar (numerar6code, boopeek6it)
local vardes6krip = 0
local numbottom6limit = 1
local stryt6sux = '#E'
if (type(numerar6code)~='number') then
numerar6code = 0 -- invalid
end--if
if (boopeek6it) then
numbottom6limit = 2 -- #E01 is a valid code for submodule only
end--if
if (mathisintrange(numerar6code,numbottom6limit,99)) then
stryt6sux = stryt6sux .. lfnumto2digit(numerar6code)
if (boopeek6it) then
vardes6krip = contaberaroj[numerar6code] -- risk of type "nil"
if (type(vardes6krip)=='string') then
stryt6sux = stryt6sux .. ' ' .. vardes6krip
else
stryt6sux = stryt6sux .. ' ??' -- no text found
end--if
end--if (boopeek6it) then
else
stryt6sux = stryt6sux .. '??' -- no valid error code
end--if
return stryt6sux
end--function lfhconstructerar
------------------------------------------------------------------------
-- Local function LFHBREWERRSM
-- Input : * numerar8code -- 2 ... 99
-- * strsubnama -- can be omitted if no submodule
-- * numsubkodo -- 1 ... 99 invalid type ignored
-- Depends on functions :
-- [H] lfhconstructerar
-- [N] lfnumto2digit
-- [E] mathisintrange mathdiv mathmod
-- Depends on constants :
-- * 3 strings constrelabg constrelaen constrlaxhu
-- * table contaberaroj TWO-based (holes permitted)
local function lfhbrewerrsm (numerar8code, strsubnama, numsubkodo)
local stryt8sux = ''
local strfromsubo = ''
stryt8sux = constrlaxhu .. constrelabg .. lfhconstructerar (numerar8code,true) .. constrelaen .. constrlaxhu
if (type(strsubnama)=='string') then
strfromsubo = 'Submodule "' .. strsubnama .. '" reports ' .. lfhconstructerar (numsubkodo,false)
stryt8sux = stryt8sux .. '<br>' .. constrlaxhu .. constrelabg .. strfromsubo .. constrelaen .. constrlaxhu
end--if
return stryt8sux
end--function lfhbrewerrsm
------------------------------------------------------------------------
-- Local function LFIULTIMINSERT
-- Insert selected substitute strings into request string at given positions
-- with optional discarding if the substitute string is empty. Discarding
-- is protected from access out of range by clamping the distances.
-- Input : * strrekvest -- request string containing placeholders
-- (syntax see below)
-- * tabsubstut -- list with substitute strings using two-letter
-- codes as keys, non-string in the table is safe and
-- has same effect as empty string, still type "nil"
-- or empty string "" are preferred
-- Output : * strhazil
-- Syntax of the placeholder:
-- * "@" followed by 2 uppercase letters and 2 hex numbers, otherwise
-- the hit is not processed, but copied as-is instead
-- * 2 letters select the substitute from table supplied by the caller
-- * 2 hex numbers control discarding left and right (0...15 char:s)
-- Empty item in "tabsubstut" is legal and results in discarding if some of
-- the control numbers is non-ZERO. Left discarding is practically performed
-- on "strhazil" whereas right discarding on "strrekvest" and "numdatainx".
-- If uppercasing or other adjustment is needed, then the caller must
-- take care of it by providing several separate substitute strings with
-- separate names in the table.
-- Depends on functions :
-- [G] lfgtestnum lfgtestuc
-- [N] lfnonehextoint
local function lfiultiminsert (strrekvest,tabsubstut)
local varduahuruf = 0
local strhazil = ''
local numdatalen = 0
local numdatainx = 0 -- src index
local numdataoct = 0 -- maybe @
local numdataodt = 0 -- UC
local numdataoet = 0 -- UC
local numammlef = 0 -- hex and discard left
local numammrig = 0 -- hex and discard right
local boogotplejs = false
numdatalen = string.len(strrekvest)
numdatainx = 1 -- ONE-based
while true do -- genuine loop, "numdatainx" is the counter
if (numdatainx>numdatalen) then -- beware of risk of overflow below
break -- done (ZERO iterations possible)
end--if
boogotplejs = false
numdataoct = string.byte(strrekvest,numdatainx,numdatainx)
numdatainx = numdatainx + 1
while true do -- fake loop
if ((numdataoct~=64) or ((numdatainx+3)>numdatalen)) then
break -- no hit here
end--if
numdataodt = string.byte(strrekvest, numdatainx , numdatainx )
numdataoet = string.byte(strrekvest,(numdatainx+1),(numdatainx+1))
if ((not lfgtestuc(numdataodt)) or (not lfgtestuc(numdataoet))) then
break -- no hit here
end--if
numammlef = string.byte(strrekvest,(numdatainx+2),(numdatainx+2))
numammrig = string.byte(strrekvest,(numdatainx+3),(numdatainx+3))
numammlef = lfnonehextoint (numammlef)
numammrig = lfnonehextoint (numammrig)
boogotplejs = ((numammlef~=255) and (numammrig~=255))
break
end--while -- fake loop -- join mark
if (boogotplejs) then
numdatainx = numdatainx + 4 -- consumed 5 char:s, cannot overflow here
varduahuruf = string.char (numdataodt,numdataoet)
varduahuruf = tabsubstut[varduahuruf] -- risk of type "nil"
if (type(varduahuruf)~='string') then
varduahuruf = '' -- type "nil" or invalid type gives empty string
end--if
if (varduahuruf=='') then
numdataoct = string.len(strhazil) - numammlef -- this can underflow
if (numdataoct<=0) then
strhazil = ''
else
strhazil = string.sub(strhazil,1,numdataoct) -- discard left
end--if
numdatainx = numdatainx + numammrig -- discard right this can overflow
else
strhazil = strhazil .. varduahuruf -- insert / expand
end--if
else
strhazil = strhazil .. string.char(numdataoct) -- copy char as-is
end--if (boogotplejs) else
end--while
return strhazil
end--function lfiultiminsert
------------------------------------------------------------------------
-- Local function LFIFINDITEMS
-- Search in string primarily intended for LFIULTIMINSERT.
-- Input : * long string where to search (for example "Kapvorto (@LK00)")
-- * even number of char:s what to search (for example "WCWU")
-- Output : * boolean ("true" in any found, "false" for our example)
local function lfifinditems (strwhere, strandevenwhat)
local strcxztvaa = ''
local numcxzlen = 0
local numcxzind = 1 -- ONE-based step TWO
local boofoundthecrap = false
numcxzlen = string.len(strandevenwhat)
while true do
if (numcxzind>=numcxzlen) then
break -- not found
end--if
strcxztvaa = '@' .. string.sub(strandevenwhat,numcxzind,(numcxzind+1))
boofoundthecrap = (string.find(strwhere,strcxztvaa,1,true)~=nil)
if (boofoundthecrap) then
break -- found any of them, done
end--if
numcxzind = numcxzind + 2
end--while
return boofoundthecrap
end--function lfifinditems
------------------------------------------------------------------------
-- Local function LFHFILLSURRSTRTAB
-- Process (fill in, transcode surr) either a single string, or all string
-- items in a table (even nested) using any type of keys/indexes (such as
-- a holy number sequence and non-numeric ones). Items with a non-string
-- value are kept as-is. For filling in own name, and converting eo and
-- sv surrogates (via 3 separate sub:s).
-- Input : * varinkommen -- type "string" or "table"
-- * varfyllo -- string, or type "nil" if no filling-in desired
-- * strlingkod -- "eo" or "sv" to convert surrogates, anything
-- else (preferably type "nil") to skip this
-- Depends on functions :
-- [I] lfifillinx (only if filling-in desired)
-- [I] lfikodeosg (only if converting of eo X-surrogates desired)
-- [I] lfikodsvsg (only if converting of sv blackslash-surrogates desired)
-- [E] mathdiv mathmod (via "lfikodeosg" and "lfikodsvsg")
-- Depends on constants :
-- * table "contabtransluteo" inherently holy (via "lfikodeosg")
-- * table "contabtranslutsv" inherently holy (via "lfikodsvsg")
local function lfhfillsurrstrtab (varinkommen, varfyllo, strlingkod)
local varkey = 0 -- variable without type
local varele = 0 -- variable without type
local varutmatning = 0
local boodone = false
if (type(varinkommen)=='string') then
if (type(varfyllo)=='string') then
varinkommen = lfifillinx (varinkommen,64,varfyllo) -- fill-in
end--if
if (strlingkod=='eo') then
varinkommen = lfikodeosg (varinkommen) -- surr
end--if
if (strlingkod=='sv') then
varinkommen = lfikodsvsg (varinkommen) -- surr
end--if
varutmatning = varinkommen -- copy, risk for no change
boodone = true
end--if
if (type(varinkommen)=='table') then
varutmatning = {} -- brew new table
varkey = next (varinkommen) -- try to pick 0:th (in no order) key/index
while true do
if (varkey==nil) then
break -- empty table or end reached
end--if
varele = varinkommen[varkey] -- pick element of unknown type
if ((type(varele)=='string') or (type(varele)=='table')) then
varele = lfhfillsurrstrtab (varele, varfyllo, strlingkod) -- RECURSION
end--if
varutmatning[varkey] = varele -- write at same place in dest table
varkey = next (varinkommen, varkey) -- try to pick next key/index
end--while
boodone = true
end--if
if (not boodone) then
varutmatning = varinkommen -- copy as-is whatever it is
end--if
return varutmatning
end--function lfhfillsurrstrtab
------------------------------------------------------------------------
---- VARIABLES [R] ----
------------------------------------------------------------------------
function exporttable.ek (arxframent)
-- general unknown type
local varkantctl = 0 -- picked from "contabkatoj"
local vartmp = 0 -- variable without type multipurpose !!!FIXME!!!
-- special type "args" AKA "arx"
local arxsomons = 0 -- metaized "args" from our own or caller's "frame"
-- general tab ("qtabkatoj" is elsewhere)
local tablg78ysubt = {}
local tabonelang = {} -- subtable for one language
local tabblock = {} -- from "%"-syntax assi
local tablinx = {} -- from "#"-syntax assi filled by "strkrositem"
local tabmnfragments = {} -- for manual split
local tabextfrog = {} -- from "ext="
local tabstuff = {} -- double-letter indexes
-- peeked stuff
local strpiklangcode = '' -- "en" privileged site language
local strpikkatns = '' -- "Category"
local strpikapxns = '' -- "Appendix"
local strpikparent = '' -- "Template:nope" FULLPAGENAME
local strpikpareuf = '' -- "nope" PAGENAME unfull
-- general str ("qstrtrace" is elsewhere)
local strtomp = "" -- temp (fix "contaberaroj", fill insane table, ...)
local strviserr = "" -- visible error
local strvisgud = "" -- visible good output
local strinvank = "" -- invisible "anchor" part
local strinvkat = "" -- invisible category part
local strret = "" -- final result string
-- str for prevalidation of split control string
local strkrositem = "" -- assi: prevalidated item from the cross "#"-syntax
local strreconl = "" -- manu: reconstructed complete lemma for "sum check"
local strfragtbl = "" -- manu: prevalidated fragment to be stored in table
local strinnertst = "" -- manu: inner content of brackets to be checked
local str2field = "" -- manu
-- str specific to language processing
local strfrafra = '' -- split control string from "fra=" before conversion
local strextext = '' -- extra param
local strdstdst = '' -- distinction hint from "dst="
local strscrscr = '' -- script code from "scr="
local strpagenam = '' -- from "{{PAGENAME}}" or "pagenameoverridetestonly"
local strlemma = '' -- bold lemma (maybe split) from pagename
local strkodbah = '' -- langcode (2 or 3 lowercase) from arxsomons[1]
local strkodkek6 = '' -- word class code (2 uppercase) from arxsomons[2]
local strkodkek7 = '' -- further word class
local strnambah = '' -- language name (without prefix "Bahasa")
local strnambalo = '' -- long language name (with prefix "la" or "bahasa")
local strnamasin = '' -- language name in the language (propralingve)
local strnamke6 = '' -- word class full
local strnamco6 = '' -- word class stripped
local strnamke7 = '' -- word class full
local strnamco7 = '' -- word class stripped
-- general num
local numerr = 0 -- 1 in 2 pa 3 sub 4 neva 5 neko 6 wc 7 fra 8 $S$H 9 chk
local num2statcode = 0 -- status code from submodule
local numpindex = 0 -- number of anon params
local numsplit = 0 -- split strategy (0 auto 1 assisted auto 2 manu 7 none)
local numlong = 0 -- for parameter processing
local numtamp = 0
local numoct = 0
local numodt = 0
local numoet = 0
local numkindex = 0
-- num for prevalidation of split control string
local numlaong = 0
local numogt = 0 -- assi and manu
local numoht = 0
local numtbindx = 0 -- current index
local numprevdx = 0 -- previous index
local numhelpcn = 0 -- help counter (assi) and fragment counter (manu)
local numnestin = 0 -- number of source opened '[' (manu)
local numofslhs = 0 -- number of source slashes (manu)
-- quasi-constant num from "constrmainctl"
local numshowlemma = 0 -- four-state 0...3
-- general boo (cool "qboodetrc" declared separately outside main)
local boonocat = false -- from "nocat=true" !!!FIXME!!!
local bookonata = false -- true if we get valid lang name
local boohavasi = false -- true if we have valid name in "strnamasin" too
local boohavdua = false -- true if we have 2 word classes
local boohavdst = false
local boohavext = false -- true if we have "ext="
local boomo3kat = false -- true if "numshowlemma"=3
local boohavnyr = false -- true if we got "NR" (ultimately exclusive)
local boohavkal = false -- true if we got "KA" (almost exclusive)
local boohavkur = false -- true if we got "KU"
local bootimp = false
-- boo for for prevalidation of split control string
local boocaught = false -- temp
local boo210kl = false -- got "L:" thus slash is prohibited
local booslshxx = false -- at least one slash "/" in complete manu "fra="
local boohavepl = false -- fragment is preceded by plus "+"
local boohvtext = false -- have ordinary text char:s
local boohvpref = false -- have prefix "M:" or similar inside []
-- quasi-constant boo from "constrmainctl"
local booshowimage = false
------------------------------------------------------------------------
---- MAIN [Z] ----
------------------------------------------------------------------------
---- GUARD AGAINST INTERNAL ERROR AGAIN ----
-- later reporting of #E01 may NOT depend on uncommentable strings
if (qbooguard) then
numerr = 1 -- #E01 internal
end--if
lfdtracemsg ('This is "mlawc", requested "detrc" report')
lfdshowvar (constrmainctl,'constrmainctl')
lfdshowvar (conbookodlng,'conbookodlng')
lfdshowvar (conboomiddig,'conboomiddig')
lfdtracemsg ('Var "numerr" is "' .. tostring(numerr) .. '" so far')
---- PEEK STUFF THAT IS NOT OVERRIDDEN GENEROUS ---- !!!FIXME!!!
if (numerr==0) then
strpiklangcode = constrpriv -- privileged site language
strpikparent = constrkoll -- called
end--if
---- PROCESS MESSAGES, FILL IN ALWAYS, SURR ONLY IF NEEDED ----
-- placeholder "\@" "\\@" is replaced by augmented name of the caller
-- from "strpikparent" in any case, for example 'sxablono "test"' or
-- 'templat "test"'
-- only for some languages the surr-transcoding is subsequently performed
if (numerr==0) then
contaberaroj = lfhfillsurrstrtab (contaberaroj, strpikparent, strpiklangcode)
contabwc = lfhfillsurrstrtab (contabwc, nil, strpiklangcode) -- no filling
end--if
---- FILL IN 2 SEMI-HARDCODED PARAMETERTS TO 3 VAR:S ----
numoct = string.byte (constrmainctl,1,1) -- "0" or "1"
booshowimage = (numoct==49)
numoct = string.byte (constrmainctl,2,2) -- "0" or "1" or "2" or "3"
numshowlemma = lfdec1digcl (numoct,3)
boomo3kat = (numshowlemma==3) -- needed for 2 sub:s and final categoriz
---- PICK ONE SUBTABLE ----
-- on error we assign "numerr" and "num2statcode" both used far below
while true do -- fake loop
if (numerr~=0) then -- #E01 possible
break -- to join mark
end--if
num2statcode, bootimp = lfhvali1status99code (qldingvoj[2]) -- from "loaddata-tbllingvoj"
if (num2statcode~=0) then
if (bootimp) then
numerr = 3 -- #E03 nombrigita
else
numerr = 2 -- #E02 malica
end--if
break -- to join mark
end--if
tablg78ysubt = qldingvoj['T78']
if (type(tablg78ysubt)~='table') then -- important check
numerr = 2 -- #E02 malica
break -- to join mark
end--if
break -- finally to join mark
end--while -- fake loop -- join mark
lfdshowvar (numerr,'numerr','picked subtable T78')
lfdshowvar (num2statcode,'num2statcode')
---- GET THE ARX (ONE OF TWO) ----
-- must be seized independently on "numerr" even if we already suck
arxsomons = arxframent.args -- "args" from our own "frame"
if (type(arxsomons)~="table") then
arxsomons = {} -- guard against indexing error
numerr = 1 -- #E01 internal
end--if
if (arxsomons['caller']=="true") then
arxsomons = arxframent:getParent().args -- "args" from caller's "frame"
end--if
if (type(arxsomons)~="table") then
arxsomons = {} -- guard against indexing error again
numerr = 1 -- #E01 internal
end--if
---- PROCESS 3 HIDDEN NAMED PARAMS INTO 1 STRING AND 2 BOOLEAN:S ----
-- this may override "mw.title.getCurrentTitle().text" and
-- stipulate content in "strpagenam", empty is NOT valid
-- bad "pagenameoverridetestonly=" can give #E01
-- no error is possible from other hidden parameters
-- "detrc=" and "nocat=" must be seized independently on "numerr" !!!FIXME!!! remove "nocat" in favor of "pate"
-- even if we already suck, but type "table" must be ensured !!!
strpagenam = ''
if (numerr==0) then
vartmp = arxsomons['pagenameoverridetestonly']
if (type(vartmp)=='string') then
numtamp = string.len(vartmp)
if ((numtamp>=1) and (numtamp<=120)) then
strpagenam = vartmp -- empty is not legal
else
numerr = 1 -- #E01 internal
end--if
end--if
end--if
if (arxsomons['nocat']=='true') then
boonocat = true
end--if
if (arxsomons['detrc']=='true') then
lfdtracemsg ('Param "detrc=true" seized')
else
qboodetrc = false -- was preassigned to "true"
qstrtrace = '' -- shut up now
end--if
lfdshowvar (numerr,'numerr','done with hidden parameters')
---- SEIZE THE PAGENAME FROM MW ----
-- later reporting of #E01 may NOT depend on uncommentable strings
-- must be 1...120 octet:s keep consistent with "pagenameoverridetestonly="
if ((numerr==0) and (strpagenam=='')) then
vartmp = mw.title.getCurrentTitle().text -- without namespace prefix
if (type(vartmp)=='string') then
numtamp = string.len(vartmp)
if ((numtamp>=1) and (numtamp<=120)) then
strpagenam = vartmp -- pagename here (empty is NOT legal)
else
numerr = 1 -- #E01 internal
end--if
end--if
end--if
---- STRICTLY CHECK THE PAGENAME ----
-- for example "o'clock" is legal "o'clock's" is legal
-- but "o''clock" is a crime
if (numerr==0) then
if (strpagenam=='') then
numerr = 1 -- #E01 internal
else
if (lfibanmulti("'1[0]0{0}0",strpagenam)) then
numerr = 1 -- #E01 internal
end--if
end--if
end--if
---- WHINE IF YOU MUST #E01 ----
-- reporting of this error #E01 may NOT depend on
-- uncommentable strings as "strpikparent" and "contaberaroj"
-- do NOT use sub "lfhbrewerrsm", report our name (NOT of template) and in EN
if (numerr==1) then
strtomp = '#E01 Internal error in module "mlawc".'
strviserr = constrlaxhu .. constrelabg .. strtomp .. constrelaen .. constrlaxhu
end--if
---- PRELIMINARILY ANALYZE ANONYMOUS PARAMETERS ----
-- this will catch holes, empty parameters, too long parameters,
-- and wrong number of parameters
-- below on exit var "numpindex" will contain number of
-- prevalidated anonymous params
-- this depends on 3 constants:
-- * contabparam[0] minimal number
-- * contabparam[1] maximal number
-- * contabparam[2] maximal length (default 160)
if (numerr==0) then
numpindex = 0 -- ZERO-based
numtamp = contabparam[1] -- maximal number of params
while true do
vartmp = arxsomons [numpindex+1] -- can be "nil"
if ((type(vartmp)~="string") or (numpindex>numtamp)) then
break -- good or bad
end--if
numlong = string.len (vartmp)
if ((numlong==0) or (numlong>contabparam[2])) then
numerr = 9 -- #E09 param/RTFD
break -- only bad here
end--if
numpindex = numpindex + 1 -- on exit has number of valid parameters
end--while
if ((numpindex<contabparam[0]) or (numpindex>numtamp)) then
numerr = 9 -- #E09 param/RTFD
end--if
end--if
---- PROCESS 2 OBLIGATORY ANONYMOUS PARAMS INTO 3 STRINGS ----
-- now var "numpindex" sudah contains number of prevalidated params always
-- 2 and is useless
-- here we validate and assign "strkodbah", "strkodkek6",
-- "boohavdua", "strkodkek7" (can be empty), "boohavkal", "boohavnyr"
-- note that "lfivalidatelnkoadv" returns "true" if the sring is valid and
-- natively supports "??" whereas "lfimultestuc" returns "true" on success
-- too but does NOT natively support "??"
-- this depends directly on "conbookodlng" "conboomiddig"
-- this depends indirectly on "contabisbanned" via "lfivalidatelnkoadv"
if (numerr==0) then
while true do -- fake loop
strkodbah = arxsomons[1] -- langcode (obligatory)
if (not lfivalidatelnkoadv(strkodbah,false,true,conbookodlng,conboomiddig,false)) then
numerr = 11 -- #E11 -- "-" banned and "??" tolerable in "lfivalidatelnkoadv"
break -- to join mark
end--if
boohavdua = false
strkodkek6 = arxsomons[2] -- 2 UC or 4 UC (obligatory)
numlong = string.len (strkodkek6)
strkodkek7 = ""
if (numlong==4) then -- maybe 2 word classes
strkodkek7 = string.sub (strkodkek6,3,4)
strkodkek6 = string.sub (strkodkek6,1,2)
if ((strkodkek6=='??') or (strkodkek7=='??')) then
numerr = 13 -- #E13 -- if both are specified then no "??" tolerable
break -- to join mark
end--if
boohavdua = true
end--if
if (strkodkek6~='??') then -- "??" is unknown but not faulty
if (lfimultestuc(strkodkek6,2)==false) then
numerr = 13 -- #E13
break -- to join mark
end--if
end--if
if (boohavdua) then -- here "??" for unknown is NOT permitted
if (lfimultestuc(strkodkek7,2)==false) then
numerr = 13 -- #E13
break -- to join mark
end--if
end--if
if ((strkodkek6=='NR') or (strkodkek7=='NR')) then
boohavnyr = true -- needed far below
end--if
if ((strkodkek6=='KA') or (strkodkek7=='KA')) then
boohavkal = true -- needed far below
end--if
if ((strkodkek6=='KU') or (strkodkek7=='KU')) then
boohavkur = true -- only for exclusivity test
end--if
if (boohavdua and boohavnyr) then
numerr = 13 -- #E13 -- "NR" is ultimately exclusive
break -- to join mark
end--if
if (boohavdua and boohavkal and (boohavkur==false)) then
numerr = 13 -- #E13 -- "KA" is almost exclusive
break -- to join mark
end--if
if ((strkodbah=='??') and (strkodkek6=='??')) then
numerr = 13 -- #E13 -- both unknown is illegal
break -- to join mark
end--if
break -- finally to join mark
end--while -- fake loop -- join mark
end--if
---- PROCESS 1 OPTIONAL NAMED PARAM INTO 1 STRING ----
-- here we validate and assign "boohavdst" and "strdstdst"
-- (2...40 or empty) from "dst=" regardless "numshowlemma"
boohavdst = false
strdstdst = ''
if (numerr==0) then
while true do -- fake loop -- abort on both success or failure -- "dst"
vartmp = arxsomons['dst'] -- optional, NOT yet prevalidated
if (type(vartmp)~="string") then
break -- parameter not specified
end--if
numtamp = string.len(vartmp)
if ((numtamp<2) or (numtamp>40)) then
numerr = 19 -- #E19 -- "dst=" is bad
break
end--if
boohavdst = true
strdstdst = vartmp
if (lfibanmulti("'0[0]0{0}0(0)0",strdstdst)) then
numerr = 19 -- #E19 -- "dst=" is bad -- all brackets prohibited
lfdtracemsg ('Illegal bracket in parameter "dst=" found')
end--if
break -- finally to join mark
end--while -- fake loop -- join mark
end--if
lfdshowvar (strdstdst,'strdstdst','"dst=" maybe seized')
lfdshowvar (numerr,'numerr')
---- PROCESS 3 OPTIONAL NAMED PARAMS INTO 3 STRINGS ----
-- here we (only if "numshowlemma" >=2 or is 3) prevalidate and store
-- 3 parameters "fra=" "ext=" "scr="
-- from "fra=" to string "strfrafra" (1...120 or empty) and to
-- "numsplit" (0...5 or 7) #S5 #S7
-- min length is: 1 for "-" no split | 2 for assi split | 4 for manual split
-- "numsplit" must be 7 if "numshowlemma" is 0 or 1 !!!
-- tables "tabblock" and "tablinx" must be empty for "numsplit" other than 1
-- "strfrafra" is needed after end of this block only for "numsplit" 1 or 2
-- here we validate and assign "strextext" 2 char:s (8 possible
-- values) or 5...120 char:s and assign "boohavext"
-- here we validate and assign "strscrscr" 1 uppercase
strfrafra = ''
numsplit = 7 -- preliminary default strategy is no split #S7
if ((numerr==0) and (numshowlemma>=2)) then
while true do -- fake loop -- abort on both success or failure -- "fra"
numsplit = 0 -- default strategy is auto #S0
vartmp = arxsomons['fra'] -- optional, NOT yet prevalidated
if (type(vartmp)~="string") then
break -- parameter not specified, stick with default strategy 0 or 7
end--if
numtamp = string.len(vartmp)
if ((numtamp<1) or (numtamp>120)) then
numerr = 23 -- #E23 "fra=" is bad (illegal length)
break
end--if
strfrafra = vartmp
if (lfibanmulti("/1(1)1+1'1[1]1{0}0|0",strfrafra)) then
numerr = 23 -- #E23 "fra=" is bad (illegal char:s detected)
break
end--if
vartmp = string.find (strfrafra, "[]", 1, true) -- plain text search
if (vartmp) then
numerr = 23 -- #E23 "fra=" is bad
break
end--if
vartmp = string.find (strfrafra, "[http://", 1, true) -- plain text search
if (vartmp) then
numerr = 23 -- #E23 "fra=" is bad
break
end--if
vartmp = string.find (strfrafra, "[https://", 1, true) -- plain text search
if (vartmp) then
numerr = 23 -- #E23 "fra=" is bad
break
end--if
if (string.len(strfrafra)==2) then
numoct = string.byte (strfrafra,1,1) -- maybe "$" ("&" belongs "ext=")
numodt = string.byte (strfrafra,2,2) -- only 3 letters tolerable S B H
if ((numoct==36) and ((numodt==83) or (numodt==66) or (numodt==72))) then
numsplit = 3 -- 83 "$S" : simple root split #S3 -> frag type N+U
if (numodt==66) then
numsplit = 4 -- 66 "$B" : simple bare root #S4 -> frag type M or N
end--if
if (numodt==72) then
numsplit = 5 -- 72 "$H" : large letter #S5 -> frag type M
end--if
if (numsplit==3) then
numtamp = string.len (strpagenam) -- at least 1 but 1 is too low
numoet = string.byte (strpagenam,numtamp,numtamp)
if ((numtamp==1) or (lfgtestlc(numoet)==false)) then
numerr = 14 -- #E14 illegal pagename for "$S" #S3
lfdtracemsg ('Illegal pagename for "$S" (simple root split) in parameter "fra="')
end--if
end--if (numsplit==3) then
if (numsplit==5) then
if (lfibanmulti("!0,0.0;0?0 0-0'0",strfrafra)) then
numerr = 14 -- #E14 illegal pagename for "$H" #S5
lfdtracemsg ('Illegal pagename for "$H" (large letter split) in parameter "fra="')
end--if
end--if (numsplit==5) then
break -- done success 345 and "strfrafra" not needed anymore or #E14
end--if ((numoct==36) and ...
end--if (string.len(strfrafra)==2) then
if (strfrafra=="-") then
numsplit = 7 -- no split #S7
break -- done success 7 and "strfrafra" not needed anymore
end--if
numoct = string.byte (strfrafra,1,1)
if ((numoct==35) or (numoct==37)) then -- "#" or "%"
numsplit = 1 -- assisted #S1
else
numsplit = 2 -- manual #S2
end--if
numtamp = string.len(strfrafra)
if ((numtamp<2) or ((numsplit==2) and (numtamp<4))) then
numerr = 23 -- #E23 "fra=" is bad (too short)
end--if
break -- finally to join mark
end--while -- fake loop -- join mark
end--if
lfdshowvar (strfrafra,'strfrafra','"fra=" maybe seized')
lfdshowvar (numsplit,'numsplit')
lfdshowvar (numerr,'numerr')
strextext = ''
boohavext = false
if ((numerr==0) and (numshowlemma==3)) then
while true do -- fake loop -- abort on both success or failure -- "ext"
vartmp = arxsomons['ext'] -- optional, NOT yet prevalidated
if (type(vartmp)~="string") then
break -- parameter not specified
end--if
numtamp = string.len(vartmp)
if ((numtamp<2) or (numtamp>120)) then
numerr = 20 -- #E20 -- "ext=" is bad
break
end--if
strextext = vartmp -- pick it (further validation pending)
boohavext = true
if (lfibanmulti("/0(0)0+0'1[1]1{0}0|0",strextext)) then
numerr = 20 -- #E20 -- "ext=" is bad
break
end--if
if (string.len(strextext)==2) then
numoct = string.byte (strextext,1,1) -- maybe "&"
numodt = string.byte (strextext,2,2) -- only 8 letters tolerable
if ((numoct==38) and lfchk789ucase(numodt,false,true)) then
break -- success with "&"-syntax C I M N P U W X
end--if
end--if
if (numtamp<5) then
numerr = 20 -- #E20 -- "ext=" is bad
lfdtracemsg ('Parameter "ext=" has 2...4 char:s but not valid "&"-syntax')
end--if
break -- finally to join mark
end--while -- fake loop -- join mark
end--if
lfdshowvar (strextext,'strextext','"ext=" maybe seized')
lfdshowvar (boohavext,'boohavext','from "ext=" too')
lfdshowvar (numerr,'numerr')
strscrscr = ''
if ((numerr==0) and (numshowlemma==3)) then
while true do -- fake loop -- abort on both success or failure -- "scr"
vartmp = arxsomons['scr'] -- optional, NOT yet prevalidated
if (type(vartmp)~="string") then
break -- parameter not specified
end--if
numtamp = string.len(vartmp)
if (numtamp~=1) then
numerr = 21 -- #E21 -- "scr=" is bad
break
end--if
strscrscr = vartmp -- pick it (further validation pending)
numtamp = string.byte(strscrscr,1,1)
if (lfgtestuc(numtamp)==false) then
numerr = 21 -- #E21 -- "scr=" is bad
break
end--if
break -- finally to join mark
end--while -- fake loop -- join mark
end--if
lfdshowvar (strscrscr,'strscrscr','"scr=" maybe seized')
lfdshowvar (numerr,'numerr')
---- STRATE 1 -- PROCESS VALIDATE SPLIT CONTROL STRING TO 2 TABLES ----
-- process from "strfrafra" to "tabblock" (from "%") and
-- to "tablinx" (from "#") both later processed in "lfsplitaa" "qsplitter"
-- "numsplit" equal 1 means only that "strfrafra" is
-- 2...120 octet's and begins with "#" or "%" and is free from some
-- evil stuff such as "++" "''" "[[" "]]" "[]" "[http" "[https" but not more
-- example of valid syntax "%3A #2N #5A #7N #8:test"
-- note that "%" may not be alone ie empty nor followed by SPACE ie "% "
-- any SPACE must be followed by "#" by syntax rules
-- this can brew #E23
if ((numerr==0) and (numsplit==1)) then
while true do -- outer fake loop
numlaong = string.len (strfrafra)
numtamp = 1 -- ONE-based index
numprevdx = - 1 -- must be ascending, index ZERO valid
numogt = string.byte (strfrafra,1,1) -- got "%" or NOT ??
if (numogt==37) then
if (numlaong==1) then
numerr = 23 -- #E23 "fra=" is bad ("%" must not be empty)
break -- outer fake loop
end--if
numodt = string.byte (strfrafra,2,2) -- "% " is illegal
if (numodt==32) then
numerr = 23 -- #E23 "fra=" is bad ("%" must not be empty)
break -- outer fake loop
end--if
numtamp = 2 -- ONE-based index -- check after "%"
numhelpcn = 0 -- counts blocked boundaries (max 8)
while true do -- inner genuine loop
if ((numtamp>numlaong) or (numhelpcn>8)) then
break -- inner loop only -- good or bad
end--if
numogt = string.byte (strfrafra,numtamp,numtamp) -- SPACE or HEX req
numtamp = numtamp + 1
if (numogt==32) then
numoet = 0
if (numtamp<=numlaong) then
numoet = string.byte (strfrafra,numtamp,numtamp) -- "#" required
end--if
if (numoet~=35) then
numerr = 23 -- #E23 "fra=" is bad
end--if
break -- inner loop only -- good or bad
end--if
numtbindx = lfnonehextoint (numogt)
if ((numtbindx==255) or (numtbindx<=numprevdx)) then
numerr = 23 -- #E23 "fra=" is bad (not ascending)
break -- inner loop only
end--if
tabblock [numtbindx] = '1' -- type "string"
numhelpcn = numhelpcn + 1
numprevdx = numtbindx
end--while
end--if
if (numhelpcn>8) then
numerr = 23 -- #E23 "fra=" is bad
end--if
if (numerr~=0) then
break -- outer loop with #E23
end--if
if (numtamp>numlaong) then
break -- outer fake loop -- OK
end--if
numprevdx = - 1 -- must be ascending, index ZERO valid, restart from it
while true do -- inner genuine loop
if (numtamp>numlaong) then
break -- inner loop only -- good end of string
end--if
numogt = string.byte (strfrafra,numtamp,numtamp) -- "#" required
numtamp = numtamp + 1
if (numogt~=35) then
numerr = 23 -- #E23 "fra=" is bad
break -- inner loop only
end--if
if (numtamp>numlaong) then
numerr = 23 -- #E23 "fra=" is bad
break -- inner loop only
end--if
numogt = string.byte (strfrafra,numtamp,numtamp) -- HEX required
numtamp = numtamp + 1
numtbindx = lfnonehextoint (numogt)
if ((numtbindx==255) or (numtbindx<=numprevdx)) then
numerr = 23 -- #E23 "fra=" is bad (not ascending)
break -- inner loop only
end--if
strkrositem = "" -- no valid hit yet -- prevalidated from "#"-syntax
if (numtamp>numlaong) then
numerr = 23 -- #E23 "fra=" is bad
break -- inner loop only
end--if
numodt = string.byte (strfrafra,numtamp,numtamp) -- one of 4 required
numtamp = numtamp + 1
if ((numodt==78) or (numodt==73) or (numodt==65)) then
strkrositem = string.char (numodt) -- "string" of "N" or "I" or "A"
if (numtamp<=numlaong) then
numoet = string.byte (strfrafra,numtamp,numtamp) -- SPACE required
numtamp = numtamp + 1 -- SPACE must be eaten away here !!!
if (numoet~=32) then
numerr = 23 -- #E23 "fra=" is bad
end--if
if (numtamp<=numlaong) then
numoet = string.byte (strfrafra,numtamp,numtamp) -- "#" required
end--if
if (numoet~=35) then
numerr = 23 -- #E23 "fra=" is bad
end--if
end--if
end--if ((numodt==78) or (numodt==73) or (numodt==65)) then
if (numodt==58) then -- ":"
numhelpcn = 0 -- counts char:s in the link target
while true do -- deep genuine loop
if ((numtamp>numlaong) or (numhelpcn==41)) then
break -- deep loop only -- good or bad
end--if
numodt = string.byte (strfrafra,numtamp,numtamp) -- trash "numodt"
numtamp = numtamp + 1
if (numodt==32) then
numoet = 0 -- SPACE must be eaten away here !!! INC is above
if (numtamp<=numlaong) then
numoet = string.byte (strfrafra,numtamp,numtamp) -- "#" required
end--if
if (numoet~=35) then
numerr = 23 -- #E23 "fra=" is bad
end--if
break -- deep loop only -- good or bad
end--if
strkrositem = strkrositem .. string.char(numodt) -- no ":" prf yet
numhelpcn = numhelpcn + 1
end--while
if ((numhelpcn==0) or (numhelpcn>40)) then
numerr = 23 -- #E23 "fra=" is bad
end--if
if (numerr~=0) then
break -- inner loop with #E23
end--if
strkrositem = ":" .. strkrositem -- add the prefix
end--if (numodt==58) then
if (strkrositem=='') then
numerr = 23 -- #E23 "fra=" is bad
end--if
if (numerr~=0) then
break -- inner loop with #E23
end--if
tablinx [numtbindx] = strkrositem
numprevdx = numtbindx
end--while
break -- finally to join mark
end--while -- fake loop -- join mark
end--if ((numerr==0) and (numsplit==1)) then
lfdshowvar (tabblock,'tabblock','from "%" assi',17)
lfdshowvar (tablinx,'tablinx','from "#" assi',17)
lfdshowvar (numerr,'numerr','done with 2 tables')
---- STRATE 2 -- PROCESS VALIDATE SPLIT CONTROL STRING TO 1 TABLE ----
-- process from "strfrafra" to "tabmnfragments" later processed
-- in "lfsplitmn" "qsplitter" and we need "strpagenam" too
-- so far "numsplit" equal 2 means only that "strfrafra" is
-- 4...120 octet's and does NOT begin with "#" or "%" and is free from some
-- evil stuff such as "++" "''" "[[" "]]" "[]" "[http" "[https" but not more
-- examples of valid syntax:
-- "[C:per-...-an/per][M:tidak][M:sama][C:per-...-an/an]"
-- "[C:per-...-an/per]+[M:kereta( )api]+[C:per-...-an/an]"
-- "[M:loep(a)]+[U:-are/ar(e)]+[M:sko]"
-- "[M:kung]+a+[M:doeme]"
-- "[I:et]+[L:fingr(o)]+[U:o]"
-- spaces are restricted:
-- * a field may not begin nor end with a space ("[U:-are /ar(e)]" is bad)
-- * deleted substring may not begin nor end with
-- a space ("[M:loep( a)]" is bad)
-- * deleted single spaces are prohibited after "L:" but otherwise
-- permitted ("[L:fingr( )]" is bad but "[M:kereta( )api]" is good)
-- we have to count slashes to make sure not to get more
-- than 1 in a single fragment
-- we do NOT have to count colons because they are ignored if
-- not in the beginning, thus we cannot get more than 1 in a fragment ;-)
-- colon is only regarded and can cause an error if:
-- * preceded by an uppercase letter
-- * those 2 char:s are located in the beginning of fragment and inside [...]
-- otherwise it is considered to be an ordinary letter
-- * for example "+[M:crap]" is regarded and valid (although maybe useless)
-- * for example "+[A:crap]" is regarded and an error
-- * for example "+[m:A:crap]" and "A:crap" is maybe nonsense but ignored
-- and not an error against the spec
-- here we do NOT YET introduce wikilinks with double brackets and walls
-- here we do NOT YET expand "+" to " + "
-- here we do NOT YET add dashes to some affixes
-- here we DO CARRY OUT the "sum check"
-- all 8 letters C I M N P U W L permitted here (but L restricted)
-- "strfragtbl" bunches the fragment EXCLUDING possible "+" and "[" and "]"
-- but they are RE-ADDED before it is stored in "tabmnfragments" !!!
-- this can brew #E23 except for "sum check" carried out here giving #E16
-- "STRING FUNCTIONS"\"lftestspace" and "STRING FUNCTIONS"\"lfidebracket"
if ((numerr==0) and (numsplit==2)) then
numlaong = string.len (strfrafra)
numtamp = 1 -- ONE-based source char index
numhelpcn = 0 -- number valid fragments defined (less than 1 or 2 illegal)
numnestin = 0 -- number of source opened '[' (only ZERO or ONE is legal)
numofslhs = 0 -- number of source slashes '/' (only ZERO or ONE is legal)
strfragtbl = '' -- fragment incl prefix ("M:") and "/"
str2field = '' -- visible part of fragment after slash for "sum check"
strreconl = '' -- reconstructed complete lemma for "sum check"
boohvtext = false -- have ordinary text char:s in field incl () excl +[/]
boohvpref = false -- have prefix "M:" or similar inside []
boo210kl = false -- separate verdict for every fragment "L:" used
booslshxx = false -- true if we got slash inside complete control string
boohavepl = false -- bracketed fragment is preceded by plus "+"
while true do -- genuine loop, "numtamp" is the counter
if (numtamp>numlaong) then
if (numnestin==1) then
numerr = 23 -- #E23 "fra=" is bad (unclosed '[')
break -- damn
end--if
if (boohvtext) then -- flush: do add but no need to erase anymore
strreconl = strreconl .. str2field -- same thing if type "000"
if (boohavepl) then
str2field = "+" .. str2field -- no [] and no spaces yet here
end--if
tabmnfragments[numhelpcn] = str2field -- same thing if type "000"
numhelpcn = numhelpcn + 1
end--if
break -- done (some checks pending)
end--if
if (numhelpcn==16) then
numerr = 23 -- #E23 "fra=" is bad due to more than 16 fragments
break -- damn
end--if
numoht = 0 -- previous char
if (numtamp~=1) then
numoht = string.byte (strfrafra,(numtamp-1),(numtamp-1)) -- can get it
end--if
numoct = string.byte (strfrafra,numtamp,numtamp)
numtamp = numtamp + 1
numogt = 0 -- pre-peeked following char
if (numtamp<=numlaong) then
numogt = string.byte (strfrafra,numtamp,numtamp) -- we can pre-peek
end--if
boocaught = false -- becomes true if char already caught (kaptiloj ...)
if (numoct==32) then -- space -- keep "boocaught" false
if ((not boohvtext) and (numnestin==1)) then -- chk only inside []
numerr = 23 -- #E23 "fra=" is bad due to field beginning wth space
break -- damn
end--if
end--if
if (numoct==43) then -- plus "+" is fragment separator
boocaught = true
if ((numoht==32) or (numogt==32) or (numoht==43) or (numogt==43)) then
numerr = 23 -- #E23 "fra=" is bad due to space or double plus
break -- damn
end--if
if ((numoht~=93) and (numogt~=91)) then
numerr = 23 -- #E23 "fra=" is bad due to bad use of '+' no "[","]"
break -- damn
end--if
if (numnestin==1) then
numerr = 23 -- #E23 "fra=" is bad due to bad use of '+' inside []
break -- damn
end--if
if (boohvtext) then -- flush: do add and do erase then
strreconl = strreconl .. str2field -- same thing if type "F000"
if (boohavepl) then -- possible previous plus, not this one !!!
str2field = "+" .. str2field -- no [] and no spaces yet here
end--if
tabmnfragments[numhelpcn] = str2field -- same thing if type "F000"
numhelpcn = numhelpcn + 1
end--if
strfragtbl = '' -- for the table (not yet including rectangular bra)
str2field = '' -- visible for "sum check"
boohvtext = false -- empty field ready to be filled with garbage
boohvpref = false -- empty field ready to be filled with garbage
boo210kl = false -- separate verdict for every fragment
boohavepl = true -- need this later when adding or flushing
end--if
if (numoct==91) then
boocaught = true -- do NOT touch "boohavepl" !!! needed later
if (numnestin==1) then
numerr = 23 -- #E23 "fra=" is bad due to nesting of '['
break -- damn
end--if
numnestin = 1 -- after opening '[' and keep "boohavepl" untouched
if (boohvtext) then -- flush: do add and do erase then
strreconl = strreconl .. str2field -- same thing if type "F000"
if (boohavepl) then
str2field = "+" .. str2field -- no [] and no spaces yet here
end--if
tabmnfragments[numhelpcn] = str2field -- same thing if type "F000"
numhelpcn = numhelpcn + 1
end--if
strfragtbl = '' -- for the table (not yet including rectangular bra)
str2field = '' -- visible for "sum check"
boohvtext = false -- empty field ready to be filled with garbage
boohvpref = false -- empty field ready to be filled with garbage
boo210kl = false -- separate verdict for every fragment
end--if
if (numoct==93) then
boocaught = true
if ((numnestin==0) or (not boohvtext)) then
numerr = 23 -- #E23 "fra=" is bad (nesting of ']' or empty '[]')
break -- damn
end--if
if (lftestspace(str2field)) then -- test visible part only here
numerr = 23 -- #E23 "fra=" is bad (criminal spaces)
break -- damn
end--if
if (boo210kl) then
strinnertst = lfidebracket (strfragtbl,false,1) -- inner part, no "/"
if (lftestspace(strinnertst)) then
numerr = 23 -- #E23 "fra=" bad criminal spaces inside ( ) aftr "L:"
break -- damn
end--if
end--if
strreconl = strreconl .. lfidebracket (str2field,true,1) -- visible part
strfragtbl = '[' .. strfragtbl .. ']' -- plus "+" outside of [] !!!
if (boohavepl) then
strfragtbl = "+" .. strfragtbl -- outside and no spaces yet here
end--if
tabmnfragments[numhelpcn] = strfragtbl -- complete fragment
numhelpcn = numhelpcn + 1
strfragtbl = '' -- for the table (not yet including rectangular bra)
str2field = '' -- visible for "sum check"
boohvtext = false -- empty field ready to be filled with garbage
boohvpref = false -- empty field ready to be filled with garbage
boohavepl = false -- separate verdict for every fragment
boo210kl = false -- separate verdict for every fragment
numnestin = 0 -- again ZERO after closing ']'
numofslhs = 0 -- reset number of slashes to ZERO
end--if
if ((numogt==58) and lfgtestuc(numoct) and (numnestin==1) and (numofslhs==0) and (not boohvtext) and (boohvpref==false)) then
boocaught = true
if (lfchk789ucase(numoct,true,false)) then
numtamp = numtamp + 1 -- OK, eat it away for now C I M N P U W L
strfragtbl = string.char (numoct) .. ':' -- begin fragment for table
boo210kl = (numoct==76) -- "L"
boohvpref = true
else
numerr = 23 -- #E23 "fra=" is bad (wrong uppercase before ":")
break -- damn
end--if
end--if
if (numoct==47) then
boocaught = true -- slash "/"
if ((numofslhs==1) or (lftestspace(str2field)) or boo210kl) then
numerr = 23 -- #E23 "fra=" bad due to space or excess slash or "L"
break -- damn
end--if
booslshxx = true -- YES -- exists in complete split control string
numofslhs = 1 -- number of "/" in this fragment
strfragtbl = strfragtbl .. '/' -- add "/" to fragment no wall yet
str2field = '' -- OTOH clear the visible part
boohvtext = false -- in 1 field, note that empty after slash is NOT LEGAL
end--if
if (boocaught==false) then
strfragtbl = strfragtbl .. string.char (numoct) -- add to frag for tbl
str2field = str2field .. string.char (numoct) -- add for "sum check"
boohvtext = true
end--if
end--while -- genuine loop, "numtamp" is the counter
if ((numhelpcn==0) or ((numhelpcn==1) and (booslshxx==false))) then
numerr = 23 -- #E23 "fra=" is bad (at least 1 or 2 fragments required)
end--if
if ((numerr==0) and (strpagenam~=strreconl)) then
numerr = 16 -- #E16 -- "fra=" is bad -- "sum check"
lfdtracemsg ('Failed "sum check" in manual split : "' .. strpagenam .. '" <> "' .. strreconl .. '"')
end--if
end--if ((numerr==0) and (numsplit==2)) then
lfdshowvar (tabmnfragments,'tabmnfragments','from manu done with one table',17)
lfdshowvar (numerr,'numerr')
---- PROCESS AND VALIDATE EXTRA PARAMETER ----
-- process fragments from "strextext" to "tabextfrog" removing rectangular
-- brackets and carrying out full validation (as opposed to above where
-- rectangular brackets are preserved in "tabmnfragments")
-- only type F210 is permitted and only C I M N P U W available
-- and ":" or "!" is required
-- no arc brackets "(" ")" no plus "+" no slash "/" (this is sudah checked)
-- alternatively expand "&"-syntax from "strextext" to "tabextfrog"
-- getting 1 or 2 "!"-fragments, even "X" permitted
numlaong = 0 -- pre-ass'ume for empty parameter
if ((numerr==0) and boohavext) then
numlaong = string.len (strextext)
end--if
if (numlaong==2) then
numoct = string.byte (strextext,2,2) -- only 8 possible letters tolerable
if (numoct==88) then
tabextfrog[0] = 'M!' .. strpagenam
tabextfrog[1] = 'W!' .. strpagenam
else
tabextfrog[0] = string.char(numoct,33) .. strpagenam -- "!" too
end--if
end--if
if (numlaong>=5) then -- skip this for "&"-syntax or empty
numtamp = 1 -- ONE-based source char index
numhelpcn = 0 -- number of valid fragments defined
numnestin = 0 -- number of source opened '[' (only ZERO or ONE is legal)
strfragtbl = '' -- fragment incl prefix (fe "M:" "U!") excl "[" and "]"
boohvtext = false -- have ordinary text char:s in field excl [] and prefix
while true do -- genuine loop, "numtamp" is the counter
if (numtamp>numlaong) then
if (numnestin==1) then
numerr = 20 -- #E20 -- "ext=" is bad due to unclosed '['
end--if
break -- done good or evil
end--if
if (numhelpcn==4) then
numerr = 20 -- #E20 -- "ext=" is bad due to more than 4 fragments
break -- damn
end--if
numoct = string.byte (strextext,numtamp,numtamp)
numtamp = numtamp + 1
boocaught = false -- becomes true if char already caught (kaptiloj ...)
if (numoct==91) then
boocaught = true
if (numnestin==1) then
numerr = 20 -- #E20 -- "ext=" is bad due to nesting of '['
break -- damn
end--if
numoct = string.byte (strextext,numtamp,numtamp) -- mortyp prefix
numtamp = numtamp + 1
if (lfchk789ucase(numoct,false,false)==false) then
numerr = 20 -- #E20 -- "ext=" is bad due to lack of valid prefix
break -- damn need C I M N P U W even in "ext=" no "X" here
end--if
numodt = string.byte (strextext,numtamp,numtamp) -- preserve "numoct"
numtamp = numtamp + 1
if ((numodt~=58) and (numodt~=33)) then -- ":" or "!"
numerr = 20 -- #E20 -- "ext=" is bad due to lack of valid prefix
break -- damn
end--if
strfragtbl = string.char (numoct,numodt) -- begin fragment for table
numnestin = 1 -- after opening '['
boohvtext = false -- empty field ready to be filled with garbage
end--if
if (numoct==93) then
boocaught = true
if ((numnestin==0) or (not boohvtext)) then
numerr = 20 -- #E20 -- "ext=" bad due nesting of ']' or empty '[]'
break -- damn
end--if
tabextfrog[numhelpcn] = strfragtbl -- store complete fragment
numhelpcn = numhelpcn + 1
strfragtbl = '' -- for the table (not including rectangular bra)
boohvtext = false -- empty field ready to be filled with garbage
numnestin = 0 -- again ZERO after closing ']'
end--if
if (boocaught==false) then
if (numnestin==0) then
numerr = 20 -- #E20 -- "ext=" bad due text outside [] ie type F000
break -- damn
end--if
strfragtbl = strfragtbl .. string.char (numoct) -- add to frag for tbl
boohvtext = true
end--if
end--while -- genuine loop, "numtamp" is the counter
end--if (numlaong>=5) then
lfdshowvar (tabextfrog,'tabextfrog','from "ext=" done with one table',5)
lfdshowvar (numerr,'numerr')
---- PEEK THE LANGUAGE NAMES ----
-- lang name in site language (/c0/ -- "connumtblc0")
-- (bookonata = false) is only possible for "??", less evil than (numerr>0)
-- lang name "propralingve" (/c2/ -- "connumtblc2")
-- (boohavasi = false) is barely bad, needed far below
-- silly isolators "constrisobg" and "constrisoen" are needed for
-- "strnamasin" (valid only if "boohavasi" is true) but not for "strnambah"
bookonata = false
boohavasi = false
if (numerr==0) then
while true do -- fake loop
if (strkodbah=='??') then -- "??" is unknown but not faulty
strnambah = contablaxwc [2] -- placeholder for unknown lang
break
end--if
tabonelang = tablg78ysubt[strkodbah] -- get subtable from T78 incom code
if (type(tabonelang)~='table') then
tabonelang = tablg78ysubt[strpiklangcode] -- get subtable from T78 site code
if (type(tabonelang)=='table') then
numerr = 12 -- #E12 unknown code (since site code works)
else
numerr = 2 -- #E02 malica (site code does NOT work either)
end--if
break
end--if
strnambah = tabonelang[connumtblc0] -- in site language /c0/
if (type(strnambah)~='string') then -- absolutely required
numerr = 2 -- #E02 malica
break
end--if
if (strnambah=="-") then -- better content in /c0/ absolutely required
numerr = 2 -- #E02 malica
end--if
bookonata = true
strnamasin = tabonelang[connumtblc2] -- propralingve /c2/
if (type(strnamasin)~='string') then
break -- NOT an error
end--if
if (strnamasin~="-") then
boohavasi = true -- have valid name better than "-" to display
strnamasin = constrisobg .. strnamasin .. constrisoen -- add the isola
end--if
break -- finally to join mark
end--while -- fake loop -- join mark
end--if (numerr==0) then
---- TRANSLATE WORD CLASS CODE VIA LUA TABLE ----
-- "strnamke6" and "strnamke7" is the long word class with possible (...)
if (numerr==0) then
if (strkodkek6=='??') then -- "??" is unknown but not faulty
strnamke6 = contablaxwc [3] -- word class full -- unknown word class
else
vartmp = contabwc [strkodkek6]
if (vartmp==nil) then
numerr = 13 -- #E13 -- unknown word class
else
strnamke6 = vartmp -- word class full -- found it in the table
end--if
end--if (strkodkek6=='??') else
end--if
if ((numerr==0) and boohavdua) then
vartmp = contabwc[strkodkek7] -- no "??" possible here
if (vartmp==nil) then
numerr = 13 -- #E13 -- unknown word class
else
strnamke7 = vartmp -- word class full -- found it in the table
end--if
end--if
---- PARTIALLY FILL INSANE TABLE ----
-- base categories are created even for unknown lang or wc
-- compound categories are available only if lang is known
-- insertable items defined:
-- constant:
-- * LK langcode (unknown "??" legal but take care elsewhere)
-- * LN langname (unknown legal, for example "dana" or "Ido")
-- * LU langname uppercased (unknown legal, for example "Dana" or "Ido")
-- * LO langname not own (empty or nil if own)
-- * LV langname uppercased not own (empty or nil if own)
-- * LY langname long (for example "bahasa Swedia")
-- * LZ langname long not own (empty or nil if own)
-- * SC script code (for example "T", "S", "P" for ZH, "C" "L" for SH)
-- variable (we can have 2 word classes):
-- * WC word class name (for example "substantivo")
-- * WU word class name uppercased (for example "Substantivo")
-- * MT mortyp code (for example "C")
-- * FR fragment (for example "peN-...-an" or "abelujo")
if (numerr==0) then
if (bookonata) then
strnambalo = lfilongname (strnambah,strpiklangcode) -- brew long, maybe needed
else
strnambalo = strnambah -- not longer than that
end--if
strtomp = lfucasegene(strnambah,true,false) -- short uppercased
tabstuff = {} -- bugger all inside
tabstuff["LK"] = strkodbah
tabstuff["LN"] = strnambah -- short name
tabstuff["LU"] = strtomp -- uppercased name
if (strkodbah~=strpiklangcode) then
tabstuff["LO"] = strnambah -- maybe short name
end--if
if (strkodbah~=strpiklangcode) then
tabstuff["LV"] = strtomp -- maybe uppercased name
end--if
tabstuff["LY"] = strnambalo
if (strkodbah~=strpiklangcode) then
tabstuff["LZ"] = strnambalo
end--if
tabstuff["SC"] = strscrscr -- script (may be empty)
end--if
---- SPLIT THE LEMMA + EXTRA IF NEEDED VIA SUBMODULE ----
-- process from "strpagenam" (sudah guaranteed to be
-- non-empty) to "strlemma" (actually NOT for manual split)
-- "numshowlemma" : 0 none 1 raw 2 maybe spl 3 maybe spl and morpheme cat:s
-- "numsplit" : 0 auto 1 assisted 2 manu 3 srs 4 sbr 5 large 7 none
-- "numsplit" must be 7 #S7 if "numshowlemma" is 0 or 1 !!!
-- we do exactly nothing (and leave "strlemma" empty) if:
-- * we already suck ie "numerr"<>0
-- * "numshowlemma" is ZERO
-- we skip the split and copy only if:
-- * "numsplit" is 7 (#S7 no split, can be due to "numshowlemma" equal 1)
-- punctuation (5 char:s: ! , . ; ?) 21 33 | 2C 44 | 2E 46 | 3B 59 | 3F 63
-- dash "-" and apo "'" do NOT count as punctuation (for auto and assisted)
-- we depend on "boomo3kat" and "bookonata" (they can switch off some cat:s)
-- we depend on "boohavkal" (switches between "vortgrupo" and "frazo")
-- "qtabkatoj" is very global
-- 0...17 cat names without "Category:" prefix, unused "nil"
-- 20...37 "true" if main page, otherwise "nil"
-- "qsplitter" will fill it from split and from "ext=" too
if ((numerr==0) and (numshowlemma~=0)) then
qtabkatoj [0] = (boomo3kat and bookonata) -- do we want compound cat:s ???
qtabkatoj [1] = strpagenam
qtabkatoj [2] = numsplit
qtabkatoj [3] = tabblock
qtabkatoj [4] = tablinx
qtabkatoj [5] = tabmnfragments
qtabkatoj [6] = tabextfrog -- from "ext=" fragments
qtabkatoj [7] = boohavext -- from "ext=" is true if "tabextfrog" is valid
qtabkatoj [8] = tabstuff
qtabkatoj [9] = boohavnyr
qtabkatoj[10] = boohavkal
qtabkatoj[15] = boodetrc
lfdshowvar (qtabkatoj,'qtabkatoj','before firing submodule',40)
qtabkatoj = qsplitter.ek { args = qtabkatoj } -- !!! FIRED HERE !!!
lfdshowvar (qtabkatoj,'qtabkatoj','after return from submodule',50) -- skip debug stuff
lfdimportreport ('Report from submodule:<br><br>"' .. tostring (qtabkatoj[51]) .. '<br>"')
while true do -- fake loop
vartmp = qtabkatoj[50]
if (type(vartmp)~='string') then
numerr = 4 -- #E04 malica
break
end--if
strlemma = vartmp
num2statcode, bootimp = lfhvali1status99code (qtabkatoj[52]) -- from "msplitter"
if (num2statcode~=0) then
if (bootimp) then
numerr = 5 -- #E05 nombrigita
else
numerr = 4 -- #E04 malica
end--if
break -- to join mark
end--if
break -- finally to join mark
end--while -- fake loop -- join mark
end--if
---- WHINE IF YOU MUST #E02...#E99 ----
-- reporting of errors #E02...#E99 depends on uncommentable strings
-- and name of the caller filled in from "strpikparent"
while true do -- fake loop
if (numerr<=1) then
break -- success or #E01
end--if
if (numerr==3) then -- #E03
strviserr = lfhbrewerrsm(numerr,'loaddata-tbllingvoj',num2statcode)
break
end--if
if (numerr==5) then -- #E05
strviserr = lfhbrewerrsm(numerr,'msplitter',num2statcode)
break
end--if
strviserr = lfhbrewerrsm(numerr,nil,nil)
break -- finally to join mark
end--while -- fake loop -- join mark
---- BREW 1 OR 2 EXTRA STRING:S ONLY FOR CATEGORIES ----
-- content in "strnamco6" and "strnamco7" is word class stripped
if (numerr==0) then
strnamco6 = lfstripparent(strnamke6)
if (boohavdua) then
strnamco7 = lfstripparent(strnamke7)
end--if
end--if
---- BREW THE INVISIBLE ANCHOR PART ----
-- uses "constrankkom" (does NOT end with a dash) and "constaankend"
-- '<span id="' .. anchor name .. '"></span>'
-- we can brew 2 or 3 or 5 anchors
strinvank = ''
if (numerr==0) then
strinvank = constrankkom .. "-" .. strkodbah .. constaankend
strinvank = strinvank .. constrankkom .. "-" .. strkodbah .. "-" .. strkodkek6 .. constaankend
if (boohavdst) then
strinvank = strinvank .. constrankkom .. "-" .. strkodbah .. "-" .. strkodkek6 .. '-' .. strdstdst .. constaankend
end--if
if (boohavdua) then
strinvank = strinvank .. constrankkom .. "-" .. strkodbah .. "-" .. strkodkek7 .. constaankend
if (boohavdst) then
strinvank = strinvank .. constrankkom .. "-" .. strkodbah .. "-" .. strkodkek7 .. '-' .. strdstdst .. constaankend
end--if
end--if (boohavdua) then
end--if
---- BREW THE VISIBLE PART ----
-- "strlemma" is the lemma with or without separation links
-- "numshowlemma" is four-state but here we bother only abo ZERO vs non-ZERO
strvisgud = ''
if (numerr==0) then
strvisgud = contabscrmisc[0] -- <div...></div> must be empty -- tiny EOL
if (booshowimage) then
strvisgud = strvisgud .. contabscrmisc[1] -- "File:Garto" ... with [[]]
end--if
strvisgud = strvisgud .. ' '
if (numshowlemma~=0) then
strvisgud = strvisgud .. '<b><bdi>' .. strlemma .. '</bdi></b> ' -- lemma and space
end--if
strvisgud = strvisgud .. '( <span ' .. constrtoolt .. ' title="' .. contablaxwc [0] .. strnambah
if (boohavasi) then
strvisgud = strvisgud .. ' ' .. strnamasin -- lang name in the lang with isola
end--if
strvisgud = strvisgud .. '"> ' .. strkodbah .. ' </span>'
strvisgud = strvisgud .. ' , <span ' .. constrtoolt .. ' title="' .. contablaxwc [1] .. strnamke6
strvisgud = strvisgud .. '"> ' .. strkodkek6 .. ' </span>'
if (boohavdua) then
strvisgud = strvisgud .. ' , <span ' .. constrtoolt .. ' title="' .. contablaxwc [1] .. strnamke7
strvisgud = strvisgud .. '"> ' .. strkodkek7 .. ' </span>'
end--if
strvisgud = strvisgud .. ' )'
end--if
---- BREW THE INVISIBLE CATEGORY LIST BASE PART ----
-- Need string "constrkatq" cat prefix NOT including colon ":".
-- We need sub "lfiultiminsert" (2 para) and table "contabkatoj" controlling
-- the structure of the cat name.
-- Note that these categories are unique as they:
-- do NOT pass through "qtabkatoj"
-- contain a word class for 2 of 3
-- are created even for unknown lang or wc
strinvkat = ''
if ((numerr==0) and (not boonocat)) then -- !!!FIXME!!! remove "nocat" in favor of "pate"
tabstuff["MT"] = nil -- no stupid morpheme here
tabstuff["FR"] = nil -- no stupid fragment here
numkindex = 0 -- index 0...2
while true do
if (numkindex==3) then
break
end--if
varkantctl = contabkatoj[numkindex] -- 0...2 pick main data string no "nil"
if (type(varkantctl)=='string') then
numtamp = string.len(varkantctl)
if (numtamp>=2) then
bootimp = lfifinditems(varkantctl,"WCWU") -- word class
if (bootimp) then
tabstuff["WC"] = strnamco6
tabstuff["WU"] = lfucasegene(strnamco6,true,false)
strinvkat = strinvkat .. '[[' .. constrkatq .. ':' .. lfiultiminsert (varkantctl,tabstuff) .. ']]'
if (boohavdua) then
tabstuff["WC"] = strnamco7
tabstuff["WU"] = lfucasegene(strnamco7,true,false)
strinvkat = strinvkat .. '[[' .. constrkatq .. ':' .. lfiultiminsert (varkantctl,tabstuff) .. ']]'
end--if
else
tabstuff["WC"] = nil -- no word class, only lng
tabstuff["WU"] = nil -- no word class, only lng
strinvkat = strinvkat .. '[[' .. constrkatq .. ':' .. lfiultiminsert (varkantctl,tabstuff) .. ']]'
end--if
end--if (numtamp>=2) then
end--if (type(varkantctl)=='string') then
numkindex = numkindex + 1
end--while
end--if
---- ENHANCE THE INVISIBLE CATEGORY LIST WITH COMPOUND PART ----
-- Need string "constrkatq" cat prefix NOT including colon ":".
-- List of cat names without NS prefix was maybe brewed by submodule "qsplitter"
-- and is stored in global "qtabkatoj" ([0]...[17]). At +20 we have requests
-- for main page of the category using "|-" as "key".
if ((numerr==0) and (not boonocat) and boomo3kat) then
do
local vartaamp = 0
local vartuump = 0
local numkiindex = 0 -- 0...17 max but stop at type "nil" guaran to occur
while true do
vartaamp = qtabkatoj[numkiindex] -- risk of type "nil"
vartuump = qtabkatoj[numkiindex+20] -- risk of type "nil"
bootimp = (vartuump==true) -- main flag -- must be boolean type
if (type(vartaamp)=='string') then
strinvkat = strinvkat .. '[[' .. constrkatq .. ':' .. vartaamp
if (bootimp) then
strinvkat = strinvkat .. '|-' -- main page of category
end--if
strinvkat = strinvkat .. ']]'
else
break -- abort at "nil"
end--if
numkiindex = numkiindex + 1
end--while
end--do
end--if
---- RETURN THE JUNK STRING ----
strret = strviserr .. strinvank .. strvisgud .. strinvkat
if (qboodetrc) then
strret = "<br>" .. qstrtrace .. "<br><br>" .. strret
end--if
return strret
end--function
---- RETURN THE JUNK LUA TABLE ----
return exporttable