Module:etymology languages

From Wiktionary, the free dictionary
Jump to navigation Jump to search

This module provides etymology-only languages. It has a separate data module, Module:etymology languages/data, which stores information for etymology-only language codes.

It exports only three functions that behave similarly to the ones in Module:languages: getByCode, getByCanonicalName, and makeObject().

However, do not use these functions directly under normal circumstances. Instead, use getByCode(code, ..., true), getByCanonicalName(name, ..., true) in Module:languages, where the third parameter indicates that etymology-only languages can be returned, and if the second parameter is non-nil, an error is thrown if the specified code or name isn't recognized.

Etymology-only language objects returned by getByCode(), getByCanonicalName(), or makeObject() or by the corresponding methods in Module:languages function, for all intents and purposes, exactly like full Language objects. In particular, they support all the same methods, meaning they can for the most part be used as drop-in replacements for full Language objects. The only significant differences are:

  1. Etymology-only languages are nested underneath a full language (or, in rare cases, a full family), i.e. their parent (or parent's parent, etc.) is a full language (or family). This can be returned using getFull(). If you need the code or canonical name of the full language or family under which a given etymology-only language is nested, it is more efficient to use getFullCode() or getFullName() rather than calling getFull():getCode() or similar. (All three methods getFull(), getFullCode() and getFullName() exist on full languages, but the first one just returns back the same object and the others are equivalent to calling getCode() and getName(), respectively.)
  2. The getParent() method on an etymology-only language returns the direct containment parent (a full language, etymology-only language or family) of the etymology-only language. When called on full languages, it returns {nil}.
  3. Some modules that accept language objects are not prepared to accept etymology-only language objects. This particularly relates to categories, because currently only full languages can form poscatboiler or topic categories beginning with a language name or code. Modules that accept etymology-only language objects must be prepared to call getFull() or similar if they construct their own category names.

local export = {}

function export.makeObject(code)
	local data = mw.loadData("Module:etymology languages/data")[code]
	code = data and data.main_code or code

	if not data then
		return nil
	end

	local EtymologyLanguage = require("Module:languages").getByCode(data[5], nil, true, true)

	local familyCode
	if EtymologyLanguage:hasType("family") then
		-- Substrates are treated as child languages of "undetermined".
		if EtymologyLanguage:getCode() == "qfa-sub" then
			EtymologyLanguage = require("Module:languages").getByCode("und")
		end
		-- True etymology-only families (e.g. "ira-old") still need to grab the family code.
		familyCode = data[5]
	end
	-- Delete cached _type table to prevent the new object's hasType method from finding it via the metatable, as it only includes the parent's types.
	EtymologyLanguage._type = nil

	if not EtymologyLanguage then
		return nil
	end

	EtymologyLanguage.__index = EtymologyLanguage

	local lang = {_code = code}

	-- Parent is full language.
	if not EtymologyLanguage._stack then
		-- Create stack, accessed with rawData metamethod.
		lang._stack = {EtymologyLanguage._rawData, data}
		lang._rawData = setmetatable({}, {
			__index = function(t, k)
				-- Data that isn't inherited from the parent.
				local noInherit = {aliases = true, varieties = true, otherNames = true, main_code = true}
				if noInherit[k] then
					return lang._stack[#lang._stack][k]
				end
				-- Data that is appended by each generation.
				local append = {type = true}
				if append[k] then
					local parts = {}
					for i = 1, #lang._stack do
						table.insert(parts, lang._stack[i][k])
					end
					if type(parts[1]) == "string" then
						return table.concat(parts, ", ")
					end
				-- Otherwise, iterate down the stack, looking for a match.
				else
					local i = #lang._stack
					while not lang._stack[i][k] and i > 1 do
						i = i - 1
					end
					return lang._stack[i][k]
				end
			end,
			-- Retain immutability (as writing to rawData will break functionality).
			__newindex = function()
				error("table from mw.loadData is read-only")
			end
		})
		-- Non-etymological code is the parent code.
		lang._fullCode = EtymologyLanguage._code
	-- Parent is etymology language.
	else
		-- Copy over rawData and stack to the new object, and add new layer to stack.
		lang._rawData = EtymologyLanguage._rawData
		lang._stack = EtymologyLanguage._stack
		table.insert(lang._stack, data)
		-- Copy non-etymological code.
		lang._fullCode = EtymologyLanguage._fullCode
	end

	lang._familyCode = familyCode

	return setmetatable(lang, EtymologyLanguage)
end

function export.getByCode(code)
	return export.makeObject(code)
end

function export.getByCanonicalName(name)
	local byName = mw.loadData("Module:etymology languages/canonical names")
	local code = byName and byName[name] or
		byName[name:gsub(" [Ss]ubstrate$", "")] or
		byName[name:gsub("^a ", "")] or
		byName[name:gsub("^a ", ""):gsub(" [Ss]ubstrate$", "")]

	if not code then
		return nil
	end

	return export.makeObject(code)
end

return export