Skip to main content

Jiří Milička

Charles University, Prague, Institute of Comparative Linguistics, Faculty Member

Charles University, Prague, Institute of the Czech National Corpus, Faculty Member

Followers

140

Following

85

Co-authors

7

Public Views

This page is not maintained. For latest publications please see my ResearchGate profile.
Address: www.milicka.cz/en

less

Gordana Pavlović-Lažetić

Gabriel Altmann

Charles University, Prague

University of Hamburg

Alexander Magidow

University of Rhode Island

Bar-Ilan University

Yonatan Belinkov

Bar-Ilan University

Christian Bentz

University Of Tuebingen

InterestsView All (19)

Uploads

Quantitative Linguistics by Jiří Milička

The Effect of Iconicity Flash Blindness — An Empirical Study

by Jiří Milička and Vojta Diatka

In our experiment, the Saussurean postulate of arbitrariness has been empirically tested in order... more In our experiment, the Saussurean postulate of arbitrariness has been empirically tested in order to see whether this postulate can be applied to all words to the same extent. Three hundred participants were asked to match Czech words with their Hindi translations. One set of words was randomly chosen from a Hindi corpus (set A); the second set consisted of both randomly chosen words and words categorized as ideophones (set B). The participants were successful in matching both sets (the lower level of the confidence interval is about 7% above random guessing), and their performance showed unexpected patterns: For one, not only iconic properties (the sound qualities) but also iconicity itself is an important distinctive feature and recipients are able to exploit this. Moreover, even words considered to be non-iconic (set A) apparently contain a degree of iconicity, which participants are able to draw upon. However, participants appear to lose this ability when non-iconic words are presented in the context of words with evident and abundant iconicity (set B). The effect resembles the accommodation process which is known for other senses; therefore, we call the effect “Iconicity flash blindness”.

Menzerath's Law: The whole is greater than the sum of its parts

Reinhard Köhler (1984) proposed an idea that the linguistic constructs which have to be processed... more Reinhard Köhler (1984) proposed an idea that the linguistic constructs which have to be processed by the human parser consist of plain information (that is needed to be communicated) and the structure information, and that this can explain Menzerath's law. Our paper assumes that the amount of plain information and the amount of the structure information are mutually independent. A new model of the nested structure of text and Menzerath's law can be based on this assumption. A formula derived from the model is successfully tested and the results are compared to the classical Menzerath-Altmann law.

Average Word Length from the Diachronic Perspective: The Case of Arabic

Linguistic Frontiers, 2018

Previous studies based on English, Russian and Ch inese corpora show that the average word length... more Previous studies based on English, Russian and Ch inese corpora show that the average word length in texts grows steadily across centuries. These findings are in accordance with our results: the average word length in Arabic texts also grows during the analysed time span (8th century to the first half of the 20th century). Our paper shows the detailed statistics of the word length distribution century by century. The dynamics of the average word length correlates with the dynamics of the average word distribution entropy, which encourages an explanation of the phenomenon based on the Shannonian theory of communication.

Distribution of the Menzerath’s Law on the Syllable Level in Greek texts

by Jiří Milička and George Mikros

Empirical Approaches to Text and Language Analysis, 2014

Examining a large corpus of Greek texts we found that the average length of syllables in the disy... more Examining a large corpus of Greek texts we found that the average length of syllables in the disyllabic words is lower than the average length of the syllable in monosyllabic words and lower than the average length of syllables in tri-syllabic words. This peculiar phenomenon can be interpreted as a counterexample of the Menzerah's Law.

Rank-frequency Relation and Type-token Relation: Two Sides of the Same Coin

Methods and Applications of Quantitative Linguistics - Selected papers of the 8th International Conference on Quantitative Linguistics (QUALICO), 2013

This paper shows that type-token relation, hapax-token relation and, generally, relation between ... more This paper shows that type-token relation, hapax-token relation and, generally, relation between types of certain frequency and tokens can be computed from the rank-frequency relation or from any type of frequency distribution and that type-token relation can be computed from the hapax-token relation. This paper shows that there is no need for any approximation or assumptions and that the formulae can be derived purely algebraically. The second part of the paper observes that, for a very large corpora, the ratio between the number of hapax legomena and types converges to a constant Z; Z>0. Under this assumption an approximation is built that enables us to predict type-token relation and other aforementioned relations from the single parameter Z. This approximation is only valid for very large corpora. As the last chapter shows, this assumption implies that for an infinitely increasing number of tokens, the number of types increases beyond any limit.

Type-token & Hapax-token Relation: A Combinatorial Model

Contains an exact formula for computing Type-token relation curve from a frequency distribution o... more

Valency and Information Structure: A quantitative approach to from – to juxtaposition in Arabic

In Arabic, mutual order of prepositional phrases syntactically dependent on one head is neither f... more In Arabic, mutual order of prepositional phrases syntactically dependent on one head is neither fixed nor random. This paper explores the factors affecting the order of prepositions from and to. Many factors related to syntax, morphology and phonology are taken into account and analysed with a corpus driven approach.

Vocabulary Richness Measure in Genres

by Jiří Milička and Miroslav Kubát

Journal of Quantitative Linguistics, 2013

This article deals with the one of the oldest and most traditional fields in quantitative linguis... more This article deals with the one of the oldest and most traditional fields in quantitative linguistics, the concept of vocabulary richness. Although there are several methods for vocabulary richness measurement, all of them are influenced by text size. Therefore, the authors propose a new way of vocabulary richness measurement without any text length dependence. In the second part of the article, the new method is used for a genre analysis in texts written by the Czech writer Karel Čapek. Furthermore, differences between authors and between languages are studied with this method.

Key Length Motifs in Czech and Arabic Texts

Issues in Quantitative Linguistics 4, 2016

Length motifs (L-motifs) are defined as sequences of words whose lengths are monotonously increas... more Length motifs (L-motifs) are defined as sequences of words whose lengths are monotonously increasing. In recent years, L-motifs have attracted well-deserved attention as they provide a new view of texts and their syntagmatic properties and nested structures. This study examines the key L-motifs, i.e. motifs that are overrepresented in texts and negative key L-motifs that are underrepresented in texts. The data reveal motifs that are typical for Czech texts, motifs that are typical for Arabic texts, and motifs that are typical for both Czech and Arabic texts – their existence suggests that there are new general language-independent patterns waiting to be explored.

Is the Distribution of L-Motifs Inherited from the Word Lengths Distribution?

Sequences in Language and Text, Apr 2015

The distribution of L-motifs (measured on a text T) is similar to the L-motifs distribution measu... more The distribution of L-motifs (measured on a text T) is similar to the L-motifs distribution measured on the pseudotext T’ constructed by random transposition of all tokens within the text T. This inspires the suggestion that the distribution of L-motifs is inherited from the word length distribution (or, by other words, that the word length distribution of a text implies the distribution of L-motifs). The paper clearly shows that despite of the similarity, an L-motifs structure, independent of the word length distribution, can be detected.

Menzerath-Altmann Law in Syntactic Dependency Structure

by Jan Macutek and Jiří Milička

Simonetta Montemagni, Joakim Nivre (Eds.): Proceedings of the Fourth International Conference on Dependency Linguistics (Depling 2017), 2017

According to the Menzerath-Altmann law, there is a relation between the size of the whole and the... more According to the Menzerath-Altmann law, there is a relation between the size of the whole and the mean size of its parts. The validity of the law was demonstrated on relations between several language units, e.g., the longer a word, the shorter the syllables the word consists of. In this paper it is shown that the law is valid also in syntactic dependency structure in Czech. In particular, longer clauses tend to be composed of shorter phrases (the size of a phrase is measured by the number of words it consists of).

Introducing LingTest: A Field-Friendly Application for the Functional Testing of Mutual Intelligibility of Related Varieties

by Slavomír Čéplö, Christophe Pereira, Jiří Milička, and Petr Zemánek

This paper describes an application designed for the functional testing of mutual intelligibilit... more

Mutual Intelligibility of Spoken Maltese, Libyan Arabic and Tunisian Arabic Functionally Tested: A Pilot Study

by Slavomír Čéplö, Jiří Milička, Petr Zemánek, and Christophe Pereira

This paper presents the results of a pilot project designed to functionally test the mutual intel... more This paper presents the results of a pilot project designed to functionally test the mutual intelligibility of spoken Maltese, Tunisian Arabic and Benghazi Libyan Arabic. We compiled an audio-based intelligibility test consisting of three components: a word test where the respondents were asked to perform a semantic classification task with 11 semantic categories; a sentence test where the task was to provide a translation of a sentence into the respondent’s native language and a text test where a short text was listened to twice and the respondents were asked to answer 8 multiple-choice questions. We collected data from 24 respondents in Malta, Tunis and Benghazi which we analyzed to determine that there exists asymmetric mutual intelligibility between the two mainstream Arabic varieties and Maltese where speakers of Tunisian and Benghazi Arabic are able to understand about 40% of what is being said to them in Maltese, whereas that ratio is about 30% for speakers of Maltese exposed to either variety of Arabic. Additionally, we found that Tunisian Arabic has the highest level of mutual intelligibility with either of the other two varieties. Combining the intelligibility scores with edit distance data, we were able to sketch out the variables involved in enabling and inhibiting mutual intelligibility for all three varieties of Arabic and provide a rough analysis of the linguistic distance between them as branches of North African Arabic.

Natural Language Processing by Jiří Milička

Ranking Search Results for Arabic Diachronic Corpora. Google-like search engine for (non)linguists

by Jiří Milička and Petr Zemánek

Proceedings of CITALA 2014 (5th International Conference on Arabic Language Processing ), 2014

The contribution introduces a corpus linguistic search engine that ranks its results according to... more The contribution introduces a corpus linguistic search engine that ranks its results according to the keyness measure and the importance of the document within the corpus. For this purpose, the minimal ratio is measured for each word and the corpus is hypertextualized. Differences between genres are taken into account.

A Combinatorial Method for a Context Comparison

When comparing the use of two word types within one text, we can do it by comparing the contexts ... more When comparing the use of two word types within one text, we can do it by comparing the contexts in which they occur. We pick all the tokens that occur e.g. immediatelly to the right of the word A and immediatelly to the right of the word B, thus getting two multiple subsets of text. This paper offers a method for comparing such subsets (and its use is not limited only to the field of linguistics). The method is based on comparing the cardinality of the intersection of the two multiple subsets and a model which characterizes the average cardinality of all possible subsets of a given length from the given text. The model is derived algebraically.

Minimal Ratio: An Exact Metric for Keywords, Collocations etc.

Czech and Slovak Linguistic Review 1/2012, 2012

The paper defines and shows how to use the Minimal Ratio – an exact metric that expresses the rat... more The paper defines and shows how to use the Minimal Ratio – an exact metric that expresses the ratio between the measured value and the limits of the confidence interval calculated according to the formula Fischer’s exact test is based on. The metric is meant to assist with keywords and collocations extraction and comparing texts or corpora according to the word types distribution or other similar criteria.

(with Jiří Milička) Quotations, Relevance and Time Depth: Medieval Arabic Literature in Grids and Networks

by Petr Zemánek and Jiří Milička

This contribution deals with the use of quotations (repeated n-grams) in the works of medieval Ar... more This contribution deals with the use of quotations (repeated n-grams) in the works of medieval Arabic literature. The analysis is based on a 420 millions of words historical corpus of Arabic. Based on repeated quotations from work to work, a network is constructed and used for interpretation of various aspects of Arabic literature. Two short case studies are presented, concentrating on the centrality and relevance of individual works, and the analysis of a time depth and resulting impact of a given work in various periods.

Restricted Collocability and its Use in Arabic Corpus Linguistics

by Petr Zemánek and Jiří Milička

Computerised and Corpus-based Approaches to Phraseology: Monolingual and Multilingual Perspectives

Restricted collocability has received some attention, but not as a formalized method. We suggest ... more Restricted collocability has received some attention, but not as a formalized method. We suggest that it should be used as a metrics for collocations, as well as for other types of usage, both in linguistics and even outside it, as it has great potentials for a plethora of applications. On the examples from a diachronic corpus of Arabic, we show the possibilities of its employment in studying prepositional valency and lexical profiling.

Software by Jiří Milička

Software for measuring type-token relation, hapax-token relation and other similar types of relat... more

Tinfi (Text Inhomogeneities Finder)

"The software is designed to contribute to discover the text inhomogeneities by comparing type-to... more "The software is designed to contribute to discover the text inhomogeneities by comparing type-token relation of the text and its combinatorial model. Parts of a text in which number of types rises disproportionally are marked. The quick increase ( i.e. a new topic is introduced or style or language is changed) is marked by the green colour, while slow increase of types (i.e. repeating of old topics or even autoquotations). The software is appropriate also for the literary science.

The application allows its user to change the direction of the processing the text - forwards and backwards. When checking both forwards and backwards, unique parts of the texts (comparing with the rest of the text) are marked by the green colour, while typical parts are marked by the red colour. The freeware application provides a graphic user interface."

The Effect of Iconicity Flash Blindness — An Empirical Study

by Jiří Milička and Vojta Diatka

In our experiment, the Saussurean postulate of arbitrariness has been empirically tested in order... more In our experiment, the Saussurean postulate of arbitrariness has been empirically tested in order to see whether this postulate can be applied to all words to the same extent. Three hundred participants were asked to match Czech words with their Hindi translations. One set of words was randomly chosen from a Hindi corpus (set A); the second set consisted of both randomly chosen words and words categorized as ideophones (set B). The participants were successful in matching both sets (the lower level of the confidence interval is about 7% above random guessing), and their performance showed unexpected patterns: For one, not only iconic properties (the sound qualities) but also iconicity itself is an important distinctive feature and recipients are able to exploit this. Moreover, even words considered to be non-iconic (set A) apparently contain a degree of iconicity, which participants are able to draw upon. However, participants appear to lose this ability when non-iconic words are presented in the context of words with evident and abundant iconicity (set B). The effect resembles the accommodation process which is known for other senses; therefore, we call the effect “Iconicity flash blindness”.

Menzerath's Law: The whole is greater than the sum of its parts

Reinhard Köhler (1984) proposed an idea that the linguistic constructs which have to be processed... more Reinhard Köhler (1984) proposed an idea that the linguistic constructs which have to be processed by the human parser consist of plain information (that is needed to be communicated) and the structure information, and that this can explain Menzerath's law. Our paper assumes that the amount of plain information and the amount of the structure information are mutually independent. A new model of the nested structure of text and Menzerath's law can be based on this assumption. A formula derived from the model is successfully tested and the results are compared to the classical Menzerath-Altmann law.

Average Word Length from the Diachronic Perspective: The Case of Arabic

Linguistic Frontiers, 2018

Previous studies based on English, Russian and Ch inese corpora show that the average word length... more Previous studies based on English, Russian and Ch inese corpora show that the average word length in texts grows steadily across centuries. These findings are in accordance with our results: the average word length in Arabic texts also grows during the analysed time span (8th century to the first half of the 20th century). Our paper shows the detailed statistics of the word length distribution century by century. The dynamics of the average word length correlates with the dynamics of the average word distribution entropy, which encourages an explanation of the phenomenon based on the Shannonian theory of communication.

Distribution of the Menzerath’s Law on the Syllable Level in Greek texts

by Jiří Milička and George Mikros

Empirical Approaches to Text and Language Analysis, 2014

Examining a large corpus of Greek texts we found that the average length of syllables in the disy... more Examining a large corpus of Greek texts we found that the average length of syllables in the disyllabic words is lower than the average length of the syllable in monosyllabic words and lower than the average length of syllables in tri-syllabic words. This peculiar phenomenon can be interpreted as a counterexample of the Menzerah's Law.

Rank-frequency Relation and Type-token Relation: Two Sides of the Same Coin

Methods and Applications of Quantitative Linguistics - Selected papers of the 8th International Conference on Quantitative Linguistics (QUALICO), 2013

This paper shows that type-token relation, hapax-token relation and, generally, relation between ... more This paper shows that type-token relation, hapax-token relation and, generally, relation between types of certain frequency and tokens can be computed from the rank-frequency relation or from any type of frequency distribution and that type-token relation can be computed from the hapax-token relation. This paper shows that there is no need for any approximation or assumptions and that the formulae can be derived purely algebraically. The second part of the paper observes that, for a very large corpora, the ratio between the number of hapax legomena and types converges to a constant Z; Z>0. Under this assumption an approximation is built that enables us to predict type-token relation and other aforementioned relations from the single parameter Z. This approximation is only valid for very large corpora. As the last chapter shows, this assumption implies that for an infinitely increasing number of tokens, the number of types increases beyond any limit.

Type-token & Hapax-token Relation: A Combinatorial Model

Contains an exact formula for computing Type-token relation curve from a frequency distribution o... more

Valency and Information Structure: A quantitative approach to from – to juxtaposition in Arabic

In Arabic, mutual order of prepositional phrases syntactically dependent on one head is neither f... more In Arabic, mutual order of prepositional phrases syntactically dependent on one head is neither fixed nor random. This paper explores the factors affecting the order of prepositions from and to. Many factors related to syntax, morphology and phonology are taken into account and analysed with a corpus driven approach.

Vocabulary Richness Measure in Genres

by Jiří Milička and Miroslav Kubát

Journal of Quantitative Linguistics, 2013

This article deals with the one of the oldest and most traditional fields in quantitative linguis... more This article deals with the one of the oldest and most traditional fields in quantitative linguistics, the concept of vocabulary richness. Although there are several methods for vocabulary richness measurement, all of them are influenced by text size. Therefore, the authors propose a new way of vocabulary richness measurement without any text length dependence. In the second part of the article, the new method is used for a genre analysis in texts written by the Czech writer Karel Čapek. Furthermore, differences between authors and between languages are studied with this method.

Key Length Motifs in Czech and Arabic Texts

Issues in Quantitative Linguistics 4, 2016

Length motifs (L-motifs) are defined as sequences of words whose lengths are monotonously increas... more Length motifs (L-motifs) are defined as sequences of words whose lengths are monotonously increasing. In recent years, L-motifs have attracted well-deserved attention as they provide a new view of texts and their syntagmatic properties and nested structures. This study examines the key L-motifs, i.e. motifs that are overrepresented in texts and negative key L-motifs that are underrepresented in texts. The data reveal motifs that are typical for Czech texts, motifs that are typical for Arabic texts, and motifs that are typical for both Czech and Arabic texts – their existence suggests that there are new general language-independent patterns waiting to be explored.

Is the Distribution of L-Motifs Inherited from the Word Lengths Distribution?

Sequences in Language and Text, Apr 2015

The distribution of L-motifs (measured on a text T) is similar to the L-motifs distribution measu... more The distribution of L-motifs (measured on a text T) is similar to the L-motifs distribution measured on the pseudotext T’ constructed by random transposition of all tokens within the text T. This inspires the suggestion that the distribution of L-motifs is inherited from the word length distribution (or, by other words, that the word length distribution of a text implies the distribution of L-motifs). The paper clearly shows that despite of the similarity, an L-motifs structure, independent of the word length distribution, can be detected.

Menzerath-Altmann Law in Syntactic Dependency Structure

by Jan Macutek and Jiří Milička

Simonetta Montemagni, Joakim Nivre (Eds.): Proceedings of the Fourth International Conference on Dependency Linguistics (Depling 2017), 2017

According to the Menzerath-Altmann law, there is a relation between the size of the whole and the... more According to the Menzerath-Altmann law, there is a relation between the size of the whole and the mean size of its parts. The validity of the law was demonstrated on relations between several language units, e.g., the longer a word, the shorter the syllables the word consists of. In this paper it is shown that the law is valid also in syntactic dependency structure in Czech. In particular, longer clauses tend to be composed of shorter phrases (the size of a phrase is measured by the number of words it consists of).

Introducing LingTest: A Field-Friendly Application for the Functional Testing of Mutual Intelligibility of Related Varieties

by Slavomír Čéplö, Christophe Pereira, Jiří Milička, and Petr Zemánek

This paper describes an application designed for the functional testing of mutual intelligibilit... more

Mutual Intelligibility of Spoken Maltese, Libyan Arabic and Tunisian Arabic Functionally Tested: A Pilot Study

by Slavomír Čéplö, Jiří Milička, Petr Zemánek, and Christophe Pereira

This paper presents the results of a pilot project designed to functionally test the mutual intel... more This paper presents the results of a pilot project designed to functionally test the mutual intelligibility of spoken Maltese, Tunisian Arabic and Benghazi Libyan Arabic. We compiled an audio-based intelligibility test consisting of three components: a word test where the respondents were asked to perform a semantic classification task with 11 semantic categories; a sentence test where the task was to provide a translation of a sentence into the respondent’s native language and a text test where a short text was listened to twice and the respondents were asked to answer 8 multiple-choice questions. We collected data from 24 respondents in Malta, Tunis and Benghazi which we analyzed to determine that there exists asymmetric mutual intelligibility between the two mainstream Arabic varieties and Maltese where speakers of Tunisian and Benghazi Arabic are able to understand about 40% of what is being said to them in Maltese, whereas that ratio is about 30% for speakers of Maltese exposed to either variety of Arabic. Additionally, we found that Tunisian Arabic has the highest level of mutual intelligibility with either of the other two varieties. Combining the intelligibility scores with edit distance data, we were able to sketch out the variables involved in enabling and inhibiting mutual intelligibility for all three varieties of Arabic and provide a rough analysis of the linguistic distance between them as branches of North African Arabic.

Ranking Search Results for Arabic Diachronic Corpora. Google-like search engine for (non)linguists

by Jiří Milička and Petr Zemánek

Proceedings of CITALA 2014 (5th International Conference on Arabic Language Processing ), 2014

The contribution introduces a corpus linguistic search engine that ranks its results according to... more The contribution introduces a corpus linguistic search engine that ranks its results according to the keyness measure and the importance of the document within the corpus. For this purpose, the minimal ratio is measured for each word and the corpus is hypertextualized. Differences between genres are taken into account.

A Combinatorial Method for a Context Comparison

When comparing the use of two word types within one text, we can do it by comparing the contexts ... more When comparing the use of two word types within one text, we can do it by comparing the contexts in which they occur. We pick all the tokens that occur e.g. immediatelly to the right of the word A and immediatelly to the right of the word B, thus getting two multiple subsets of text. This paper offers a method for comparing such subsets (and its use is not limited only to the field of linguistics). The method is based on comparing the cardinality of the intersection of the two multiple subsets and a model which characterizes the average cardinality of all possible subsets of a given length from the given text. The model is derived algebraically.

Minimal Ratio: An Exact Metric for Keywords, Collocations etc.

Czech and Slovak Linguistic Review 1/2012, 2012

The paper defines and shows how to use the Minimal Ratio – an exact metric that expresses the rat... more The paper defines and shows how to use the Minimal Ratio – an exact metric that expresses the ratio between the measured value and the limits of the confidence interval calculated according to the formula Fischer’s exact test is based on. The metric is meant to assist with keywords and collocations extraction and comparing texts or corpora according to the word types distribution or other similar criteria.

(with Jiří Milička) Quotations, Relevance and Time Depth: Medieval Arabic Literature in Grids and Networks

by Petr Zemánek and Jiří Milička

This contribution deals with the use of quotations (repeated n-grams) in the works of medieval Ar... more This contribution deals with the use of quotations (repeated n-grams) in the works of medieval Arabic literature. The analysis is based on a 420 millions of words historical corpus of Arabic. Based on repeated quotations from work to work, a network is constructed and used for interpretation of various aspects of Arabic literature. Two short case studies are presented, concentrating on the centrality and relevance of individual works, and the analysis of a time depth and resulting impact of a given work in various periods.

Restricted Collocability and its Use in Arabic Corpus Linguistics

by Petr Zemánek and Jiří Milička

Computerised and Corpus-based Approaches to Phraseology: Monolingual and Multilingual Perspectives

Restricted collocability has received some attention, but not as a formalized method. We suggest ... more Restricted collocability has received some attention, but not as a formalized method. We suggest that it should be used as a metrics for collocations, as well as for other types of usage, both in linguistics and even outside it, as it has great potentials for a plethora of applications. On the examples from a diachronic corpus of Arabic, we show the possibilities of its employment in studying prepositional valency and lexical profiling.

Software for measuring type-token relation, hapax-token relation and other similar types of relat... more

Tinfi (Text Inhomogeneities Finder)

"The software is designed to contribute to discover the text inhomogeneities by comparing type-to... more "The software is designed to contribute to discover the text inhomogeneities by comparing type-token relation of the text and its combinatorial model. Parts of a text in which number of types rises disproportionally are marked. The quick increase ( i.e. a new topic is introduced or style or language is changed) is marked by the green colour, while slow increase of types (i.e. repeating of old topics or even autoquotations). The software is appropriate also for the literary science.

The application allows its user to change the direction of the processing the text - forwards and backwards. When checking both forwards and backwards, unique parts of the texts (comparing with the rest of the text) are marked by the green colour, while typical parts are marked by the red colour. The freeware application provides a graphic user interface."

Universal software for simple linguistic experiments. The BlackSquare is designed to test subject... more

Knihtisk v dějinách islámské kultury (Typography and the Islamic Culture)

Nový Orient 64/2 , 2009

The article examins the phenomenon ot the typography in the course of the Islamic history. In the... more The article examins the phenomenon ot the typography in the course of the Islamic history. In the Islamic world printing by movable types and printblocks was unacceptable. The using such a technology to copy a text written in an Arabic script was illegal. The article asks how could the society resist the temptation of this innovation and describes the distressful influence of typography on the life of Muslims.

Kontroverzní hranice jazykovědy aneb O syntagmatických očích Hany Karadžičové

Naše řeč 4-5/97, 2014

Review of Václav Cvrček: Kvantitativní analýza kontextu. Praha: Nakladatelství Lidové noviny, 20... more

Konfidenční intervaly v empirické lingvistice (Confidence Intervals and the Empirical Linguistics)

Lingvistika Praha 2014, 2014

The paper attempts to introduce confidence intervals to the empirical linguistics. First, classic... more The paper attempts to introduce confidence intervals to the empirical linguistics. First, classical inference tests are discussed claiming their inability to determine the real life significancy. Then confidence intervals are defined and the basic idea underlying the method for computing the confidence intervals for binary data is described. It is shown how the intervals can be useful when exploring binary quaternities and relations between two variables. The last section deals with the relevance of the method for the Czech linguistic discourse.

Listina XVIII G 199 Národní knihovny v Praze: Soubor islámských magických čtverců (Transcription of a Manuscript Containing Magic Squares from the Islamic Cultural Area)

The edition of the one-page document which is archived as XVIII G 199 in the National Library in ... more The edition of the one-page document which is archived as XVIII G 199 in the National Library in Prague. The document syncretizes magic squares, sufi texts and symbols and Islamic elements, and it was created probably in the Ottoman cultural area. The text and symbols are transcribed, reconstructed and interpreted.