How long can my text file be?

There is no strict length limit, but longer texts take more time to process.

How many keys and attributes can my table contain?

There is no strict length limit, but longer lists of keys take more time to process. By contrast, the number of attributes (i.e. columns) has only little impact on processing - but make sure to use a finite number of columns.

Can I also upload a table consisting only of a single column with my keys?

No, WordValue currently requires at least two columns. If you do not have any values to map on your keys, simply use a dummy value like "1" in all cells of the second column.

Can I use a Word file with WordValue?

No, you first need to edit your text in such a way that WordValue can process it, since the text needs to be in .txt format with UTF-8 encoding. The simplest option is to copy-paste your text into the text box in the TEXTS tab and to save it with a file name in WordValue.

Alternatively, you can easily convert a Word document into a .txt file in Microsoft Word by saving a copy of it and changing the file type option to .txt. Then click Tools in the same window. Select “Web options”, go to the tab Encoding and select “Unicode (UTF-8)” in the dropdown menu “Save this document as”.

Can I use an Excel spreadsheet with WordValue?


How can I delete double lines in my csv file?

Open the file in Excel, select the tab Data; then click on the button Remove Duplicates and check only the column keys.

Excel uses semicolons instead of commas in my .csv file. How can I change that?

This is only relevant for those versions of Excel which use semicolons instead of commas (e.g. the German version). Open your csv file in a text editor like Notepad++ or the Editor that comes automatically with Microsoft Windows. (Access the Editor in the file explorer view by right-clicking on the filename. Go to “Öffnen mit/Open with” and select the text editor software.) Then replace all semi-colons with commas and save your document. Furthermore, if floating point numbers in your file are represented with commas instead of dots (e.g. 3,47), you need to replace them with dots (e.g. to 3.47) before replacing all semi-colons with commas. Also make sure that commas are only used as separators.

What do the preset files contain?

The preset text files contain royalty-free short texts (or extracts) by famous authors from previous centuries. WordValue currently contains the following preset text files:

  • Alice's Adventures in Wonderland by Lewis Carroll - Chapter 1
  • Daffodils by William Wordsworth
  • The Happy Prince by Oscar Wilde

The preset table English personal pronouns with genders contains a list of the English personal pronouns, together with information on their gender (female, male, unmarked).

The preset table The 200 most frequent English words contains the 200 most frequent words from the British National Corpus with information on their part of speech, frequency, rank, word length (in letters), language family of origin (Germanic vs. Romance) and period of first attestation (Old English – Middle English – Early Modern English – Modern English). The dataset is based on the beginning of Adam Kilgarriff’s list BNC frequency list lemma.num ( ) and was extended following the method outlined in Christina Sanchez-Stockhammer (2018), Consociation and Dissociation: An empirical study of word-family integration in English and German , with a more fine-grained recoding of the most recently attested words based on the information in the Oxford English Dictionary.

What other word lists can I use for my searches?

You can use any list of words or linguistic items that is relevant for you, provided that you convert it into an appropriately formatted table with the list words in the first column and other information (e.g. the dummy item “placeholder”) in the next column(s), e.g.:

  • corpus-based frequency lists, such as Adam Kilgarriff’s BNC frequency lists
  • pedagogical word lists, such as the Oxford 3000 or the Oxford 5000 (which comprise the 3000/5000 most important English words and can be downloaded here )
  • specialised word lists, such as the Academic Word List (which comprises 570 English words that occur particularly frequently in academic texts and is available here )

What lemmatiser does WordValue use?

We use spaCy – except for pronouns, because Spacy combines all pronouns (e.g. me and themselves) into a single category, -PRON-. To enable gender-related searches, we subdivided the pronoun spectrum into the more specific lemmas I, you, he, she, it, we and they. Our pronoun lemmas comprise the following members:








Can WordValue search across line breaks?

Yes, our search algorithm ignores the line break. This allows WordValue to find spaced and hyphenated compounds whose constituents are interrupted by a line break. By contrast, rule-governed end-of-line hyphenation that separates various syllables cannot be ignored.

Can WordValue also find geographical or historical spelling variants of my key items (e.g. color for colour, or faerie for fairy)?

No, that is not possible. The version of the spaCy lemmatiser that we use does not allow for spelling variation.

How can I compare the frequency of my search items across different texts?

To compare texts, you need to carry out separate searches for each target text with the same search table. This can be done very quickly and easily by recombining your search table with different texts in the RESULTS tab. Afterwards, you can manually combine your results in a single table.

The difference between the colours in my gradient is not huge. What can I do?

Try using other colours and make use of differences in hue and saturation to achieve a clearer gradient.

How can I export my rainbow-coloured texts?

The rainbow-coloured texts are displayed in html format. This permits easy copy-pasting into Microsoft Word (and other appropriate software), where they can be saved and further processed.

WordValue is taking very long to load my results.

This is not uncommon if your results are based on a long text and/or list of keys. Please be patient. We are currently working on a solution to this problem. Since your results are stored as soon as they have been computed, all the following processes will be much faster, and you will be able to access the results very quickly in different formats (e.g. case-sensitive results table or lemmatised colour-coding).

I can only see very few lines of the results table.

Scroll down using the scrollbar on the right hand side.

When I want to select a specific attribute for colour-coding my text, I get a warning message.

Make sure that the correct "data type of chosen attribute" (= categorical, ordinal or numeral) for your specific attribute is selected in the box below. For example, if you want to colour-code a category like gender (which has no inherent ordering), you need to select "categorical" instead of the default mode "numerical". The following overview indicates the data type of the attributes in the preset tables:

attribute data type
gender categorical
part of speech categorical
frequency numerical
rank numerical
word length numerical
language family of origin categorical
period of first attestation ordinal

How can I delete my account?

Go to the ACCOUNT page and click on the Delete button at the bottom of the page.
By deleting your account, your account information (username, e-mail address, password), all your uploaded files and all your results will be deleted. Thus, every information and file connected to your account will be deleted and neither you nor we will be able to retrieve any information or files after you have deleted your account.

You cannot find an answer to your own question here or would like to give us feedback?

Then please write to christina[dot]sanchez[at]phil[dot]tu-chemnitz[dot]de or johannes[dot]tochtermann[at]campus[dot]lmu[dot]de.