Posts Tagged ‘nlp’

POS-Tagsets. A list.

Paper @ CALICO 2014: Using NLP Platforms for Language Learning Material Production

…has been accepted for inclusion in the program for CALICO 2014, May 9-10 at Ohio University, (Athens, OH) and was presented on May 9: Here are abstract and slide deck:

Corpora, Treebanks, Word-Lists. A List.

ELRA language corpora available in the LRC for research

The LRC has availed itself of a free research distribution of 55GB collection of language corpora from, the European Language Resources Association. This “big data” should be of interest for the translation program, as well as the language learning programs, since it enables corpus linguistic approaches to language learning and automated learning material production based on natural language processing.

Here is an overview of the materials included:


A list of files included can be found here:

How NLP and IR and LG are used by spammers now



You need to know a little bit about “yours truly” to find this spam comment interesting (which, if you follow the links, eventually leads to a discount wrist watches site).

Multilingual WordNet search interface


Covers a subset of the languages supported by the LRC. Based on Wordnet which is rather than a dictionary for human consumption, a machine-readable semantic network, but here is one of its machine-generated applications.

Looking forward to the Digital Humanities Unconference at UNC Charlotte

  1. Why I come to THATCamp Piedmont:
    1. I am looking for practitioners of NLP in a language and literature teaching context since I am working on Using NLP tools to automate production and correction of interactive learning material  (presented at Calico 2012)
    2. for the Learning Exercise Creation Engines (presented at EUROCALL 2007) I developed.
  2. A little about myself:
    1. My Ph.D. thesis expanded the close reading of textual variants in the German editorial schools of Hans Zäch and the use of the computer-generated textual concordances in the interpretation and selection of textual variants into a corpus linguistic-inspired approach, that traced Leitmotifs in the work  (partially first digitized by myself) of the foremost Swiss-German classic as a digital corpus using Regular Expressions programming.
    2. I have since applied my corpus linguistic approach to
      1. the use of machine translation software
      2. the automation of learning material creation  (glossing, question generation, differentiation) on the basis of natural language processing of textual  (film subtitles, news) corpora.