Archive

Archive for the ‘Genre-is-any’ Category

Using NLP tools to automate production and correction of interactive learning materials for blended learning templates in the Language Resource Center. Presentation Calico 2012, Notre Dame University

Comparison of NLP Platforms

Not really a comparison, only a notebook compiled from online sources. Not really fit for publication either, unless “sharing is caring”. You can view a larger version here.

When installing apache-maven-3.0.4 on Windows 7 (64-bit), environment variables do not expand in %Path%

2012/05/15 4 comments
  1. Do not follow the instructions for installing apache-maven-3.0.4  on “Windows 2000/XP”, or your final test running mvn –version to verify that it is correctly installed will fail.
  2. When adapting the Environment variable path, do not use %M2_HOME%\bin, but rather repeat the explicit path, e.g. “G:\conf\lang\java\apache\maven\apache-maven-3.0.4” 
  3. maven install
  4. Why is that?

How to easily upload, distribute, share and play large multimedia files with Google Apps

  1. Tired of burning, lugging around, inserting, ejecting, or forgetting, losing, scratching and replacing CD- and DVD- media, or hard- and thumb drives to handle your large multimedia files? Do you have internet and web browser where you need to access and play your files? Then you can use UNCC Google Apps  instead.
  2. Go to your UNCC Google Apps (you have to log into UNC)
  3. Click on the Hard-drive-icon in the upper left uploadand on “files”, select a video file:
      1. upload-file-dialogue (many formats and underlying codecs supported, including Flash, MOV, AVI, WMV, MPG. File size limits: currently “every user is given 1GB of free storage space for files” (not enough for much HD footage, but difficult to upload, and didn’t Google Drive just increase this limit to 5GB, or does this not carry over to Google Apps? Stay tuned for updates), click “Start upload”..
  4. Wait until upload is finished, upload-progress and then, depending:
    1. if you want to share the file with colleagues, click on “share” (appears after “cancel”) and fill out the dialogue. You can share your file both
      1. inside the university community and share
    2. outside of the university community: share-outsideshare-outside1
    3. if you want to share the files with students in your course, there is a better way using Moodle Kaltura video upload;
    4. if you just want to play the file yourself (including to your students in the classroom), you are already done.
    5. Go to your files playand click on the file in the list to play the video: play1
    6. Also, you can always “get your file back” by “Download”: download (and note you can also prevent users from downloading the files. This is useful if you only want to temporarily share it, but later revoke permissions).
  5. More help is available from the  from the  source
    1. How save files to your Google Docs
    2. How to play back video files in Google Docs

How to easily merge MP3 files

  1. There are many ways, including many that are easier than doing it manually in Audacity.
  2. MergeMP3 is a free and easy one that worked here: mergemp3

Setting up European Union translation memories and document corpora for SDL-Trados

  1. SDL-Trados installation allows the translation program to teach this industry-standard computer-aided translation application . So far, however, we had no actually translation memory loaded into this translation software.
  2. The European Union is a powerhouse for translation and interpreting – at least for the wide range of their member languages many of which are world languages – , and makes some of their resources – which have been set up for translation and interpreting study use here before – available to the community free of charge as reported during a variety of LREC’s.
    1. This spring, the Language Technology Group at the Joint Research Centre  of the European Union this spring updated their translation memory  offer DTG-TM can fill that void at least for the European Languages  that have a translation component at UNC-Charlotte.
      1. We download on demand (too big to store: http://langtech.jrc.ec.europa.eu/DGT-TM.html#Download)
        1. Is the DGT-TM 2011 truly a superset of the 2007, or should both be merged? probably too much work?
      2. and extract only the language pairs with English and the language only the languages “1”ed here : “G:\myfiles\doc\education\humanities\computer_linguistics\corpus\texts\multi\DGT-tm\DGT-tm_statistics.xlsx” (using “G:\myfiles\doc\education\humanities\computer_linguistics\corpus\texts\multi\DGT-tm\TMXtract.exe”)
      3. and convert
        1. English is the source language by default, but should be the target language in our programs,
        2. The TMX format this translation memory is distributed provided in, should be “upgradeable ” to the SDL Trados Studio 2011/2011 SP1 format in the Upgrade Translation Memories wizard”.,
          1. TBA:where is this component?
      4. configure the Trados to load the translation memory
        1. how much computing resources does this use up?
        2. how do you load a tm?
        3. can you load in demand instead of preload all?
      5. Here are the statistics for the translation memories for “our” languages
      6. uncc Language Language code Number of units in DGT – release 2007 Number of units in DGT – release 2011
        1 English EN 2187504 2286514
        1 German DE 532668 1922568
        1 Greek EL 371039 1901490
        1 Spanish ES 509054 1907649
        1 French FR 1106442 1853773
        1 Italian IT 542873 1926532
        1 Polish PL 1052136 1879469
        1 Portuguese PT 945203 1922585
        Total 8 8 7246919 15600580
    2. Would it be of interest to have the document-focused jrc-acquis distribution of the materials underlying the translation materials available on student/teachers TRADOS computers so that sample texts can be loaded  for which reliable translation suggestions will be available – this is not certain for texts from all domains – and the use of a translation memory can be trained in under realistic conditions?
      1. “The DGT Translation Memory is a collection of translation units, from which the full text cannot be reproduced. The JRC-Acquis is mostly a collection of full texts with additional information on which sentences are aligned with each other.”
      2. It remains to be seen how easily one can transfer documents from this distribution into Trados to work with the translation memory
      3.   Here is where to download:
      4. uncc

        lang

        inc

        1

        de

        jrc-de.tgz

        1

        en

        jrc-en.tgz

        1

        es

        jrc-es.tgz

        1

        fr

        jrc-fr.tgz

        1

        it

        jrc-it.tgz

        1

        pl

        jrc-pl.tgz

        1

        pt

        jrc-pt.tgz

      5. The JRC-Acquis comes with these statistics:
    3. uncc

      Language ISO code

      Number of texts

      Total No words

      Total No characters

      Average No words

      1

      de

      23541

      32059892

      232748675

      1361.87

      1

      en

      23545

      34588383

      210692059

      1469.03

      1

      es

      23573

      38926161

      238016756

      1651.3

      1

      fr

      23627

      39100499

      234758290

      1654.91

      1

      it

      23472

      35764670

      230677013

      1523.72

      1

      pl

      23478

      29713003

      214464026

      1265.57

      1

      pt

      23505

      37221668

      227499418

      1583.56

      Total

      7

      164741

      247374276

      1588856237

      10509.96

  3. What other multi corpora are there (for other domains and other non-European languages)?

Does Respondus-lockdown–browser block when a user attempts to load a Moodle quiz on 2 different computers?

  1. We experienced slowness of Moodle during an exam where about 12 students
    1. load a Moodle quiz into the Respondus lockdown browser (lockdown browser hangs with message "page loading"),
    2. but also already when logging into Moodle with a regular browser (hangs on login page).
  2. Turns out large classes used the Moodle quiz function elsewhere on campus which put lots of load on the Moodle servers.
  3. What can we do on our end to work around this as smoothly as possible?
    1. First, be patient while Respondus-lockdown–browser displays “Page loading
    2. Refresh” or “Back/forward” are the next resort once “Page loading” attempt has stopped and the page
      1. states it cannot be loaded
      2. displays an error about missing CSS component (likely due to incomplete load before timeout)
      3. says it “can be loaded only in Respondus-lockdown–browser” while you are in Respondus-lockdown–browser (Huh?).
    3. Keep calm and carry on, i.e. on your current computer.
      1. In general, trying on additional “fallback” computers is likely to make matters only worse, since even more load is put on the Moodle server system.
      2. Specifically, however, does Respondus-lockdown–browser block when a user attempts to load a Moodle quiz in Respondus-lockdown–browser on 2 different computers simultaneously? One student kept getting “can be loaded only in Respondus-lockdown–browser” consistently, until closing Respondus-lockdown–browser on this computer. Then the quiz would finally load in Respondus-lockdown–browser where she was logged in on another computer (can this being tracked by the Respondus-lockdown–browser security layer that checks whether a page is loaded within Respondus-lockdown–browser? Why then no more helpful error message, or is this “Security by obscurity”? Data seems inconclusive).
  4. Additional tips for takers (and authors) of Moodle exams are available.

Bootsect (the command formerly known as fixmbr), may enable your Windows 7 installation to start up from a CorruptVolume

  1. I am trying to back up my system partition to external media. The 3rd party utility I boot into refuses to get to work since it sees inconsistencies on the partition.
  2. I schedule a chkdsk /f /r from Windows 7 which on restart runs without errors.
  3. Upon completion, however, Windows 7 fails to  boot.
  4. CIMG0034I go in the Startup Repair, but System Repair tells me it cannot fix my problem, but offers System Restore, which I try twice, to no avail.
  5. CIMG0033Cryptic Error IDs 21200664, googling of which leads to nothing but MS System Engineers advising reimaging the system partition to the factory copy which would lose my 16 months worth of system customizations.
  6. I feel I have other things to do to research the innards of Windows 7. Faintly I remember that fixmbr used to get me out of fixes with non-starting XP installations. It has been replaced by bootsect /nt60 <driveletter>, and on top of that, it responds with an apparent failure “since the volume could not be locked during the update” (actually deemed likely harmless, or use /force).
  7. Windows 7 starts up. (WTH?! Check disk versus the master boot record?!)
  8. Now, am I supposed to use a 3rd party or a Microsoft tool for my system partition backup?