Scraping RSS of online actualités for language learning materials production

  1. The capability of RSS-news feed integration of foreign language news may be standard now in most LMS, but was not in 2002 (not even having an LMS was standard, I had to build my own while it took the university a few more years to adopt Blackboard as I had recommended in 2000): cc-calico-news-glossing.2
  2. But RSS-feed display is skin-deep and, even in extensive-reading pedagogies, not sufficient for integration into teaching and learning which requires more post-processing.
  3. At a recent Digital Humanities Unconference, I was asked how I had “scraped” (RSS-scraping was chosen since it easier than screen scraping,  for RSS is devoid of most markup, as long as it validates) into a SQL-server database. Here are some code-snippets to get you
    1. from the web
    2. into the database: sql-portal-csvs-codecc-ms-sql-server2cc-ms-sql-server3
    3. The scraped plain text in the database can form the foundation for post-processing for SLA-purposes, see e.g.  glossing for reading comprehension facilitation or question generation with the trpQuizConverter for 2008 install experience

Symptom: Error 32003. Vsvars.bat could not be opened for write. clip_image001

Context: After 2008 Professional Install on Windows-7 home premium 64-bit , reboot, on opening Outlook 2010 32-bit.

Resolution: In the file properties, tab:security, give “full control” permissions to “Everyone”.

Comments: Changed permissions back after installer finished, have yet to see whether whether this errors again. – Error did not occur on Vista Home Premium 64-bit .

