Corpus del Español Actual (CEA)
-
Link:

- Example of KWIC view result:

- Based on Europarl, Wikicorpus (2006!), MultiUN. From their metadata page:
Metadata for Corpus del Español Actual
Corpus name
Corpus del Español Actual
CQPweb’s short handles for this corpus
cea / CEA
Total number of corpus texts
73,010
Total words in all corpus texts
539,367,886
Word types in the corpus
1,680,309
Type:token ratio
0 types per token
Text metadata and word-level annotation
The database stores the following information for each text in the corpus:
There is no text-level metadata for this corpus.
The primary classification of texts is based on:
A primary classification scheme for texts has not been set.
Words in this corpus are annotated with:
Lemma (Lemma)
Part-Of-Speech (POS)
WStart (WStart)
The primary tagging scheme is:
Part-Of-Speech
Further information about this corpus is available on the web at:
- To use, “consult the IMS’s brief description of the regular-expression syntax used by the CQP and their list of sample queries. If you wish to define your query in terms of grammatical and inflectional categories, you can use the part-of-speech tags listed on the CEA’s Corpus Tags page.”
- Also provides frequency data (based on word forms or lemmas, and others – up to a 1000):

- Examples of a frequency query result (click for full-size image. Note that a lemmatized list was requested here which links all inflected forms back to the lemma, and vice versa, upon clicking the lemma, displays a KWIC view containing all forms subsumed under that lemma, see picture above):

Categories: Corpus-linguistics, Spanish, websites
links
Comments (0)
Trackbacks (0)
Leave a comment
Trackback

