Difference between revisions of "Main Page"
Line 6: | Line 6: | ||
'''Citation for the version 0.1 preview release''' | '''Citation for the version 0.1 preview release''' | ||
− | + | <pre> | |
Wallenberg, Joel, Anton Karl Ingason, Einar Freyr Sigurðsson and Eiríkur Rögnvaldsson. 2010. Diachronic Icelandic Corpus. Version 0.1, http://www.linguist.is/wiki | Wallenberg, Joel, Anton Karl Ingason, Einar Freyr Sigurðsson and Eiríkur Rögnvaldsson. 2010. Diachronic Icelandic Corpus. Version 0.1, http://www.linguist.is/wiki | ||
+ | </pre> | ||
'''Annotation guidelines:''' | '''Annotation guidelines:''' |
Revision as of 10:20, 25 June 2010
This is the Icelandic Treebank Wiki. It is mostly used to document the annotation standard for those constructing and using the corpus. The annotation scheme is meant to be mostly compatible with the Penn historical corpora, and the guidelines here are written as a supplement to the Penn guidelines, so look at Beatrice Santorini's guidelines for further information.
How to get a copy?
The treebank is under construction but preview versions will be released regularly during the construction. The first preview, version 0.1, will be released July 1st 2010, for download from this site (under a free and open source license). Until then you can watch the development at Github.
Citation for the version 0.1 preview release
Wallenberg, Joel, Anton Karl Ingason, Einar Freyr Sigurðsson and Eiríkur Rögnvaldsson. 2010. Diachronic Icelandic Corpus. Version 0.1, http://www.linguist.is/wiki
Annotation guidelines:
- Phrase Types
- Head Types
- Conjunction
- Tagset
- Treatment of individual words
- Empty categories
- Lemmatization
- Index
Search PPCME/PPCEME documentation (ling.upenn.edu/~beatrice/annotation) <html> <form method="get" action="http://www.google.com/search">
<a href="http://www.google.com/> <img src="http://www.google.com/logos/Logo_40wht.gif" border="0" alt="Google"></a> <input type="text" name="q" maxlength="255" /> <input type="submit" value="Google Search" /> <input style="visibility:hidden" type="radio" name="sitesearch" value="http://ling.upenn.edu/~beatrice/annotation/" checked="checked" />
</form> </html>
General information
- Icelandic Syntax Phenomena
- Syntactic definitions
- English-Icelandic Translations of Linguistic Terminology
- Texts
Annotation team stuff:
- Annotation Issues ...
- Annotation Process
- Checklist
- MediaWiki Formatting Guide
Resources
- Icelandic Resources for doing Computational Linguistics and Natural Language Processing
- Treebank Resources (language independent)
- Penn Parsed Corpora of Historical English
- Parsed Corpora for other languages
Treebank team:
- Eiríkur Rögnvaldsson (PI)
- Joel Wallenberg (PI)
- Anton Karl Ingason
- Einar Freyr Sigurðsson
- Brynhildur Stefánsdóttir (BA research assistant)
- Hulda Óladóttir (BA research assistant)