Difference between revisions of "Download"

From Icelandic Parsed Historical Corpus (IcePaHC)
Jump to: navigation, search
Line 1: Line 1:
 +
==Introduction==
 +
The Icelandic Parsed Historical Corpus (IcePaHC) is a project that aims to construct a diachronic corpus with samples of written Icelandic from all periods from the 12th century to modern times. The corpus is mostly compatible with the [http://www.ling.upenn.edu/~beatrice/annotation/ corpora of historical English] developed at UPenn.
 +
 
==Download Version 0.1 ([http://en.wikipedia.org/wiki/GNU_Lesser_General_Public_License LGPL])==
 
==Download Version 0.1 ([http://en.wikipedia.org/wiki/GNU_Lesser_General_Public_License LGPL])==
 
To experiment with the 0.1 preview version of the Icelandic Parsed Historical Corpus (IcePaHC) you can download the following zip-file, which contains the raw data of the corpus in labeled bracketing format.  
 
To experiment with the 0.1 preview version of the Icelandic Parsed Historical Corpus (IcePaHC) you can download the following zip-file, which contains the raw data of the corpus in labeled bracketing format.  
Line 5: Line 8:
  
 
The corpus, as well as software developed as part of the IcePaHC project, is released under an ([http://en.wikipedia.org/wiki/GNU_Lesser_General_Public_License LGPL]) license, to ensure compatibility with other LGPL-licensed NLP tools, notably the [http://sourceforge.net/projects/icenlp/ IceNLP] toolkit, which is used extensively in the development of the corpus.
 
The corpus, as well as software developed as part of the IcePaHC project, is released under an ([http://en.wikipedia.org/wiki/GNU_Lesser_General_Public_License LGPL]) license, to ensure compatibility with other LGPL-licensed NLP tools, notably the [http://sourceforge.net/projects/icenlp/ IceNLP] toolkit, which is used extensively in the development of the corpus.
 +
 +
The corpus is free as in beer and as in speech. We recommend that people cite the latest released version when using the corpus for research to ensure that results can be replicated. However, the most up-to-date version and information on the current state of development can be accessed at our version control repository at [http://github.com/antonkarl/icecorpus Github].
  
 
==Citation for the version 0.1 preview release (of July 1st 2010)==
 
==Citation for the version 0.1 preview release (of July 1st 2010)==

Revision as of 11:22, 30 June 2010

Introduction

The Icelandic Parsed Historical Corpus (IcePaHC) is a project that aims to construct a diachronic corpus with samples of written Icelandic from all periods from the 12th century to modern times. The corpus is mostly compatible with the corpora of historical English developed at UPenn.

Download Version 0.1 (LGPL)

To experiment with the 0.1 preview version of the Icelandic Parsed Historical Corpus (IcePaHC) you can download the following zip-file, which contains the raw data of the corpus in labeled bracketing format.

  • Icelandic Parsed Historical Corpus (IcePaHC). Version 1.1. LGPL. (zip, X KB)

The corpus, as well as software developed as part of the IcePaHC project, is released under an (LGPL) license, to ensure compatibility with other LGPL-licensed NLP tools, notably the IceNLP toolkit, which is used extensively in the development of the corpus.

The corpus is free as in beer and as in speech. We recommend that people cite the latest released version when using the corpus for research to ensure that results can be replicated. However, the most up-to-date version and information on the current state of development can be accessed at our version control repository at Github.

Citation for the version 0.1 preview release (of July 1st 2010)

Wallenberg, Joel, Anton Karl Ingason, Einar Freyr Sigurðsson and Eiríkur Rögnvaldsson. 2010. 
Icelandic Parsed Historical Corpus (IcePaHC). 
Version 0.1. http://www.linguist.is/wiki