Anton Karl Ingason

  • Increase font size
  • Default font size
  • Decrease font size
Anton Karl Ingason

2011

Print
Some highlights of 2011 (roughly in the order they happened):
  • In the beginning of the year, getting an offer to do graduate school work at UPenn
  • Visiting Philadelpia twice before moving there, and on one of those occasions also spending some time in New York
  • Being more healthy than ever before and having the organization skills to run and eat and sleep well. There was a big positive change in the first part of the year and throughout the summer. (All of those things suffered a bit towards the end of the year because of graduate school stress. Will work on that in 2012!)
  • Visiting Manchester in May, meeting a number of wonderful people, seeing an interesting city, and getting stuck there for a few extra days because of a volcano making trouble back home in Iceland
  • Finishing a huge project in work in August. The project took two years and our group finished it exactly on time even if there were times when the deadline didn't seem realistic. Things go well when you're working with a great group of people.
  • Moving to Philadelphia in the fall
  • Switching to an actual Linux computer for my work after having been in the business of installing Linux on Windows computers before. Free at last and will hopefully never do business with the Microsoft/Apple evil empires again!
  • Getting married in Philadelphia shortly after moving there. It was just the two of us and the people working in the chapel and we had a wonderful day that was exciting in a very relaxed way. We had some really good Indian food and I got a bunch of cheese. For our honeymoon we went running up the Schuylkill River together.
  • Enjoying cheese in a place where buying and selling cheese is a free market experience
  • Getting breakfast shipped from Amazon by UPS
  • Catching two mice
  • Getting two mouse traps for Christmas to replace the ones I already used
  • Getting to know wonderful people from all over the world, especially the other newly arrived linguists at UPenn. Also, Americans are always incredibly nice and I am very fortunate to get a chance to get to know how things are like over there. Americans are sometimes very critical of their own Americanness but the basis for those nationality worries is most often at the large scale level of a broken and self-destructive system of corporations and institutions. That stuff has nothing to do with the more personal level where Americanness is a super positive quality. I will stop now before this turns into blog post about Americanness, I'll do that later :-)
  • Having a chance to meet family and friends in Iceland in December (though time went by far too fast so that I wasn't able to meet everyone, but that'll happen next time!)
The main regret is that I would have liked to meet more people more often in the non-work part of everything but the entire year was full of intense work, which in the latter half of the year was made even more time consuming by adding a transition to a new environment where a lot of adjustment took a lot of time. Will fix this in 2012.
Last Updated on Monday, 02 January 2012 12:50
 

IcePaHC 0.9. 1 million words of syntactically parsed (hand-corrected) Icelandic

Print
We are very pleased to announce that version 0.9 of the Icelandic Parsed Historical Corpus (IcePaHC) is now available for free download.

The corpus can be downloaded from:
www.linguist.is/icelandic_treebank/Download

The corpus is a treebank of over 1 million words in size, annotated for full phrase structure parse, and hand-corrected, using an adaptation of the annotation scheme used by the Penn Treebank and the Penn parsed corpora of historical English (http://www.ling.upenn.edu/hist-corpora/). Note that this release contains all of the text for version 1.0, but some minor corrections remain to be finished.

The corpus contains:

- 1 002 361 words total, consisting of ~100 000-word samples from each century from the 12th to the beginnng of the 21st century.
- Annotated with a phrase structure parse, part-of-speech-tagged, and lemmatized.
- The entire parse, pos-tagging, and lemmata for every sentence have been *hand-corrected*.
- Text samples are balanced for genre within each century.
- LGPL license: You are free to copy, modify and redistribute the corpus for research and/or profit with appropriate citation.

The corpus is distributed as raw UTF-8 data in labeled bracketing format and it is therefore compatible with various existing programs, including CorpusSearch (http://corpussearch.sourceforge.net/).

A plain text version without markup and a set of info files containing philological information accompany the corpus download.

The entire corpus may be downloaded in a plain text version, a platform-independent GUI, and a Windows-compatible GUI for ease of searching.

Further information on the annotation guidelines and project organization can be found on the project wiki:
www.linguist.is/icelandic_treebank/


Joel C. Wallenberg ( This e-mail address is being protected from spambots. You need JavaScript enabled to view it )
Anton Karl Ingason ( This e-mail address is being protected from spambots. You need JavaScript enabled to view it )
Einar Freyr Sigurðsson ( This e-mail address is being protected from spambots. You need JavaScript enabled to view it )
Eiríkur Rögnvaldsson ( This e-mail address is being protected from spambots. You need JavaScript enabled to view it )
University of Iceland

We were grateful to receive support for this project through the following grants:

Icelandic Research Fund (RANNÍS), grant nr. 090662011,"Viable Language Technology beyond English – Icelandic as a test case".

U.S. National Science Foundation (NSF) International Research Fellowship Program (IRFP), grant #OISE-0853114, "Evolution of Language Systems: a comparative study of grammatical change in Icelandic and English".

University of Iceland Research Fund (Rannsóknasjóður Háskóla Íslands), grant Icelandic Diachronic Treebank (Sögulegur íslenskur trjábanki)

Last Updated on Monday, 29 August 2011 14:04
 

Available: IcePaHC 0.4 (now includes a visual Windows version)

Print
IcePaHC 0.4, the latest version of the Icelandic Parsed Historical Corpus, is now available for download:

http://linguist.is/icelandic_treebank/Download

- 440.000 words total, from every century between the 12th and the 19th centuries inclusive annotated for phrase structure, part-of-speech-tagged and lemmatized
- An optional easy-to-install visual user interface for Windows
- LGPL license: You are free to copy, modify and redistribute the corpus for research and/or profit

Joel C. Wallenberg ( This e-mail address is being protected from spambots. You need JavaScript enabled to view it )
Anton Karl Ingason ( This e-mail address is being protected from spambots. You need JavaScript enabled to view it )
Einar Freyr Sigurðsson ( This e-mail address is being protected from spambots. You need JavaScript enabled to view it )
Eiríkur Rögnvaldsson ( This e-mail address is being protected from spambots. You need JavaScript enabled to view it )
University of Iceland

The project is funded by the following grants:

Icelandic Research Fund (RANNÍS), grant nr. 090662011,"Viable Language Technology beyond English – Icelandic as a test case".

U.S. National Science Foundation (NSF) International Research Fellowship Program (IRFP), grant #OISE-0853114, "Evolution of Language Systems: a comparative study of grammatical change in Icelandic and English".

--------------------------------

IcePaHC 0.4, íslenski trjábankinn (nú með Windows útgáfu)


IcePaHC 0.4, nýjasta útgáfa íslenska trjábankans, er komin út:

http://linguist.is/icelandic_treebank/Download

- Samtals 440.000 orð frá öllum öldum frá og með 12. öld til og með 19. öld, sem búið er að greina setningafræðilega, marka og lemma
- Einföld Windows uppsetning á myndrænu notandaviðmóti
- LGPL leyfi: Notendur geta afritað málheildina, breytt henni og endurútgefið vegna rannsókna og/eða í hagnaðarskyni

Joel C. Wallenberg ( This e-mail address is being protected from spambots. You need JavaScript enabled to view it )
Anton Karl Ingason ( This e-mail address is being protected from spambots. You need JavaScript enabled to view it )
Einar Freyr Sigurðsson ( This e-mail address is being protected from spambots. You need JavaScript enabled to view it )
Eiríkur Rögnvaldsson ( This e-mail address is being protected from spambots. You need JavaScript enabled to view it )

Verkefnið er styrkt af:

RANNÍS, styrk nr. 090662011, "Hagkvæm máltækni utan ensku - íslenska tilraunin".

U.S. National Science Foundation (NSF) International Research Fellowship Program (IRFP), styrk #OISE-0853114, "Evolution of Language Systems: a comparative study of grammatical change in Icelandic and English".
Last Updated on Tuesday, 12 April 2011 15:45
 

Fix accent problem in TexMaker on Ubuntu

Print
To fix the accent problem with TexMaker in Ubuntu where the accents stop going over the character but are instead written before them, so you get 'a instead of á (Icelandic, Spanish etc.), install the ibus-qt4 package:

sudo apt-get install ibus-qt4

Fedora equivalent:

yum install ibus-qt

(Source thread)

Easy fix for a very annoying and unpredictable problem. I have no idea why this happens occasionally without that package but according to online sources the bug affects some more Linux distributions even in the latest version of TexMaker.
Last Updated on Wednesday, 24 August 2011 09:36
 

Available: Icelandic Parsed Historical Corpus (IcePaHC), V0.2

Print
We are pleased to announce that version 0.2 of the Icelandic Parsed Historical Corpus (IcePaHC) is now available for free download.

The corpus is syntactically parsed, annotated for full phrase structure using an adaptation of the annotation scheme used by the Penn parsed corpora of historical English and other corpora in that tradition (see links from website). The corpus contains ca. 120.000 words from 6 different centuries (12th, 13th, 16th, 17th, 18th and 19th). Please note that this is a small portion of the ultimate goal for the completed corpus, ca. 1 million words from the 12th-19th centuries.

The corpus is distributed as raw UTF-8 data in labeled bracketing format and it is therefore compatible with various existing programs, including CorpusSearch.

The corpus can be downloaded from:
www.linguist.is/icelandic_treebank/Download

Further information on the annotation guidelines and project organization can be found on the project wiki:
www.linguist.is/icelandic_treebank/

We hope that this release will result in feedback that allows us to improve the resource for upcoming versions. Updates are released every three months - the upcoming 0.3 version will be released on January 1st 2011. Between releases, development can be tracked at our open repository at Github (http://github.com/antonkarl/icecorpus) but use of released versions is encouraged to ensure that results can be replicated.

Texts included in Version 0.2:
4585 words from The First Grammatical Treatise (entire text) (12th century)
8179 words from Íslensk hómilíubok (Icelandic book of homilies) (12th century)
3459 words from Egils saga (theta fragment) (13th century)
22719 words from Sturlunga saga (13th century)
20683 words from the New Testament's Gospel of John (1540)
16421 words from the New Testament's Acts (1540)
4521 words from Jón Indíafari's travelogue (1661)
22097 words from Jón Steingrímsson's biography (1791)
17837 words from Piltur og stúlka (novel by Jón Thoroddsen) (1850)
Total number of words: 120355

Joel Wallenberg ( This e-mail address is being protected from spambots. You need JavaScript enabled to view it )
Anton Karl Ingason ( This e-mail address is being protected from spambots. You need JavaScript enabled to view it )
Einar Freyr Sigurðsson ( This e-mail address is being protected from spambots. You need JavaScript enabled to view it )
Eiríkur Rögnvaldsson ( This e-mail address is being protected from spambots. You need JavaScript enabled to view it )
University of Iceland

The project is funded by the following grants:

Icelandic Research Fund (RANNÍS), grant nr. 090662011,"Viable Language Technology beyond English – Icelandic as a test case".

U.S. National Science Foundation (NSF) International Research Fellowship Program (IRFP), grant #OISE-0853114, "Evolution of Language Systems: a comparative study of grammatical change in Icelandic and English".
Last Updated on Friday, 01 October 2010 17:53
 
  • «
  •  Start 
  •  Prev 
  •  1 
  •  2 
  •  3 
  •  4 
  •  5 
  •  Next 
  •  End 
  • »


Page 1 of 5