Difference between revisions of "Tagset"

From Icelandic Parsed Historical Corpus (IcePaHC)
Jump to: navigation, search
(Simple Tags)
 
(44 intermediate revisions by 2 users not shown)
Line 1: Line 1:
This is the head level tagset used in the Icelandic Treebank. The tagset is based on the [[IFD Tagset]]. Each head is assigned a tag (N-NSDIC in the example below):
+
This is the head level tagset used in the IcePaHC.
 
+
<synttree>[NP[N-NSDIC[barni]]]</synttree>
+
 
+
In the example ''barni'' is a dative form of the Icelandic word for 'child'. The first part, before the dash, always represents the word class, e.g. N for '''N'''oun. The extension of the above tag, NSDIC, means: '''N'''euter, '''S'''ingular, '''D'''ative, '''I'''ndefinite (no suffixed article), '''C'''ommon Noun. The predashial word class can be one or more characters while the postdashial subfeatures are always one character per feature.
+
  
 
==Tags with postdashial subfeatures==
 
==Tags with postdashial subfeatures==
Line 10: Line 6:
 
!# || Category/Feature || Symbol – semantics
 
!# || Category/Feature || Symbol – semantics
 
|-
 
|-
|1  ||  Word class  ||    '''N'''–noun
+
|1  ||  Word class  ||    '''N''' noun, '''NPR''' proper noun,  
|-
+
|2  ||  Gender      ||    '''M'''–masculine, '''F'''–feminine, '''N'''–neuter, '''X'''–unspecified
+
|-
+
|3  ||  Number ||          '''S'''–singular, '''P'''–plural
+
 
|-
 
|-
|4 ||  Case    ||   '''N'''–nominative, '''A'''–accusative, '''D'''–dative, '''G'''–genitive 
+
|2 ||  Number ||         '''S''' plural
 
|-
 
|-
|5 ||  Article      ||    '''I'''-without suffixed article (indefnite), '''D'''–with suffixed definite article (definite)
+
|3 ||  Case    ||    '''N''' nominative, '''A''' accusative, '''D''' dative, '''G''' genitive 
 
|-
 
|-
|6 ||  Proper/Common     ||   '''C'''-common noun, '''P'''–person name, '''L'''-place name, '''O'''–other proper name
+
|  ||  Examples     ||     '''N-G''' noun, singular, genitive; '''NS-N''' noun, plural, nominative; '''NPR-A''' proper noun, singular, accusative; '''NPRS-D'''proper noun, plural, dative
 
|}
 
|}
  
Line 27: Line 19:
 
!# || Category/Feature || Symbol – semantics
 
!# || Category/Feature || Symbol – semantics
 
|-
 
|-
|1  ||  Word class  ||    '''ADJ'''–adjective
+
|1  ||  Word class  ||    '''ADJ''' adjective, '''ADJR''' adjective, comparative, '''ADJS''' adjective, superlative
 
|-
 
|-
|2  ||  Gender      ||    '''M'''–masculine, '''F'''–feminine, '''N'''–neuter, '''X'''-unspecified
 
 
|-
 
|-
|3  ||  Number      ||    '''S'''–singular, '''P'''–plural
 
 
|-
 
|-
|4 ||  Case  ||    '''N'''–nominative, '''A'''–accusative, '''D'''–dative, '''G'''–genitive
+
|2 ||  Case  ||    '''N''' nominative, '''A''' accusative, '''D''' dative, '''G''' genitive
 
|-
 
|-
|5 ||  Declension  ||    '''S'''–strong declension, '''W'''–weak declension, '''X'''–indeclineable
+
|  ||  Examples      ||    '''ADJ-N''' adjective, nominative; '''ADJR-D''' adjective, comparative; ADJS-G adjective, superlative, genitive
|-
+
|6  ||  Degree      ||    '''P'''–positive, '''C'''–comparative, '''S'''–superlative
+
 
|}
 
|}
  
Line 44: Line 32:
 
!# || Category/Feature || Symbol – semantics
 
!# || Category/Feature || Symbol – semantics
 
|-
 
|-
|1  ||  Word class ||    '''PRO'''–pronoun
+
|1  ||  Word class ||    '''PRO''' pronoun
 
|-
 
|-
|2 ||    Subcategory    || '''D'''–demonstrative, '''B'''–indefinite demonstrative (Icel. 'óákveðið ábendingarfornafn'), '''Q'''–possessive, '''X'''–indefinite (Icel. 'óákveðið'), '''P'''–personal, '''W'''–interrogative, '''R'''–relative
+
|2 ||   Case     ||   '''N''' nominative, '''A''' accusative, '''D''' dative, '''G''' genitive 
 
|-
 
|-
|3  ||  Gender/Person || '''M'''–masculine, '''F'''–feminine, '''N'''–neuter/'''1'''–1st person, '''2'''–2nd person
+
|  ||  Examples    ||     '''PRO-A''' pronoun, accusative; '''PRO-D''' pronoun, dative
|-
+
|4 ||  Number  ||         '''S'''–singular, '''P'''–plural
+
|-
+
|5  ||  Case  ||      '''N'''–nominative, '''A'''–accusative, '''D'''–dative, '''G'''–genitive
+
 
|}
 
|}
  
Line 59: Line 43:
 
!# || Category/Feature || Symbol – semantics
 
!# || Category/Feature || Symbol – semantics
 
|-
 
|-
|1 ||    Word class  ||    '''D'''–article (determiner)
+
|1 ||    Word class  ||    '''D''' determiner
 
|-
 
|-
|2  ||  Gender      ||    '''M'''-masculine, '''F'''–feminine, '''N'''–neuter
+
|2  ||  Case    ||    '''N''' nominative, '''A''' accusative, '''D''' dative, '''G''' genitive 
 
|-
 
|-
|3 ||  Number      ||   '''S'''–singular, P–plural
+
|  ||  Examples    ||     '''D-D''' determiner, dative; '''D-G''' determiner, genitive
|-
+
|4  ||  Case ||  '''N'''–nominative, '''A'''–accusative, '''D'''–dative, '''G'''–genitive
+
 
|}
 
|}
  
Line 72: Line 54:
 
!# || Category/Feature || Symbol – semantics
 
!# || Category/Feature || Symbol – semantics
 
|-
 
|-
|1  ||  Word class ||      '''NUM'''–numeral
+
|1  ||  Word class ||      '''NUM''' numeral
 
|-
 
|-
|2   || Category    ||    '''P'''-(málfr. frumtala, þ.e. ekki raðtala???), '''F'''-percentage (fraction), '''O'''-other
+
|2  ||   Case     ||    '''N''' nominative, '''A''' accusative, '''D''' dative, '''G''' genitive 
 
|-
 
|-
|3  |Gender      ||    '''M'''–masculine, '''F'''–feminine, '''N'''–neuter
+
|  ||   Examples     ||      '''NUM-N''' numeral, nominative; '''NUM-G''' numeral, genitive
|-
+
|4  ||  Number     ||    '''S'''–singular, '''P'''–plural
+
|-
+
|5  ||  Case        ||  '''N'''–nominative, '''A'''–accusative, '''D'''-dative, '''G'''–genitive
+
 
|}
 
|}
  
 
====Verbs====
 
====Verbs====
 +
(see table below for passive participle (VAN) and past participle (VBN))
 
{|  
 
{|  
 
!# || Category/Feature || Symbol – semantics
 
!# || Category/Feature || Symbol – semantics
 
|-
 
|-
|1  ||  Word class  ||    '''V'''–verb (except for past participle)
+
|1  ||  Word class  ||    '''VB''' verb; '''BE''' VERA (BE verb); '''DO''' GERA (DO verb); '''HV''' HAFA (HAVE verb); '''MD''' modal verb; '''RD''' VERÐA (WILL BE/BECOME)
|-
+
|2  ||  Mood      ||      '''T'''–infinitive, '''M'''–imperative, '''I'''–indicative, '''S'''–subjunctive, '''U'''–supine, '''P'''–present participle, '''D'''-past participle
+
|-
+
|3  ||  Voice  ||    '''A'''–active, '''M'''–middle
+
 
|-
 
|-
|4 ||   Person ||   '''1'''–1st person, '''2'''–2nd person, '''3'''–3rd person,
+
|2 ||   Tense ||   '''P''' present, '''D''' past
 
|-
 
|-
|||   Number  ||     '''S'''–singular, '''P'''–plural
+
||| Mood      ||       '''I''' imperative (no tense), '''I''' indicative, '''S''' subjunctive
 
|-
 
|-
|6 ||  Tense  ||   '''P'''–present, '''D'''–past
+
|  ||  Examples    ||     '''VBI''' verb, imperative; '''VB''' verb, infinitive; '''MD''' modal, infinitive; '''VBPI''' verb, present, indicative; '''BEDS''' VERA, past, subjunctive
 
|}
 
|}
  
Line 104: Line 79:
 
!# || Category/Feature || Symbol – semantics
 
!# || Category/Feature || Symbol – semantics
 
|-
 
|-
|1  ||  Word class  ||     '''V'''–verb (past participle)
+
|1  ||  Word class  ||   passive participle: '''[[VAN]]''', '''BAN''', '''DAN''', '''HAN'''
 
|-
 
|-
|2   ||  Tense      ||   '''D'''–past
+
|2   ||  Case  ||         '''N''' nominative, '''A''' accusative, '''D''' dative, '''G''' genitive
 
|-
 
|-
|3  |Voice        ||   '''A'''–active, '''M'''–middle
+
|  ||  Examples     ||      '''VAN-A''' verb, passive participle, accusative; '''DAN-D''' GERA, passive participle, dative; '''HAN-G''' HAFA, passive participle, genitive
|-
+
|4   ||  Gender     ||      '''M'''–masculine, '''F'''–feminine, N–neuter
+
|-
+
|5  ||  Number  ||        '''S'''–singular, '''P'''–plural
+
|-
+
|6  ||  Case  ||          '''N'''–nominative, '''A'''–accusative, '''D'''–dative, '''G'''–genitive
+
 
|}
 
|}
  
====Prepositions====
 
 
{|  
 
{|  
 
!# || Category/Feature || Symbol – semantics
 
!# || Category/Feature || Symbol – semantics
 
|-
 
|-
|1 ||   Word class   ||    '''P'''–preposition
+
|1 ||   Word class ||    past participle: '''VBN''', '''BEN''', '''DON''', '''HVN''', '''RDN'''
 
|-
 
|-
|2  ||  Case governed   || '''A'''–governs accusative, '''D'''–governs dative, '''G'''–governs genitive
+
|2  ||  Case  ||         '''N''' nominative, '''A''' accusative, '''D''' dative, '''G''' genitive
|}
+
 
+
====Adverbs====
+
{|
+
!# || Category/Feature || Symbol – semantics
+
 
|-
 
|-
|1 ||   Word class   ||   '''ADV'''–adverb
+
| ||  Examples    ||     '''VBN-A''' verb, past participle, accusative; '''DON-D''' GERA, past participle, dative; '''HVN-G''' HAFA, past participle, genitive
|-
+
|2  ||  Category  ||      '''N'''–normal, '''I'''–exclamation
+
|-
+
|3  ||  Degree  ||        '''C'''–comparative, '''S'''–superlative
+
 
|}
 
|}
  
==Simple Tags==
+
====Prepositions====
 
+
 
{|  
 
{|  
 
!# || Category/Feature || Symbol – semantics
 
!# || Category/Feature || Symbol – semantics
 
|-
 
|-
|1 ||   Word class ||     '''CONJ'''–conjunction
+
|1 ||   Word class   ||    '''P''' preposition
|-
+
|2  ||  Category   ||    '''I'''–sign of infinitive, '''R'''–relative conjunction,
+
 
|}
 
|}
  
 
+
====Adverbs====
 
{|  
 
{|  
 
!# || Category/Feature || Symbol – semantics
 
!# || Category/Feature || Symbol – semantics
 
|-
 
|-
|1 ||   Word class ||     '''FOREIGN'''–foreign word
+
|1 ||   Word class   ||   '''ADV'''–adverb, '''ADVR''' comparative, '''ADVS''' superlative
 
|}
 
|}
  
 +
==Simple Tags==
  
{|
+
*[[C]] - complementizer
!# || Category/Feature || Symbol – semantics
+
*[[CONJ]] - conjunction
|-
+
*[[FW]] - foreign word
|1  ||  Word class    ||  '''X'''–unanalyzed word
+
*[[NEG]] - negation, ''ekki'', ''eigi'', ''ei''
|}
+
*[[TO]] - infinitival marker ''að'', 'to'.
 +
*X - unanalyzed word

Latest revision as of 20:57, 9 August 2010

This is the head level tagset used in the IcePaHC.

Tags with postdashial subfeatures

Nouns

# Category/Feature Symbol – semantics
1 Word class N noun, NPR proper noun,
2 Number S plural
3 Case N nominative, A accusative, D dative, G genitive
Examples N-G noun, singular, genitive; NS-N noun, plural, nominative; NPR-A proper noun, singular, accusative; NPRS-Dproper noun, plural, dative

Adjectives

# Category/Feature Symbol – semantics
1 Word class ADJ adjective, ADJR adjective, comparative, ADJS adjective, superlative
2 Case N nominative, A accusative, D dative, G genitive
Examples ADJ-N adjective, nominative; ADJR-D adjective, comparative; ADJS-G adjective, superlative, genitive

Pronouns

# Category/Feature Symbol – semantics
1 Word class PRO pronoun
2 Case N nominative, A accusative, D dative, G genitive
Examples PRO-A pronoun, accusative; PRO-D pronoun, dative

Article (determiner)

# Category/Feature Symbol – semantics
1 Word class D determiner
2 Case N nominative, A accusative, D dative, G genitive
Examples D-D determiner, dative; D-G determiner, genitive

Numbers

# Category/Feature Symbol – semantics
1 Word class NUM numeral
2 Case N nominative, A accusative, D dative, G genitive
Examples NUM-N numeral, nominative; NUM-G numeral, genitive

Verbs

(see table below for passive participle (VAN) and past participle (VBN))

# Category/Feature Symbol – semantics
1 Word class VB verb; BE VERA (BE verb); DO GERA (DO verb); HV HAFA (HAVE verb); MD modal verb; RD VERÐA (WILL BE/BECOME)
2 Tense P present, D past
3 Mood I imperative (no tense), I indicative, S subjunctive
Examples VBI verb, imperative; VB verb, infinitive; MD modal, infinitive; VBPI verb, present, indicative; BEDS VERA, past, subjunctive


# Category/Feature Symbol – semantics
1 Word class passive participle: VAN, BAN, DAN, HAN
2 Case N nominative, A accusative, D dative, G genitive
Examples VAN-A verb, passive participle, accusative; DAN-D GERA, passive participle, dative; HAN-G HAFA, passive participle, genitive
# Category/Feature Symbol – semantics
1 Word class past participle: VBN, BEN, DON, HVN, RDN
2 Case N nominative, A accusative, D dative, G genitive
Examples VBN-A verb, past participle, accusative; DON-D GERA, past participle, dative; HVN-G HAFA, past participle, genitive

Prepositions

# Category/Feature Symbol – semantics
1 Word class P preposition

Adverbs

# Category/Feature Symbol – semantics
1 Word class ADV–adverb, ADVR comparative, ADVS superlative

Simple Tags

  • C - complementizer
  • CONJ - conjunction
  • FW - foreign word
  • NEG - negation, ekki, eigi, ei
  • TO - infinitival marker , 'to'.
  • X - unanalyzed word