CP-CMP

From Icelandic Parsed Historical Corpus (IcePaHC)
Revision as of 09:20, 21 May 2010 by Anton (Talk | contribs) (SEM and SVO SEM)

Jump to: navigation, search

Being revised, see CP-CMP-OLD for old guidelines that will be removed once these are finished ...

EN (BETUR EN (CP-CMP) ...)

Overt clausal structure

(ADVP (ADVR betur-vel)
    (PP *ICH*-6))
(VAN BÚNIR-BÚA)
(PP (CONJ *ICH*-5)
  (PP (P að-að)
      (NP (N-D viti-vit)))
  (CONJP (CONJ og-og)
	 (PP (P að-að)
	     (NP (N-D máli-mál)))))
(PP-6 (P en-en)
    (CP-CMP (WADJP-7 0)
	    (C 0)
	    (IP-SUB (ADJP *T*-7)
		    (NP-SBJ (PRO-N vér-ég))
		    (BEPS SÉIM-VERA))))))))
  (ADJP (ADJR-N helgara-helgari)
	(PP (P en-en)
	    (CP-CMP (WNP-1 0)
		    (C 0)
		    (IP-SUB (IP-SUB-2 (NP-SBJ (NS-N menn-maður))
				      (MDPS megi-mega)
				      (PP (P eftir-eftir)
					  (NP *T*-1))
				      (VB GLÍKJA-GLÍKJA))
			    (CONJP (CONJ eða-eða)
				   (IP-SUB=2 (RP frá-frá) (VB segja-segja)))))))
(IP-SUB (NP-SBJ (PRO-N hún-hún))
      (VBDI tókst-taka)
      (NP-MSR (ADJR-A MEIRA-MIKILL)
	      (PP *ICH*-1))
      (PP (P á-á)
	  (NP (NS-A hendur-hönd)))
      (IP-INF (NP-OB1 (NPR-D Guði-guð))
	      (TO að-að)
	      (VB þjóna-þjóna))
      (PP-1 (P en-en)
	    (CP-CMP (WNP-2 0)
		    (C 0)
		    (IP-SUB (IP-SUB-3 (NP-MSR *T*-2)
				      (NP-SBJ (N-N boðorð-boðorð))
				      (BEDS væri-vera)
				      (RP til-til))
			    (CONJP (CONJ eða-eða)
				   (IP-SUB=3 (NP-SBJ (N-N dæmi-dæmi)))))))))))

(Partly) reconstructed clausal structure

When the comparative clause contains only a nominative subject, we treat it as if the rest of the comparative has been elided. In other words, the comparative is parsed as if it contained a full subordinate clause, even though only the subject of that clause is overt.

Note that this is unlike the parse of sentences like "John is bigger than Bill" in the English corpora (PPCME2, PPCEME), where "than Bill" is parsed as a PP without internal clausal structure. Since there is no prepositional comparative in Icelandic, except possibly the "No clausal structure" case described under that heading below, comparatives almost always contain some clausal structure in the Icelandic corpus.

  (ADJP (ADJR-N HELGARI-HELGUR)
	(PP (P en-en)
	    (CP-CMP (WADJP-5 0)
                    (C 0)
		    (IP-SUB (ADJP *T*-5)
			    (NP-SBJ (OTHERS-N aðrir-annar))))))))

Object comparatives

Treated exactly the same as the previous case, where all but the subject is elided. The difference is that here, the IP-SUB contains only an object, rather than only a subject.

Hann vill banana frekar en (hann vill) appelsínur.

Honum tókst setningafræðin betur en (honum tókst) hljóðkerfisfræðin.

(ADVP (ADVR frekar)
       (PP (P en-en)
           (CP-CMP (WADVP-1 0)
                   (C 0)
                   (IP-SUB (ADVP *T*-1)
                           (NP-OB1 (NS-A appelsínur-appelsína))))))))))

No clausal structure

For ÖÐRUVÍSI 'different', as in: Jón er öðruvísi en Joel (*er). Then EN is simply a PP with an NP complement.

SEM and SVO SEM

Currently under revision, see SVO SEM for older guidelines.

SEM is always tagged C.

Every clause containing a SEM complementizer also contains a gap.

SEM comparatives do not have an introducing P head: they are just CP-CMP, with SEM in C and a gap of the appropriate kind in Spec(CP). Note that this is not the same as the treatment of "as"-comparatives in the PPCME2, PPCEME, even though SEM is sometimes translated similarly to English "as".

SEM comparative with partly reconstructed clausal structure:

   (NP (D-A þá-sá)
       (N-A mállýsku-mállýska)
       (, ,-,)
       (CP-REL (WNP-5 0)
	       (C ER-ER)
	       (IP-SUB-6 (NP-OB1 *T*-5)
			 (NP-SBJ (PRO-N ÉR-ÞÚ))
			 (MDDI kunnuð-kunna)
			 (ADVP (ADV JAMT-JAFNT)
			       (CP-CMP *ICH*-9))
			 (VB skilja-skilja)
			 (IP-SUB-PRN=6 (CONJ og-og)
				       (PP (P um-um)
					   (IP-INF (TO að-að) (VB mæla-mæla))))
			 (CP-CMP-9 (WADVP-8 0)
				   (C sem-sem)
				   (IP-SUB (ADVP *T*-8)
					   (NP-SBJ (PRO-N vér-ég))))))))))))))))
  (ADVP-TMP-RSP (ADV þá-þá))
  (BEPI er-vera)
  (NP-SBJ (PRO-N hún-hún))
  (ADJP (ADJ-N jafnbjört-jafnbjartur)
	(CP-CMP (WADJP-2 0)
		(C sem-sem)
		(IP-SUB (ADJP *T*-2)
			(ADVP-TMP (ADV áður-áður)))))

Left-dislocated SVO SEM clause:

( (IP-MAT (ADVP (ADV Nú-nú))
	  (ADVP-LFD (ADVR svo-svo)
		    (CP-CMP (WADVP-1 0)
			    (C sem-sem)
			    (IP-SUB (ADVP *T*-1)
				    (NP-SBJ (Q-N allir-allur)
					    (NS-N hlutir-hlutur)
					    (NP-PRN *ICH*-2))
				    (RDDI urðu-verða)
				    (PP (P að-að)
					(NP (ADJ-D helgum-helgur) (NS-D dómum-dómur)))
				    (, ,-,)
				    (NP-PRN-2 (PRO-N þeir-hann)
					      (CP-REL (WNP-3 0)
						      (C ER-ER)
						      (IP-SUB (NP-SBJ *T*-3)
							      (NP-11 (NPR-D Drottni-drottinn)
								     (NP-POS (PRO-D ÓRUM-VOR)))
							      (BEDI VÓRU-VERA)
							      (ADJP (ADJS-N NÁLÆGSTIR-NÁLÆGUR)
								    (NP *ICH*-11)))))
				    (, ,-,)
				    (CP-ADV (WADVP-12 0)
					    (C sem-sem)
					    (IP-SUB (ADVP *T*-12)
						    (NP-SBJ (NP (N-N ETAN-JATA)
								(, ,-,)
								(CP-REL (WNP-5 0)
									(C ER-ER)
									(IP-SUB (NP-SBJ (PRO-N hann-hann))
										(BEDI var-vera)
										(PP (P í-í)
										    (NP *T*-5))
										(VAN lagður-leggja)
										(, ,-,)
										(ADVP-TMP (ADV þá-þá)
											  (CP-ADV (WADVP-6 0)
												  (C ER-ER)
												  (IP-SUB (ADVP-TMP *T*-6)
													  (NP-SBJ (PRO-N hann-hann))
													  (BEDI var-vera)
													  (NP-PRD (N-N barn-barn))))))))
							    (, ,-,)
							    (CONJP (CONJ eða-eða)
								   (NP (N-N klæði-klæði)))
							    (CONJP (CONJ eða-eða)
								   (NP (Q-N margir-margur) (NS-N hlutir-hlutur) (OTHERS-N aðrir-annar)))))))))
	  (, ,-,)
	  (NP-SBJ *exp*)
	  (ADVP-RSP (ADV þá-þá))
	  (MDPI má-mega)
          [...]

SVO SEM with stylistically fronted SVO:

  (CP-REL (WNP-1 0)
	  (C ER-ER)
	  (IP-SUB (NP-SBJ *T*-1)
		  (ADVP-2 (ADVR svo-svo))
		  (BEDS væri-vera)
		  (ADJP (ADVP *ICH*-2)
			(ADJ-N hreinlíf-hreinlíf)
			(CP-CMP (WADJP-4 0)
				(C SEM-SEM)
				(IP-SUB (ADJP *T*-4)
					(NP-SBJ (N-N líkneski-líkneski))
					(RDPI verður-verða)
					(PP (P með-með)
					    (NP (D-D þeim-sá)
						(N-D lit-litur)
						(CONJP (CONJ og-og)
						       (NX (N-D geisla-geisli)))
						(, ,-,)
						(CP-REL (WNP-3 0)
							(C sem-sem)
							(IP-SUB (NP-SBJ *T*-3)
								(PP (P í-í)
								    (ADVP (ADV gegnum-gegnum)))
								(VBPI skín-skína)))))))))))))