Date: Mon, 24 May 2004 12:02:18 -0400 Subject: some query language cases To: Steven Bird From: Beatrice Santorini here's are some examples of query language requirements that have come up in my work on the emodeng corpus so far. i'll let you know of more as soon as i can; it may be till the end of the week till our workstations are set up. A. add/delete brackets; i.e. go from the (a) cases to the (b) cases, or vice versa. this presumably is easy to do, at least adding the np part, perhaps not the -sbj), but would be quite useful because it is needed a lot. (1) a. (cp-frl what i want) is not so easy to get. b. (np-sbj (cp-frl what i want)) is not so easy to get. B. adding/deleting elements in easily identifiable contexts. in the diachronic english corpora, we analyze indirect and direct questions as in (2) and (3). (2b) contains a null complementizer by analogy to (2a), which is grammatical in earlier forms of english. (2) a. i don't know [cp-que [wadvp why] [c that] [ip-sub he did that ] ] overt c(omplentizer) b. i don't know [cp-que [wadvp why] [c 0] [ip-sub he did that ] ] null c(omplentizer) (3) [cp-que [wadvp why] [ip-sub did he do that ] ] never with overt c(omplentizer) mistakes occur as in (4). (4) a. i don't know [cp-que [wadvp why] [ip-sub he did that ] ] ^^^^^ add null c(omplentizer) as in (2b) b. [cp-que [wadvp why] [c 0] [ip-sub did he do that ] ] ^^^^^ delete null c(omplentizer) as in (3) C. small clauses - vanilla. some frameworks/corpora analyze a sentence like (5) as in (6a); others as in (6b). (5) we saw smarty jones win. (6) a. we saw [np-obj smarty jones] [vb win] . monoclausal analysis b. we saw [s [np-sbj smarty jones] [vb win] ] . small clause analysis it would be useful to go from (6a) to (6b), as well as the other way around. of course, from (6b) to (6a) is more difficult. D. small clauses - a bit more complex. in the vanilla case, the small clause predicate is a verb. in other cases, the predicate is a whole phrase - say, a prepositional phrase. one might want to rebracket (7a) as (7b) (or vice versa). (7) a. we consider [np-obj smarty jones] [pp in the running]. monoclausal analysis b. we consider [s [np-sbj smarty jones] [pp in the running] ]. small clause analysis however, one would not usually want to rebracket (8a) as (8b). (8) a. we put [np-obj the cats [pp in the carrier]. monoclausal analysis b. we put [s [np-sbj the cats] [pp in the carrier] ]. small clause analysis - no! so it would be nice to be able to distinguish cases like (7) and (8) on the basis of the matrix predicate. in complicated cases like this one, it might be best not to replace one structure by the other, but merely to flag the cases for inspection. E. conjunction. in the current emodeng corpus, np conjunction is generally analyzed as in (9a). to make the emodeng corpus compatible with the mideng corpus, such structures will have to be rebracketed as (9b). of course, one might want to go from (b) to (a) as well. (9) a. (np (np the first conjunct) (conjp (np second conjunct)) (conjp (conj and) (np third conjunct))) a. (np (np the first conjunct) (conjp (nx second conjunct)) determiner-less np replaced by nx (conjp (conj and) (nx = n' = intermediate between np and n) (nx third conjunct))) note that (9) must be distinguished from (10), where the non-first conjuncts are unequivocally full np's with determiners or possessives of their own. (10) (np (np my first conjunct) (conjp (np your second conjunct)) ^^^^ (conjp (conj and) (np the third conjunct))) ^^^ (11) shows a mixed case. again, one might want to go from (a) to (b) or vice versa. (10) a. (np (np my first conjunct) (conjp (conj and) (np (np your second conjunct)) ^^^^^^^ (conjp (conj and) (np third conjunct))))) ^^^^ b. (np (np my first conjunct) (conjp (conj and) (np (np your second conjunct)) ^^^^^^ np stays because of possessive (conjp (conj and) (nx third conjunct))))) ^^^^ np replaced by nx because no det/poss