DASL Project: Plan and
Progress
1) Establish process team
to investigate sociolinguistic annotation project and develop team charge
2) Identify an appropriate
sociolinguistic variable for annotation and analysis
3) Devlop a coding scheme
(annotation specification)
-
Coding
scheme developed (following Guy, 1980 with some modifications)
4) Modify LDC-Online to
allow for easy searching of the corpus/corpora, easy audio playback of
examples, and coding/annotation of relevant tokens. Additional modifications
must allow for easy exporting of the coding string/annotations to an external
program, and the inclusion of speaker demographics within the coding string.
5) Coordinate with part-time
annotation staff to complete annotation. This involves training,
annotation and quality assurance.
-
Annotation is currently in progress
-
TIMIT corpus - annotation complete
as of 9/22/2000
Results
from TIMIT annotation; includes VARBRUL analysis
Dual annotation of ~5% of
TIMIT corpus currently underway (11/2000)
-
Annotation of Switchboard corpus
in progress as of 1/2001
-
Annotation of additional corpora
(CallHome, Broadcast News) to follow
6) Produce documentation of
the corpus creation effort: annotation guide, tools & resources, QA
efforts, results, comparison with other studies of the variable; create
a website containing corpus documentation and results.
-
Website
with project documentation established
7) Publicize the project
within sociolinguistics community and solicit feedback from sociolinguists.
-
Members of the DASL Project
will attend NWAVE 2000