I hope the participants in my workshop at JALT will benefit from insights into, and hands-on experience of, dictionary compilation. I shall be talking very much as a practitioner rather than as a theorist. The Advanced Learner's is only a medium-sized dictionary, but in preparing it I have had, nevertheless, to wrestle with the individual complexities of a large subset of the lexicon. And I have read the dictionary from cover to cover (a good read, if rather disjointed!).
I think of the process of creating dictionary entries as being in distinct stages. Perhaps these stages can be seen as similar to those needed for the preparation of a case in a court of law. First the evidence is marshalled, then it has to be sifted and interpreted, then ordered and presented. And the strongest case may fail to convince if it is poorly presented.
Evidence in a court of law may be patchy and unreliable. Dictionary writers, in contrast, have benefited over the last decade or so from the availability of the large language corpora that can supply them with hard evidence of the most convincing kind. They now have objective information to help them make authoritative statements on frequency and collocation, and on meaning as it is revealed through context. Having worked on dictionaries both pre- and post-corpus, I know the value of corpus evidence cannot be disputed. How to use this evidence is also far less problematic than, say, the question of how evidence from corpora, and especially spoken corpora, should be integrated into coursebook materials and classroom teaching. The corpus reveals facts about the language that were not accessible before. And this point has to be stressed -- they were previously absolutely not accessible. Thinking harder or thinking better did not help. No native speaker of English, for example, can tell you 'off the top of their head' whether someone or somebody is more frequent in written English. (In fact, someone is about five times more frequent in the British National Corpus.)
A lawyer working on a case must construct an interpretation of the evidence that is to the advantage of his or her client. Similarly, a good lexicographer is working to produce a version of the facts that is appropriate for a particular identified audience. Several factors will influence the selection of material -- is the dictionary aimed at learners or native speakers of a language, at beginners or advanced students, at specialists or non-specialists? There will be different 'truths' for each. And the corpus evidence may be adapted in order to increase the usefulness to the intended audience. For example, I would defend, and indeed encourage, the use of 'pedagogical' examples, thought up by the lexicographer, where these best illustrate a grammatical point.
The same case presented by different lawyers may not be equally convincing. Not all dictionary entries are equally useful, even if they are based on the same corpus evidence and interpretation. For example, the defining language or style may be inappropriate, or the grammatical information may be presented in a way which baffles rather than illuminates. The organization on the page (or computer screen), even the typographical specification, may facilitate or hinder the users' reception of the content.
There are many questions which preoccupy me as I think ahead to new projects. Electronic dictionaries will free lexicographers finally from the obsession with space and the need to conserve it. But will this necessarily mean better dictionaries? Is there not a case of 'less being more'? For example, with corpus evidence we can say a great deal about -ed adjectives and -ing adjectives and nouns. Do we want to? Or rather, are the interests of the learner served by our doing so? Are there things we should be leaving out of our dictionaries, rather than aiming to put more in? And, most importantly, do we know enough about our users and their reference skills and needs? Have we thought enough about what experience of the world they bring to their use of the dictionary and do we know how to construct our entries accordingly? I hope the workshop at JALT will be a forum for raising these and many other questions, including the consideration of what role dictionaries have in classroom teaching.
Corpus evidence of language in use needs sorting and interpretation before it can form part of a dictionary entry. The presentation of information is all-important. There are still many questions about what it is appropriate to include in learners' dictionaries, and these will be raised at the workshop.