|
|||||||||||||||||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|||||||
|
|
|
|
|
|
|
|
|||||||||||||||
|
How the MED was Written
Creating
a completely new dictionary is a great opportunity. Its a chance
to ask fundamental questions,
Writing a new dictionary is a great challenge, too: there are plenty of good learners dictionaries around, so we knew that there was no point in simply producing another "me-too" book. There are three things you need in order to create a really good dictionary:
Each component is vital: you may have the best corpus and the best lexicographers in the world, but you wont produce a good dictionary unless you apply well-thought-through principles to the complex process of analyzing corpus data and converting it into useful, relevant, and easy-to-use dictionary text. Lets look at these three aspects in turn. Like all good dictionaries, the MED is based on a large corpus. Our editors had access to around 200 million words of authentic English, representing a wide range of text-types (including novels, newspapers, academic writing, and recorded conversation) from all the main varieties of English. Analyzing a corpus is normally done with "concordancing" software, and this approach was used by our lexicographers too. Concordances help us to discover the most typical ways in which words behave and combine with other words. But lexicographers are facing "information overload": nowadays, when we search our corpus for a common English word (such as agree, bright, or consequence) we have access to thousands (sometimes tens of thousands) of instances of the word. Since it is really not possible to reliably analyze so much data for a single word, we need new programs that will do some of the work for us. The MEDs editorial team were fortunate in having access to the most advanced language-analysis software currently available: a "lexical profiling" program that searches the corpus and compiles a summary (or "Word Sketch") of the key features of a words behaviour. The resulting profiles were supplied to our team through a collaboration with the University of Brightons Information Technology Research Institute, an internationally known centre for computational linguistics. Read more about 'Word Sketches' here. This combination of corpus data and state-of-the-art software has given us the basis for an unrivalled description of the way words behave and combine with one another. Reliable word frequency data underpins decisions about which words, meanings, and phrases to include in the dictionary and about the order in which information is supplied in the more complex entries. And the Word Sketches have enabled us to provide a uniquely rich account of collocations in English: essential collocates are shown (as in all good learners dictionaries) in the body of the dictionary entry, but the MED also lists thousands of strong collocates in its 450 special "Collocation Boxes". We knew that we had two major tasks if we were going to produce an even better dictionary. First, we would have to do all the "ordinary" things extraordinarily well: that is, providing really good, easy to use definitions, example sentences that are both natural and pedagogically useful, and an account of syntactic and collocational behaviour appropriate to learners at the advanced level. But secondly, we knew we would have to give users much more than this, by going into areas where dictionaries had not yet ventured. The starting point for the MED team was the belief that "the customer is always right" or to put it another way: if students go to their dictionary and they cant find and use the information they need, then it is not the students fault but the dictionarys fault. Arising from this fundamental belief come a number of guiding principles:
What makes the MED really different and, we hope, uniquely useful for learners is this clear distinction between productive and receptive information types. The words in the dictionary are clearly divided into two main classes: core vocabulary and more peripheral items. To elaborate: core (or productive) vocabulary
These words are shown in red in the dictionary, and each headword gets a frequency rating: three stars indicates a word of very high frequency, in the top 2500 most common words in English (e.g. big, place, or change); two stars means the word is still very common but slightly less central than a three-star word (e.g. confident, favourite, or grasp); and one-star indicates a word of medium frequency that is nevertheless worth learning (e.g. confidential, feasible, or gamble). non-core (or receptive) vocabulary
The core vocabulary items (the "red" words) are given detailed treatment, with these main characteristics:
The non-core (or "black") words aim to give learners the information they need and no more. Typical black words provide help with:
These black entries are usually very short, and this has two great benefits for the learner: first, we can include a lot more vocabulary (so students chances of finding the words they need are much higher), and secondly the entries are easy and quick to use allowing users to spend as little time as possible in the dictionary. In the days of Samuel Johnson, dictionaries were often compiled by a single writer. Nowadays, producing a new dictionary is a highly complex operation that can involve literally hundreds of people. The MED is the work of a skilled and creative editorial team, backed up by proven expertise in project management and IT resourcing, and benefiting from the advice of academics, teachers, and language learners from all over the world.
Finally, a few words about how the dictionary was actually compiled. Writing the MED took a little over three years. The process can be divided into four main parts:
Research and development: during the first three months of the project, a small group of experienced editors developed and tested ideas, with the objective of finding better ways of meeting students reference needs. Piloting: next, we started producing samples of text to test the feasibility of the ideas we had developed, and shared these with our advisors and with selected groups of teachers and learners in several different countries. Writing and editing: this is the main stage in any dictionary project, and usually involves dozens of people in several different roles. On the MED project there were several innovations in the way we worked: (1) We used a separate team to write entries for the 7500 core vocabulary items these are usually the most complex words to describe, and require high levels of lexicographic skill. (2) We also had a small team just three people working separately on the "function words": the 200 or so basic grammatical words in English (words like out, of, but, and would). (3) We used a set of about 75 "Template" entries to help our geographically-dispersed team of editors achieve a high degree of consistency when writing entries for words in specific categories (such as Animals, Trees, Body Parts, and Illnesses). (4) Most importantly, for every stage and every component of the project, there was a British English phase and an American English phase: dictionary text went back and forth across the Atlantic repeatedly, so that a "dual-track" database was gradually built up as the project progressed. Finalization: this is the stage where all the components of the dictionary are brought together not just the A-Z text, but also the various usage notes, the Collocations boxes, the unique boxes on metaphor and academic writing skills, the Language Awareness pages written by academic experts, and the hundreds of illustrations and cartoons. All of these elements had to be moulded together by Bloomsburys IT team and the dictionary was at last ready for the printers. |