Corpus Design & Area Coverage

 

For more information, see the User's Guide to FRED .


Dialect areas

FRED consists of 370 texts, which total c. 2.5 million words of text or c. 300 hours of speech. This excludes interviewer utterances. FRED covers nine major dialect areas and a multitude of locations:

 

dialect area

size (in words)

% of total

  # of informants

Southwest (SW)

588,864

23.6%

  112

Southeast (SE)

624,431

25.0%

  56

Midlands (Mid)

351,284

14.1%

  58

North (N)

487,477

19.5%

  76

Scottish Lowlands (ScL)

175,819

7.1%

  55

Scottish Highlands (ScH)

23,872

1.0%

  7

Hebrides (Heb)

142,682

5.7%

  53

Isle of Man (Man)

10,461

0.4%

  2

Wales (Wal)

88,755

3.6%

13

TOTAL

2,493,645

432

 

 


Counties

Each dialect area is subdivided into different counties. A detailed break-up according to county can be seen from below:

size (in words)

% of corpus total

Devon (DEV)

92107

3,7%

Wiltshire (WIL)

176028

7,1%

Somerset (SOM)

204239

8,2%

Oxfordshire (OXF)

14043

0,6%

Cornwall (CON)

102447

4,1%

Suffolk (SFK)

312414

12,5%

Kent (KEN)

174420

7,0%

Middlesex (MDX)

31048

1,2%

London (LND)

106549

4,3%

Leicestershire (LEI)

8311

0,3%

Nottingham (NTT)

160617

6,4%

Shropshire (SAL)

174180

7,0%

Warwickshire (WAR)

8176

0,3%

Ambleside (WES)

149747

6,0%

Newcastle (NBL)

28432

1,1%

Middlesb. (DUR)

26602

1,1%

York (YKS)

87585

3,5%

Lancashire (LAN)

195111

7,8%

Angus (ANS)

17759

0,7%

Banffshire (BAN)

5468

0,2%

Dumfriesshire (DFS)

9282

0,4%

East Lothian (ELN)

36032

1,4%

Fife (FIF)

3465

0,1%

Kincardineshire (KCD)

7092

0,3%

Kinrosshire (KRS)

2218

0,1%

Lanarkshire (LKS)

3599

0,1%

Mid Lothian (MLN)

30123

1,2%

Peebleshire (PEE)

14584

0,6%

Pertshire (PER)

20545

0,8%

Selkirk (SEL)

9032

0,4%

West Lothian (WLN)

16620

0,7%

Ross and Cromarty (ROC)

10131

0,4%

Sutherland (SUT)

10615

0,4%

Inverness-shire (INV)

3126

0,1%

Hebrids (HEB)

142682

5,7%

Denbighshire (DEN)

37284

1,5%

Glamorganshire (GLA)

51471

2,1%

Isle of Man (IOM)

10461

0,4%

 


Informants

FRED contains data from 432 different informants (excluding interviewers). The majority of these speakers are NORMs - non-mobile old rural males - who typically grew up before WW I. Most of the material included in FRED stems from Oral History Projects where informants have been interviewed to record their life memories. In order to remedy the linguistic shortcomings of the original Oral History Material, trancripts have been carefully edited, or re-transcribed where necessary, before being included in the corpus. 

Of FRED's 432 informants, 275 (63.7%) are male, and 132 (30.6%) are female (sex is unknown for the rest). In all, 76.8% of the textual material in FRED is produced by male speakers, and 21.7% by female speakers.

The age of informants included in FRED ranges from 6 years to 102 years. Mean age ist 75 years. A breakdown of age groups, according to the amount of text produced by them, can be seen from the table below. As can be seen, about three quarters of the textual material in FRED is produced by speakers that are older than 60 years.

age group # of speakers # of words produced % of textual material in corpus produced
0 - 14 years 9 7501 0,3%
15 - 24 years 14 30336 1,2%
25 - 34 years 2 5312 0,2%
35 - 44 years 4 10633 0,4%
45-59 years 17 108472 4,3%
60+ years 237 1862618 74,6%
unknown 149 470968 18,9%

 

The oldest of FRED's informants was born in 1877. Overall, 14 informants (3.3%) were born between 1880 and 1889, 60 informants (14.3%) were born between 1890 and 1899, 96 informants (22.9%) were born between 1900 and 1909, and 64 informants (15.2%) were born between 1910 and  1919) This means that 89% of all informants in FRED were born before 1920. A breakdown of age groups according to the amount of text produced can be seen from the table below.

 

date of birth # of speakers % of speakers % of corpus material produced
1870-1879 1 0,2% 0,3%
1880-1889 13 3,0% 5,3%
1890-1899 60 13,9% 18,7%
1900-1909 100 23,1% 32,9%
1910-1919 64 14,8% 18,6%
1920-1929 24 5,6% 7,0%
1930-1939 5 1,2% 1,3%
1940-1949 3 0,7% 0,4%
1950-1959 2 0,5% 0,2%
total 272 63,0% 84,5%
unknown 160 37,0% 15,5%
GRAND TOTAL 432 100,0% 100,0%

 

Click here to view a table with information on each speaker in the corpus.

 


Recordings

The material included in FRED has been recorded between 1968 and 2000. A detailed breakdown of recording dates can be seen from the table below.

 

recording date

# of speakers

% of all speakers

1961 - 1969

2 0.5%

1970 - 1979

123 28.5%

1980 - 1989

175 40.5%

1990 - 1999

59 13.7%

2000

2 0.5%

unknown

71 16.4%

 

Last updated: 07/23/05