|A female Chaetoraellia jaceae, a tephritid fly whose larvae feed on Knapweed.|
Photo: (c) mausboam, Flickr.
The usual hope expressed by people doing this type of analysis with OSM data is that by better understanding of these contributions we can improve the number of people who continue to contribute after the initial sign-up and first edit.
My perspective is slightly different, because it is coloured by knowledge of the much longer history of biological recording.
In Britain this can roughly be dated to the 17th century, and John Ray's publication of a flora of the Cambridge area. In the early days compilation of records relied on exchanges of letters, but by the 18th century the collation of data had grown to the extent that for many counties it was possible to produce a flora for the local area. For instance in Nottinghamshire, the first flora was produced by Deering in 1738, followed by a second by Ordanyo in 1807. (In fact there was another written in the 1830s, followed by a gap of over hundred years until the last county flora was written by the Howitts in 1963). The sheer scope of this literature can be seen in Tim Rich's digitised copy of Simpson's 900 odd page A Bibliographical Index of the British Flora (see the BSBI website). Here's a small extract of works which mention aspects of the Nottinghamshire Flora from around 1750-1825
(Nottinghamshire) A catalogue of plants ... about Loughborough; R. Pulteney 1747; Manuscript Leicester Museum 1749, and library Linnean SocietyPerhaps my favourite quote from this period is about the Maiden Pink Dianthus deltoides, for instance in an english language edition of Camden's Britannia around 1722 :
Nottingham; Historical account of the town of, C. Deering 1751, 90.
(Nottinghamshire) An account of some of the more rare English plants observed in Leicestershire; R. Pulteney Philosophical Transactions XLIX 2 (1757) 803.
(Nottingham) A catalogue of some of the more rare plants found in the neighbourhood of Leicester, Loughborough and in Charley Forest; R. Pulteney Philosophical Transactions XLIX (1757) 803, 866; and in ‘The history ... of Leicester', I. Nichols I (1795) clxxvii.
Nottinghamshire. Plantae Cantabrigienses; T. Martyn 1763, 83, [from Deering's Catalogue].
Nottinghamshire The present state of all nations; T. Smollett II (1768) 408.
Nottingham. Description of England and Wales; [Society of Gentlemen] VII (1769) 135.
Nottingham. The complete English traveller; N. Spencer 1771, 495; 1773, 495.
(Nottinghamshire) Manuscript notes in Ray's Synopsis iii; J. Lightfoot [died 1788] & J. Hill [died 1775] library Botany Department Oxford.
Nottingham; Topographical and statistical description of the county of, G. A. Cooke [c. 1802-10] 121.
Nottinghamshire. Botanist's guide; D. Turner & L. W. Dillwyn II (1805) 482.
(Nottinghamshire) Midland flora; T. Purton 1817 2 volumes; appendix, parts 1, 2 & 3 (in 2 parts) 1821.
Nottinghamshire. The scientific tourist in England ...; T. Walford II (1818).
Nottinghamshire. The new British traveller; J. Dugdale IV (1819) 6.
Nottinghamshire; Botanical calendar for, T. Jowett 1826. By "Il Rosajo" in local paper*.
John Ray says, " I find this to be the same pink which groweth so plentifully by the road side on the sandy hill you ascend going from Lenton to Nottingham." Catalogus Plantarum 2 ed., 1677. p. 57.
|Maiden Pink in Nottingham,|
sadly not native but an escape from a green roof.
Photo: copyright the author
Ray's correspondence has more (gruesome detail):
These details just emphasise how much has changed: the sandy hill between Lenton & Nottingham is a four-lane road, Derby Road. This Mapillary sequence is roughly in the relevant location.
Indeed the area was long known as Lenton Sands, although this name is, I think, falling into disuse.
I wasn't not sure where the gallows were located, but a quick web search shows that they were close to what is now the junction of Forest and Mansfield Roads.
Another evocative historic plant location is Nottingham Castle Rock. This is the type locality for the Nottingham Catchfly, Silene nutans, but it is long since extinct in the area. However, at the foot of the rock is another plant location known since John Ray's time: here Alexanders, Smyrnium olusatrum, still grows in a small patch along Peveril Drive.
|Foot of Castle Rock, Nottingham.|
The gates are the old entrance gates to The Park Estate.
The green plants behind the left-hand gate are Alexanders.
It's not just plants either, a friend, David Brown, runs a regular field course in Scotland called Special Spring Moths. He takes the participants to locations which have been known for at least 170 years to see such exotica as the Rannoch Sprawler and the Rannoch Brindled Beauty.
|A Rannoch Sprawler in Poland, a rather better photo than mine of a Scottish moth.|
By Adam Furlepa CC BY-SA 4.0, via Wikimedia Commons
Until the mid-20th century the principle means of record keeping were personal card indexes (David Brown still does things this way). Collation of records over a particular area would be entrusted to a particularly knowledgeable and enthusiastic individual. The ideal was that these records would be periodically consolidated and published, either as a journal article, or, for larger groups, as a book.
In 1964 this changed when the Biological Record Centre was set-up and records started to be computerised. Slightly earlier, in 1962, the first Plant Atlas of Britain & Ireland was published. I think this was the first major publication to organise records on the basis of Ordnance Survey grid squares, which had only appeared relatively recently on consumer, rather than military, map products (the New Popular Edition). Since then major atlases have appeared for a number of groups, with birds and plants each having several editions.
For these better known groups, atlas data pertains to surveys carried out in a defined period. For instance the last Bird Atlas surveys ran from 2007-2011, and the next plant atlas recording period runs until 2020. For most insects (the exceptions being Butterflies and Dragonflies) there are just not enough people interested, or with the specialist knowledge, for such an effort. For these groups, all known records, however old, need to be used. There are currently around 100 beetles known from Nottinghamshire, where the last known sighting was pre-1916. But they can be re-found, as was the nationally rare Hazel Pot Beetle, Cryptocephalus coryli, by Trevor & Dilys Pendleton in 2008. So these older records can still be totally relevant today.
Now to return to the long tail phenomenon.
The point about the various BRC Atlas initiatives is that, in addition to all the information about plants and animals, they also form a range of large and valuable datasets for understanding aspects of how people contribute data in this type of undertaking. (Of course they don't answer the WHY?, but the National Biodiversity Network has recently surveyed contributors, and answers there may be of interest to people concerned with similar issues with OpenStreetMap).
I, of course, don't have access to any of these large datasets at a granularity at which one can ask questions about relative frequency of contributions. However, I have contributed to a small niche dataset, that for Tephritid flies (often called Picture-winged Flies, or Gall Flies, but more widely called Fruit Flies; unfortunately in Britain fruit fly usually means Drosphila melanogaster which belongs to a different family). These are small, but distinctive flies most of which spend their larval phase feeding on fruits of various plants, some galling their hosts. They are common, but not well recorded.
|Tephritis bardanae on Arctium tomentosum, Puchberg-am-Schneeberg, Austria, 2011|
Taken the day before SotM-EU 2011.
|A male of Terellia tussilaginis on Arctium minus, Nottingham.|
Despite the name, these insects only feed on Burdocks.
|A page from Carr (1916) show early records of Tephritidae in Nottinghamshire|
Not all of these records have found there way to the national scheme.
|Records / recorder GB & Ireland Tephritid Recording Scheme 2008|
Of course, and entirely as expected, the histogram shows an exponential decay of numbers of recorders against records. This is exactly like the graph which Marc created which set me of looking at these numbers. The extreme outlier with nearly 2000 records, is, again, as might be expected, Laurence Clemons. No-one devotes their leisure activity to running a recording scheme unless it is a passion. Also note that 2000 records implies considerably less than 100 a year: each record probably represents significant effort.
If I change the size of bins and exclude the more extreme outlier values, the graph looks remarkably the same! Below I show the graph with 50 bins for the visible range with an upper limit of 200 and then 50.
|The same graph as above, but for recorders with under 200 records|
|The same graph as above, but for recorders with under 200 records|
I'm in the 16 records bin.
This is only one dataset. I'm pretty confident that the same patterns will be seen in other atlas datasets, and, as I stated at the outset, in other citizen science datasets. It's pretty much the same pattern we see when we look at OSM contributions.
For me the main point about this is to emphasise that there is much to learn from the long experience of Biological Recording. I think the idea that somehow we can do magical things to change the shape of the contribution curve is belied by these other data collection experiences. Indeed, the co-ordinator of the fantastically successful National Earthworm Recording scheme wrote recently that "I speak from experience when I say that teaching someone to identify a group does not make them record (and build up the necessary experience to become an expert)". Therefore we should accept that it is most likely a reality which we need to recognise.
This does not of course mean that we should cease in efforts to make OpenStreetMap look:
- important (for instance as Missing Maps & HOT have done most successfully);
- useful (Richard Fairhurst's 'driven by cyclist' idea);
- fun (it is, really!);
- interesting (that too);
- or just providing an excuse to get outside (.
OSM is all of these things; it's also educative, a way to meet like-minded folk; a way of learning new skills (mainly in informatics, but also observational). Nor is this pattern a reason not to strive for a more diverse community of contributors. In fact, the very likelihood that mappers "are made not created" tells us that new contributors are as important as ever.
OSM is different in one other important way from biological records: there is a founder effect.
If I see a bird in one place today and the same species tomorrow both are useful records. In OSM if something is already mapped one can't contribute it again (although one can alter what already exists). Over time the really easy things get mapped leaving those that are harder, more tedious or just less useful. This probably doesn't matter too much for the already engaged, but it does possibly put a limit on what a newcomer might feel able to do. However, to date, there is no sign of this happening: see Simon Poole's recent diary post on the subject.
On a more general level, as I said at the outset of this piece, I hope to see some much more detailed and meticulous academic research in the near future. There are now hundreds of different citizen science datasets which can be examined across a range of different domains and types of data acquisition.