Can crowd sourcing ever take the place of traditional datasets and collection types?
Ask this question to a variety of different geospatially inclined people and I am certain you will get a wide variety of very passionate and yet different responses. For those who may not be familiar with the term crowd sourcing, it is the practice of utilizing amateurs or the public at large to aid in providing information or data which has become available with the increases in technology. Two examples that you may have seen are the open street map project and to some extent even Wikipedia, both locations where the general public is able to add content in an attempt to refine what already exists or expand upon it.
Even Google Maps is beginning to use crowd sourcing, it recently announced that the Google Map Maker will open its editing tools to everyone. One reason for this shift is to allow locals to add content which Google’s teams would have difficulty obtaining. A review process is in place to ensure that the added content is verified but essentially local individuals add content directly to the live maps.
Another good example of crowd sourcing is the urban forest map project which was utilized in San Francisco to map the location and get information about trees all around the city. The program has met with great success as the community heavily participated in providing a large amount of information. This has saved the city a large amount of money as this information is being collected by empowered citizens and traditional tree surveys have not needed to be conducted.
While these two examples are great uses of getting the public involved and providing content, there are several concerns and problems which immediately rise to the surface making this practice controversial. The three most common concerns are accuracy, updates or maintenance, and documentation or metadata.
As geospatial professionals we understand the critical importance of data accuracy. Hours upon hours can been spent in discussion about differences in accuracy and precision as they relate to different geographic datasets. Many wonder how high the accuracy of a data set which is being edited and created by an untrained member of the public can be. This can be argued from both points of view. Some argue that it is impossible for crowd sourced maps can be as accurate as professionals creating data. While others note that the accuracy greatly increases because local knowledge is able to correct mistakes and expand on information which cannot be known from traditional methods. This leads to one of the advantages of crowd sourcing in that you have access to an endless supply of local knowledge based peer reviewers who together are able to refine information and accuracy, while in traditional data collection practices constant revisions can be costly and impractical.
Like many discussions the reality is probably somewhere in the middle and dependent on what layer is being created and for what purpose. For some data such as utilities which require high accuracy levels as they relate to other projects and inaccuracies can cause many problems, traditional data collection methods should be used over crowd sourcing options. Other data sets which are intended for basic navigation or public knowledge and don’t require a high level accuracy, such as business or tree locations you would expect might have a better accuracy in crowd sourcing as local knowledge is able to offer information which be difficult for a data collector to know and increased peer review can refine.
The timeliness of data created from crowd sourcing can be difficult to distinguish, unless it is included in the original collection. However one important consideration of crowd sourcing to remember is that if the community is involved, it is likely that changes on a local level will be reported quickly by locals, where in the traditional data update process it would take time for changes to be reflected.
Metadata and other documentation can be among the most difficult parts to complete when using crowd sourcing as your public is made from a variety of professions and technical levels. While you are likely to have some experts editing data you are just as likely to have citizens contributing who do not have the same level of background or technical experience. With such a wide variety of potential editors, understanding who exactly is maintaining the data can be hard, where in a traditional data collection you know exactly who created the data, how the data was created, and any details which are relevant to the data.
All of these different factors need to be considered when determining whether utilizing crowd sourcing options are the best option or where traditional data collection remains superior. It does appear that at present both forms have many advantages and disadvantages. Determining when to use which is really dependent on the type of project or data that will be collected, edited, and maintained.
No comments:
Post a Comment