Crowdsourcing in National Mapping 2017

An International Workshop

Leuven, Belgium April 3rd and 4th 2017


Breakout Session #1 - Summaries of Day 1 Group Discussions

We are very grateful to those delegates who volunteered to act as rapporteurs and take notes during the Group Discussions. The summary of discussions are written and compiled purely out of community goodwill and are intended to represent the views and opinions discussed within the discussion groups. They are of high value for research discussions. The difficult task of rapporteuring in fast paced discussions is acknowledged. Where possible we have left out specific names of people and companies/organisations.

Breakout Session #1 Group 1

Question : What are the needs of National Mapping and Cadaster Agencies (NMCA) from Volunteerd Geographic Information ? What are the needs of industry from VGI ?

Rapporteur: Anthony Simonofski

We chose to expand the initial question that only focuses on NMCA to also focus on industry and academia because the needs might be different according to the perspective. The different needs can be found in bold in this report.

NMCA Perspective

Data Quality is a key need. However, the conception/demand of quality are maybe different. There are clusters of quality (for some cases, it is more relevant than other). NMCA clearly needs the data that is modelled. That is why they offer “fields” (non-mandatory but as checks) to build formal maps.

INSPIRE that focuses more on compliancy than on use. The NMCAs must also focus on potential use instead of compliancy. That is an area of collaboration with users (through Open Data so that the interaction is not only C2G). There is a double-feeding need in organizations. Thus, VIG is an opportunity to re-connect with the end-users (to determines their needs). They can also identify the users by e-mail and can thus blacklist the persons that are not constructive.

An additional need would be to have a completion of the data (ex: tracks, speed limit, …). A consideration for non-terrestrial VGI should also be considered for the future. They try to engage the fishermen to report they data.

The NMCA needs other sources of data collection because they won’t be able to offer it in another manner. They were able to provide data but not really at good quality level (ex: forest data).

When you talk about collaboration, you can also talk about collaboration among levels and among organization. We are one step further from the users (sometime take our data and we lose the connection). The question of the optimal level (federal ? Regional ? Local ?) to enable VGI is also key.

A additional need would be to signal so that the organizations knows when to make an update in the map. There must be a reflection about the taxonomy of crowdsourcers. Sometimes, it is more helpful to y use “specialists” (forest keepers to map forests instead of random citizens). A repository of citizen profiles with their interest in VGI would be helpful for the future. Furthermore, it is also necessary to identify good practices to collect data. The advantage of VGI as data collector is that it is done at a cheap cost. Furthermore, there must be a reflection about a sustainable business model take integrates open data and VGI. When there is less political and financial support, the current business model is not sustainable if we open the data.

A Legal view on VGI regarding intellectual property rights (who is the owner?) is missing from research. Furthermore, a clear role structure (to find out who is responsible for what is also necessary.

Industry Perspective

They want reports to state when a map is wrong on all features in order to have the best quality of data (sometimes they have a timestamp that have). Furthermore, there is also the question about how to stimulate citizen participation ? (for TomTom, they do reports manually but there is a lead to do it by voice). They don’t know why people contribute because they don’t stimulate them.

They also need to have them as complete (in content) as possible (ex: if they want to block a street, they can fill the attributes). In that sense, volume of that from all over the word is essential. Furthermore, the velocity of data Regarding their dependence with NMCA data, they have many sources of data and need to fuse it. These sources are : super-users, normal users. They are tracking the people (anonymously). The provision of data as a product is not part of the business model of TomTom. API offering from NMCA to industry.

Regarding the motivation of users,: they have a map quality phone where they can reach to super-users. People are motivated by having good information, good representation. They don’t reward the users but users can see the status of every report but not reward. Every year, they invite super-users from all over the world at the TomTom week-end. Are the motivations similar in NMCA’s and in private sector ?

There is a need for more advanced crowdsourcing (for ex: mapping of polygon than a pointer)

NMCAs from Southern and Easter Countries

In northern countries, people are free to go in more areas (ex: forest). In the opinion of the present NMCAs, the situation is similar for southern countries. For eastern countries, the situation is various.

In Finland, when they introduced the platform, they build it so user-friendly that it was easy for the users for engage. Sometimes, the application are difficult to use and less user-friendly and intuitive.

Breakout Session #1 Group 2

Question : In terms of collaboration - what are the needs of the VGI community from NMCAs?

Rapporteur: Bert Van Mele

In our discussion VGI is in a great part equal to OSM.
VGI needs are :

  • To work together, constructing a win-win situation.
  • Both have the same attitude and need for completeness and quality.
  • Transparency
    • Have a clear Feedback mechanism.
    • Depends on the project. Small projects don’t need complicated feedback systems.
  • Open data
    • Some NMCA aren’t sure about the return on investment of open data. But from experience no need to worry. OSM uses data from authority for quality assurance. Easy accessible data
      • WFS to work with the data.
      • Properly documented and one url
      • No need for account creation (no password, no registration)
      • Clear data license (creative commence license)
      • Have a good agreement on the license, more important for the NCMA’s.
      • VGI want guarantees that the created data will stay free.
      • Large difference in licenses between countries.
  • Logistical and financial support to get started. (It can be hard for individuals that want to start mapping.)
  • Data quality
    • Information about the quality, like labels.
    • What kind of quality is needed and what is the quality of the authorative data?
    • Not easy from NCMA’s, 50 percent of workforce is need for quality check.
  • No need for unique identifier or unique model OSM
    • In OSM you know the past and the present but not what happened to construct the new situation.
  • Realization that the updates are important. Ex. to make sure an ambulance reaches its destination on time.
    • Was enough for kids to be motivated.
    • No need for gamification in a first phase.
  • VGI tool chain should first look at what’s already existing and seek cooperation. But support from NCMA’s is needed.
    • Collaboration between OSM and government.
      • Think about the end-product together.
      • Governmental data can relatively fast be integrated in OSM but there is a need for support from the NCMA’s.
  • Research topics
    • Licensing
    • Support from NCMA’s in VGI tool chain
    • Feedback mechanisms
    • Quality labels of NCMA data

Breakout Session #1 Group 3

Question: Practically, how do NMCAs and VGI communities work together? Tools, processes, procedure

Rapporteur: Frank Ostermann
  • what is crowdsourcing? active, passive? definition difficult but differences are important
  • keep it open, do not artificially limit it
  • In VGI and crowdsourcing there have been lots of discussion on terminology for the past years, not always very helpful
  • In VGI sometimes community members cannot download their own data, so you cannot demand an upload
  • working together is a two-way lane but who takes initiative?
  • NMAs need to adapt
  • budget cuts need crowdsourcing, but this requires change of thinking
  • more give from NMAs required
  • threat to legitimacy because of OSM and Gmaps, and not enough visibility of NMAs;
  • open data seen as threat, completely different market
  • different user groups, different needs
  • the map is a tool, ranges from basic to advanced, i.e. a whole product range; some information is requested to be hidden or degraded
  • different scales or levels of information, where do public and citizens and NMAs meet?
  • business models vary greatly across countries and NMAs
  • citizens will always prefer seeing the result directly on the map
  • fundamental question on the role of NMAs: Public good or commercial interests; what can the crowd do to enhance the publicity and image of the NMAs? promoting themselves is important; some shared sense is needed, working together
  • Knowing that my contribution is for the good of society, is that good enough?
  • providing common data model or base map
  • competition with OSM not good for NMAs
  • Could the role of NMAs be as facilitator, mediator, expert, deliver mechanism, working with others
  • We must respect feeling of ownership in VGI, Crowdsourcing and NMCAs.
  • no new money for NMAs, do more with less
  • sustainable perspective needed
  • be careful not get to hung up about OSM, NMAs should think about business models that open up data, and then see what OSM does with it
  • two parallel processes: Long process with very accurate data, plus a more short term process working with crowdsourced data; emergency services provide trust-able information
  • establish a standing body of communication and coordination to find out what OSM wants!
  • establish common body of cooperation (organizational task force), offer data and help to use it (technical task force)
  • need for a business model, difficult to keep everybody happy
  • the question misses the element of data policies
  • Need to work together to tackle the big problems (e.g. resilient and energy-efficient cities).
  • what is the intention of NMAs: use crowdsourced data for validation? NMAs cannot do the reseller’s job
  • NMAs use crowdsourced data as cheap alternative to e.g. LIDAR data
  • Overall NMCAs need to change the way we think
  • NMAs are a in difficult position, because they have to do more with less. Both GoogleMaps and OSM are competitors for public attention and potential users that question somewhat the legitimacy of NMAs requiring constant funding.
  • NMAs also have very different approaches between countries.
  • Common problem of all NMAs: They need to change thinking and approach, but in what direction?
  • This is the one big question: What is the role of the NMAs in the future? This determines the relationship with VGI. If the NMAs need to have a business model that generates profit, then the relationship will be more “exploitative” (FO). If the NMAs can act as mediators and facilitators, then the NMAs can be much more open, give away data and expertise to VGI communities.
  • Currently, there still seems to be a lot skepticism on both sides of the aisle - who is going to make the first step?

Breakout Session #1 Group 4

Question: How can VGI and crowdsourced spatial data assist in the work of NMCA’s?

Rapporteur: Jaap-Willem Sjoukema

Why are NMCAs interested in VGI?

Not because it is about spending less money, but to get the data more accurate, get more engagement of users and also get more exposure.

Open data is an important precondition for VGI in NMCA’s. Only other companies like Google and navigation companies with high usage, can afford to make it proprietary, but this does not seem viable for NMCA’s. There are companies who are reselling open data, this is in most cases allowed, but does not seem a sustainable business model.

Should contributors approve the modification of their own contributed data by changes? Putting your data in OSM can be done instantly, because you can overwrite other things. You have to be careful with this, the community could be sensitive.

OpenStreetMap
In Brittain mid-scale data is used by OSM, large scale is still licensed. There is no agreement between OSM and Ordnance Survey, but OSM is using the data of OS. For Ordnance Survey the platform of data is next to the data itself also very interesting. But it is important to realise that using VGI does not equal using OSM, but there are a lot of other sources such as mining Twitter, but also by putting own staff to work with VGI.

Twitter
PhD’s looking at Twitter etc, geo-tagged tweets, but also implicit tweets with natural language tools. The Brittish Geological Survey also examined twitter about landslides, but they only found one tweet which was about a landslide.

Using your own staff
OS did an internal pilot in which their own staff compared stereo pairs of road side imagery and put then traffic signs on the map by tagging them. Also they identified shop signs by using their own staff.

Keeping up-to-date
Interested in national coverage, not in boundaries. Now all changes are captured in 98% of the time within 6 months, sounds fast, but is nowadays slow. Thinking about an extra layer with unchecked changes. Probably this will be put in the attributes, but it is not advanced yet.

Communication data
By using of telecommunication data, you can get interesting patterns. For example when its raining, there are more phone calls. But this data is not distributed accurate enough for mapping purposes, , due to privacy legislation.

Using names of places
Interesting to use the crowd for names for places. In UK the coast guard is used by get the names. You will not get the data about informal places directly from the public, but via the coast guard you can get the informal names and make them formal. It is not yet merged to the core systems.

Using crowd photographs
Spain is interested in combining land cover, land use and landscape comparison. Possible contribution to this is to get photographs of the environment which are accessible for everyone (such as Flickr, Panoramio etc.). Using crowd photographs to assess land cover mapping, is not yet used in a mature way. You also will get a bias, because people take photos of things that interest them.

Geological Apps
Landslides are not exciting in the UK, but the geological survey has to know where it is, because bigger danger can happen. They have a app: iVolcano they get information via that app about eruptions, which excite more people. People will get feedback with that, but it is not instant. MySoil app: everyone can send in their ph-values and stuff, but they can send strange things such as their pets. But it is till not an instant pin on the map: this is what people want to see.

(Blocked) walking paths
OS maps is used by walkers and can upload their own walks. This gives info about rural roads. It is also potentially interesting to land registries, to get information about roads which are blocked. For example they can be privately owned but they are not allowed to close it, so called ‘common lands’. When they are blocked people with mountain bikes, runners etc. cannot use them anymore. This information is interesting for land registries. In Italy the same problem happens with the coast, which is not allowed to be closed off. Overgrown paths are also interesting. The users can be easily engaged, because they have a direct win (unblocking of their land).

Also interesting to let the crowd creating preferred walking paths by letting the crowds say where the paths should be. Nice way to engage them into the system. Could be a useful tool. There are also a lot of forums to communicate such platform. There are also people who try to fight it locally, but by a broader public system more power and information can be involved and information is less dispersed. There is only a problem when you cannot provide a solution, but you are aware of the solution. Ignorance is bliss. It also depends how you interact with your people. Can be useful to work with other applications which already track runners and cyclists such as Strava.

Crofting in Scotland: land administration due to history not very good, because of farmers are renting land. Perhaps tracking sheeps would be interesting. The common lands could be an interesting case study, which will interested the mapping agency and the cadastre. It could be sensitive due to local disputes, but practical. Nice use case in which VGI can contribute. Interesting idea for the hackaton for getting to know these permissive footpaths.

Breakout Session #1 Group 5

Question: How can sources of VGI be successfully combined with NMCA/government/other data?

Rapporteur: Peter Mooney

Risks

  • The risks associated with crowdsourcing data must be calibrated. Combining the data is one of the biggest problems in the future. There are unintended consequences of the combination of data which are not easy to predict or understand.
  • Ethics are not clear. Do we need ethical or legal frameworks? Do they need to be introduced? It is still hard for non-legal people to make the link between the consequence and the linkage of datasets.
  • NMCAs need to be very careful about which data they combine themselves with.
  • Would this be going too far? Do we censor what happens data? We should not restrict the information gatherer. There are restrictions on crowdsourced data - national security, public security, etc. There are deep issues concerned with the restriction of access to data or the recreation of data. Who is making the rules then? If there are serious consequences for data usage - then who would be part of the crowd?
  • Should the goals change for NMCAs? Should they become a gatherer of all crowdsourced data or VGI? Now that NMCAs are not necessarily the sole producers of data - should their role change? Should they take a role of quality assessment of this crowdsourced data or VGI.
  • Completeness, coherence, actuality, etc - these are all needed by NMCAs. Why is the crowd producing data which NMCAs are actually producing? Is this not replication of effort? Wasted effort? NMCAs have been caught in between what they can do and what the crowd or public are requiring? Perhaps in the terms of open data - NMCAs may not be able to provide certain data as open data.

How do NMCAs see their changing role?

  • Do NMCAs become data brokers in the future? Become less traditional? The NMCA still has to produce something that is authoritative - it must be trusted.
  • The crowd is local and international - the crowd is losing trust in experts - there is a general global distrust atmosphere. So who do they trust? “The people” - who are these people? Who are these as customers?
  • NMCAs can give away data as open data - and then other customers can map on it what they want? They can add on crowdsourced layers. If this is something that the public is looking for - then the NMCAs can consider actually making resources available to try to provide these data.
  • The easiest thing is gathering data for the first time. There is no glamour in maintaining data after and into the future. Who will do all of the updates? Passive groups and passive crowdsourced data could be used - but how? OSM and NMCAs agree that updating a map and keeping it current is resource intensive. It is difficult to do. A future potential role is looking at other sources of data as a means of updating the current datasets.
  • NMCAs - long lasting things are the focus,.... Short term things such as fires, innovations, flood, etc. Then the best group is the crowd. Can the crowd be available? The crowd does not have the same view of the map as a NMCA. The crowd are looking for something more instantaneous. Something that involves as they move through.
  • Cross-national and global datasets - are usually basic data - not very advanced data or parameters. Some countries do not have these fundamental datasets - for most of the African countries. In some cases authorities do not want to map locations because they will give them some type of “formality”. What then if a company comes along and then takes the crowdsourced data and then creates a product?

Is there a way forward?

  • The crowd must have the possibility to take part - if you build it they will come. So if this opportunity comes from a commercial company then the crowd might go and join up and partner with these companies? There is a Google example for Google Maps.
  • It is certainly good to combine NMCA and VGI - but what about the usefulness of the data combinations.
  • Generate generic approaches to data combination.
  • Can the crowd go where the NMCA cannot go? Building heights, urban environments, rural environments? Can the NMCAs generate approaches, rather than their formal ways (LIDAR, Radar, etc)?
  • Protocols are leading to a waste of time and effort…. People get frustrated…. People go out and do this for themselves.
  • Crowdsourcing could do 80% of what NMCAs would want. The 20% is the most expensive bit (completeness, temporal, etc) - the crowd map where they live or where they are - the NMCAs must produce their product consistently across their entire territory. So then alerting and change detection is used in a passively harvested way - they are used as a “hint”.
  • How does this work for SDIs? National Geoportals? INSPIRE is there - but it doesn’t link to other countries. There are really only a few products which are European wide and do not stop at the border.
  • If you start a new crowdsourcing activity - perhaps you could avoid problems before you start? Setup a structural system for the collection of crowdsourced data. For example - if staff or university people went out and crowdsourced data - who would be responsible for the data? Is it still crowdsourcing if there are serious restrictions? Is it still crowdsourcing if it is an expert crowd? We have a lot of expert crowds.

Breakout Session #1 Group 6

Question: How do we improve (or measure) the quality of VGI data for NMCA needs?

Rapporteur: Barend Köbben

NOTE: this Q assumes that the VGI quality is NOT good (enough)...?

First opinions round summarised as follows:

  • IGN-B have just started test with feedback: Not for all users, but emergency services: they check (using imagery) the validity of the remarks = point with text info; No measure of quality, just edit the map based on that; no editor/geometry; Own tool;
  • Estonia NMA have same system : 800 messages in last 5 years (by everyone) - more coming by email; some issues: “delete this road by my door because I think its too noisy”; own tool;
  • Lower Saxonia: feedback is only for “our users” (not by app or similar)
  • Low numbers of feedback , because low number of users: “You cannot have crowdsourcing without a crowd”
  • IGN-F has standard user feedback integration (also for users) -> they want to have more “community sourcing” then “crowd sourcing”; eg. municipalities and fire depts (registered users).
  • Also coop with Postal office and OSM for address database.
  • Medicins Sans Frontières use the missing maps and map-swipe (pre-selection) systems; explains the set-up with crowd-sourcing. Quality assurance in validation steps (of remote mapping, by expert mappers), and then validation on the ground (by volunteers as well as professionals); There have been researches into the quality (eg comparing with machine learning => human crowdsourcing is better)
  • There is no clear-cut line between volunteers and professionals;
  • Lower Saxony: there seems to be no interest in using OSM at all, not based on real arguments; in student projects the OSM data actually was equal (and sometimes better then) the official data.
  • ? => do NMAs measure the quality of their data? Yes, IGN-F and IGN-B do (also eg for subcontracters), Lower Saxony does not really measure, MSF does, based on sampling of test cases;
  • Do OSM (and other VGI data sets) suffer in quality because their datamodel is not fixed? E.g. difference between path and road? Certainly on the higher levels (“relations”).
  • Is this only a theoretical problem? Is this different for NMA data (definition of “path” is different between different German State Mapping agencies)?

Problem is that we do not really know:

  • What is quality ? vs “fit for purpose” ?
  • Positional accuracy is easy / just a matter of time (better GPS etc)
  • All the other measurements (consistency, etc) can be made (but are they)
  • Mostly its’ done “by method” : the method is of high quality, therefore the results should be.

ANSWER (sort of):
=> first identify what quality is needed, then measure, then decide what has to be improved => this based on “use by use“ case, depending on the goal you use the data for (different quality for different uses, for different parts of the data)

This leaves unanswered the question HOW to improve...