Geo: Using the Mapbox SDK for Android

Hello all!

We're considering a big change, and as usual we'd like to air it here and collect some feedback from you, our community, before moving forward.

Today, ODK Collect offers a choice of two Mapping SDKs (in the General Settings under "User interface"):

  • Google Maps for Android: This is the built-in map SDK on Android, provided by Google. It's designed primarily to show the Google basemap, and it has a user interface that feels like the Google Maps app.

  • OSMdroid: This is an open-source library for drawing maps. Although it's called OSMdroid, it's not specific to OSM; it's really just a library for displaying any raster map tiles. By default, we configure it to use map tiles from OSM.

It's useful to make a distinction between the map SDK and the map data. Currently, the "Mapping SDK" setting is effectively a choice between Google map data and OSM map data, because the two SDKs are configured that way. But any SDK could be used to display any map data, if it's in a format that the SDK supports.

We're considering adding the Mapbox SDK for Android. If we go this route, it will probably replace the OSMdroid library, because it looks like it offers all the functionality of OSMdroid and more. There are several motivations:

  • The Mapbox SDK can draw maps using raster tiles (like OSMdroid today) or vector tiles (which are faster and more responsive).

  • If used with the online map data provided by Mapbox, we'd have a beautiful, fast, vector-based map based on OSM data.

  • Support for vectors would allow us to support offline vector map tiles, which are far more storage-efficient than offline raster map tiles. The difference is significant enough that this may enable new use cases that previously required prohibitively large files.

  • The Mapbox SDK has built-in support for displaying GeoJSON. This is useful in itself, and also opens the door toward the possible future goal of interacting directly with preloaded geometry (e.g. collecting data about existing roads or buildings).

  • The Mapbox SDK is being actively developed. We currently use OSMdroid 5.6.4, which is 2+ years old; the current version is 6.1.0. If we want to benefit from OSMdroid's ongoing development, we'll need to migrate to 6.1.0, which offers lots of new features, but also has some major and incompatible changes.

Notably, however, the Mapbox SDK documentation says that you must sign up for a Mapbox access token in order to use the SDK. We don't want users to be required to register with a commercial entity in order to use ODK Collect—so either we would need to package our app with a built-in access token (i.e., get permission from Mapbox to distribute a special ODK access token), or we would need to find a way to use the Mapbox SDK without an access token.

It would make sense that the access token exists to control access to the online map data provided by Mapbox, not to prevent use of the SDK code itself. The Mapbox SDK for Android is open source, so it seems like it should be possible to use it to display your own map data without an access token. And I've confirmed this—I can initialize the SDK with a blank token and tell it to display a GeoJSON file, for example, and it works just fine. It would be nice to know how Mapbox feels about using their SDK this way—is it considered impolite, or could we reasonably expect the SDK to continue to work on non-Mapbox data with a blank access token? @Marena, can you shed some light on this?

Here's what we're currently imagining. If the Mapbox SDK is selected, it operates in one of two modes:

  1. If there is an access token (either we distribute one or the user enters one), we display the beautiful Mapbox vector basemaps.

  2. If there's no access token, we fall back to displaying the same OpenStreetMap raster tiles that OSMdroid uses now.

I'm going to be exploring this possibility over the next couple of weeks to determine its feasibility. I'd love to hear what you all think about this plan!

Thanks for your input!

I'm tagging a few folks who've previously engaged in conversations about our Geo features: @mathieubossaert @ggalmazor @Ivangayton @paul_macharia @paul.uithol @smit1678 @Neil_Penman @danbjoseph

I'd love to hear what you think!

1 Like

And on the developer side: @yanokwa @LN @cooperka @tomsmyth @langstonsmith

1 Like

A related previous discussion is here: Adding Mapbox vector tile basemaps - #31 by zestyping

Wonderful, thank you for opening this @zestyping

On the question of tokens, what we had proposed was:

  • For the official, app-store app, ODK uses an access token linked to a single ODK-managed account and we'd work with ODK to keep the account traffic covered for free (like how Google does currently for ODK Collect's Google API services, right?). Would it be possible to have a built-in access token in the app when downloaded? Why would you need to distribute the access token?

  • With a built-in token, users of the official app wouldn't ever need to think about tokens if they didn't want to. They would be able to use the Mapbox styles that are offered from ODK's account, without ever needing to set up a Mapbox account themselves. (If they wanted to use styles from their own Mapbox account, they'd have to either a) make the style public and use it with the built-in ODK access token via some way to enter a custom map style URL or b) have a way to enter their own access token.)

  • However, since ODK Collect is open source and can be forked to be used in any number of ways, we'd need to keep this token hidden to prevent token abuse (i.e., token not publicly published to Github and won't carry over to any clone/fork of the ODK Collect repo). This means that if developers want to create a fork of ODK Collect and use the Mapbox SDK, they would need to have their own Mapbox account and use their own token. That seemed acceptable in the discussion over in Adding Mapbox vector tile basemaps - #28 by Marena

  • I think your two mode option (access token or no access token) could work - but why not simply build it so there is a built-in access token, linked to an ODK-managed (and Mapbox-couponed) account?

For the other part of your question, @zestyping, yes the code for the Mapbox SDK is open source so there might be ways you could build a way to use the code without needing to link it to Mapbox services (e.g. your own raw GeoJSON files). (@langstonsmith please correct me if I'm wrong here.) There's no guarantee how well this would work, how well it would continue to work through various versions, and Mapbox volunteers would be limited in the amount of support we'd be able to provide (because it would be departing from the use cases that we're building for).

One of the big benefits of using the Mapbox SDK is access to the Mapbox basemap data. And we are willing to provide the basemap data for free to ODK Collect users via a central couponed Mapbox account. So if there is a way to address the concerns about access tokens while still providing Mapbox map data, that seems like the better option.

Please let me know if there are complications that I'm overlooking (for example, is it difficult to have a built-in, but hidden, access token within the ODK Collect app?)

1 Like

Thank you so much for explaining everything, @Marena!

A quick clarification on this:

Why would you need to distribute the access token?

All I meant by "distribute" was distributing the token inside the app. The access token would be built into the app, so distributing the app would in a sense also be distributing the token.

I think your two mode option (access token or no access token) could work - but why not simply build it so there is a built-in access token, linked to an ODK-managed (and Mapbox-couponed) account?

I can think of two reasons to continue supporting both modes:

  • If ODK Collect only works with a Mapbox access token, and the token is not in ODK Collect's GitHub repo, then any developer that tries to fork or clone ODK Collect and build it will find that they have a non-working app.

  • If ODK Collect only works with a Mapbox access token, and in the future Mapbox ever finds itself in circumstances that make it no longer comfortable or able to provide free service for the ODK token, that would suddenly cause every copy of ODK Collect everywhere to break. Of course, we all hope that never happens, but it seems impossible for anyone (even Mapbox) to guarantee that it won't, and the consequences would be so severe that it doesn't seem responsible of us to not provide a fallback. You're correct that we are in a similar position with our Google Maps API key—Google could also disable their key and break ODK Collect. Today, if that happens, users can choose OSMdroid as a fallback. But if we replace OSMdroid with the Mapbox SDK and it also only works with a key, then there would be no fallbacks left.

So, it seems to me that this is worth doing, but I thought it best to check with you before going ahead. I know that using a blank access token is non-standard, and I wouldn't want to use your SDK in a way that you consider rude or abusive. Does that seem okay to you?

One of the big benefits of using the Mapbox SDK is access to the Mapbox basemap data.

Absolutely!

Please let me know if there are complications that I'm overlooking (for example, is it difficult to have a built-in, but hidden, access token within the ODK Collect app?

I think it's possible, but it does complicate things a little. If we can find a way to do it without too much difficulty, I'd love to see Mapbox's beautiful basemaps as the default in ODK Collect going forward.

1 Like

@zestyping has front loaded the technology decision to make but taking a step back, I think the feature here is "Enable custom, offline vector layers for geo questions." This is why it's separate from https://forum.getodk.org/t/adding-mapbox-vector-tile-basemaps/ which is focused on adding support for online Mapbox basemaps.

As was alluded to in some of the points above, vector layers are useful because:

  • users have custom datasets from prior data collection efforts or sources like government ministries that they'd like to load onto data collectors' devices. Vector tiles are much smaller than raster tiles and support arbitrary zooming.
  • users would like to be able to select features from their custom map layers rather than selecting a point(s) and then having to figure out what feature it relates to. For example, let's say that a data collection effort needs to collect information about hospitals. Perhaps the location of the hospitals is already known and it would be much better for the data collector to tap on a hospital from a custom map layer than to collect a point that gets connected to the hospital in later analysis. See more about this requirement in @paul.uithol's comment here.

As @zestyping points out, Collect currently uses both Google Maps and OSMdroid as mapping engines. In terms of vector layers, both support GeoJson. Google Maps additionally supports KML. Both are very powerful but I believe they don't address the first requirement because they can be quite large. As I write that, I wonder how much zipping those assets could help? Also, if they will be transferred offline, does it matter?

Questions about GeoJson and KML aside, Mapbox Vector Tiles is a broadly-used standard so it seems that would be a good candidate to support. The Mapbox SDK for Android is (I think) the only library that provides support for the format for Android.

Hopefully this provides a little more context as to why there are now two separate conversations about Mapbox and how they differ.

3 Likes

Awesome, thanks @zestyping - that all makes sense. I agree having the fall-back to OSMdroid makes sense for those reasons you laid out.

And I agree with @LN :pray: - it seems like there are multiple, related pieces to unpack. One is the somewhat separate question about whether and how to let users use their own custom vector tiles (either via a custom Mapbox style or via other tools) and how to handle the access token in that case. (Whether and how users could interact with features on their vector tiles seems like a further extension of that question, but I'm not sure if we want to get into that on this ticket.)

1 Like

Hi to all !
Hélène's examples help a lot :wink:
With such a possibility to edit or fill a form related to a vector feature just tapped on the map, odk-collect will make another new big step !
In our use case, it would be a huge tool to validate/describe natural habitats cartographies insted of big and great GIS tools which requires computers wich are not so easy to use on the field.
For the moment, computers segmentize a aerial photo to vector layer, we open it into a big heavy gis tool or on a big heavy paper board and we go to the field to validate/decribe each feature.
A simple collect form would be a great alternative for such a case.

@mathieubossaert Thanks as always for your feedback! I know you've been involved in conversations about vector layers in the past but I'm not quickly able to find what your favorite format is. It would be really helpful to know -- geojson? pbf mbtiles? GeoPackage? Something else?

As we're getting closer to an implementation here, it would be amazing if you could share a sample layer to test with. A file with just a couple of features and a brief description of how you'd like to use that layer would be perfect.

If anyone else is interested in vector tile support and can answer the same questions, that would also be very useful. @seewhy comes to mind.

@LN Thanks so much for filling in the context! That helps a lot.

1 Like

Thanks to you !!!
I am a bit "underwater" as we say in french this week. I will take some time in the coming days to document 2 or 3 use cases with vector data would be glad to see onto the map :slight_smile:

  • locate on a map the place we want to reach, for example a land parcel we own or a specie's location
  • describe a habitat polygon shown onto the map....

But if we talk about the data format, I think that the best is a the most easy to produce with GIS tools. I am not an expert, but I think that geojson and geopackage are now 2 classical GIS format in QGIS, maybe geojson is the most universal modern format between major GIS tools (free/non free). But file size will matter too... For the moment I never had to save vector data as mbtile and I even don't know how to do that.

2 Likes

Hi all,

I'm pasting in some ideas from an email discussion on this very topic. I hope this will provide some insights into the technical discussion about file formats!


I'd emphasize MVT as the first priority to allow broad basemaps, with GeoJSON as the second priority to allow editable vectors.

Why?

MVT allows a very small file to provide a very broad, though static, map that is quite efficient and quick to render. Raster MBTiles are fine for viewing a static map, but the file size is many orders of magnitude greater for a given area; for example a detailed MVT map of all of Tanzania—containing all of the OSM data, therefore millions of buildings and most of the visible roads—is about 300MB, while a raster MBTile of only the capital city of Dar es Salaam could easily be 120GB. If deploying enumerators to visit rural areas, raster MBTiles are simply too heavy to contemplate in most cases, but MVT is small enough to fit on the vast majority of phones' storage. So implementing MVT basemaps immediately gets most enumerators an offline basemap of the area they are working in.

The drawback of MVT is that it's not a good format for on-device editing. A few reasons for this:

  • Each tile contains only the geometry within the tile itself. At a high zoom level this may be a tiny fragment of a road, or even a portion of a single building. Editing this means trying to keep track of the geometry of the whole feature, which is contained in the protobuf of multiple tiles (adjacent tiles and overlapping tiles of different zoom levels). That's a hassle.
  • The geometry does not have much notion of the attributes of the feature it's part of, so you need another file/database to keep track of object attributes. While you have the feature ID, you have to go back to another file to link that ID to the attributes that you may wish to deal with when editing.
  • The geometry within the protobuffers encapsulated in the MVT tiles is simplified; it may not (and in most cases probably doesn't) contain all of the nodes of the original feature! During creation of MVTiles, nodes are discarded to leave only enough to render the relevant parts of the feature as a visibly reasonable facsimile at the particular zoom level. When you extract a feature back out of MVT, you aren't guaranteed to get your original geometry back, though at the highest zoom levels you might (or at least it should be close). So editing would require, at a minimum, always drilling down to the highest-zoom tile, grabbing all of the adjacent tiles that contain relevant bits of the feature you want to edit, converting that to an editable representation (the internal MVT protobuffer representation is a crazy Logo Turtle-style language that would be hard to manipulate), and then replacing it in those tiles, as well as re-simplifying it to push it back into the lower-zoom tiles. Not a nice process at all, and even if you pull it off not guaranteed to retain all unedited nodes of the original feature.

This still leaves the question of styling MVT. There are various options for that, but I think the simplest is to have a default styling similar to the rendering done by OSMAnd, Maps.me, or the MapBox SDK, with an optional sidecar file specifying a custom styling using MapCSS or something like it.

Once MVT is done, we could theoretically implement OpenMapKit-style adding of attributes to polygons right away, just using the feature ID of the polygons in the MVT (no editing of actual geometry, but adding attributes in a form containing the feature ID). However, while it might be nice, I think that's a dead end because it'll never be smooth to access existing attributes or edit geometry. I'd just as soon head straight toward an extensible solution, which is an editable file for particular layers that are targets of the survey.

I could see lots of arguments for Shapefile or GeoPackage, but I think GeoJSON is the best option.

Shapefile is very compact and computationally efficient, but it truncates attribute headers/keys to 10 characters—you'll find a very common column name in GIS data is "descriptio" due to this truncation—this is particularly unfortunate for data like OpenStreetMap that often has quite long keys (i.e. building:construction which gets truncated to building:c, indistinguishable from the truncated building:commercial ). Shapefile is also a bit of a bear to parse due to the arcane ancient spec; there are some good libraries around, but the underlying machinery is pretty hairy. GeoPackage is lovely, reasonably efficient and compact, and is implemented on top of SQLite, which makes it a cinch on Android which has solid SQLite support. However, a lot of users will be intimidated by the requirements to create a GeoPackage. Not all data can be translated smoothly to GeoPackage, even for a moderately skilled GIS user (GeoPackage freaks out when confronted with topological errors or non-unique id columns, which unfortunately are often found in useful datasets).

GeoJSON is not very compact and not particularly computationally efficient, but it's really clear and straightforward in structure, easy to parse without relying on esoteric libraries, and easy to generate. Lots of Web people are familiar with GeoJSON, and there are tons of tools to generate it that aren't big scary GIS packages. Most data can be translated to GeoJSON without complaint, as the GeoJSON drivers don't care if you have topological errors or weird keys in your data (which would likely trip up a GeoPackage writer and maybe a Shapefile writer). It contains only one representation of any given feature, and can happily contain all attributes and metadata in the same file. As a generalist GIS format, it covers most uses reasonably without offering particularly high performance in any specific use.

So compared to an MVT (or Shapefile), a GeoJSON is large and inefficient for pure display purposes. You wouldn't want to put all of the OSM data for a city, or some equally large dataset, into it! However, you only need to put the layers you want to edit into the GeoJSON format, which in most cases vastly reduces the amount of data needed in that layer. And it's easy to create, style, edit, add/modify attributes to, and ingest after editing.

So my dream scenario is a three-layer cake, each layer being optional:

  • Raster MBTiles on the bottom for satellite imagery or other useful high-res stuff in specific areas
  • MVT in the middle, above the rasters, for a broad vector basemap, default styled with an optional sidecar file for styling
  • Editable GeoJSON on top for features that will actually be mapped and/or tagged.
    • Editable meaning the ability to:
      • Modify geometry using the GPS position
      • Modify geometry using fingers (long-press or cross-hairs, either way is fine)
      • Trigger a form to populate attributes
      • Trigger a form with existing attributes already pre-loaded for modification or not

However, that's my big ambitious dream scenario! I feel that just doing the MVT basemaps already adds a lot of value, which is not lost if the rest is not done (or takes a lot of time).

2 Likes

Hi @Ivangayton,

Your dream scenario suits to me but for the moment what I do plan with such a feature is to first show vector homemade data over the map, to find it on the field, trigger a form to populates attributes, and maybe later to modify preloaded attributes...

1 Like

Hi @Marena,

Please forgive me if I've not understood this in your response, but are you saying that MapBox is happy for @zestyping and the ODK community to proceed with the use of the MapBox SDK, understanding that this may be for using their own custom vector tiles in some cases, without a MapBox account or token?

I think most of us are delighted to see the MapBox tiles as the default datasource, but as far as I understand the idea of using the MapBox SDK is not to have "the fall-back to OSMdroid" anymore for custom or user-specific vector tiles, but rather to fall back to the MapBox SDK without necessarily having an access token or using the MapBox online data to render such tiles.

This would mean that people forking ODK would end up with a version of the MapBox SDK, albeit without an access token for the MapBox data but able to render arbitrary vector tiles. At the moment @zestyping is doing this by using a blank access token; is this acceptable from MapBox's perspective? No one in this community wants to make use of the MapBox SDK in a way that MapBox isn't comfortable with, but there's a strong desire for the ODK app itself to be a fully functional stand-alone tool, even for people who for whatever reason might fork it and not have a MapBox account (without, of course, expecting to have free access to anyone's tiles or data, only to be able to use their own tiles and data).

There are a number of libraries that can render vector tiles, but I think the desire here is to use the MapBox SDK both as a way to access the MapBox tiles from the official Play Store ODK Collect app, but also as a standalone component to render tiles (and we can't help but notice that it's also capable of rendering GeoJSON, which makes it a rather strong candidate for the likely next step you've mentioned, which is interacting with user's own feature layers, which of course should not be dependent on an access token).

Sorry for the rather obsessive querying on these details; I think we really need to get this exactly right to ensure no one misunderstands or is sad later!

1 Like

Hi @mathieubossaert,

That's my most critical use-case as well! I can't wait to visit a building, road, tent, tree, or village in the field, click on it, and fill out the form triggered by it.

That said, I also can't wait to draw a new building and add a missing segment of road.

The only reason I suggest MVT first is because I'm pretty sure it's a quick win, and it'll probably be easier to implement interaction with features after we've got some kind of vector layer rendering.

I agree with you that GeoPackage and GeoJSON are the best options for user-generated layers; as I mentioned I lean toward GeoJSON to make it easier for non-GIS people. A person can use something like geojson.io to create a really basic layer without having any GIS knowledge or tools.

It may be that some sophisticated users want to put really detailed layers in, which could result in performance issues using GeoJSON. For this, GeoPackage would probably work better, but I suspect that such a benefit to users of heavy datasets won't be worth the added burden on those wishing to use light ones quickly and easily.

Maybe there's a chance that GeoPackage support could come later as an added functionality if a lot of us hit the limit of GeoJSON performance!

2 Likes

Hi @Ivangayton - thank you for asking to confirm, it is good to be certain about these things!

I have confirmed with several colleagues today (thanks @langstonsmith) that yes, it is acceptable within Mapbox ToS to use the Mapbox SDK without a Mapbox token. The Mapbox SDK is open source, so if you develop with it in such a way that does not require a Mapbox token that is fine.

That approach does limit how much we can be involved and supportive (if the implementation starts to fork away from our SDK it is hard to guarantee what we can support). But we would not see that as a violation of our terms.

If the best situation for the ODK community is to use a version of the Mapbox SDK without a token, but in such a way that creates the option for people to more easily use Mapbox basemap tiles in cases where people want to, that's :+1: by us.

(Perhaps it would be a good idea to include info for developers that fork ODK that it contains a modified version of the Mapbox SDK - and that if developers want the full-featured, updated Mapbox SDK they should return to the source for that. I wouldn't want to see a developer thinking that they were using the full Mapbox SDK without realizing that it was a modified version.)

4 Likes

Thank you so much, @Marena! That's wonderful news!

That's the confirmation we need to move ahead with this plan, and I feel much better knowing that it's okay with you and your colleagues at Mapbox. I really appreciate the time you've taken to do the extra checking into that.

It's certainly my intention to avoid forking the SDK for as long as we can—it makes everyone's life easier. Right now, there is no need to modify the SDK at all; we can achieve what we want using it exactly as is (just without an access token). To my thinking, the only situation in which it would really be necessary to fork the Mapbox SDK is if all four of the following are true:

  • Mapbox releases a new version of the SDK in which the current code path that allows a blank access token is removed.

  • In that new version, there is no other way to instantiate a MapView through public API calls.

  • We want to upgrade to using that version in ODK Collect because it does something great and new that we need.

  • We still want to use the Mapbox SDK for the fallback situation (as opposed to implementing the fallback with Google Maps or reintroducing OSMdroid or some other mapping SDK).

If we did ever fork the SDK, it would absolutely make sense to document that clearly and point developers back at the original Mapbox SDK.

4 Likes

Thanks @Marena, indeed this is great news; many thanks to you and to your colleagues at MapBox.

Thanks for the clarity, and most of all thanks for creating and sharing such great software (the SDK), standards (the MVT format itself, as well as the other open formats Mapbox has created), and for giving us access to your tileservers and data.

1 Like

Thanks everyone, this is exciting all around! @Ivangayton makes a strong case for supporting MVT basemaps as a first step. And just so we're on the same page, we're actually talking about MVT MBTiles (Mapbox Vector Tiles stored within a SQLite container), right?

Here is a summary of what I think is going to happen in the short term:

  • @Marena and @yanokwa will coordinate in private to get the API key that Mapbox has generously offered into the build process with the documentation she describes in this comment.
  • @langstonsmith will finish his effort to get online Mapbox basemaps in Collect as described in Adding Mapbox vector tile basemaps - #36 by zestyping. This support will only work on the release build which will have the API key mentioned above. Alternately, those who fork can add their own API key as described in @Marena's documentation.
  • @zestyping will work off of the same branch and add support for offline vector mbtiles. This will work without requiring an API key.
  • I will be ready to code review as needed.

Did I get all that right?

@Marena, is there any way that you can restrict an API key to a specific Android app package name? I'm guessing probably not but thought I'd ask because that could make it very easy to use the special API key. It wouldn't need to be protected because it wouldn't work in any app but the officially released Collect. That's what Google has done (but of course they're entirely Android-centric).

As part of this work, I think it makes sense to rethink how users specify the various layers that should be shown, as @zestyping alluded to in his original post. Currently, there's a Mapping section in user interface settings with two preferences: Mapping SDK and Basemap. I'd like to propose that we remove "Mapping SDK" since that is not something a user should care about and instead have the following preferences:

  • Online basemap. Options will include the Google basemaps, OSM basemaps, Mapbox basemaps and none. If a selection is made here, that will determine which SDK is used behind the scenes.
  • Offline basemap. These can be rasters or vectors. If no online basemap was selected, the Mapbox SDK will be used. If a user selects a Google online basemap and a vector offline basemap, some kind of error message will be shown.
  • (Eventually) Editable vector layer (wording to be determined)

@Ivangayton, I'm particularly interested in your feedback on that proposal. Note that I'm suggesting one online basemap from a fixed list and one offline basemap that is user-specified.

We can start thinking about selectable and editable GeoJSON support in parallel but that will be considered a separate feature or more likely two (select may be a pretty low bar). Sample data files and desired behavior from @mathieubossaert and @Ivangayton will help guide that. No urgency so it will be quand @mathieubossaert pourra Ă  nouveau respirer!

1 Like