Collecting forum stats

Hello all,

I am amazed by the amount of questions and activities in this forum!. As part of ODK documentation, I came up with an idea to collect this data for further analysis.

For example, we need to check which area of ODK is most active, estimate number of questions per month or solved and unsolved issues,..etc.

Web scraping might be a convenient method to get the data.

What do you think of the idea?

Thaks & Regards,
Nada

@yanokwa, @LN I'm wondering if there are any APIs available, or direct DB access, that would make it easier to access this information in a structured way.

1 Like

Hi @adammichaelwood. I tried scraping the site using Selenium Webdriver, which imitates web browser. I could extract some data (categories, # of topics).

odkforum.csv (227 Bytes)

The documentation for the forum's API is at http://docs.discourse.org and I'd prefer that to scrapping.

If that API is inadequate, the forum also has a data explorer that let's admins run queries against the live data. I don't think that level of access would be appropriate for a non-committer, but perhaps that access could be mediated by an admin.

1 Like

Nada, I think this is an interesting idea. What actions do you expect the community would take from this data. Put another way, what problem does this effort solve?

1 Like

Thanks, Yaw, I will check this.

I think we could improve the documentation when we learn about forum's topics. The forum is the practical side of ODK, people share their experiences here, and give examples. A new user might come and ask the same question another might ask before. If that question is documented well, it will help users to solve their problems faster.

I love the idea of being data-driven about the documentation but we don't have to scrape the forum to get it that data. I would propose we take a first pass by looking at the publicly available data.

For example, in the support category, over the last year:

Do you think that would help inform the gaps?

This would be helpful. I like it that there are a several filtering year/month/day for top topics.
I think we need to refine topics using tags. Suppose I want to get all ODK- Build topics or ODK-aggregate. It will be helpful to make tags as mandatory.

Also is there a way to know if a question is answered, without reviewing the whole thread?.

Thank you

There is currently no way to require tags, but I bet that'll be a feature that will likely be added to Discourse soon. See https://meta.discourse.org/t/the-option-to-enforce-tagging/69527/6.

You can filter by solved and unsolved like so:

But not that not all unsolved problems are actually unsolved. It might be that the answer wasn't flagged by solved by the original poster. It might also be a question that can't really be solved.

1 Like

@nada_gh i agree with you.
also if people could try as much as possible to let the community know if their problems are solve and with which solution. it will have been good.

2 Likes