Calculating week-of-year for a date


(Dr. Gareth S. Bestor) #1

This resulted from a recent Forum question posted by @Godsway_Sackey, asking how to determine the week number of a given date; that is, in which week of the year (eg 1..52) a date occurs. Presently, there is no build-in ODK function to computes this, nor does the existing format-date() ODK function have a option for displaying the week number. However, it appeared that ODK possessed all the necessary pieces of the puzzle to solve the problem... It was a fun and interesting exercise, so I thought I'd share with the wider community. Plus it gives you an idea of what you can actually accomplish in ODK forms with a bit of creative thinking! :slight_smile:

To restate the problem as originally posed:

"get a week number automatically when I select date"

per ISO8601, weeks start on a Monday, where the first Monday of the year determines week #1 [these are important and will dictate the precise result].

Determining the week of year is certainly no new problem in computer science, and there are a lot of existing tools which already perform this calculation. I chose to look at the standard Linux/C function strftime() for the formula:

Description

The C library function size_t strftime(char *str, size_t maxsize, const char *format, const struct tm *timeptr) formats the time represented in the structure timeptr according to the formatting rules defined in format and stored into str.

Declaration

Following is the declaration for strftime() function.

size_t strftime(char *str, size_t maxsize, const char *format, const struct tm *timeptr)
...
%W Week number with the first Monday as the first day of week one (00-53)

The specific formula that strftime() uses to compute %W can be found in the source code for strftime.c:

case 'W':
    if (!_conv((t->tm_yday + 7
         - (t->tm_wday ? (t->tm_wday - 1) : 6))
	 / 7, 2, '0'))
		return(0);
    continue;

So our desired formula is: (t->tm_yday + 7 - (t->tm_wday ? (t->tm_wday - 1) : 6)) / 7

We can looking at the header file time.h to find out what these variables mean:

struct tm {
	int	tm_sec;		/* seconds after the minute [0-60] */
	int	tm_min;		/* minutes after the hour [0-59] */
	int	tm_hour;	        /* hours since midnight [0-23] */
	int	tm_mday;	        /* day of the month [1-31] */
	int	tm_mon;		/* months since January [0-11] */
	int	tm_year;	        /* years since 1900 */
	int	tm_wday;        /* days since Sunday [0-6] */
	int	tm_yday;        /* days since January 1 [0-365] */
	int	tm_isdst;	/* Daylight Savings Time flag */
	long	tm_gmtoff;	/* offset from CUT in seconds */
	char	*tm_zone;	/* timezone abbreviation */
};

To summarize, if we can get the day-of-year (tm_yday) and day-of-week (tm_wday) we can plug it into this formula to give us the week-of-year!

Going back to ODK-land... there isn't an existing function to calculate the day-of-year; however, we can calculate this by subtracting from the numeric representation of our target date, obtained using ODK's decimal-date-time(), the numeric representation of Jan 1 of the same year, which will give us the number of days between; hey presto day-of-year!

So the next problem is how to generate a date for Jan 1 of the same year... This can be accomplished by extracting just the year from our target date, using ODK's format-date() with a format of '%Y'. To this we just need to add "-01-01" for Jan 1. That is:

year = format-date(${date},"%Y")
dayofyear = decimal-date-time(${date}) - decimal-date-time(concat(${year}, "-01-01"))

Next, we need to get the day-of-week... ODK's format-date() can also be used for this, using the '%a' format, but unfortunately it gives us the weekday text abbreviation ("Mon", "Tue", ...) whereas we need the weekday number (0,1,2,...) for our formula. Rather than write a long messy nested if() expression to do this conversion, instead I made a simple lookup table using a select_one question's label-value pairs.

label value
-----+-----
0     Mon
1     Tue
2     Wed
3     Thu
4     Fri
5     Sat
6     Sun

[to prevent this select_one question from actually appearing in the form, you can just set its relevant="false()"]

Once I have the weekday text obtained from format-date() I can lookup the corresponding weekday number with a single jr:choice-name() call. That is:

weekday = format-date(${date},"%a")
weekdaynum = jr:choice-name(${weekday},'${lookuptable}')

where ${lookuptable} is a reference to my hidden select_one question.

And now, finally, we have all the data we need to plug into our week formula!!!

weekofyear = int((${dayofyear} + 7 - ${weekdaynum}) div 7)

(the original C formula does integer division, which we mimic in ODK by dropping the decimal from the result using the int() function)

There you have it. Here's a form which implements all the steps of the calculation - have a play!

weekyear.xls (20.5 KB)
weekyear.xml (2.9 KB)

[bonus question: who can tell me why I labelled Monday as 0 in my select_one, and not Sunday... :wink: ]


Auto generate week number from date
Validate issue for decimal-date-time
(Godsway Sackey) #2

Wow! This is great! I will try this out

Thank you

Nurture your mind with great thoughts for you will never go any higher than you think


(Dr. Gareth S. Bestor) #3

This is probably worth emphasizing. As a consequence of how ISO8601 defines the first week of the year, it is possible for some of the first days of January to be calculated in week #0 (!) [which I guess you can interpret as actually falling in the last week of the previous year...]. For example, in 2019, week #1 starts Monday Jan 7, so Jan 2 (a Wednesday) is considered to fall in week #0.


(ARIF AZAD KHAN) #4

Though it is easier for me to use if function than lookup table function :slightly_smiling_face:

Thanks for sharing this with detail explanation. I tested the xml and it works fine. But while trying to convert the xls to XML it returns an error.


(Dr. Gareth S. Bestor) #5

Hmm... dunno. I exported both the XLS and XML - from the original Kobo form - just fine. I'll poke around, thank you for the heads-up.


(Aurelio) #6

Hi Gareth,

i think the issue on converting to xml is a bug of the Validate probably. (it fails on validate the "decimal-date-time(concat(${year}, "-01-01"))" )
Running XLSform offline 1.8.1 unchecking the option to Validate the Xform will produce an xml that then works on Collect.

I understand you got it from Kobo, but this is an Open Data Kit forum and not a Kobo one.
It is a very interesting form and a good showcase, but it is important also to show something that works for users that don't use Kobo and that face problems on converting the xlsform through the usual way (XLSform Offline or online).

@ARIF_AZAD_KHAN if you want to use the form editing according to your need, then at the moment you need to use the XLSform offline and uncheck the Validate option.

I'll open a new thread on the Support Channel and try to address an issue to GitHub.

@Xiphware thanks a lot for the useful example (I'll use it with my students)


(Dr. Gareth S. Bestor) #7

Sorry. I tested the XML in Collect (and XLS via Enketo preview), but alas not actually loading into Aggregate which would have triggered the Validate bug(s). :slightly_frowning_face: I'll edit the form to workaround; decimal-date-time() is arguably optional since ODK's XPath eval will do the same conversion automatically, so the form should still work OK without it.


(Aurelio) #8

Hi @Xiphware, I didn't talk of Aggregate. Try to upload your xls on http://opendatakit.org/xlsform/ and you'll see what I mean.
I opened a post on support on the issue:


(Dr. Gareth S. Bestor) #9

I've replaced decimal-date-time() in the form(s) with number() in the interim whist the Validate bug is fixed [number() performs same conversion]. And renamed the form to avoid the other bug [that filename was just last minute when I put form files into posting. D'oh]. Tested against http://opendatakit.org/xlsform/


(ARIF AZAD KHAN) #10

@Xiphware attached xls does not give expected result in ODK collect but works fine in enketo.


(Dr. Gareth S. Bestor) #11

It looks like there's some bugs lurking around decimal-date-time() support. I've pulled the forms till I can figure out a common workaround that'll work against both Collect, Validate, and Enketo... sorry folks.


(Dr. Gareth S. Bestor) #12

After determining the bug in Validate, I've been able to formulate a workaround and re-posted a new XLS and XML form, which have been tested against both Validate, Enketo, and Collect.

For those interested, I've highlighted the change in the XLS form that was necessary to 'fake out' Validate and prevent it from throwing a bogus type mismatch error when evaluating date-related XPath expressions. Specifically, I changed

dayofyear = decimal-date-time(${date}) - decimal-date-time(concat(${year}, "-01-01"))

to

dayofyear = decimal-date-time(${date}) - decimal-date-time(concat(coalesce(${year},"2019"), "-01-01"))

Replacing ${year} with coalesce(${year},"2019") means Validate gets a valid date string when it evaluates the decimal-date-time() argument during its (static) type checking phase; the 2019 year will be ignored as soon as you run the form and enter an actual date. This is only needed as bug workaround and is not part of the algorithm.