Entries in Specification (47)

Thursday
Mar242011

Compatibility Kludges

From part 1 of the Google Browser Security Handbook:

Like many other text protocols of that time, early takes on HTTP made little or no effort to mandate a strict adherence to a particular understanding of what a text-based format really is, or how certain "intuitive" field values must be structured. Because of this, implementations would recognize, and often handle in incompatible ways, technically malformed inputs ........ In later versions, as outlined in RFC 2616 section 19.3 ("Tolerant Applications"), the standard explicitly recommends, but does not require, lax parsing of certain fields and invalid values. One of the most striking examples of compatibility kludges is Firefox prtime.c function used to parse HTTP Date fields, which shows a stunning complexity behind what should be a remarkably simple task.

Here then, in all its glory, is the code that the writers of prtime.c found necessary to cope with this 'stunning complexity' created by the specifiers of HTTP:

/*
* The following code implements PR_ParseTimeString(). It is based on
* ns/lib/xp/xp_time.c, revision 1.25, by Jamie Zawinski .
*/

/*
* We only recognize the abbreviations of a small subset of time zones
* in North America, Europe, and Japan.
*
* PST/PDT: Pacific Standard/Daylight Time
* MST/MDT: Mountain Standard/Daylight Time
* CST/CDT: Central Standard/Daylight Time
* EST/EDT: Eastern Standard/Daylight Time
* AST: Atlantic Standard Time
* NST: Newfoundland Standard Time
* GMT: Greenwich Mean Time
* BST: British Summer Time
* MET: Middle Europe Time
* EET: Eastern Europe Time
* JST: Japan Standard Time
*/

typedef enum
{
TT_UNKNOWN,

TT_SUN, TT_MON, TT_TUE, TT_WED, TT_THU, TT_FRI, TT_SAT,

TT_JAN, TT_FEB, TT_MAR, TT_APR, TT_MAY, TT_JUN,
TT_JUL, TT_AUG, TT_SEP, TT_OCT, TT_NOV, TT_DEC,

TT_PST, TT_PDT, TT_MST, TT_MDT, TT_CST, TT_CDT, TT_EST, TT_EDT,
TT_AST, TT_NST, TT_GMT, TT_BST, TT_MET, TT_EET, TT_JST
} TIME_TOKEN;

/*
* This parses a time/date string into a PRTime
* (microseconds after "1-Jan-1970 00:00:00 GMT").
* It returns PR_SUCCESS on success, and PR_FAILURE
* if the time/date string can't be parsed.
*
* Many formats are handled, including:
*
* 14 Apr 89 03:20:12
* 14 Apr 89 03:20 GMT
* Fri, 17 Mar 89 4:01:33
* Fri, 17 Mar 89 4:01 GMT
* Mon Jan 16 16:12 PDT 1989
* Mon Jan 16 16:12 +0130 1989
* 6 May 1992 16:41-JST (Wednesday)
* 22-AUG-1993 10:59:12.82
* 22-AUG-1993 10:59pm
* 22-AUG-1993 12:59am
* 22-AUG-1993 12:59 PM
* Friday, August 04, 1995 3:54 PM
* 06/21/95 04:24:34 PM
* 20/06/95 21:07
* 95-06-08 19:32:48 EDT
*
* If the input string doesn't contain a description of the timezone,
* we consult the `default_to_gmt' to decide whether the string should
* be interpreted relative to the local time zone (PR_FALSE) or GMT (PR_TRUE).
* The correct value for this argument depends on what standard specified
* the time string which you are parsing.
*/

PR_IMPLEMENT(PRStatus)
PR_ParseTimeString(
const char *string,
PRBool default_to_gmt,
PRTime *result)
{
PRExplodedTime tm;
TIME_TOKEN dotw = TT_UNKNOWN;
TIME_TOKEN month = TT_UNKNOWN;
TIME_TOKEN zone = TT_UNKNOWN;
int zone_offset = -1;
int date = -1;
PRInt32 year = -1;
int hour = -1;
int min = -1;
int sec = -1;

const char *rest = string;

#ifdef DEBUG
int iterations = 0;
#endif

PR_ASSERT(string && result);
if (!string || !result) return PR_FAILURE;

while (*rest)
{

#ifdef DEBUG
if (iterations++ > 1000)
{
PR_ASSERT(0);
return PR_FAILURE;
}
#endif

switch (*rest)
{
case 'a': case 'A':
if (month == TT_UNKNOWN &&
(rest[1] == 'p' || rest[1] == 'P') &&
(rest[2] == 'r' || rest[2] == 'R'))
month = TT_APR;
else if (zone == TT_UNKNOWN &&
(rest[1] == 's' || rest[1] == 's') &&
(rest[2] == 't' || rest[2] == 'T'))
zone = TT_AST;
else if (month == TT_UNKNOWN &&
(rest[1] == 'u' || rest[1] == 'U') &&
(rest[2] == 'g' || rest[2] == 'G'))
month = TT_AUG;
break;
case 'b': case 'B':
if (zone == TT_UNKNOWN &&
(rest[1] == 's' || rest[1] == 'S') &&
(rest[2] == 't' || rest[2] == 'T'))
zone = TT_BST;
break;
case 'c': case 'C':
if (zone == TT_UNKNOWN &&
(rest[1] == 'd' || rest[1] == 'D') &&
(rest[2] == 't' || rest[2] == 'T'))
zone = TT_CDT;
else if (zone == TT_UNKNOWN &&
(rest[1] == 's' || rest[1] == 'S') &&
(rest[2] == 't' || rest[2] == 'T'))
zone = TT_CST;
break;
case 'd': case 'D':
if (month == TT_UNKNOWN &&
(rest[1] == 'e' || rest[1] == 'E') &&
(rest[2] == 'c' || rest[2] == 'C'))
month = TT_DEC;
break;
case 'e': case 'E':
if (zone == TT_UNKNOWN &&
(rest[1] == 'd' || rest[1] == 'D') &&
(rest[2] == 't' || rest[2] == 'T'))
zone = TT_EDT;
else if (zone == TT_UNKNOWN &&
(rest[1] == 'e' || rest[1] == 'E') &&
(rest[2] == 't' || rest[2] == 'T'))
zone = TT_EET;
else if (zone == TT_UNKNOWN &&
(rest[1] == 's' || rest[1] == 'S') &&
(rest[2] == 't' || rest[2] == 'T'))
zone = TT_EST;
break;
case 'f': case 'F':
if (month == TT_UNKNOWN &&
(rest[1] == 'e' || rest[1] == 'E') &&
(rest[2] == 'b' || rest[2] == 'B'))
month = TT_FEB;
else if (dotw == TT_UNKNOWN &&
(rest[1] == 'r' || rest[1] == 'R') &&
(rest[2] == 'i' || rest[2] == 'I'))
dotw = TT_FRI;
break;
case 'g': case 'G':
if (zone == TT_UNKNOWN &&
(rest[1] == 'm' || rest[1] == 'M') &&
(rest[2] == 't' || rest[2] == 'T'))
zone = TT_GMT;
break;
case 'j': case 'J':
if (month == TT_UNKNOWN &&
(rest[1] == 'a' || rest[1] == 'A') &&
(rest[2] == 'n' || rest[2] == 'N'))
month = TT_JAN;
else if (zone == TT_UNKNOWN &&
(rest[1] == 's' || rest[1] == 'S') &&
(rest[2] == 't' || rest[2] == 'T'))
zone = TT_JST;
else if (month == TT_UNKNOWN &&
(rest[1] == 'u' || rest[1] == 'U') &&
(rest[2] == 'l' || rest[2] == 'L'))
month = TT_JUL;
else if (month == TT_UNKNOWN &&
(rest[1] == 'u' || rest[1] == 'U') &&
(rest[2] == 'n' || rest[2] == 'N'))
month = TT_JUN;
break;
case 'm': case 'M':
if (month == TT_UNKNOWN &&
(rest[1] == 'a' || rest[1] == 'A') &&
(rest[2] == 'r' || rest[2] == 'R'))
month = TT_MAR;
else if (month == TT_UNKNOWN &&
(rest[1] == 'a' || rest[1] == 'A') &&
(rest[2] == 'y' || rest[2] == 'Y'))
month = TT_MAY;
else if (zone == TT_UNKNOWN &&
(rest[1] == 'd' || rest[1] == 'D') &&
(rest[2] == 't' || rest[2] == 'T'))
zone = TT_MDT;
else if (zone == TT_UNKNOWN &&
(rest[1] == 'e' || rest[1] == 'E') &&
(rest[2] == 't' || rest[2] == 'T'))
zone = TT_MET;
else if (dotw == TT_UNKNOWN &&
(rest[1] == 'o' || rest[1] == 'O') &&
(rest[2] == 'n' || rest[2] == 'N'))
dotw = TT_MON;
else if (zone == TT_UNKNOWN &&
(rest[1] == 's' || rest[1] == 'S') &&
(rest[2] == 't' || rest[2] == 'T'))
zone = TT_MST;
break;
case 'n': case 'N':
if (month == TT_UNKNOWN &&
(rest[1] == 'o' || rest[1] == 'O') &&
(rest[2] == 'v' || rest[2] == 'V'))
month = TT_NOV;
else if (zone == TT_UNKNOWN &&
(rest[1] == 's' || rest[1] == 'S') &&
(rest[2] == 't' || rest[2] == 'T'))
zone = TT_NST;
break;
case 'o': case 'O':
if (month == TT_UNKNOWN &&
(rest[1] == 'c' || rest[1] == 'C') &&
(rest[2] == 't' || rest[2] == 'T'))
month = TT_OCT;
break;
case 'p': case 'P':
if (zone == TT_UNKNOWN &&
(rest[1] == 'd' || rest[1] == 'D') &&
(rest[2] == 't' || rest[2] == 'T'))
zone = TT_PDT;
else if (zone == TT_UNKNOWN &&
(rest[1] == 's' || rest[1] == 'S') &&
(rest[2] == 't' || rest[2] == 'T'))
zone = TT_PST;
break;
case 's': case 'S':
if (dotw == TT_UNKNOWN &&
(rest[1] == 'a' || rest[1] == 'A') &&
(rest[2] == 't' || rest[2] == 'T'))
dotw = TT_SAT;
else if (month == TT_UNKNOWN &&
(rest[1] == 'e' || rest[1] == 'E') &&
(rest[2] == 'p' || rest[2] == 'P'))
month = TT_SEP;
else if (dotw == TT_UNKNOWN &&
(rest[1] == 'u' || rest[1] == 'U') &&
(rest[2] == 'n' || rest[2] == 'N'))
dotw = TT_SUN;
break;
case 't': case 'T':
if (dotw == TT_UNKNOWN &&
(rest[1] == 'h' || rest[1] == 'H') &&
(rest[2] == 'u' || rest[2] == 'U'))
dotw = TT_THU;
else if (dotw == TT_UNKNOWN &&
(rest[1] == 'u' || rest[1] == 'U') &&
(rest[2] == 'e' || rest[2] == 'E'))
dotw = TT_TUE;
break;
case 'u': case 'U':
if (zone == TT_UNKNOWN &&
(rest[1] == 't' || rest[1] == 'T') &&
!(rest[2] >= 'A' && rest[2] <= 'Z') &&
!(rest[2] >= 'a' && rest[2] <= 'z'))
/* UT is the same as GMT but UTx is not. */
zone = TT_GMT;
break;
case 'w': case 'W':
if (dotw == TT_UNKNOWN &&
(rest[1] == 'e' || rest[1] == 'E') &&
(rest[2] == 'd' || rest[2] == 'D'))
dotw = TT_WED;
break;

case '+': case '-':
{
const char *end;
int sign;
if (zone_offset != -1)
{
/* already got one... */
rest++;
break;
}
if (zone != TT_UNKNOWN && zone != TT_GMT)
{
/* GMT+0300 is legal, but PST+0300 is not. */
rest++;
break;
}

sign = ((*rest == '+') ? 1 : -1);
rest++; /* move over sign */
end = rest;
while (*end >= '0' && *end <= '9')
end++;
if (rest == end) /* no digits here */
break;

if ((end - rest) == 4)
/* offset in HHMM */
zone_offset = (((((rest[0]-'0')*10) + (rest[1]-'0')) * 60) +
(((rest[2]-'0')*10) + (rest[3]-'0')));
else if ((end - rest) == 2)
/* offset in hours */
zone_offset = (((rest[0]-'0')*10) + (rest[1]-'0')) * 60;
else if ((end - rest) == 1)
/* offset in hours */
zone_offset = (rest[0]-'0') * 60;
else
/* 3 or >4 */
break;

zone_offset *= sign;
zone = TT_GMT;
break;
}

case '0': case '1': case '2': case '3': case '4':
case '5': case '6': case '7': case '8': case '9':
{
int tmp_hour = -1;
int tmp_min = -1;
int tmp_sec = -1;
const char *end = rest + 1;
while (*end >= '0' && *end <= '9')
end++;

/* end is now the first character after a range of digits. */

if (*end == ':')
{
if (hour >= 0 && min >= 0) /* already got it */
break;

/* We have seen "[0-9]+:", so this is probably HH:MM[:SS] */
if ((end - rest) > 2)
/* it is [0-9][0-9][0-9]+: */
break;
else if ((end - rest) == 2)
tmp_hour = ((rest[0]-'0')*10 +
(rest[1]-'0'));
else
tmp_hour = (rest[0]-'0');

/* move over the colon, and parse minutes */

rest = ++end;
while (*end >= '0' && *end <= '9')
end++;

if (end == rest)
/* no digits after first colon? */
break;
else if ((end - rest) > 2)
/* it is [0-9][0-9][0-9]+: */
break;
else if ((end - rest) == 2)
tmp_min = ((rest[0]-'0')*10 +
(rest[1]-'0'));
else
tmp_min = (rest[0]-'0');

/* now go for seconds */
rest = end;
if (*rest == ':')
rest++;
end = rest;
while (*end >= '0' && *end <= '9')
end++;

if (end == rest)
/* no digits after second colon - that's ok. */
;
else if ((end - rest) > 2)
/* it is [0-9][0-9][0-9]+: */
break;
else if ((end - rest) == 2)
tmp_sec = ((rest[0]-'0')*10 +
(rest[1]-'0'));
else
tmp_sec = (rest[0]-'0');

/* If we made it here, we've parsed hour and min,
and possibly sec, so it worked as a unit. */

/* skip over whitespace and see if there's an AM or PM
directly following the time.
*/
if (tmp_hour <= 12)
{
const char *s = end;
while (*s && (*s == ' ' || *s == '\t'))
s++;
if ((s[0] == 'p' || s[0] == 'P') &&
(s[1] == 'm' || s[1] == 'M'))
/* 10:05pm == 22:05, and 12:05pm == 12:05 */
tmp_hour = (tmp_hour == 12 ? 12 : tmp_hour + 12);
else if (tmp_hour == 12 &&
(s[0] == 'a' || s[0] == 'A') &&
(s[1] == 'm' || s[1] == 'M'))
/* 12:05am == 00:05 */
tmp_hour = 0;
}

hour = tmp_hour;
min = tmp_min;
sec = tmp_sec;
rest = end;
break;
}
else if ((*end == '/' || *end == '-') &&
end[1] >= '0' && end[1] <= '9')
{
/* Perhaps this is 6/16/95, 16/6/95, 6-16-95, or 16-6-95
or even 95-06-05...
#### But it doesn't handle 1995-06-22.
*/
int n1, n2, n3;
const char *s;

if (month != TT_UNKNOWN)
/* if we saw a month name, this can't be. */
break;

s = rest;

n1 = (*s++ - '0'); /* first 1 or 2 digits */
if (*s >= '0' && *s <= '9')
n1 = n1*10 + (*s++ - '0');

if (*s != '/' && *s != '-') /* slash */
break;
s++;

if (*s < '0' || *s > '9') /* second 1 or 2 digits */
break;
n2 = (*s++ - '0');
if (*s >= '0' && *s <= '9')
n2 = n2*10 + (*s++ - '0');

if (*s != '/' && *s != '-') /* slash */
break;
s++;

if (*s < '0' || *s > '9') /* third 1, 2, or 4 digits */
break;
n3 = (*s++ - '0');
if (*s >= '0' && *s <= '9')
n3 = n3*10 + (*s++ - '0');

if (*s >= '0' && *s <= '9') /* optional digits 3 and 4 */
{
n3 = n3*10 + (*s++ - '0');
if (*s < '0' || *s > '9')
break;
n3 = n3*10 + (*s++ - '0');
}

if ((*s >= '0' && *s <= '9') || /* followed by non-alphanum */
(*s >= 'A' && *s <= 'Z') ||
(*s >= 'a' && *s <= 'z'))
break;

/* Ok, we parsed three 1-2 digit numbers, with / or -
between them. Now decide what the hell they are
(DD/MM/YY or MM/DD/YY or YY/MM/DD.)
*/

if (n1 > 31 || n1 == 0) /* must be YY/MM/DD */
{
if (n2 > 12) break;
if (n3 > 31) break;
year = n1;
if (year < 70)
year += 2000;
else if (year < 100)
year += 1900;
month = (TIME_TOKEN)(n2 + ((int)TT_JAN) - 1);
date = n3;
rest = s;
break;
}

if (n1 > 12 && n2 > 12) /* illegal */
{
rest = s;
break;
}

if (n3 < 70)
n3 += 2000;
else if (n3 < 100)
n3 += 1900;

if (n1 > 12) /* must be DD/MM/YY */
{
date = n1;
month = (TIME_TOKEN)(n2 + ((int)TT_JAN) - 1);
year = n3;
}
else /* assume MM/DD/YY */
{
/* #### In the ambiguous case, should we consult the
locale to find out the local default? */
month = (TIME_TOKEN)(n1 + ((int)TT_JAN) - 1);
date = n2;
year = n3;
}
rest = s;
}
else if ((*end >= 'A' && *end <= 'Z') ||
(*end >= 'a' && *end <= 'z'))
/* Digits followed by non-punctuation - what's that? */
;
else if ((end - rest) == 4) /* four digits is a year */
year = (year < 0
? ((rest[0]-'0')*1000L +
(rest[1]-'0')*100L +
(rest[2]-'0')*10L +
(rest[3]-'0'))
: year);
else if ((end - rest) == 2) /* two digits - date or year */
{
int n = ((rest[0]-'0')*10 +
(rest[1]-'0'));
/* If we don't have a date (day of the month) and we see a number
less than 32, then assume that is the date.

Otherwise, if we have a date and not a year, assume this is the
year. If it is less than 70, then assume it refers to the 21st
century. If it is two digits (>= 70), assume it refers to this
century. Otherwise, assume it refers to an unambiguous year.

The world will surely end soon.
*/
if (date < 0 && n < 32)
date = n;
else if (year < 0)
{
if (n < 70)
year = 2000 + n;
else if (n < 100)
year = 1900 + n;
else
year = n;
}
/* else what the hell is this. */
}
else if ((end - rest) == 1) /* one digit - date */
date = (date < 0 ? (rest[0]-'0') : date);
/* else, three or more than four digits - what's that? */

break;
}
}

/* Skip to the end of this token, whether we parsed it or not.
Tokens are delimited by whitespace, or ,;-/
But explicitly not :+-.
*/
while (*rest &&
*rest != ' ' && *rest != '\t' &&
*rest != ',' && *rest != ';' &&
*rest != '-' && *rest != '+' &&
*rest != '/' &&
*rest != '(' && *rest != ')' && *rest != '[' && *rest != ']')
rest++;
/* skip over uninteresting chars. */
SKIP_MORE:
while (*rest &&
(*rest == ' ' || *rest == '\t' ||
*rest == ',' || *rest == ';' || *rest == '/' ||
*rest == '(' || *rest == ')' || *rest == '[' || *rest == ']'))
rest++;

/* "-" is ignored at the beginning of a token if we have not yet
parsed a year (e.g., the second "-" in "30-AUG-1966"), or if
the character after the dash is not a digit. */
if (*rest == '-' && ((rest > string && isalpha(rest[-1]) && year < 0)
|| rest[1] < '0' || rest[1] > '9'))
{
rest++;
goto SKIP_MORE;
}

}

if (zone != TT_UNKNOWN && zone_offset == -1)
{
switch (zone)
{
case TT_PST: zone_offset = -8 * 60; break;
case TT_PDT: zone_offset = -7 * 60; break;
case TT_MST: zone_offset = -7 * 60; break;
case TT_MDT: zone_offset = -6 * 60; break;
case TT_CST: zone_offset = -6 * 60; break;
case TT_CDT: zone_offset = -5 * 60; break;
case TT_EST: zone_offset = -5 * 60; break;
case TT_EDT: zone_offset = -4 * 60; break;
case TT_AST: zone_offset = -4 * 60; break;
case TT_NST: zone_offset = -3 * 60 - 30; break;
case TT_GMT: zone_offset = 0 * 60; break;
case TT_BST: zone_offset = 1 * 60; break;
case TT_MET: zone_offset = 1 * 60; break;
case TT_EET: zone_offset = 2 * 60; break;
case TT_JST: zone_offset = 9 * 60; break;
default:
PR_ASSERT (0);
break;
}
}

/* If we didn't find a year, month, or day-of-the-month, we can't
possibly parse this, and in fact, mktime() will do something random
(I'm seeing it return "Tue Feb 5 06:28:16 2036", which is no doubt
a numerologically significant date... */
if (month == TT_UNKNOWN || date == -1 || year == -1)
return PR_FAILURE;

memset(&tm, 0, sizeof(tm));
if (sec != -1)
tm.tm_sec = sec;
if (min != -1)
tm.tm_min = min;
if (hour != -1)
tm.tm_hour = hour;
if (date != -1)
tm.tm_mday = date;
if (month != TT_UNKNOWN)
tm.tm_month = (((int)month) - ((int)TT_JAN));
if (year != -1)
tm.tm_year = year;
if (dotw != TT_UNKNOWN)
tm.tm_wday = (((int)dotw) - ((int)TT_SUN));

if (zone == TT_UNKNOWN && default_to_gmt)
{
/* No zone was specified, so pretend the zone was GMT. */
zone = TT_GMT;
zone_offset = 0;
}

if (zone_offset == -1)
{
/* no zone was specified, and we're to assume that everything
is local. */
struct tm localTime;
time_t secs;

PR_ASSERT(tm.tm_month > -1
&& tm.tm_mday > 0
&& tm.tm_hour > -1
&& tm.tm_min > -1
&& tm.tm_sec > -1);

/*
* To obtain time_t from a tm structure representing the local
* time, we call mktime(). However, we need to see if we are
* on 1-Jan-1970 or before. If we are, we can't call mktime()
* because mktime() will crash on win16. In that case, we
* calculate zone_offset based on the zone offset at
* 00:00:00, 2 Jan 1970 GMT, and subtract zone_offset from the
* date we are parsing to transform the date to GMT. We also
* do so if mktime() returns (time_t) -1 (time out of range).
*/

/* month, day, hours, mins and secs are always non-negative
so we dont need to worry about them. */
if(tm.tm_year >= 1970)
{
PRInt64 usec_per_sec;

localTime.tm_sec = tm.tm_sec;
localTime.tm_min = tm.tm_min;
localTime.tm_hour = tm.tm_hour;
localTime.tm_mday = tm.tm_mday;
localTime.tm_mon = tm.tm_month;
localTime.tm_year = tm.tm_year - 1900;
/* Set this to -1 to tell mktime "I don't care". If you set
it to 0 or 1, you are making assertions about whether the
date you are handing it is in daylight savings mode or not;
and if you're wrong, it will "fix" it for you. */
localTime.tm_isdst = -1;
secs = mktime(&localTime);
if (secs != (time_t) -1)
{
#if defined(XP_MAC) && (__MSL__ < 0x6000)
/*
* The mktime() routine in MetroWerks MSL C
* Runtime library returns seconds since midnight,
* 1 Jan. 1900, not 1970 - in versions of MSL (Metrowerks Standard
* Library) prior to version 6. Only for older versions of
* MSL do we adjust the value of secs to the NSPR epoch
*/
secs -= ((365 * 70UL) + 17) * 24 * 60 * 60;
#endif
LL_I2L(*result, secs);
LL_I2L(usec_per_sec, PR_USEC_PER_SEC);
LL_MUL(*result, *result, usec_per_sec);
return PR_SUCCESS;
}
}

/* So mktime() can't handle this case. We assume the
zone_offset for the date we are parsing is the same as
the zone offset on 00:00:00 2 Jan 1970 GMT. */
secs = 86400;
(void) MT_safe_localtime(&secs, &localTime);
zone_offset = localTime.tm_min
+ 60 * localTime.tm_hour
+ 1440 * (localTime.tm_mday - 2);
}

/* Adjust the hours and minutes before handing them to
PR_ImplodeTime(). Note that it's ok for them to be <0 or >24/60

We adjust the time to GMT before going into PR_ImplodeTime().
The zone_offset represents the difference between the time
zone parsed and GMT
*/
tm.tm_hour -= (zone_offset / 60);
tm.tm_min -= (zone_offset % 60);

*result = PR_ImplodeTime(&tm);

return PR_SUCCESS;
}

Really, reading a date from a text string should only take a few lines of code, even if you do have to take things like time zones into account.

(The above code is licensed under the Mozilla Public Licence 1.1, GPL 2.0 and LGPL 2.1.  See here for more details.)

Tuesday
Jun292010

Another Reason to Learn Haskell

From Real World Haskell (O'Sullivan, Goerzen and Stewart, O'Reilly, 2009, page 38):

In "Function Types and Purity" on page 27 we talked about figuring out the behaviour of a function based on its type signature.  We can apply the same kind of reasoning to polymorphic functions.  Let's look again at fst:

  ghci :type fst
fst :: (a, b) -> a

First of all, notice that its argument contains two type variables, a and b, signifying that the elements of the tuple can be of different types.

The result type of fst is a.  We've already mentioned that parameteric polymorphism makes the real type inaccessible.  fst doesn't have enough information to construct a value of type a, nor can it turn an a into a b.  So the only possible valid behaviour (omitting infinite loops or crashes) it can have is to return the first element of the pair.

...

There is a deep mathematical sense in which any nonpathological function of type (a,b) -> a must do exactly what fst does.  ...

If this doesn't surpise you when you first come across it, then you haven't been paying attention.

Tuesday
Mar092010

Leslie Lamport Interviewed by Erik Meijer

On MSDN Channel 9, here.

Monday
Apr062009

E-voting Technology is Too Advanced

"Ten years ago, the replacement of pencils and ballot papers by machines was seen as a badge of modernity. But technology was not sufficiently advanced to guarantee security of the new system."

I disagree with this assertion. I think the technology is /too/ advanced to guarantee security of the system. I'm reminded of Clarke's 3rd law: "Any sufficiently advanced technology is indistinguishable from magic." We cannot have a trusted system of democracy if voting works by magic. Voting needs to work in a way that everyone can fully understand.

Robert 'Jamie' Munro, commenting here on this post by Peter G. Neumann.

Sunday
Feb032008

Where's a Mathematician when you need one?

Software version control systems play a crucial role in modern software development: they allow developers to track and control changes to the large numbers of files that make up modern software systems.  Given that version control systems have been in existence for several decades now, one would have thought that the theory behind their operation would be well-developed, but this is not so.  A quote from a recent talk on DARCS, an innovative modern version control system, given by Ganesh Sittampalam to the London Haskell Users' Group:

Patch Theory:  This is the theory underlying DARCS or, rather, what we would like the theory underlying DARCS to be but we cannot quite figure out what the theory should be.

And another:

We would like to know a consistent set of rules that actually guarantee the behaviour that we want from DARCS.

The formalization of DARCS Patch Theory would appear to  be a worthwhile little project for any passing mathematician.