Date/time formatting notation

This is a specification for a notation to specify the notation in which to output a date, time or combination thereof.

As of Stewart's Utility Library 0.05, all features have been implemented except for the ZZZ/zzz (time zone TLA) format specifier.

Format specifiers

The letters of the basic Latin alphabet are reserved for format specifiers.  Each specifier is a letter, or two or more of the same letter consecutively (picked out by maximal munch before lookup in the following table).  The letter denotes what piece of information to format, and the capitalisation and length denote how to format it.  Where alternative capitalisations are given for a single format, they denote the output capitalisation of the corresponding datum.

Letter Datum Specifier Format
Y Year yy Two digits
yyy Exact required length
yyyy Four digits (longer if necessary)
YYY Astronomical notation (BC represented as negative numbers)
B BC/AD B or b BC if necessary
BB or bb BC or AD
BBB or bbb BCE or CE
BBBB or bbbb BCE if necessary
M Month m numeric without leading zero
mm numeric with leading zero
MMM, Mmm or mmm abbreviated name
MMMM, Mmmm or mmmm full name
D Day of month d no leading zero
dd leading zero
T Ordinal suffix T or t Ordinal suffix (TH, ST, ND or RD) of last formatted datum
W Day of week WWW, Www or www abbreviated name
WWWW, Wwww or wwww full name
H Hour H 24-hour without leading zero
HH 24-hour with leading zero
h 12-hour without leading zero
hh 12-hour with leading zero
A AM/PM A or a A or P
AA or aa AM or PM
I Minute i no leading zero
ii leading zero
S Second s no leading zero
ss leading zero
F Fraction of a second f deciseconds
FF centiseconds with leading zero
ff centiseconds without leading zero
FFF milliseconds with leading zero
fff milliseconds without leading zero
Z Time zone ZZZ or zzz TLA
zzzz offset from UTC

Examples

Format string Sample formatted date/time
dd/mm/yy 08/09/05
Www dt Mmm yyyy BB Thu 8th Sep 2005 AD
h:ii AA 4:51 PM
yyyy-mm-dd HH:ii:ss zzzz 2005-09-08 16:51:09 +0100
HH:ii:ss.FFF ZZZ 16:51:09.427 BST

Literals

Any non-alphabetic character that doesn't have a special meaning is a literal, i.e. it will be placed as is in the generated string.  You can literalise a single character by placing a backquote (`) immediately before it, or any number of consecutive characters by enclosing them in single quotes ('...').

All letters of the basic Latin alphabet, even those that are not defined as format specifiers, must be literalised if they are to occur as are in the generated string.  This is to prevent accidental use of letters that may become format specifiers in future versions of the notation.

The following characters will always be literals: - : , . / space and all Unicode codepoints beyond U+007F.

Examples

Format string Sample formatted date/time
yyyy-mm-dd`THH:ii:ss 2005-09-08T16:51:09
Wwww 'the' dt 'of' Mmmm Thursday the 8th of September

Alignment fields

Square brackets ([ ]) can be used to create fields.  A field is a piece of formatting that is padded to a specific width.  Inside a pair of square brackets, the input consists of up to three parts: the left padding, content and right padding.  Either padding part consists of the same literal character one or more times in succession.  The total number of padding characters determines the width of the field.  The field alignment is determined by which paddings are present:

An alternative is to place a number just inside the square bracket, next to a single padding character.  Then the number is the field width.  For left-aligned and right-aligned fields, the number must be on the same side as the padding character.  For centred fields, the padding character is given on each side, and the side on which the number is placed will receive the extra padding character if the content cannot be centred exactly.

In the unlikely event that you may wish to use a digit as a padding character, it must be literalised.

Examples

[------Wwww.....]

This is a centred, eleven-character field containing the name of the day of the week.  The generated string will be one of these:

---Sunday..
---Monday..
--Tuesday..
-Wednesday.
--Thursday.
---Friday..
--Saturday.

The format string [11-Wwww.] has the same meaning.

The content can contain any number of format specifiers and literals. For example:

[d/m/yyy           ]HH:ii:ss

creates, if outputting a list of date/time values, a column for the date and a column for the time.  Sample output:

24/9/1979  03:05:42
15/11/1983 21:43:05
3/4/991    13:57:24

Collapsible portions

Curly brackets ({ }) can be used to create a collapsible portion.  This is a portion that will appear in the output only if at least one of the format specifiers within it actually generates something.  Under the current spec, the only conditions in which a format specifier may generate nothing are:

Examples

d Mmm yyy{ B}{ HH:ii:ss zzzz}

Formats the date, and the time if present, with only one space between them whether the date is BC or AD.

{d }{Mmm }yyy BB

For applications in which an exact date, the month and year or just the year may be known, to prevent leading spaces.

HH:ii{:ss}{.FFF}

For applications in which a time is always known to the precision of minutes, but may sometimes be known more precisely.

Open issues

When using fields, what should happen if the content is too long for the field?  Possibilities:

The notation could eventually be used to format dates and times in almost any language.  However, many languages don't capitalise names of days and months as English does.  Sometimes an application might want the day or month in the mid-sentence case appropriate to the language.  Should we invent a notation to do this?  Or rely on the application to offer a format string appropriate to the language being used at the time?

I'm not so sure about the way of choosing BC/AD and BCE/CE.  The existence of two ways to express the information might not extend to all other languages, which might have only one notation or even more than two.  An alternative is to provide one notation as the default for any language, and a means for applications to tweak aspects of the language data.

A similar formatting system could be devised to apply to date/time intervals rather than absolute instants in time.  But this would open up a whole new can of worms.


Stewart's Utility Library | D Programming Resources | Programming Resources Home