"Newick's 8:45" Tree Format Standard Interpretation by Gary Olsen Aug. 30, 1990: My interpretation from discussions and a copy of "Committee" notes. Oct. 4, 1991: Revised to reflect discussions with Joseph Felsenstein, David Madison and David Swofford at 1991 Woods Hole MBL Molecular Evolution Workshop. Jan. 24, 1992: Text revised. Jan. 20, 1994: Revised to reflect discussions with David Swofford regarding quotation marks in comments (they will have no special meaning; thus, [Newick's 8:45 Tree Standard] is a legal comment). Aug. 23, 1994: Text revised. Oct. 16, 2003: Branch length in "Printer Plot" of tree example fixed to match value (thanks to Al Gernon). Minor text revision. Conventions Used in Syntax Diagram: Items in { } may appear zero or more times. Items in [ ] are optional, they may appear once or not at all. All other punctuation marks (colon, semicolon, parentheses, comma and single quote) are required parts of the format. Rough Syntax Diagram: tree ==> descendant_list [ root_label ] [ : branch_length ] ; descendant_list ==> ( subtree { , subtree } ) subtree ==> descendant_list [internal_node_label] [: branch_length] ==> leaf_label [: branch_length] root_label ==> label internal_node_label ==> label leaf_label ==> label label ==> unquoted_label ==> quoted_label unquoted_label ==> string_of_printing_characters quoted_label ==> ' string_of_printing_characters ' branch_length ==> signed_number ==> unsigned_number Notes: Unquoted labels may not contain blanks, parentheses, square brackets, single_quotes, colons, semicolons, or commas. Underscore characters in unquoted labels are converted to blanks. Single quote characters in a quoted label are represented by two single quotes. Blanks or tabs may appear anywhere except within unquoted labels or branch_lengths. Newlines may appear anywhere except within labels or branch_lengths. Comments are enclosed in square brackets and may appear anywhere newlines are permitted. Other notes: PAUP (David Swofford) allows nesting of comments. My software supports this as well. TreeAlign (Jotun Hein) writes a root node branch length (with a value of 0.0). Most other software (including my own) seems to as well. PHYLIP (Joseph Felsenstein) requires that an unrooted tree begin with a trifurcation; it will not "uproot" a rooted tree. Example of rooted tree: (((One:0.2,Two:0.3):0.3,(Three:0.5,Four:0.3):0.2):0.3,Five:0.7):0.0; +-+ One +--+ | +--+ Two +--+ | | +----+ Three | +-+ | +--+ Four + +------+ Five Addendum (October 4, 1991): At the 1991 Woods Hole Marine Biology Laboratory Molecular Evolution Course, the following special comments were defined (by Joseph Felsenstein, David Madison, Gary Olsen and David Swofford): [&rooted] [&unrooted] One of these two comments may precede a tree to define whether it is meant to be read as a rooted or unrooted tree. The default treatment, when neither of these comments is present, may be context and/or application specific. [&&ApplicationID: Application_specific_comments ] This form permits users of the Newick 8:45 format to tag comments that are meant to be machine readable by specific programs. There is no registration of IDs, though it is expected that users of this convention will choose sufficiently descriptive IDs that coincidental conflicts are unlikely. Other forms of comments beginning with "[&" are reserved to the "Standard". It was also decided that names embedded within single quotes can contain any printable character and the space character. If a name is quoted, this must be done in its entirety. All compliant programs must be able to handle names of at least eight characters. Addendum (January 20, 1994): In response to discussions with David Swofford, quotation marks in comments will have no special meaning. Thus, [Newick's 8:45 Tree Standard] is a legal comment. On the other hand, [('B. subtilis':0.1, 'E. coli rrnB]':0.2):0.3] is not legal because the square bracket in the quotation marks ends the comment. Because comments can be nested, the following would be a legal comment: [('B. subtilis':0.1, 'E. coli [rrnB]':0.2):0.3]