bibsort - sort a BibTeX bibliography file

     bibsort [optional sort(1) switches] < infile >outfile

     bibsort filters a BibTeX bibliography, or bibliography frag-
     ment,  on	its standard input, printing on	standard output	a
     sorted bibliography.

     Sorting is	by BibTeX tag name, or by @String macro	name, and
     letter case is ignored in the sorting.

     If	no command-line	switches are provided for  sort(1),  then
     -f	 is  supplied to cause letter case to be ignored.  If you
     also want to remove duplicate entries, you	could specify the
     switches -f -u.

     The input stream is conceptually divided  into  four  parts,
     any of which may be absent.

2 	--1.--Introductory--material--such--as	-comments,---file
 	  1.  Introductory  material  such  as	 comments,   file
2 	------headers,	and-edit-logs-that-are-ignored-by-BibTeX.
 	      headers,	and edit logs that are ignored by BibTeX.
2 	------No-line-in-this-part-begins-with-an-at-sign,-``@''.
 	      No line in this part begins with an at-sign, ``@''.

2 	--2.--Preamble-material	delineated-by-``@Preamble{''--and
 	  2.  Preamble material	delineated by ``@Preamble{''  and
2 	------a	-matching-closing-``}'',-intended-to-be	processed
 	      a	 matching closing ``}'', intended to be	processed
2 	------by-TeX.--Normally,-there-is-only-one-such	entry--in
 	      by TeX.  Normally, there is only one such	entry  in
2 	------a	-bibliography-file,-although-BibTeX,-and-bibsort,
 	      a	 bibliography file, although BibTeX, and bibsort,
2 	------permit-more-than-one.
 	      permit more than one.

2 	--3.--Macro-definitions	of-the-form-``@String{...}''.	A
 	  3.  Macro definitions	of the form ``@String{...}''.	A
2 	------single--macro--definition	-may-span-multiple-lines,
 	      single  macro  definition	 may span multiple lines,
2 	------and-there	are-usually-several-such-definitions.
 	      and there	are usually several such definitions.

2 	--4.--Bibliography--entries--such--as--``@Article{...}'',
 	  4.  Bibliography  entries  such  as  ``@Article{...}'',
2 	------``@Book{...}'',--``@Proceedings{...}'',--and-so-on.
 	      ``@Book{...}'',  ``@Proceedings{...}'',  and so on.
2 	------For-bibsort,-any-line-that--begins--with	an--``@''
 	      For bibsort, any line that  begins  with	an  ``@''
2 	------immediately--followed--by	letters	and-digits-and-an
 	      immediately  followed  by	letters	and digits and an
2 	------open-brace-is-considered-to-be-such-an-entry.
 	      open brace is considered to be such an entry.

     The order of these	parts is preserved in the output  stream.
     Part  1  will  be	unchanged,  but	parts 2--4 will	be sorted
     within themselves.

     The sort key of ``@Preamble'' entries is their initial line,
     of	 ``@String''  entries,	the macro name,	and of all BibTeX
     entries, the citation tag between the open	curly  brace  and
     the trailing comma.

     bibsort will correctly handle UNIX	files with LF line termi-
     nators, as	well as	IBM PC DOS files with CR LF line termina-
     tors; the essential requirement is	that input lines be  del-
     ineated by	LF characters.

     BibTeX has	loose syntactical requirements that  the  current
     simple  implementation of bibsort does not	support.  In par-
     ticular, outer parentheses	may  not  be  used  in	place  of
     braces  following	``@keyword''  patterns,	 nor may there be
     leading or	embedded whitespace.

     If	you have such a	file, you can use bibclean(1) to  pretty-
     print it into a form that bibsort can handle successfully.

     The user must be aware that sorting a  bibliography  is  not
     without peril, for	at least these reasons:

2 	--1.--BibTeX-has-a-requirement-that-entry-tags	given--in
 	  1.  BibTeX has a requirement that entry tags	given  in
2 	------crossref	=--tag-pairs-in	a-bibliography-entry-must
 	      crossref	=  tag pairs in	a bibliography entry must
2 	------refer-to-entries-defined-later,--rather--than--ear-
 	      refer to entries defined later,  rather  than  ear-
2 	------lier,--in	-the-bibliography-file.	-This-regrettable
 	      lier,  in	 the bibliography file.	 This regrettable
2 	------implementation-limitation	of-the-current	(pre-1.0)
 	      implementation limitation	of the current	(pre-1.0)
2 	------BibTeX--prevents-arbitrary-ordering-of-entries-when
 	      BibTeX  prevents arbitrary ordering of entries when
2 	------crossref-values-are-present.
 	      crossref values are present.

2 	--2.--If-the-BibTeX-file-contains-interspersed-commentary
 	  2.  If the BibTeX file contains interspersed commentary
2 	------between--``@keyword{...}''--entries,--this-material
 	      between  ``@keyword{...}''  entries,  this material
2 	------will-be-considered-part-of-the-preceding-entry,-and
 	      will be considered part of the preceding entry, and
2 	------will-be-sorted-with-it.--Leading-commentary-is-more
 	      will be sorted with it.  Leading commentary is more
2 	------common,-and-will-be-moved	elsewhere-in-the-file.
 	      common, and will be moved	elsewhere in the file.

2 	------This-is-normally-not--a--problem	for--the--part	1
 	      This is normally not  a  problem	for  the  part	1
2 	------material-before-the-``@Preamble'',-since-it-is-kept
 	      material before the ``@Preamble'', since it is kept
2 	------together-at-the-beginning	of-the-output-stream.
 	      together at the beginning	of the output stream.

2 	--3.--Some-kinds-of-bibliography-files-should-be-kept--in
 	  3.  Some kinds of bibliography files should be kept  in
2 	------a	-different--order-than-alphabetically-by-tags.	A
 	      a	 different  order than alphabetically by tags.	A
2 	------good-example-is-a	bibliography-file-with	the--con-
 	      good example is a	bibliography file with	the  con-
2 	------tents--of	a-journal,-for-which-publication-order-is
 	      tents  of	a journal, for which publication order is
2 	------likely-more-suitable.
 	      likely more suitable.

     While a much more sophisticated  implementation  of  bibsort
     could  deal  with	the  first  point, solving the second one
     requires human intelligence and natural language understand-
     ing that computers	lack.

     bibsort uses ASCII	control	characters 001	through	 007  for
     temporary	modifications  of  the	input  stream.	If any of
     these are already present in the input, they will be altered
     on	 output.  This is unlikely to be a problem, because those
     characters	have neither a printable representation, nor  are
     they  conventionally used to mark line or page boundaries in
     text files.

     Some text editors permit application of an	arbitrary  filter
     command  to a region of text.  For	example, in GNU	emacs(1),
     the   command   C-u    M-x	   shell-command-on-region,    or
     equivalently,  C-u	 M-|,  can  be	used  to run bibsort on	a
     region of the buffer that is devoid of cross references  and
     other material that cannot	be safely sorted.

     Some  implementations  of	BibTeX	editing	 support  in  GNU
     emacs(1)  have  a	sort-bibtex-entries command that is func-
     tionally similar to bibsort.  However, the	 file  size  that
     can  be  processed	by emacs(1) is limited,	while bibsort can
     be	used on	arbitrarily large  files,  since  it  acts  as	a
     filter,  processing  a  small amount of data at a time.  The
     sort stage	needs the entire data  stream,	but  fortunately,
     the  UNIX sort(1) command is clever enough	to deal	with very
     large inputs.

     The current implementation	of bibsort follows the UNIX trad-
     ition  of	combining simple already-available tools.  A six-
     stage pipeline of	egrep(1),  nawk(1),  sort(1),  and  tr(1)
     accomplishes  the	job  in	 one  pass with	about 70 lines of
     shell script, 60 lines of which is	 a  nawk(1)  program  for
     insertion	of  sort keys.	bibsort	was written and	tested on
     several large bibliographies in a couple of hours.	 By  con-
     trast, bibtex(1) is more than 11 000 lines	of code	and docu-
     mentation,	and bibclean(1)	is about 1500 lines long.

     bibsort may fail on  some	UNIX  systems  if  their  sort(1)
     implementations  cannot  handle very long lines, because for
     sorting purposes, each complete bibliography entry	 is  tem-
     porarily  folded  into  a	single	line.  You may be able to
     overcome this problem by adding  a	 -znnnnn  switch  to  the
     sort(1)  command (passed via the command line to bibsort) to
     increase the maximum line size to some larger value of nnnnn

     bibclean(1),   bibtex(1),	 egrep(1),   emacs(1),	 nawk(1),
     sort(1), tr(1).

     Nelson H. F. Beebe, Ph.D.
     Center for	Scientific Computing
     Department	of Mathematics
     University	of Utah
     Salt Lake City, UT	84112
     Tel: (801)	581-5254
     FAX: (801)	581-4148
     Email: <beebe@math.utah.edu>