.I file
]
.SH DESCRIPTION
-HTML comes in various character set encodings
+HTML comes in various character-set encodings
and has special forms to encode characters. To
-make it easier to process html, uhtml is used
-to normalize it to a unicode only form.
+make it easier to process HTML, uhtml is used
+to normalize it to a Unicode-only form.
.LP
-Uhtml detects the character set of the html input
+Uhtml detects the character set of the HTML input
.I file
and calls
.IR tcs (1)
-to convert it to utf replacing html-entity forms
-by ther unicode character representations except for
-.B lt
-.B gt
-.B amp
-.B quot
+to convert it to UTF replacing HTML-entity forms
+by their Unicode character representations except for
+.BR lt ,
+.BR gt ,
+.BR amp ,
+.BR quot ,
and
-.B apos .
-The converted html is written to
+.BR apos .
+The converted HTML is written to
standard output. If no
.I file
was given, it is read from standard input. If the
.B -p
option is given, the detected character set is printed and
the program exits without conversion.
-In case character set detection fails, the default (utf)
+In case character-set detection fails, the default (UTF)
is assumed. This default can be changed with the
.B -c
option.