3 doc2txt, doc2ps, wdoc2txt, xls2txt, olefs, mswordstrings, msexceltables
4 \- extract printable text from Microsoft documents
34 .IB mtpt /WordDocument
58 to extract the printable text from the body of a Microsoft Word document
59 and write it on the standard output.
61 is similar, but emits PostScript corresponding to the document.
67 to send the output to a new
71 performs a similar function for Microsoft Excel documents.
73 Microsoft Office documents are stored in OLE (Object Linking and Embedding)
74 format, which is a scaled down version of Microsoft's FAT file system.
76 presents the contents of an MS Office document as a file system
84 may then be used to parse the files inside, extracting
87 may be given options to control the formatting of its output.
91 Attempt conversion of non-tabular sheets in the workbook (charts).
94 Sets the inter-field delimiter to the string
96 by default a single space.
99 Enables debugging output.
103 is a comma-separated list of column numbers and ranges.
104 Ranges are separated by dashes.
105 Limit processing to just those columns named;
106 by default all columns are output.
109 Disables field padding to column width.
112 Disable quoting of textural fields (see
116 Truncate fields to the column width.
120 is a comma-separated list of worksheet numbers and ranges, this
121 limits the sheets output using the same syntax as the
124 Suppressed chart pages are always included in the sheet count.
126 Extract pieces of an MS Excel spreadsheet.
132 msexceltables -q -w 1,7,9-14 -c 3-5 -n -d '@' /mnt/doc/Workbook > rpt.txt
137 .TF "\fL/sys/src/cmd/aux "
153 ``Microsoft Word 97 Binary File Format'',
154 at Microsoft's developer (MSDN) home page.
156 ``LAOLA Binary Structures'',
157 .B http://user.cs.tu-berlin.de/~schwartz/pmh
159 ``OpenOffice.Org's Excel Documentation'',
161 .B http://sc.openoffice.org/excelfileformat.pdf