CommunityData:LaTeX to Word

From CommunityData


LaTeX is a wonderful system for producing beautiful PDF documents. However, many journals (including many Communication journals) do not accept PDFs and only accept Word documents (which, upon acceptance, they will dutifully convert to LaTeX documents for publication).

There are a few options for getting a Word document from your LaTeX document. No matter what option you choose, there will almost certainly be some manual post-conversion work, so it's usually wise to save this step until the very end (i.e., after proofreading).

PDF to Word[edit]

One is to use Adobe's PDF to Word converter. This often works pretty well but it will require a lot of cleanup to match the formatting of the journal.

LaTex to Word with Pandoc[edit]

Pandoc is an incredible cross-platform utility that allows you to convert documents between formats. It can convert from a .tex document do a Word document, using a template document to match font, heading styles, etc.

Instructions for how to do this with an .Rtex document, created using the cdsc_tex library.

  1. Download the template document to the directory that has your paper and rename it as template_doc.docx
  2. Edit your Makefile to add a word target. I also added some code to the clean target to remove the output file. Here is an example Makefile:

all: $(patsubst %.Rtex,%.pdf,$(wildcard *.Rtex)) 

word: $(patsubst %.Rtex,%.docx,$(wildcard *.Rtex)) 

%.tex: %.Rtex
	Rscript -e "library(knitr); knit('$<')"

%.pdf: %.tex 
	latexmk -f -pdf $<

	latexmk -C *.tex
	rm -f *.tmp *.run.xml
	rm -f vc
	rm -f *.bbl
	# the following lines are useful for Rtex/knitr
	rm -rf cache/ figure/
	rm -f *.tex
	mv template_file.docx template_file.bak
	rm *.docx
	mv template_file.bak template_file.docx

viewpdf: all
	evince *.pdf

%.docx: %.tex template_file.docx
	perl -i -p0e 's/(.chapterstyle\{cdsc.*)|(.usepackage\{cdsc-memoir.*)//g' $<
	perl -i -p0e 's/.published.*\n.*\n.*permission.*//gm' $<
	pandoc $< --bibliography=refs.bib -o $@ --citeproc --reference-doc=template_file.docx

vc:	resources/vc-git

pdf: all

.PHONY: clean all word
.PRECIOUS: %.tex

Note that part of what the code does is remove pieces of the cdsc-memoir package from the .tex file so that pandoc works successfully. Before building the paper again, you will want to run $ make clean to get back the clean .tex file

In order to create the Word document, you simply run

   $ make word