<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" >

  <title>Erik L. Arneson — Writer and Software Developer</title>
  <subtitle>Erik L. Arneson is a freelance writer and software developer with WordPress experience. He is located in Portland, Oregon.</subtitle>
  <generator uri="https://jekyllrb.com/" version="4.4.1">Jekyll</generator>
  <link href="https://arnesonium.com/feeds/pandoc.xml" rel="self" type="application/atom+xml" />
  <link href="https://arnesonium.com/" rel="alternate" type="text/html" />
  <updated>2026-06-18T15:03:10+00:00</updated>
  <id>https://arnesonium.com/feeds/pandoc.xml</id>
  <author>
    <name>Erik L. Arneson</name>
  </author>
      <entry>
        
        <title>Update: Org to DOCX with Citations</title>
        <author>
          <name>Erik L. Arneson</name>
        </author>        
        <link href="https://arnesonium.com/2023/06/org-to-docx-with-citations" rel="alternate" type="text/html" title="Update: Org to DOCX with Citations" />
        <updated>2023-06-20T00:00:00+00:00</updated>
        <id>https://arnesonium.com/2023/06/org-docx-citations</id>
          <category term="org-mode" />
        
          <category term="pandoc" />
        
          <category term="emacs" />
        
          <category term="writing" />
        <content type="html" xml:base="https://arnesonium.com/2023/06/org-to-docx-with-citations">&lt;p&gt;Last year, I wrote about &lt;a href=&quot;/2022/10/org-mode-to-docx-pipeline&quot;&gt;converting Org to DOCX with pandoc&lt;/a&gt;. Well, that particular method has needed some improvements. I needed to also support converting Markdown files, and more vitally, I needed to support the new-ish &lt;a href=&quot;https://orgmode.org/manual/Citation-handling.html&quot;&gt;org-cite citation format&lt;/a&gt;.&lt;/p&gt;

&lt;!--more--&gt;

&lt;p&gt;The first thing I did was update to the &lt;a href=&quot;https://pandoc.org/installing.html&quot;&gt;latest version of Pandoc&lt;/a&gt;. Next, I had to learn how &lt;a href=&quot;https://pandoc.org/MANUAL.html#citations&quot;&gt;Pandoc’s citations&lt;/a&gt; work. Note that you have to enable the &lt;a href=&quot;https://pandoc.org/MANUAL.html#org-citations&quot;&gt;citations extension&lt;/a&gt; as well.&lt;/p&gt;

&lt;p&gt;For citations to work, you need to have a &lt;a href=&quot;https://docs.citationstyles.org/en/stable/specification.html&quot;&gt;Citation Style Language&lt;/a&gt; (CSL) file. Zotero comes with a ton of them, so check your Zotero installation for examples.&lt;/p&gt;

&lt;p&gt;In the updated fish shell function below, you will want to update both &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;refdoc&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;csldoc&lt;/code&gt; to point to your reference DOCX file and your CSL file, respectively.&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-fish&quot;&gt;function org2docx --description &apos;Generate a DOCX file using a custom reference document&apos;
    set -l refdoc &quot;$PATH_TO_REFERENCE_DOCX&quot;
    set -l csldoc &quot;$PATH_TO_CSL&quot;
    set -l fromfmt (string match -r &apos;(?:org|md)$&apos; $argv)
    set -l base (basename -s .$fromfmt $argv)

    echo Generating $base.docx ...

    pandoc --from &quot;$fromfmt&quot;+citations \
        --citeproc --csl $csldoc \
        --reference-doc $refdoc -o $base.docx $argv
end
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;And there you have it! Now you can convert both Org files and Markdown files to DOCX. And I am sorry that you have to use DOCX!&lt;/p&gt;</content>
      </entry>
    
      <entry>
        
        <title>Writing and Reviewing Jupyter Notebooks</title>
        <author>
          <name>Erik L. Arneson</name>
        </author>        
        <link href="https://arnesonium.com/2023/05/reviewing-jupyter-process" rel="alternate" type="text/html" title="Writing and Reviewing Jupyter Notebooks" />
        <updated>2023-05-18T00:00:00+00:00</updated>
        <id>https://arnesonium.com/2023/05/reviewing-jupyter-process</id>
          <category term="jupyter" />
        
          <category term="writing" />
        
          <category term="emacs" />
        
          <category term="pandoc" />
        <content type="html" xml:base="https://arnesonium.com/2023/05/reviewing-jupyter-process">&lt;p&gt;A recent project involves delivering a finished product as a collection of &lt;a href=&quot;https://jupyter.org/&quot;&gt;Jupyter Notebooks&lt;/a&gt;. This process involves using Emacs for writing, Git for version control, and a slightly tricky process for enabling non-Jupyter, non-Emacs users to perform document review.&lt;/p&gt;

&lt;!--more--&gt;

&lt;p&gt;Writing—just like programming—ideally includes a review process before anything is delivered to the client. Even first drafts need at least two readers before delivery. I’ve previously discussed how &lt;a href=&quot;https://arnesonium.com/2022/10/org-mode-to-docx-pipeline&quot;&gt;I use Org Mode and Pandoc to deliver DOCX files&lt;/a&gt;, and DOCX or ODF files unfortunately remain the easiest way to track changes and edits among word processor users.&lt;/p&gt;

&lt;p&gt;Since Jupyter Notebooks are basically JSON documents, the best way to keep track of changes and revisions is using some kind of version control. &lt;a href=&quot;https://towardsdatascience.com/how-to-use-git-github-with-jupyter-notebook-7144d6577b44&quot;&gt;Here is one process using Git and GitHub.&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When my notebook files are ready for review, converting them to DOCX files is pretty straightforward.&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;First, I use the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;jupyter&lt;/code&gt; command line tool to convert to Markdown, like this:&lt;/p&gt;

    &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;jupyter nbconvert --to markdown *.ipynb
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;    &lt;/div&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Next, I use &lt;a href=&quot;https://pandoc.org/&quot;&gt;Pandoc&lt;/a&gt; to convert to DOCX using a reference link.&lt;/p&gt;

    &lt;pre&gt;&lt;code class=&quot;language-fish&quot;&gt;for file in *.md
    pandoc --reference-doc $path_to_refdoc -o $file.docx $file
end
&lt;/code&gt;&lt;/pre&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;h2 id=&quot;renaming-files-with-dired&quot;&gt;Renaming files with Dired&lt;/h2&gt;

&lt;p&gt;At this point, I needed to rename all of the DOCX files and move them to the proper shared folder, so my reviewer could get to them and know what’s going on. We have a &lt;a href=&quot;https://www.computerworld.com/article/2833158/4-rules-for-naming-your-files.html&quot;&gt;naming format for filenames&lt;/a&gt; that helps us track project and versions, so all of the files needed to have at least a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;v1&lt;/code&gt; in them.&lt;/p&gt;

&lt;p&gt;Emacs has a file manager called Dired, which contains powerful features that allow you to modify directory contents just like any other buffer. I now had a bunch of files that ended in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.md.docx&lt;/code&gt; that needed to instead end in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-v1.docx&lt;/code&gt;. Here is the process I used to easily rename them.&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;In Emacs, use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;M-x dired&lt;/code&gt; to open the directory.&lt;/li&gt;
  &lt;li&gt;Use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;C-x C-q&lt;/code&gt; to run &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dired-toggle-read-only&lt;/code&gt;.&lt;/li&gt;
  &lt;li&gt;Use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;M-%&lt;/code&gt; to run &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;query-replace&lt;/code&gt;, and replace &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.md.docx&lt;/code&gt; with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-v1.docx&lt;/code&gt;.&lt;/li&gt;
  &lt;li&gt;Finish “writing” the directory with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;C-c C-c&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;All done! It was nice and simple. The DOCX files were finally properly named and ready for review.&lt;/p&gt;</content>
      </entry>
    
      <entry>
        
        <title>An Org-mode to DOCX Pipeline</title>
        <author>
          <name>Erik L. Arneson</name>
        </author>        
        <link href="https://arnesonium.com/2022/10/org-mode-to-docx-pipeline" rel="alternate" type="text/html" title="An Org-mode to DOCX Pipeline" />
        <updated>2022-10-26T00:00:00+00:00</updated>
        <id>https://arnesonium.com/2022/10/org-to-docx</id>
          <category term="org-mode" />
        
          <category term="pandoc" />
        
          <category term="emacs" />
        
          <category term="writing" />
        <content type="html" xml:base="https://arnesonium.com/2022/10/org-mode-to-docx-pipeline">&lt;p&gt;Freelance writers need to deliver documents in the format requested by clients. However, frequently
the requested format is not the writer’s preferred working format. I like to write in Org Mode, but
many clients prefer delivery in Microsoft Word’s DOCX format.&lt;/p&gt;

&lt;p&gt;This is how I generate DOCX files for my clients.
&lt;!--more--&gt;&lt;/p&gt;

&lt;h2 id=&quot;what-is-org-mode&quot;&gt;What is Org Mode?&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https://orgmode.org/&quot;&gt;Org Mode&lt;/a&gt; is an Emacs package for writing and working with Org files. Org
files are highly structured plain text files that may appear to be a text outline, but can do so
much more. Org Mode is incredibly versatile, and can be used to track projects, manage schedules,
write outlines, and even create documents.&lt;/p&gt;

&lt;h2 id=&quot;choosing-between-org-export-and-pandoc&quot;&gt;Choosing between Org Export and Pandoc&lt;/h2&gt;

&lt;p&gt;Org Export runs inside Emacs and is capable of converting Org files to &lt;a href=&quot;https://orgmode.org/guide/Exporting.html&quot;&gt;a variety of other
formats&lt;/a&gt;. While it is very powerful, it also has its
idiosyncrasies. For instance, when converting Org files to DOCX files, it uses its own style names
such as “Org Title” and “Org Heading 1”.&lt;/p&gt;

&lt;p&gt;A second option for converting Org files to DOCX files is &lt;a href=&quot;https://pandoc.org/&quot;&gt;Pandoc&lt;/a&gt;. Pandoc
prides itself on being the Swiss army knife of document format conversion. It handles an impressive
variety of document formats and handles a dizzying collection of configuration options.&lt;/p&gt;

&lt;p&gt;Since the DOCX files that I create need to be shared with other writers, editors, and reviewers, I
need to make sure that they are easy to work with. This influenced my decision. Since Pandoc uses
more &lt;a href=&quot;https://pandoc.org/MANUAL.html#option--reference-doc&quot;&gt;standard style names&lt;/a&gt;, I decided to use
it for Org conversion.&lt;/p&gt;

&lt;h2 id=&quot;setting-up-pandoc&quot;&gt;Setting up Pandoc&lt;/h2&gt;

&lt;p&gt;To use Pandoc to generate nice looking DOCX files, you will need to configure a template
document. The recommended method for doing this is to generate a default template using Pandoc, and
then edit it in Word. I used &lt;a href=&quot;https://www.libreoffice.org/&quot;&gt;LibreOffice Writer&lt;/a&gt; for this, and it
worked just fine.&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Install the latest Pandoc &lt;a href=&quot;https://pandoc.org/installing.html&quot;&gt;using these instructions&lt;/a&gt;.&lt;/li&gt;
  &lt;li&gt;Run the following command to generate &lt;strong&gt;reference.docx&lt;/strong&gt;
    &lt;div class=&quot;language-shell highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;pandoc &lt;span class=&quot;nt&quot;&gt;-o&lt;/span&gt; custom-reference.docx &lt;span class=&quot;nt&quot;&gt;--print-default-data-file&lt;/span&gt; reference.docx
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;    &lt;/div&gt;
  &lt;/li&gt;
  &lt;li&gt;Open &lt;strong&gt;reference.docx&lt;/strong&gt; in your word processor and edit the styles so they meet your needs.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2 id=&quot;converting-from-org-to-docx&quot;&gt;Converting from Org to DOCX&lt;/h2&gt;

&lt;p&gt;One option for running the conversion is to take advantage of the
&lt;a href=&quot;https://github.com/emacsorphanage/ox-pandoc&quot;&gt;ox-pandoc&lt;/a&gt; package for Emacs. If you will always be
using the same configuration for your exports, this is a great option.&lt;/p&gt;

&lt;p&gt;However, I need to use a number of different configurations for converting documents, so I tend to
run Pandoc from the command line. Recent versions of ox-pandoc support &lt;a href=&quot;https://github.com/emacsorphanage/ox-pandoc#passing-options-to-pandoc&quot;&gt;passing options via Org
headers&lt;/a&gt;, but I still haven’t
bothered to set that up. It should be very easy to template this using 
&lt;a href=&quot;/2022/09/yasnippet-emacs-writing&quot;&gt;Yasnippet&lt;/a&gt;, though.&lt;/p&gt;

&lt;p&gt;I use a custom &lt;a href=&quot;https://fishshell.com/&quot;&gt;fish shell&lt;/a&gt; function that looks like this:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-fish&quot;&gt;function org2docx --description &apos;Generate a DOCX file using a custom reference document&apos;
    set -l refdoc &quot;$PATH_TO_REFERENCE_DOCX&quot;
    set -l base (basename -s .org $argv)
    echo Generating $base.docx ...
    pandoc --reference-doc $refdoc -o $base.docx $argv
end
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;From my fish shell command line, I can then just run &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;org2docx whatever.org&lt;/code&gt; to generate
&lt;strong&gt;whatever.docx&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;I have not found a level of automation that makes my converted DOCX files completely perfect,
unfortunately. After conversion, I always open the new file in my word processor to make final
tweaks and fixes.&lt;/p&gt;

&lt;h2 id=&quot;have-fun-converting-files&quot;&gt;Have fun converting files!&lt;/h2&gt;

&lt;p&gt;The method I’ve outline in this blog post is straightforward and fits my needs. There are definitely
improvements to be made, such as using templates to pass the proper options to Pandoc. Switching to
ox-pandoc would mean one fewer reason to leave Emacs, after all.&lt;/p&gt;

&lt;p&gt;In recent years, more and more clients are asking for files to be delivered via Google Docs. So far,
I have yet to find a good conversion pipeline to get Org files into Google Docs easily. My method
right now takes too many manual steps. That’s a problem I would love to solve.&lt;/p&gt;

&lt;p&gt;Do you have a conversion pipeline for documents that works for you? Leave me a comment and let me
know!&lt;/p&gt;</content>
      </entry>
    
</feed>
