Threaded Word File Conversion in Office Developer Automation

I have tested your code carefully. The function SaveToHtml is used to
convert ONE doc to ONE html. According to its code, every time we call the
function, we will
Step 1. create an instance of Word (word=new Word.Application())
Step 2. convert the document format.
Step 3. release/close the Word instance (NAR(word)).

Step 2 should work very fast based on my experience. However, Step 1 and 3
generally take a long time. Every call of new Word.Application() will
create a new winword.exe process. This explains why the overall process of
the 1000 word files is very slow in the single-thread mode and it also
explains why you see a lot of Word processes in task manager when using
multi-threads to do the conversion.

The recommended approach is to create a single instance of Word.Application
object or get the active Word.Application instance running on your
computer, and re-use this instance to convert all the files, then
close/release it in the end. In this way, I’d assure you that the whole
process will turn much faster even if we do not apply multi-thread mode.
(To be honest, I don’t think multi-threading will make it faster in this
case)

Here is an example for your reference:

private void ConvertFiles()
{
if (!Directory.Exists(txtSourceDirectory.Text))
txtStatus.Text = “Source Directory does not exist.”;
else if (!Directory.Exists(txtDestinationDirectory.Text))
txtStatus.Text = “Destination Directory does not exist.”;
else
{
Directory.Delete(txtDestinationDirectory.Text, true);
Directory.CreateDirectory(txtDestinationDirectory.Text);

Word._Application word = null;
object Missing = System.Reflection.Missing.Value;

try
{
// try to get an running instance of winword.exe
word =
(Word._Application)Marshal.GetActiveObject(“Word.Application”);
}
catch
{
// if no winword already running, we start a new
instance of Word
word = new Word.Application();
}

Word._Document doc = null;

object fileToOpen, fileToSave;
object FileFormat = Word.WdSaveFormat.wdFormatFilteredHTML;
object readOnly = true;
object isVisible = false;

string[] SourceFiles =
Directory.GetFiles(txtSourceDirectory.Text, “*.doc”);
foreach (string fileName in SourceFiles)
{
if (!Path.GetFileName(fileName).StartsWith(“.”))
{
fileToOpen = (object)fileName;
fileToSave =
(object)Path.Combine(txtDestinationDirectory.Text,
Path.GetFileName(fileName)).Replace(“.doc”, “.html”);
doc = word.Documents.Open(ref fileToOpen,
ref Missing, ref readOnly, ref Missing, ref
Missing,
ref Missing, ref Missing, ref Missing, ref
Missing,
ref Missing, ref Missing, ref isVisible, ref
Missing,
ref Missing, ref Missing, ref Missing);

doc.SaveAs(ref fileToSave,
ref FileFormat, ref Missing, ref Missing, ref
Missing,
ref Missing, ref Missing, ref Missing, ref
Missing,
ref Missing, ref Missing, ref Missing, ref
Missing,
ref Missing, ref Missing, ref Missing);

//close the document
doc.Close(ref Missing, ref Missing, ref Missing);
NAR(doc);
doc =null;
}
}

// quit the instance of word in the end.
word.Quit(ref Missing, ref Missing, ref Missing); //closes
all documents
NAR(word);
word = null;
}
}

If you insist on using multi-threading, we need to initialize and close the
Word.Application instance in the ConverFiles function, and do the
conversion job in SaveToHtml function by re-using the word instance.

If you have any other concerns or questions, please feel free to let me
know.

Regards,
Jialiang Ge (jialge@online.microsoft.com, remove ‘online.’)
Microsoft Online Community Support