Extract Formatted Text From Excel Cell With C# (Rich Text Format)

by Filip Stanek 16. September 2009 17:17

I was writing an application that needed to convert text in a cell in an Excel workbook to HTML. It is fairly trivial to get formatting for the entire cell, but each individual character in the cell could have different formatting itself, so I needed something more specific than cell-level formatting info.

At first, I started using the Excel.Range.get_Characters( pos, len ) method to get info out of the cell.  The code would loop through all characters, get them one by one, and check the formatting.  For example:

   1: Microsoft.Office.Interop.Excel.Range Range = (Microsoft.Office.Interop.Excel.Range)Cell;
   2: int TextLength = Range.Text.ToString().Length;
   3: for (int CharCount = 1; CharCount <= TextLength; CharCount++)
   4: {
   5:     Microsoft.Office.Interop.Excel.Characters charToTest = Range.get_Characters(CharCount, 1);
   6:     bool IsBold = (bool)charToTest.Font.Bold;
   7:     bool IsItalic = (bool)charToTest.Font.Italic;
   8:     // other formatting tests here
   9: }

However, that method proved to be incredibly slow for cells that have more than just a few characters.  For cells that have 1000+ characters, it would take several minutes to run the test across all characters. I kept playing around with different ways to speed up the whole process, but it just became apparent that making the call to Excel to get all of this information was not going to be acceptable.

Finally, I think I’ve found the solution. It is possible to copy the text from a cell to the clipboard, and then use the Clipboard class to retrieve the formatted text, and parse it with C#. I ended up using the System.Windows.DataFormats.Rtf format to extract the data from the clipboard in the following way:

string rtf = (string)System.Windows.Clipboard.GetData(System.Windows.DataFormats.Rtf);

Then, I create a System.Windows.Forms.RichTextBox, and use that to parse the data. The following is a sample of the solution, and it is reasonably quick.

   1: Microsoft.Office.Interop.Excel.Range Range = (Microsoft.Office.Interop.Excel.Range)Cell;
   2: Range.Copy(System.Reflection.Missing.Value);
   3:             
   4: string rtf = (string)System.Windows.Clipboard.GetData(System.Windows.DataFormats.Rtf);
   5: System.Windows.Forms.RichTextBox rtb = new System.Windows.Forms.RichTextBox();
   6: rtb.Rtf = rtf;
   7:             
   8: int CharCount = rtb.Text.Length;
   9:  
  10: for (int CharNum = 0; CharNum < CharCount; CharNum++)
  11: {
  12:    rtb.Select(CharNum, 1);
  13:    System.Drawing.Font Font = rtb.SelectionFont;
  14:    bool IsCharBold = Font.Bold;
  15:    bool IsCharUnderline = Font.Underline;
  16:    bool IsCharItalic = Font.Italic;
  17:  
  18:    // other code here
  19: }

Tags: , , ,

Windows Development

Vista Sidebar Gadget, Transparent Backgrounds, and Font Color

by Filip Stanek 9. May 2008 12:30

We've had a request from a client to build a Windows Vista Sidebar Gadget.  The background on the gadget is to be somewhat see-through.  So I've played around with gadgets, and noticed that when I make a semi-transparent background on the gadget, the font color gets all messed up.  For example, I wanted white text on a blue background with 50% opacity, and the font appeared blueish, even though it was supposed to be white.  If I turn off the opacity on the background, the font becomes white, but as long as there's opacity, the font is messed up.

Well, after doing some research, it appears there is a way to display the proper font color on transparent backgrounds, although I didn't find this anywhere on the Microsoft site.  The trick is to use JavaScript and the addTextObject() method.  Here's an example of some code, and the resulting gadget:

 

   1: <html> 
   2:   <head> 
   3:     <title>Sample Gadget</title> 
   4:     <style> 
   5:       body { width: 120px; height: 200px; font-size: 12px; margin: 0; font-family: Verdana; color: #FFFFFF; } 
   6:       .gadget { margin: 3px; } 
   7:     </style> 
   8:   </head> 
   9:   <body> 
  10:     <div class="gadget"> 
  11:       <div>This text is not white.</div> 
  12:     </div> 
  13:     <g:background id="background" src="images\bgblue.png" style="width:100%;height:100%;z-index: -1; position: absolute; top: 0; left: 0;" opacity="50"></g:background> 
  14:     <script language="javascript" type="text/javascript">
   1: document.getElementById("background").addTextObject("This text is white.", "Verdana", 12, "white", 3, 50);
</script>
  15:   </body> 
  16: </html>

Tags: ,

Windows Development

Tag cloud

About Filip Stanek

Death Note Pic I'm a developer at ACG Multimedia in Cincinnati, OH. Besides working with ASP.NET, Flash, and other web technologies, I enjoy playing chess, video games, etc.

Currently playing:
- Final Fantasy XIII
E-mail me Send mail

Recent Comments

Comment RSS

Month List

Page List