Excel Services 101
I guess that many of you know about the Excel Calculation Services, one of the key technologies that are available with the Enterprise Edition of Microsoft Office SharePoint Server 2007 and SharePoint Server 2010.
Even if they are often considered only for one of the features they provide – the rendering of an Excel document inside the browser – the Excel Calculation Services are indeed much more than this.
They provide a remote API, based on SOAP Web Services, that you can use in order to start a calculation (based, of course, on an Excel document) leveraging the resources (CPU and memory) of your SharePoint Server.
Moreover, starting from SharePoint Server 2010, a REST based API is available that makes part of the Excel document (yes, charts included!) available as data objects that can be retrieved by using simple HTTP GET calls, making this API a perfect tool for some client-side “mashup” logic.
Combining these features with some Excel Sheets manipulation API can help you create complex solutions with a very limited effort.
Open Xml 101
Since Office 2007, the legacy “xls” binary file format has been superseded by the brand-new Office OpenXml specification, which is an ECMA and ISO standard and is implemented through the “xlsx” file format.
Xlsx files are just an archive, so that if you rename an xslx file with a .zip extension and try to open it with your favorite software, you’ll end up with a nested folder structure containing images, media files and… xml files!
Those XML file are the core structure of the document, and can be manipulated in order to edit the contents of the Excel document itself.
Although you could do this “by hand”, if you have to develop against an Office OpenXml document you should definitely leverage the OpenXml SDK, a managed .NET library that makes available a bunch of classes that expose entities such as Cells, Sheets, Charts and so on.
These APIs mimic the internal representations of these entities, and you will notice that there is very little abstraction over the XML file structure, but anyway they are definitely easier to use and, of course, to learn, so I would suggest you to follow this way if you need to develop against Excel 2007/2010 documents.
Using OpenXml to produce ECS compatible documents
So… after this very (very, very) short introduction to the Excel Calculation Services and the OpenXml API, let’s try to combine them in a simple scenario.
Imagine you want to collect information from your users leveraging the OOTB SharePoint user interface (i.e. Lists and List Forms), but later use that information to drive some complex calculation and provide an output based on charts and pivot tables. That complex calculation is, indeed, the core of your solution.
You would use the Excel Calculation Services as the “brain” of your solution, so the first step is to prepare a “template” that can be rendered by the browser (the Excel Calculation Services do not support all the features of an Excel document, so be prepared to test your template as long as you implement it).
Your template will contain all the calculation logics (i.e. formulas, etc…) that will be applied to a set of cells where the user will be able to insert its own data. But of course, in the template those are cells “null”.
Once ready, you could just save it into a common repository (a document library, of course) that you make “unreachable” by the user (i.e. protected by authorization).
The reason why we might want to “hide” the main document to the user is to protect our intellectual property: if we would let the user download the template, edit it and upload it back to SharePoint, the user would be able to inspect the document and grab the logics behind it, which is not always acceptable.
Now… the template is there, it cannot be downloaded and it contains tons of computation formulas that are applied to null cells… mmhhh… how will it ever work??!
Well… here you start writing code J
First of all, you should collect the data entered by the user through the OOTB SharePoint user interface: since you are developing a server component, you could just use the SharePoint Server Object Model to access the list and its items, maybe feeding some in-memory objects for later use.
Then, and here comes the interesting part, you could open the calculation template using OpenXml. Since the document is not accessible by the user, if the code you are developing is supposed to run within the security context of the user (a web part? An application page triggered by some Ribbon button?) you should use some impersonation to access it.
I’m not displaying all of the previous steps since I wish to focus on the OpenXml part, as displayed by the following code snippet:
using (MemoryStream templateStream = new MemoryStream())
templateStream.Write(templateBytes, 0, (int)templateBytes.Length);
using (var xlsxDoc = SpreadsheetDocument.Open(streamTemplate, true))
Here I’m supposing you have loaded a Byte instance (templateBytes) with the contents coming from the XLSX template, and I’m using that object to open an instance of SpreadsheetDocument, the main entry point for our logics.
From there, you can iterate through the sheets using some straightforward LINQ syntax. Below you can find a short snippet that gets access to a Sheet given its name (sheetName):
IEnumerable<Sheet> sheets = document.WorkbookPart.Workbook.GetFirstChild<Sheets>().Elements<Sheet>().Where(s => s.Name == sheetName).FirstOrDefault();
Then you can obtain the reference to a cell, identified by a row index and a column name:
var row = worksheet.GetFirstChild<SheetData>().Elements<Row>().Where(r => r.RowIndex == rowIndex).FirstOrDefault();
var cell = row.Elements<Cell>().Where(c => string.Compare(c.CellReference.Value, columnName + rowIndex, true) == 0).FirstOrDefault();
Setting the cell value is of course the last operation you should perform:
if (cell.DataType != "s")
cell.CellValue.Text = inputFromUser;
Be careful in this last step: string cells (the ones identified by an “s” data type) are treated differently since they are stored in a separate worksheet part, the Shared String Table Part, and the cell value just contains a numeric identifier that identifies the string in the Shared String Table. I’m not showing you how to copy a string value, but you can imagine it’s just a matter of writing some additional lines of code in order to fin (or insert) the string in the Shared String Table before setting the cell value.
Once you have finished copying the values inserted by the user, you just need to save the resulting stream as a document in another SharePoint document library, which should be – of course – accessible to the user. Before uploading the final document, just remember to “finalize” the modifications:
xlsxDoc.WorkbookPart.Workbook.CalculationProperties.ForceFullCalculation = true; xlsxDoc.WorkbookPart.Workbook.CalculationProperties.FullCalculationOnLoad = true;
xlsxDoc.WorkbookPart.Workbook.CalculationProperties.CalculationOnSave = true;
At the end of this process, the user will find the “merge results” of its own data with the calculation logics applied by the template, which makes it possible to display charts, pivot tables and reports based on the data collected through a SharePoint List user interface.
But… won’t the user be able to read the formulas that sit behind the Excel document?
It’s just a matter of authorizations! If you just assign the View Only permissions to the user, he will only be able to open the document in the browser or download a “snapshot” of it: opening the snapshot with the Excel client will only display the results of the computation, without any reference to formulas that are completely hidden (I should say “flattened”) from the user.
Of course, I have just scratched the surface of what you can do using OpenXml and the Excel Calculation Services.
For example, I did not mention Data Connections, which could be an alternative approach to the scenario I have just described: I could have defined a data connection based on Web Services that would read the user data directly from the SharePoint list, obtaining similar results. Although that would have been a “no-code” solutions, it would have been much less flexible: suppose you wish to hide/show a sheet based on some condition, well, that is only possible through the use of some API.
Also, I did not talk about the infrastructure requirements for the Excel Service Service Application, nor I talked about the settings that you need to apply (trusted libraries, global settings, identity, etc…).
But I hope that the topics I talked you about can at least give you an idea of what you can accomplish with these two powerful technologies J