Getting a list of files from a MOSS document library using a SharePoint web service - Sparks from the Anvil

Getting a list of files from a MOSS document library using a SharePoint web service

My challenge was simple.  I needed to develop an SSIS package that would download and extract data from every Excel file held in document libraries across several SharePoint sites.  SSIS was the natural choice as the data needed to be cleaned and validated before being imported into a database. However, SSIS is not great with web services – especially in the data flow. As I not worked with the SharePoint web services much, I started with a good old Console application.
 
MOSS, or more accurately, WSS provides a whole host of web services to obtain information about SharePoint sites.  However, figuring out which method to invoke and what parameters to pass is more problematic.  Especially as many of the parameters are chunks of Collaborative Application Mark-up Language (CAML) – a dialect of XML developed by Microsoft specifically for use with SharePoint.
 
A False Start
My first console app simply obtained the GUID of the document library using the GetListCollection() method of the Lists web service.  The GUID was then passed to the GetListItems() method which duly provided all documents and folders at the top level of the document library.  It then seemed logical to me to recursively call the GetListItems() method using the GUID of each sub-folder.  On no, how wrong could I be!  The GetListItems() method simply chokes on these folder GUIDs.
 
On searching the internet I found many other incorrect forum posts and blog entries about the same topic – but no working solutions.  I also made an extensive search of my eBook collection – but again no solutions – which overall motivated me to write this blog entry.
 
The solution - RTFM
Well, if I had read the whole page in the manual, I would have got to the solution earlierCrying.  The key to my puzzle was the QueryOptions XML fragment which has both a Folder element and the all important <ViewAttributes Scope="Recursive" /> element. Using these elements together, it is possible to obtain a list of all documents in all subfolders in the list. Indeed, it does not even bother returning the subfolder details!
 
So here is the code for my working C# sample.
 

using System;

using System.Collections.Generic;

using System.Text;

using System.Xml;

using System.Web.Services;

using System.Web;

using System.Net;

 

namespace ConsoleApplication1

{
    class Program
    {

        static void Main(string[] args)
        {

string siteUrl = @"http://yourserver/sites/yoursite";

string documentLibraryName = @"Shared Documents";

SharePointList.Lists wsList = new SharePointList.Lists();

wsList.Credentials = System.Net.CredentialCache.DefaultCredentials;

WebProxy proxyObj = new WebProxy("yourproxy", 80);

wsList.Proxy = proxyObj;

wsList.Url = siteUrl + @"/_vti_bin/lists.asmx";

 

// get a list of all top level lists

XmlNode allLists = wsList.GetListCollection();

// load into an XML document so we can use XPath to query content

XmlDocument allListsDoc = new XmlDocument();

allListsDoc.LoadXml(allLists.OuterXml);

// allListsDoc.Save(@"c:\allListsDoc.xml"); // for debug

XmlNamespaceManager ns = new XmlNamespaceManager(allListsDoc.NameTable);

ns.AddNamespace("d", allLists.NamespaceURI);

 

// now get the GUID of the document library we are looking for

XmlNode dlNode = allListsDoc.SelectSingleNode("/d:Lists/d:List[@Title='" + documentLibraryName + "']", ns);

if (dlNode == null)

{

Console.WriteLine("Document Library '{0}' not found!", documentLibraryName);

}

else

{

// obtain the GUID for the document library and the webID

string documentLibraryGUID = dlNode.Attributes["ID"].Value;

string webId = dlNode.Attributes["WebId"].Value;

 

Console.WriteLine("Opening folder '{0}' GUID={1}", documentLibraryName, documentLibraryGUID);

 

// create ViewFields CAML

XmlDocument viewFieldsDoc = new XmlDocument();

XmlNode ViewFields = AddXmlElement(viewFieldsDoc, "ViewFields", "");

AddFieldRef(ViewFields, "GUID");

AddFieldRef(ViewFields, "ContentType");

AddFieldRef(ViewFields, "BaseName");

AddFieldRef(ViewFields, "Modified");

AddFieldRef(ViewFields, "EncodedAbsUrl");

//viewFieldsDoc.Save(@"c:\viewFields.xml"); // for debug

 

// create QueryOptions CAML

XmlDocument queryOptionsDoc = new XmlDocument();

XmlNode QueryOptions = AddXmlElement(queryOptionsDoc, "QueryOptions", "");

AddXmlElement(QueryOptions, "Folder", documentLibraryName);

AddXmlElement(QueryOptions, "IncludeMandatoryColumns", "FALSE");

// this element is the key to getting the full recusive list

XmlNode node = AddXmlElement(QueryOptions, "ViewAttributes", "");

AddXmlAttribute(node, "Scope", "Recursive");

queryOptionsDoc.Save(@"c:\queryOptions.xml"); // for debug

 

// obtain the list of items in the document library

XmlNode listContent = wsList.GetListItems(documentLibraryGUID, null, null, ViewFields, null, QueryOptions, webId);

 

XmlDocument xmlResultsDoc = new XmlDocument();

xmlResultsDoc.LoadXml(listContent.OuterXml);

ns = new XmlNamespaceManager(xmlResultsDoc.NameTable);

ns.AddNamespace("z", "#RowsetSchema");

// xmlResultsDoc.Save(@"c:\listContent.xml"); // for debug

 

XmlNodeList rows = xmlResultsDoc.SelectNodes("//z:row", ns);

if (rows.Count == 0)

{

Console.WriteLine("No content found");

}

foreach (XmlNode row in rows)

{

Console.WriteLine(row.Attributes["ows_ContentType"].Value + " " + row.Attributes["ows_GUID"].Value + " :: " + row.Attributes["ows_BaseName"].Value);

}

}

Console.WriteLine("Done");

Console.Read();

}

 

public static XmlNode AddXmlElement(XmlNode parent, string elementName, string elementValue)

{

XmlNode element = parent.AppendChild(parent.OwnerDocument.CreateNode(XmlNodeType.Element, elementName, ""));

if (elementValue != "")

element.InnerText = elementValue;

return (element);

}

 

public static XmlNode AddXmlElement(XmlDocument parent, string elementName, string elementValue)

{

XmlNode element = parent.AppendChild(parent.CreateNode(XmlNodeType.Element, elementName, ""));

if (elementValue != "")

element.InnerText = elementValue;

return (element);

}

 

public static XmlNode AddXmlAttribute(XmlNode element, string attrName, string attrValue)

{

XmlNode attr = element.Attributes.Append((XmlAttribute)element.OwnerDocument.CreateNode(XmlNodeType.Attribute, attrName, ""));

if (attrValue != "")

attr.Value = attrValue;

return (attr);

}

 

public static void AddFieldRef(XmlNode viewFields, string fieldName)

{

    XmlNode fieldRef = AddXmlElement(viewFields, "FieldRef", "");

    AddXmlAttribute(fieldRef, "Name", fieldName);

}

}

}


Published Friday, November 2, 2007 12:06 AM by DrJohn
Filed under: ,

Comments

# SSIS: Calling SharePoint web services from the data flow

Sunday, December 9, 2007 5:45 PM by Sparks from the Anvil

In my previous post , I outlined how to call a SharePoint web service to obtain a list of all Excel files

# re: Getting a list of files from a MOSS document library using a SharePoint web service

Monday, April 7, 2008 6:39 PM by reskatze

Thanks for this blog.

One question:

How do I get the names of the (recursive) folders?

# re: Getting a list of files from a MOSS document library using a SharePoint web service

Friday, June 4, 2010 8:02 PM by jazzyg

I am getting error at sharepointList reference. What lib needs to be referenced?

Thanks

# re: Getting a list of files from a MOSS document library using a SharePoint web service

Saturday, June 5, 2010 7:27 AM by DrJohn

Since I wrote these original posts, a set of sample applications has been published on CodePlex which provide integration between SharePoint and SSIS.  See technet.microsoft.com/.../dd365137%28SQL.100%29.aspx and sqlsrvintegrationsrv.codeplex.com

# Export a SharePoint 2007 wiki to the filesystem &laquo; Jeremy Branham&#039;s Blog

Pingback from  Export a SharePoint 2007 wiki to the filesystem &laquo; Jeremy Branham&#039;s Blog

# Export a SharePoint 2007 wiki to the filesystem &laquo; Jeremy Branham&#039;s Blog

Pingback from  Export a SharePoint 2007 wiki to the filesystem &laquo; Jeremy Branham&#039;s Blog

# just click the following internet page

Wednesday, November 19, 2014 10:38 PM by just click the following internet page

Getting a list of files from a MOSS document library using a SharePoint web service - Sparks from the Anvil

# How To Export Document Library In Sharepoint 2007 | Information

Pingback from  How To Export Document Library In Sharepoint 2007 | Information