This post is migrated from previous hosting provider. There are still some issues with old posts. Please make a comment on this post with any issues.

Dissecting the SharePoint 2010 Taxonomy fields

Tags: SharePoint 2010

An intense Twitter conversation initiated by Fabian about how Managed Metadata is updated in SharePoint 2010 gave me the idea to note down a few interesting bits about the Taxonomy Fields and how they work within a Site Collection. I hope/guess that Fabian will write a good post (as usual) about his findings as well.

image

Introduction

The possibility to tag documents in SharePoint is one of my favorite features and one of the reasons that I think you should move to SharePoint 2010 as soon as possible. As every new function added to the huge SharePoint spectrum I have an urge to dive deep into these new additions to really know how to use them fully. I've spent some time with the Managed Metadata Service Application and the taxonomy fields used with it. I can't say it has been a smooth ride all the way - but digging into the actual bits and understanding how it all works made it a whole lot easier. And you know what - why keep everything a secret!

Taxonomy Fields are Lookup columns!

Yes, you heard it right! It is a smart and clever implementation from the SharePoint Team (some say the opposite though). In order to get performance from the taxonomy and managed metadata fields all used keywords and terms within that Site Collection is stored in a hidden list (in the top-level root site).

If you've read my previous posts about how to create Taxonomy Site Columns you've probably seen that we use two different fields; one of the type TaxonomyFieldType and one hidden using the type Note. And when defining a field of the type TaxonomyFieldType we need to specify a reference to a list called TaxonomyHiddenList.

Looking at the properties with SharePoint Designer This hidden list contains all used keywords and terms for the Site Collection and SharePoint uses this list for fast retrieval of the labels of the keyword and terms. You can find the list either by just browsing to /Lists/TaxonomyHiddenList, use SharePoint Designer to get the list id from the Site Properties (see image to the right) or use SharePoint Manager 2010.

If we take a look at the columns in that list we will see that it contains a number of interesting things. The first interesting columns are the IdFor* (1-3). IdForTermStore (1) is the Guid of the term store used to store this term and IdForTerm (2) is the Guid for the term. Keywords doesn't belong to a term set so the IdForTermSet (3) is an empty Guid, while managed metadata terms have a Guid corresponding to the term set.

Columns in the hidden list

Also worth noticing here is that the hidden list also contains the localized labels. I have the French language pack installed, so I can see both the English (4) term and path as well as the French one (5).

SharePoint uses this list as a lookup column so that it does not have to query the Managed Metadata Service all the time, but instead just looks it up in the local Site Collection.

You need proof?

Ok, let's make a really sample scenario. I have one document with Enterprise Keywords and a Managed Metadata column like this:

A document with metadata

Let's change the Term1033 column in the hidden taxonomy list to all uppercase letters and save the list item.

Change the hidden list info

When the list is reloaded you will immediately see that the column value has changed to the value we updated in the hidden list:

Display info is changed!

What happens if I delete one of the items in the hidden list?

Removed a tag!

As you probably guessed, it was removed from the file. Let's look at the document from another point of view - the edit properties view!

Edit info is not changed

What! As you can see: the term for which I changed the label is back to it's normal state, but the deleted one is still missing (and it is permanently). What's really happening here is that when in edit mode the taxonomy fields queries the Managed Metadata service directly - it does not use the local hidden list.

So, how do I get it back to where it were? The short answer is you can't. But by default every hour a timer job is executed. It is called Taxonomy Update Scheduler, and it's job is to push down the term store changes to the hidden lists (very much like the sync between the site collection user list and the UPA). Unfortunately it only pushes down changed items, so no luck here. Instead you actually need to go change it in the Term Store Management tool before running the timer job.

Warning: Under normal conditions you should never ever fiddle with the items in this hidden list. I'm just doing you this to show some stuff some of you never seen or even thought about.

What about the Note field then?

Let's take a look on what is stored in these two taxonomy fields. The TaxonomyField which is the lookup looks quite similar to a lookup column. It has the lookup id and the value:

Taxonomy column in SharePoint Manager 2010

The Note field (the hidden field) on the other hand contains just the term identifier and is the actual field used to store the connection to the term store. In case you copy, move or uploads a document to another Site Collection - then it will update the TaxonomyField with the correct lookup values.

Hidden Note field in SharePoint Manager 2010

I've seen the TaxCatchAll field, what's that!?

If you use managed metadata there is a hidden column on all your list items or documents called TaxCatchAll. This field contains all ID's in the hidden lookup list of all used terms and keywords for the list item object and is used by SharePoint when adding and updating items.

Who manages the hidden list?

Good question, it isn't you! The hidden list is automagically managed by two internal event receivers (can be found in the Microsoft.SharePoint.Taxonomy assembly). These event receivers are responsible for adding items to the hidden list and clean up old and unused ones. There's also a feature stapled, called TaxonomyFieldAdded, on the site definitions which is responsible for creating the hidden list as well as adding the item receivers.

Summary

This was it - a quick introduction on how to very clever use the SharePoint feature-set to make the tagging functionality available in SharePoint. And essentially it is just a plain ol' lookup columns with some event receivers that does all the magic for us!

No Comments

  • Deepak said

    With this kind of metadata architecture will be really good feature to tagging. But the project which I took over this month from a developer, might have over used the taxonomy concept, all the relation ship between the 10-15 list are maintained through taxonomy terms. The issue here is all this relation ship is creating so many list items in the hidden list, it is reaching the threshold limit of the list. My questions is there way to reduce the number of list items or can I clear items in hidden list, I know this is not solution. One more idea which could make sense is can I split this hidden list. Could you suggest on this! you inputs are highly appreciated.

  • Wictor said

    You should never manipulate the taxonomy hidden list! How many items do you have in that list? First of all there are indexed columns in the list and there are the list throttling as well - so can you explain what issues you are seeing?

  • Patrik said

    Great description! I have a problem that the values in the hidden list aren't updated by the Taxonomy Update Scheduler job. Anyone who has got a solution for this problem? http://social.msdn.microsoft.com/Forums/en-US/sharepoint2010general/thread/435d2806-d1d4-45ba-9f78-c6b73167976a

  • Kevin said

    I can see how this might prove to be a real nightmare for those of us trying to migrate data from a database into a site. Since this is really just a lookup column, then a lot of the issues we see with secondary apps (e.g. Access) are repeated here. Those MMD columns cannot be editing except directly in the browser forms. So I'm wondering how to write either workflows or code to solve this issue?

  • Kevin said

    I can see how this might prove to be a real nightmare for those of us trying to migrate data from a database into a site. Since this is really just a lookup column, then a lot of the issues we see with secondary apps (e.g. Access) are repeated here. Those MMD columns cannot be editing except directly in the browser forms. So I'm wondering how to write either workflows or code to solve this issue?

  • Wictor said

    You can for sure edit the values of a MMD column in code. Just use the correct class (TaxonomyField) to update the values - dont't try to manually set the correct string value unless you know how to do it. (which might be a good idea for a new post :-)

  • Rajesh said

    Hi Wictor We are using a managed metadata field to tag a page in SharePoint. In our case we need to tag the content to a huge list, so there would be more than 500 tags. When we try to save the page, we get an RPC error - "The incoming tabular data stream (TDS) remote procedure call (RPC) protocol stream is incorrect. Too many parameters were provided in this RPC request. The maximum is 2100". We tried to handle this in ItemUpdating event, by moving the items from the taxonomy field to a text field and setting the taxonomyfieldvalue to empty in the AfterProperties. But that too throws the same exception. Would you know if there is any workaround? Thanks

  • Cas van Iersel said

    He Wictor great article about the Taxonomy field! Used a lot of information from this Blog. However now I'm running into a problem with Sandboxed Solutions. I can deploy Taxonomy fields and I can see the Hidden Field is created by SharePoint. However when I retract & re-deploy my solution there's an error saying the field already exists. The error is actually saying that the Hidden Field for my Taxonomy field allready exists. Any solutions for this? Or should I just write a FeatureReceiver which will clean up the hidden fields for me? Thanks Cas

  • mahesh said

    Hi,

    I am migrating a 2010 site collection to a new site collection and facing a weird problem. Migration went well. Then I created some Managed Metadata columns and content types and assigned it to a document library. I could upload the document and fill the MM columns too but same how TaxCathall column is empty. Is there anything I have to do to get it working? Any help would be really appreciated.

    Thanks,
    Mahesh

  • Daniel Tshin said

    So, the List GUID for the TaxonomyHiddenList changes per site collection. That is a bit problematic for me, as I'm creating lists and libraries using a saved STP through a feature receiver. In the Manifest.xml for the STP, the taxonomy fields reference the TaxonomyHiddenList via GUID - can I make the reference more generic?

    I was able to hack the STP for lookups to local lists (change it from:
    List="{[Some GUID]}"
    to:
    List="Lists/LookupSourceListName"

    How do I reference the Site Collection's TaxonomyHiddenList in the STP?

    Or is this even a good approach?

  • Sushmi said

    Hi Wictor,

    I have this problem, wherein we had a Site Column pointing to a MMS Term which had labels only in English. Language packs were added later. Ideally this site column should be pointing to the right language labels in the corresponding language sites. But it is Not working. It loads only English terms.

    Although, a new Site Column gives the right results.

    How do I solve this?

Comments have been disabled for this content.

About Wictor...

Wictor Wilén is a Director and SharePoint Architect working at Connecta AB. Wictor has achieved the Microsoft Certified Architect (MCA) - SharePoint 2010, Microsoft Certified Solutions Master (MCSM) - SharePoint  and Microsoft Certified Master (MCM) - SharePoint 2010 certifications. He has also been awarded Microsoft Most Valuable Professional (MVP) for four consecutive years.

And a word from our sponsors...

SharePoint 2010 Web Parts in Action