Cloned VM slow due to Distributed Cache issues in SharePoint 2013

9 Jul

Recently I worked on a SharePoint 2013 VM for development purposes. This cloned VM contained 16 GB of memory which isn’t a lot for a SP2013 environment. To be able to work with the VM at a certain speed the Search service was stopped, but that didn’t speed up things as I was hoping for.

Since we’re using VMWare the next thing was to check ‘Reserve all guest memory (all locked)’. This worked like a charm, even with the Search enabled, just for a very short time… I started to monitor the ULS. At once I noticed issues with the Distributed Cache, like

  • There is a temporary failure. Please retry later. (One or more specified cache servers are unavailable, which could be caused by busy network or servers. For on-premises cache clusters, also verify the following conditions. Ensure that security permission has been granted for this client account, and check that the AppFabric Caching Service is allowed through the firewall on all cache hosts. Also the MaxBufferSize on the server must be greater than or equal to the serialized object size sent from the client.). Additional Information : The client was trying to communicate with the server : net.tcp://<servername>:22233

The servername in the above message wasn’t the name of the machine I was working on, it still pointed to the machine from which this one is a clone.

Checking the available cache host with PowerShell confirmed this:

#Set context to cluster
Use-CacheCluster
#List all cache host services present in cluster
Get-CacheHost

The cache host service listed will be the one at the ‘old’ server in an UNKNOWN service state, like:

HostName : CachePort         Service Name                                Service Status

——————–                         ————–                                          ————

<old_server>:22233              AppFabricCachingService            UNKNOWN

Since a cache cluster is present the current server can be added as a cache host:

#Stop the distributed cache service instance
Stop-SPDistributedCacheServiceInstance -Graceful
#add the server as a cache host
Add-CacheHost -ConnectionString "Data Source=<new_server>;Initial Catalog=SP_CONFIG;Integrated Security=True;Enlist=False" -ProviderType "SPDistributedCacheClusterProvider"

The connection string can be found in:

  • HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\AppFabric\V1.0\Configuration

There were multiple cloned servers which had to be fixed and sometimes this message appeared:

Service is already configured on this host.

Then run Remove-CacheHost to unconfigure it and proceed with the next steps.

The next step is to register the server as a cache host:

#register the server as a cache host
Register-CacheHost -ConnectionString  "Data Source=<new_server>;Initial Catalog=SP_CONFIG;Integrated Security=True;Enlist=False" -ProviderType "SPDistributedCacheClusterProvider"
#Check if cache host is registered successfully
Get-CacheHost

Now 2 cache hosts are listed:

HostName : CachePort            Service Name                      Service Status

——————–                              ————–                              ————

<old_server>:22233              AppFabricCachingService     UNKNOWN

<new_server>:22233            AppFabricCachingService     DOWN

The configuration of the cache cluster has to be exported and adjusted to remove the old server and add the new server:

#stop the cluster
Stop-CacheCluster
#If result:
#Invalid operation encountered on <old_server>:AppFabricCachingService : Cannot open
#Service Control Manager on computer '<old_server>'. This operation might require other privileges
#and/or
#No hosts running in cluster
#Just proceed: unable to connect to old_server which makes sense

#export the cluster configuration so changes can be made
Export-CacheClusterConfig D:\CODE\clusterconfig.xml

The cluster configuration file needs to be modified. The old server reference has to be deleted, while the new server reference has to be added when registering the cachehost

Part of exported and modified configuration:

<hosts>
<host replicationPort="22236" arbitrationPort="22235" clusterPort="22234" hostId="114149731" size="819" leadHost="true" account="<account>" cacheHostName="AppFabricCachingService" name="<old_server>" cachePort="22233" />
<host replicationPort="22236" arbitrationPort="22235" clusterPort="22234" hostId="1975933372" size="8191" leadHost="true" account="<account>" cacheHostName="AppFabricCachingService" name="<new_server>" cachePort="22233" />
</hosts>
#import the modified cluster configuration
Import-CacheClusterConfig -file D:\CODE\clusterconfigmodified.xml
#Start the cluster
Start-CacheCluster
#and check if the service is UP
Get-CacheHost

Check if the service status of the new server is UP and the old server isn’t listed as cache host anymore

HostName : CachePort            Service Name                      Service Status

——————–                           ————–                              ————

<new_server>:22233            AppFabricCachingService     UP

Check if AppFabric Cache service is started in Services and in Central Administration.

In the ULS the following messages appeared:

Calling… SPDistributedCacheClusterCustomProvider:: BeginTransaction
Successfully executed… SPDistributedCacheClusterCustomProvider:: BeginTransaction

    And SharePoint is responding quite a lot faster than before!

    Summary

    This post described how to fix Distributed Cache service issues on a cloned SharePoint machine where the cache host pointed to the ‘old’ server.

    There are different caches that depends on the Distibuted Cache service: Login Token Cache, Feed Cache, Last Modifed Cache, Search Cache, Security Trimming Cache, View State Cache, and more. Therefor it’s quite important and convenient when the Distributed Cache service works properly.

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    This site uses Akismet to reduce spam. Learn how your comment data is processed.