Tuesday, 17 July 2018
Tuesday, 10 July 2018
Wednesday, 27 June 2018
In particular, when it comes to Team Foundation Server this is a list of errors and problems that go away with a common denominator: the right certificate.
The number one offender is of course the out-of-domain machine. If you have a domain-joined machines these problems simply do not happen because the internal certificate is deployed by the domain GPO - hence you don't have to fiddle with it. When your machine is not domain-joined, things can easily go south.
Bear in mind - these are not security tips, this is just a collection of situations which you will face if you deploy HTTPS with TFS.
Non domain-joined machines
If you don't have your certificate installed on both the Agent (if outside the domain) and the target machine (again, if outside the domain) then you will get this cryptic error:
The Deploy Test Agent task in Build and Release
The running command stopped because the preference variable "ErrorActionPreference" or common parameter is set to Stop: Exception calling ".ctor" with "2" argument(s): "One or more errors occurred."
C:\>git clone https://myserver/Collection/_git/Project
Cloning into 'Project'...
fatal: unable to access 'https://myserver/Collection/_git/Project/': SSL certificate problem: unable to get local issuer certificate
Wednesday, 20 June 2018
Like...deploying an application with a pipeline. Everybody talks about it, right? And everybody (including myself!) have some demo-ready stuff to show around in case it might be required.
I am working on a sample application right now, and I realised how blind I was - even if I am deploying stuff to different slots and environments and whatnot, I am still treating everything as a single monolith. Not really what you want these days, right?
Well' let's sort it out. Say that you have an API component and a Frontend component, the best thing to do is to decouple the two of them so they can be independently deployed *and* mix-matched depending on the requirement.
It is .NET Core in my case, so in my Frontend component's appsettings.json I created this section:
Of course I modified the application so I could add the configuration in my ConfigureServices method and consume it in my Controller. The variable part in this case is the Slot property.
Now comes the fun side of the story - of course I have a pipeline in place. How do I handle these settings?
The best approach here, given the relative complexity of this exercise, is to scope the relevant value by environment. The Dev environment will always point at the Dev environment, Staging to Staging, and the last two environments are effectively production so I do not need to worry about adding a slot. It's not like I have cross-environment settings here.
The reason why the variables are named that way is because I am using the JSON variable substitution option in the Azure App Service Deploy task, and as my property is not on the first level then it needs to be explicitly written that way.
Doing it ensures that each environment has its own setting, and it also makes sure you remain sane while handling internal app settings across your applications and environments 😉 it is really easy to do as well, so there is really no reason to skimp on it.
Saturday, 16 June 2018
That was the answer I gave to a friend of mine who asked me how to feed some baseline database for testing purposes with VSTS in Azure.
The obvious one would be to have your versioned SQL scripts in a dedicated repository which you can use to rebuild the whole thing from code (which is by all accounts the most correct solution to this problem). But in this case there are other avenues.
Databases have been treated like second class citizens for years - by tools and practices. For example, why not using BACPAC files for this exercise? At the end of the day, a BACPAC file contains the packaged version of a database at a certain point in time, including its data.
So if you have your BACPAC somewhere, get to an Azure storage account and run this SQLPackage command inside a VSTS PowerShell Script task (of course you need to replace the variables and provide the actual path):
& 'C:\Program Files (x86)\Microsoft Visual Studio\2017\Enterprise\Common7\IDE\Extensions\Microsoft\SQLDB\DAC\130\sqlpackage.exe' /Action:Import /TargetServerName:$(DBUrl) /TargetDatabaseName:$(DBName) /TargetUser:$(DBAdmin) /TargetPassword:$(DBPassword) /SourceFile:"<your location>/sample.bacpac"
Don't get me wrong, I love seeing a database fully integrated with the pipeline and that's how it should be. But in this specific case, I feel the tradeoff is worth it.
Also - this is a baseline database, nobody prevents us from running delta scripts against it depending on needs. But given it was for testing purposes, I highly doubt there is going to be much development on it in the future!
Thursday, 7 June 2018
In order to enable a machine to run UI tests you need to make sure your InteractiveSession capability is set to true.
In order to do so, you need to re-configure or manually change the script used to add a machine to the Deployment Group. Given a standard script the first step is removing the --runasservice switch from it.
Once you run the configuration script the process will guide you to configure the agent for interactive interaction. You will set it to auto-start so you will get an unattended experience when rebooting the machine, but you will be able to run interactive sessions on it.
Eventually, I always recommend to use the VSTest Platform Installer task to make sure you have a consistent environment to run your tests from:
and to refer to the tools installed by that in the Visual Studio Test task:
Wednesday, 30 May 2018
Long story short, this was due to a peculiar condition related to a high volume of transactions during that operation, not something you see every day. Microsoft Support was really good helping us getting back to normality.
In retrospective, what really hit me was how resilient TFS was thanks to SQL Server AlwaysOn. As you know, I am a huge fan of AlwaysOn because of how transparent it makes High Availability.
For us, maintaining availability meant a simple failover to the other node. Given that we are running the Availability Group with Synchronous-Commit Mode (my default choice when it comes to TFS) the then-Primary Replica was already updated to the latest transaction, so there was no data loss.
Team Foundation Server did not lose a single heartbeat. When things go south like this, during the issue itself and if you are doing something during the failover you will get a JobInitializationError, which is self-explanative. As this is a transactional system by design, nothing is left hanging in the balance like good ol' SourceSafe :)
Of course we were in limited availability while we were troubleshooting and fixing this problem (always change the Failover Mode to Manual when you are doing so), but there was no downtime.
Also talking recovery, at the end of the day we had to restore backups on the Secondary Replica to get back to a proper synchronisation. Again, a bit tedious and time consuming given the sizes involved, but it was flawless.
Tuesday, 22 May 2018
Friday, 11 May 2018
Monday, 30 April 2018
Wednesday, 11 April 2018
Wednesday, 4 April 2018
Thursday, 29 March 2018
Wednesday, 28 March 2018
We are talking about a plain, empty instance, so... it was a bit of a needle in a haystack!
Let's take a step back: SQL Server AlwaysOn Automatic Seeding is a new feature of SQL Server 2016 and above that manages to sync up a database in an Availability Group without leveraging backup and restore. This is a life saver in certain situations, so that you can avoid the computational load of a backup and of a restore that might take a long time.
There are some constraints - above all, the instances making up the Availability Group must be *identical*. Yes, identical in everything, including paths used by SQL Server. It is a very cloud-first approach at the end of the day, where you have identical, commodity resources at your disposal and your actual target is to provide a friction-less experience to whom is going to consume the service you'll offer.
So cool, right? Still, for some reason, my Configuration database didn't stream from Primary to Secondary replica. I checked the DMV, and I got an obscure 1200 failed_state error - Internal Error.
The first thing I did (as the instances are really identical, they were provisioned the day before) was to check that I was on the latest CU, as there are fixes available for Automatic Seeding. Check.
I had a look at the script used by the wizard to add the databases to the Availability Group, nothing too fancy to be fair. Reading around seems that there is still a chance that things might suddenly break, so I took another path.
Yes, a Full Backup (taken with the TFS Administration Console nonetheless) was supposed to be enough to enable Automatic Seeding as the recovery chain is started. Would another Transaction Log backup hurt? I don't think so.
After taking the faulty database off the Availability Group, I ran the speedy Transaction Log backup and added the database back in the Availability Group with the script. Guess what, it worked! And my new TFS instance is up-and-running.
Of course this is totally transparent as usual for TFS, as the configuration wizard is smart enough to set the right connection string from the beginning. But you still need to make sure the Availability Group is correctly set, otherwise at the first failover you will be left with nothing.
Wednesday, 14 March 2018
Thursday, 1 March 2018
I stopped the Incremental Analysis, and the Optimize Databases job completed successfully. Fine.
But – for whatever reason – my SSAS cube got corrupted! I couldn’t even connect to the Analysis Engine with SSMS. I also found errors in the Event Viewer pointing at a corrupted cube:
Errors in the metadata manager. An error occurred when loading the 'Team System' cube, from the file, '\\?\<path>\Tfs_Analysis.0.db\Team System.3330.cub.xml'.
Errors in the metadata manager. An error occurred when loading the 'Test Configuration' dimension, from the file, '\\?\<path>\Tfs_Analysis.0.db\Configuration.254.dim.xml'.
Now, what to do? It looked like a full-blown rebuild was in order, and it is a costly operation, given that what the rebuild does is dropping both the data warehouse and the SSAS cube, rebuilds the warehouse with data from the TFS databases and then rebuilds the cube.
It is not like being without source code or Work Items, but still… it is an outage, and it is painful to swallow.
Now, in this case the data warehouse was perfectly healthy – the report shown an update age just a few minutes old. So all the raw data in this case is fine, and all you need to do is to rebuild how you look at this data.
The SSAS cube is just a way of looking at the data warehouse. If your warehouse is fine, just wait for the next scheduled Incremental Analysis Database Sync job to run, it will recreate the cube (thus making the Analysis Database Sync job a Full one rather than an Incremental one) without going through the full rebuild.
Why didn’t I process this myself by using the WarehouseControlService? Simply because the less you mess with the scheduled jobs the better it is hiccups happen, but the system is robust enough to withstand such problems and pretty much self-heal itself once the stumbling block is removed.
Tuesday, 27 February 2018
Friday, 16 February 2018
Last week SonarSource released a new version of their tasks for TFS and VSTS, with a couple of very welcome additions.
Up to v3, we basically had to do everything manually – especially passing parameters with the /d:… switch.
v4 introduces a context-aware switch where you can specify what you are using for your build:
The Use standalone scanner is quite interesting, as it guides you towards providing a .properties file:
Also, gone are the days of using /d:… inline. There is a very handy Additional Properties textbox to use with a line-by-line parsing, which makes property override very easy to do:
Tasks are also split now into Prepare Analysis, Run Code Analysis and Publish Analysis Result, to allow a more streamlined design of your Build Definition.
Thursday, 8 February 2018
A quick one I am dealing with these days – if you switch the Public URL of your Team Foundation Server to HTTPS you might see your Build Agents losing connection with the server.
This usually happens because of a known bug in TFS an OAuth token isn’t registered so all the authentication tokens on the agents expire.
Of course YMMV, so always double check with Support before running a Stored Procedure on your production instance.
If you happen to get into this problem, you can mitigate it by reverting your HTTPS switch-on and changing the Public URL back to the HTTP version. Doing that will re-establish the connection between the server and the agents.
Monday, 22 January 2018
Despite the push we’ve seen in the last few years, the Hosted Build Service might not be the right product for you for whatever reason.
Then, if you are in a situation where your agents aren’t running in the same domain as Team Foundation Server’s and you want to use the Test Agent then you really risk opening the Pandora’s box, courtesy of WinRM and PowerShell remoting.
And to be completely clear – I have nothing against them the only downside is that they need to be approached in the right way, otherwise the can-of-worms effect is just behind the corner.
First and foremost, remember that whenever you target a machine for Test Agent deployment you only need to consider the Build Agent-Test Agent relationship. All the errors you will get are going to be from the Test box, not the build box.
So when you need to configure WinRM, the Test box is the machine that is going to be accepting the connections. While it sounds straightforward, sometimes things happen and one is tempted to look at the Build box first: don’t.
Also, if you really want to use HTTP and WinRM, remember that this is the trickiest combination – so think twice before going down that route!
Then in terms of errors – you will likely face WinRM errors of all sorts. The most common is this:
If you are outside a domain then REMEMBER about Shadow Accounts – it is the only way to keep identity issues to a minimum. You’ll also need to set the TrustedHosts value to the machines pushing the agent.
Remember that passwords need to match, and that mixing users at setup time isn’t really a good idea if you are going down the workgroup/non-trusted domain route.
Always triple check passwords, and I recommend to use the same account for both provisioning and execution, at least as a baseline. This will make sure you have a safety net incase things don’t pan out as expected.
Eventually there is this error, that really puzzles me:
This is actually an aggregated exception:
Look at UAC and execution context for this – it always happens when you are not running stuff as Administrator when that’s supposed to be elevated. It always drives me mad.