File Server Migration to Server 2012 Part 9: Execution and Lessons Learned

I've talked a lot about file servers lately. I went over what a file server is in general, how to move the data between servers, discussions and walk-thrus of configuring features, the planning and etc. However, this last post on the topic is to discuss the execution, the aftermath and what we didn't see coming.

The migration was set to happen Saturday evening. I was told that I could boot everyone off the server at 5pm and we expected the whole thing to take 4-6 hours. I actually got in about 30 minutes before to start collecting reports on the permissions before I started the actual migration.


I had written detailed step-by-step instructions for myself days prior. My boss had reviewed it and tweaked it as well. I was to run a report listing the permissions for every folder and share on each of the 12 volumes before and after the migration. This way, instead of simply trusting that all the permissions where the same, I could actually verify everything was still intact by comparing the 2 reports. I had specific steps for closing all open connections, taking the drives offline and removing the iSCSI connections. I'd written up how to unassign the old server in our Lefthand SAN management program and then assign the new server. I'd jotted down instructions for creating a DNS Alias for the old server so anything trying to call the old server would find the new one. 

The migration went pretty smoothly in fact. The biggest issue I faced was that I didn't have RSAT tools installed on my workstation so I couldn't actually open DNS. Opening it on the domain controller worked fine though. 

The first issue came with time. In my testing, moving the iSCSI connections from one server to another took about 10 minutes tops. On our production machine, every step took far longer and then was multiplied by 12. What took 10 minutes in testing took 2+ hours in production. 

The second issue came when trying to test someone's logon script. (Yes, a better way to do this is assign printers and drive mappings via Group Policy but we still have individual logon scripts in my environment, so, there.) For some reason, the logon script would fail when calling the alias and worked fine when calling the actual hostname. I was going to leave editing the logon scripts for our Helpdesk Monday but doing that would've meant a lot of people coming in to no mapped drives (which translates into lots of phone calls and general chaos). I decided to edit the scripts myself. 

When I was editing the scripts, I'd open scripts starting with the letters A-D, E-J, K-P, etc. With each range open, I'd find "FileServer" and replace with "FileServer01". All seemed well but the next morning we realized that I should've used some kind of break when finding and replacing. For example, I should've searched "FileServer/" and replaced it with "FileServer01/". Without the "/", I ended up with good chunk of scripts saying "FileServer0101" or "FileServer010101". I'll consider that the first lesson learned.

By this time, I was already at the 6 hour mark so everything had taken longer to perform than we'd planned for. Because of this, I decided to hold off on setting up the auditing agent from our File System Auditor program. I had File Screening, Deduplication and VSC to setup.

Deduplication was pretty simple. Turned it on, set to background optimization, waited for days and boom! Space savings. When it finally did finish (which was Tuesday BTW), We'd saved a combined 1,461Gb of space out of 4.3Tb, and achieved an average Deduplication rate of 28% (the lowest/highest being 14%/40%, respectively). VSC was equally easy to setup.

My next issue was setting up File Screening. The File Screening feature was actually something I got a lot of push-back about from other members in my group. It started a larger discussion about user impacts, file organization and other topics that are above my pay-grade. We decided to enable it but to allow passive screening rather than active screening. When the policy is triggered, we wanted it to email myself and my supervisor. The issue I ran into was that I couldn't figure out the SMTP settings for our mail server. I tried a few things but kept getting errors. I decided that this too could wait until Monday to work on further.

Come Monday, I started to get File System Auditor working and figured out the proper SMTP settings to setup file screening. What we didn't see coming were most of the scanners that stopped working. In the meetings leading up to the migration, all we'd heard from the people who've configured and managed our scanners/copiers/MFPs were that they contacted the file server by IP Address, which was a large part of why we kept the same IP address for the new server. However, there were two issues that unfolded with the scanners.

One of our PC Techs had been investigating the issue and found that while contacting vendors to get firmware updates (since apparently it's not like just going to Xerox's site and downloading the latest firmware), he found an update available that's supposed to fix this issue. A lot of the scanners had been using SMB 1.0 (which came out around the XP/Server2k3 days) which didn't work on Server 2012. Apparently, it only supports SMB 2.0 and up. There's an update to resolve this but the update was only for Server 2012 R2. So, we plotted an in-place upgrade to R2 so we could install the update and scan again. The upgrade went smoothly aside from resetting the network adapter information (something it didn't do in testing).

This fixed the issue for some of the scanners but most of them were will experiencing the issue. After doing some digging, we found out that the scanners that were still experiencing the issue were initiating their SMB connection by calling the NETBIOS name of the file server, which we disabled because we don't use NETBIOS. Once we enabled NETBIOS on the file server, all the remaining scanners worked like a boss.

My friends and family who knew all the steps I had to perform to make this migration happen all basically said "If the only thing that didn't work were some scanners then I'd say you did pretty well." I'd have to agree. Even my boss was surprised at how smoothly everything went. He mentioned that no matter how much you plan, there's always going to be something little you didn't account for.

I learned a lot with this project and even got to utilize some of my MCSA studies (70-411 talks a lot about file sharing stuff). I went from not knowing anything about file servers to being able to have an in-depth discussion about it. It was also my second major project and my first time migrating such a public service. I learned a lot about time-management (something else that's always a work in progress), presentations and coordinating with multiple teams. Not only were my technical skills put to the test but my communication skills were tested even more.

Lastly, when I first started this project, I found very little information on file servers in general. That's why I wrote 9 blog posts about it. If some other Jr Admin out there gets this kind of a project, I hope that my research, testing, execution and experiences from this can help someone else out. 

Comments

Popular posts from this blog

Installing CentOS 7 on a Raspberry Pi 3

Modifying the Zebra F-701 & F-402 pens

How to fix DPM Auto-Protection failures of SQL servers