| Perfil de RussRuss KaufmannBlogListas | Ayuda |
|
|
19 junio Tech-Ed and the High Availability Pre-Conference SessionI have learned over the years that a successful presentation depends on solid planning, good input from many sources, and preparation. So, what do you do when things go wrong despite all prepartions going right? What do you do when:
Yes, it was challenging. Would you believe that it was still a great deal of fun and everyone that I saw during the rest of TechEd that was in the session said they still learned a great deal of information? I am shocked that I didn't burst out in a tirade of obscene statements. [:D] Somebody asked me if I would do it again knowing that the same circumstances would come up, and I said that I would. Really, I had a great time, and it appears that the attendees were still happy despite all of the facility issues. BTW, I heard that another Pre-Conference session was cancelled during the first few minutes because of problems that they had. Windows Server 2008 Failover Clustering - Microsoft Official CoursewareMicrosoft has released its first Windows Server 2008 course based on the RTM version. Lucky for us high availability geeks, it happens to be the coruse on Failover Clustering. The course will be available May 15th, 2008. In the meantime, I strongly suggest everyone take a look at the syllabus for the class. You can find it here. 08 abril CCR and Multi-Site EnvironmentsI have been hearing more and more people talk about the virtues of using CCR with a node in each site. This talk has escalated now that Windows Server 2008 has released to manufacturing. With Windows Server 2008 Failover Cluster environments now have the ability to have nodes in multiple sites without having to use Virtual LANs (VLANs) to provide the networking support. On the surface, CCR and Windows Server 2008 in a multi-site cluster sounds like the answer to many organization needs. Obviously, I am setting up the argument against this kind of implementation. OK, maybe it wasn't obvious to some of you. <G> Anyways, here is a rough sketch (this means that lots of non-discussed components are not shown, i.e. CAS, DC/GC, DNS, etc.) of how this would look if you had two physical locations with them both being in the same AD site to support CCR. In the drawing, Node1 is the active node and replication traffic flows over the WAN link to Node2 which is the passive node. If you look at the drawing, you should immediately see some issues. Consideration number 1. Where should you put the FSW? In this drawing, it is in the site on the left. Well, what if that is the site that goes down in a flood, tornado, meteor strike, or whatever? If the FSW is lost along with one of the nodes, there will not be an automated failover. OK, this is fixable since we can manually force the cluster to start, but it will impact life in the real world if there is a major disaster, especially if you lose your administrators along with the site. Make sure you document the process in your DR documentation as somebody else might need to perform the task. Consideration number 2. How do you know which Hub Transport to use for the transport dumpster in order to back fill the surviving node? After all HT1 and HT2 are in the same AD site, which means that they would be used in a load balanced manner, so it is not possible to use one of them to provide full replay of lost transactions. Yes, you can hard code which HT to use, but that makes no sense to me in an HA environment as if you did that, you would lose the redundancy/load balancing functionality gained by having multiple HTs in a site. Of course, you might even have two in the same physical location. Also, let's say you hard code HT1 for the CMS and it is active on Node1. If you do that, then you lose the transport dumpster along with the location in the event of a major disaster. OK, so let's say you hard code HT2 for the CMS which is active on Node1. That would mean all of your traffic would be going across the WAN link, which is not exactly a good idea. Consideration number 3. What about the use of the Wide Area Network (WAN) and its uncontrolled use by many different services? After all, if both physical locations are in the same AD site, will you have issues with clients logging on and authenticating across the WAN link? Will you have problems with the Clustered Mailbox Server (CMS) using the Hub Transport (HT) on the other side of the WAN link? What about the HT using the wrong Domain Controller/Global Catalog server and thus all of its queries being run over the WAN link? Again, you can hard code some of these settings for some applications and services, but even if you do that, there is again the issue of potentially losing redundancy/load balancing. Consideration number 4. Using Windows Server 2008 and its multi-site improvements impacts DNS and resolution. For example, when Node1 is active, its VIP address is registered with the CMS name. If there is a failover, then the other VIP (for the physical location of Node2) must be registered within DNS and DNS updates needs to be replicated to all DNS servers in the organization. During the time of the updates and shortly after, there will be clients that have the old VIP address in its cache, so it will resolve incorrectly until the cache is updated on the clients. This is not an Exchange issue, but something else that should be considered. So, what do I recommend? I am glad you asked that question. If you didn't, too bad, I will answer it anyways. I highly recommend using CCR within a single physical site that is also an AD site. For disaster recovery reasons, I recommend using Standby Continuous Replication (SCR) to copy transactions to a remote site's Exchange mailbox server. FYI, I updated based on some of Scott Schnoll's comments to me. Scott had some excellent points regarding my concerns listed above. I won't go through them one by one, but it basically came down to my making the assumption that CCR in a multi-site (stretched AD site) environment would be configured for automatic failover. I did make this assumption because if we were looking for a manual process that would require administrator intervention to get it up and running, then we should be talking SCR, not CCR. High Availability (HA) and Disaster Recovery (DR) are very different in my mind. HA means that processes are automated to reduce downtime to a minimal amount. DR is something that is done when there is a major disaster that requires steps to be taken to recover the environment. CCR is an HA technology and SCR is a DR technology, in my opinion. 28 enero TechEd 2008 PreCon: High Availability Planning with Windows Server 2008I just wanted to point out that there will be an excellent Pre-Conference session on Windows Server 2008 High Availability with several demonstrations at the upcoming TechEd in Orlando. You can see the pre-conference sessions here: https://www.msteched.com/itpro/public/precons.aspx I have multiple reasons for promoting this session:
Reason Number 1: It will have lots of great information on Failover Clustering and using Failover Clustering with key applications such as Exchange and SQL.
Reason Number 2: While Manish Kalra is not the speaker, he is the Business Lead for High Availability for all of Microsoft, and he is responsible for making this seminar happen. Manish will absolutely be involved in the presentation, and will make himself available during the presentation time as well as during the entire time of TechEd.
Reason Number 3: You will get the best bang for the buck as you, the attendee, will get a chance to see several demos and get to hear lots of pointers on the proper way to deploy Failover Clustering as well as Network Load Balancing.
The real reason I am promoting this besides that it is great stuff on clustering? OK, I confess! I will be presenting the content of this session along with good friend Rodney R. Fournier. While I am no longer a Microsoft MVP (because I am now a Microsoft employee), you can still see my MVP Profile for some more info about me.
Updated: Feb 3, 2008. I also wanted to add that Microsoft will have some of its key players available during and after the Pre-Con seminar to answer questions. At this time, it looks like we will have some excellent hardware as part of the demo. I will provide more information as it is solidified. 03 agosto eLearning - Windows Server 2008 Failover ClusteringMicrosoft recently published a two hours online course on Failover Clustering for Windows Server 2008 (formerly known as Longhorn). You can access this content here and for $39.99, you can spend as much time as you want for the next three years reviewing the content. This course, Course 6051: Implementing High Availability and Virtualization in Windows Server 2008 is part of a larger group of courses that can be purchased as a group, or you can purchase this individual course separate from any other eLearning course. Rod Fournier and I are working on very similar material, with much more depth for our ClusterHelp.com course. However, we will not be releasing it until Windows Server 2008 is released to manufacturing. Look for more information here as Windows Server 2008 comes closer to release. Windows Server 2008 Failover Clustering - Top QuestionsThe number one question by far at the cluster booth during TechEd was, "What are the differences between Windows Server 2003 Server Clustering and Windows Server 2008 Failover Clustering?"
The second most asked question appeared (yeah, yeah, I didn't bother actually tracking) was about Virtual Server 2005 R2 and whether virtual servers could be clustered. The answer is yes, they can be clustered. The process is called Host clustering since the host machines are clustered, and the virtualized servers can be moved (or failed) over to other hosts. You can read all about it at http://technet2.microsoft.com/windowsserver/en/library/9a3de6d0-c820-41ac-860c-de950d271f8d1033.mspx?mfr=true. Windows Server 2008 Failover Clustering - The New Quorum Modelne of the big changes in Windows Server 2008 Failover Clustering is the new quorum model. In Windows Server 2003, we had only two choices, either the single disk quorum that has been around since NT 4.0 or Majority Node Set (MNS). Actually, there are three if you consider MNS with the File Share Witness (FSW) as a separate option.
In Windows Server 2008 Failover Clustering, administrators now have four choices on how to implement the quorum.
It has been a few days since I have seen the GUI, so I can't tell you off the top of my head which order they appear in within the GUI. Two notes that caught my attention the other day when talking about these options is that it is not possible to use DFS as the file share witness and with changes to the quorum there aren't any checkpoints so there is no longer a need for the -resetquorumlog switch on starting the cluster service. Windows Server 2008 Failover Clustering - Storage ChangesIn a previous blog, I talked a little bit about some of the major changes to disk storage for Windows Server 2008 Failover Clustering. Now, while I wait for dinner to cook, is a good time to cover some of the changes.
All in all, there have been some pretty significant changes when it comes to the way Windows Server 2008 Failover Clusters work with disk storage. Standby Continuous Replication for Exchange Server 2007As previously discussed, when SP1 for Exchange Server 2007 ships, it will include some new technologies, too. One is Standby Continuous Replication (SCR). I am completely psyched by this new technology. Myself, I see SCR as the perfect remote site Disaster Recovery solution for Exchange Server 2007. What would make it the perfect solution would be having a hot site available for the implementation. Please read more about SCR on the Exchange Team's blog. Scott Schnoll wrote a wonderful post about it last week. I am sure you will love it, too. Building a Windows Server 2008 Failover Cluster - finally!I have been dying to get my first Windows Server 2008 Failover Cluster built. Of course, I want to do it on the cheap. That means using virtualization. The problem is that Virtual Server 2005 R2 does not provide support for Serially Attached SCSI (SAS) and there is just a complete dearth of virtualized SANs out there with virtualized HBAs. :) OK, that isn't going to happen anytime soon. So, my first Failover Cluster didn't have any shared disks. That makes it pretty worthless for testing. Microsoft, though its acquisition of Stringbean Software (WinTarget), put an iSCSI target into the new version of Windows Storage Server, but it has not been released and is not available outside of Microsoft. So, Microsoft has been the only real source of testing outside of using real hardware. Well, our partner, Rocket Division Software, is in the final stages of upgrading their iSCSI target software (Starwind) to support persistent reservations. It works wonderfully, so it should be released pretty soon. The only problems I have found have to do with a user interface issue that they already know about. Anyways, now I have my first Failover Cluster using iSCSI. It passed the validate tests and runs like a charm. Keep an eye out for Rocket Division's release and then you will be able to try it, too. Windows Server 2008 Default Share PermissionsI was working on the June CTP for Windows Server 2008 when I created a basic file share. One of my students, looking over my shoulder, asked me to check the default share permissions.
In Windows Server 2008, there are no default permissions. The only permission that is there when you create the share is the Administrator with Ownership, but that is it. When you create the share, it bring up the window to create the permissions right away. I tell you, the more I see of Windows Server 2008, the more I like it. Windows Server 2008 Failover Clustering Virtual LabMicrosoft posted a virtual lab today for Failover Clustering. This lab walks through the following procedures:
You can get to the virtual lab here: http://msevents.microsoft.com/CUI/WebCastEventDetails.aspx?EventID=1032345932&EventCategory=3&culture=en-US&CountryCode=US The lab should take about an hour to an hour and a half to complete. That, of course, depends on how many times you get interrupted with the kids wanting you to take them to the store to buy candy. I went through part of the lab, and it looks like it will work just fine for everyone except it will not pass validate. This is because of an issue that seems to have come up in the June CTP. BTW, I recorded a step-by-step today using Camtasia. I will probably put it someplace on the web for download early next week. Keep an eye out for it.
18 julio Upper and Lower and Mixed Case in SQL ClusterI had a well known geek in my cluster class last week. Ben Miller, a former Microsoft MVP lead and SQL expert sat our cluster class, www.clusterhelp.com.
During our many conversations, Ben told me about a problem he ran into recently in SQL Server 2005 clustering. Basically, the issue is that the way the server names were recognized differed depending on what tool pulled the name since the different applications did not pull the name from the same place in the registry. So, if one node was all upper case, and the other node was mixed case or lower case, clustering would install and work just fine. However, SQL Server 2005, which pulls the name from a completely separate key in the registry, does not install properly unless both nodes are all upper case.
Ben says he is going to blog more about the issue after he has a chance to do some further testing and documents the issue completely. During class, he was able to replicate the problem with complete predictability.
See Ben Miller's blog for more detail.
12 junio TechEd 2007 - OrlandoI am still in Orlando trying to figure out how I missed my plane. Oh well, there is always tomorrow.
Anyways, putting that aside, it has been a fantastic week. It was great to see old friends again, it was great to see current friends, and it was fantastic meeting some new people that I have been dying to meet. For example, Evan Dodds has always been a great person when it comes to getting information on some off-the-wall attribute setting, or even more mainstream stuff. Evan has been great to talk to on the phone and via email, but I have never met him until this last week. Evan is definitely a great guy, and I am glad to say that I finally met him and did my best to not blow cigar smoke on him. Another person that I have been wanting to meet for a few years is Eileen Brown. I was walking around the Technical Learning Center (Yellow) area saying hello to several people that I know when I was accosted by Jane. OK, accosted is a bit strong, but she stopped me to introduce herself. While talking, I asked her if she had seen Eileen around at all as I as dying to meet her. About 20 minutes passed, and poof, there was Eileen. She is just like I imagined her. What a wonderful lady and just chock full of knowledge.
OK, side question to over 50% of the world, "Why are there so few women in the IT field?" The question is addressed to women in general. I just don't get why there are so few women in the field, when women have been proven to excel in the field without having to have any specialized scientific or math skills.
I have to admit that this was a strange TechEd for me as I didn't seem to like any of the parties and spent many evenings with close friends smoking cigars by the pool of one hotel or another. I was a little ticked that I missed out on a meeting with the MCTs and Ken Rosen which was basically a goodbye roast for Ken. I missed out on that one because I was working the booth for Windows Server 2008 Failover Clustering. We had a great time at the booth helping all sorts of people. One question that I dreaded was the, "I know nothing about clustering, can you tell me everything I need to know?" This has got to be the worst question to answer as I can go on for several days, but I wasn't able to get some people to be more specific. My favorite question was the, "So, what is different between Windows Server 2003 Server Clustering and Windows Server 2008 Failover Clustering." At least this one could be answered with some basic information and why the changes are important.
I am already looking forward to next year. Bring on the geeks! 20 febrero Fibre Channel Information Tool (fcinfo)Released just yesterday (Feb19th, 2007), this tool can be run from Windows Server 2003 or Windows 2000 to enumerate disks and configuration information for attached SANs. Download it here: http://www.microsoft.com/downloads/details.aspx?familyid=73d7b879-55b2-4629-8734-b0698096d3b1&displaylang=en&tm 06 febrero Cluster PrepYES! It is finally released to the public. The Microsoft Cluster Configuration Wizard released today to production. Clusprep (Clus Prep), as it is known with affection, can be used to test the configuration before configuring for clustering. The tool can be installed on either node or another computer altogether. It should be installed on a 32 bit server, however, even then it can still inventory and test the configuration of both 32 bit and64 bit systems. Clusprep tests the hardware configuration and evaluates the OS, patches, and hot fixes. While clusprep is not 100% fool proof, if the potential nodes all pass through the tool properly, you can be pretty confident that clustering will configure without any issues. Good luck everyone! 14 enero Cluster Training in LondonThe contract is finally complete. ClusterHelp.com will be partnering with Global Knowledge in the UK to provide a cluster training class. Come join us March 6-9, this year. Global Knowledge issued this press release today: http://www.trainingpressreleases.com/newsstory.asp?NewsID=2487. The actual location has not yet been selected. I have requested that it be someplace close to Heathrow airport to make it easier for those coming from other countries in Europe. It may end up in downtown London, though, depending on availability and classroom size. So, anyone in Europe that would like to attend the training provided by ClusterHelp.com can now save some money and attend the class in the UK instead of flying across the pond. We are all excited about doing this class and hope to fill up the classroom like we do in New York at Netlan and in Denver at Ameriteach. I am really looking foward to this trip. 19 octubre DFS and ClusteringThere seems to be some confusion around the Distributed File System (DFS) and Windows Server 2003 server clustering. First, let's look at the terms for DFS so we start from the same foundation in this discussion.
OK, granted this is just some very high level and basic information, but let's get rolling with it. What does this all have to do with clustering? Clustering is used to achieve high availability for certain resources. As a business requirement, we may be told to provide solutions that can help us achieve our goals for the company. One of the requirements is to make certain files highly available as they are needed all the time to keep the business running smoothly. We can achieve our goal a few different ways:
It is important to note that clusters can not host domain roots. They can only host server roots. Damn, I hope I got that right. If not, email me. DNS Round Robin and IISOpen up the attachment below. It demonstrates how DNS round robin works. DNS Round robin is a common solution for enabling load balancing for Internet server farms. Consider the following example in which there are three IP address entries for the same host name on a DNS server.
You can also replicate this example some time just by playing with your own DNS server. If you create three host records with the same name but with three different IP addresses, you will have implemented DNS round robin as a solution. What happens is that the first client receives the first address, the second client receives the second address, the third client receives the third address, the fourth client receives the first address, and they continue to loop. Using DNS round robin, it is possible to spread the load among multiple servers. The problem with round robin DNS, is that it is completely unable to handle a down server. In the event one of the servers fail, its address will continue to be given to clients and a portion of the clients will basically be pointed to an invalid address and a portion of the clients will fail to connect.
Round Robin DNS is not a high availability solution. DNS Round Robin and File and Print ServersA common question comes up in the public newsgroups on windows clustering all the time. "Can I use DNS round robin to provide high availability for printers or file shares." The answer is usually, "No." The reason is that NetBIOS names are used for these types of connections and the client must know the NetBIOS name of the target server. So, when you try to connect to a UNC path, i.e. \\servername\sharename, this is treated the same as if you were to run the Net command, i.e. "net use * \\servername\sharename" to connect. The * in the command is normally replaced with a specific drive letter and then the drive is mapped. The result is that an attempt is made to resolve the name using normal NetBIOS resolution methods. If those processes fail, then it is possible to use DNS for the resolution if the "Enable DNS for Windows Name Resolution" check box is enabled. Another reason that you would not want to use DNS round robin for highly available printer or file shares is that DNS resolution of the name to IP address will continue even if the server is not available. For example, if two client computers connect using DNS round robin, one will get the first IP address of the first server, and the second client will get the IP address of the second server, and so on. If the second server goes down, DNS will continue to supply every other client request with the IP address of the server that is down. What you get instead of highly available resources, what you get is halfly available resources. I wonder if I can trademark that term, "halfly available" for this kind of discussion. Of course, the fraction of failures will change depending on the number of servers used. If your organization needs highly available file resources, you can look at a couple of solutions. First would be DFS with DFS replicas and the second would be server clustering. For highly available printing solutions, server clustering should be your main focus. |
|
|