We asked Clive, one of our lead Solution Architects with a lifetime of delivering IBM FileNet solutions, to take us on his personal journey, from overseeing traditional platform installs of IBM FileNet P8 to Docker containers, to Kubernetes to IBM Cloud Pak on Red Hat OpenShift.
Many organisations are in the process of modernising existing applications to support hybrid cloud and integrate new technologies and services into mission-critical applications.
In this blog Clive shares:
- Why containers help move applications across various infrastructures
- The challenges
- How IBM Red Hat OpenShift has simplified container deployment and management
- The new challenges!
So, take us back. What is a traditional platform install?
Traditionally, the way to install the platform was:
- Get one or more Windows or Linux based servers
- Put a database on the side of that – Oracle, SQL Server or DB2
- Go through each of the packages that make up the platform and run the install.
- Configure the system so all of the different bits talk to each other – talk to the database, an active directory etc.
They could be hosted on a single server or multiple servers and could be bare metal or virtual machines.
And then there was Docker. What was that?
Actually, Docker has been around for a long time and is a way to build, share and run applications using Containers. It’s very portable. Docker containers can run across any desktop, data centre or cloud environment. The initial release was in 2013 but the first container-related technologies were available decades before. I think I started messing about with Docker containers in Spring 2018. Containers package up code and all its dependencies so that the application runs quickly and reliably from one computing environment to another. Applications dependencies are separate from infrastructure. (There’s an introduction to containers you can watch here)
With Docker containers:
- It takes a server, strips it down to its bare minimum, releasing resources that are held for the GUI and stuff for managing day to day tasks. Having reduced the amount of CPU and RAM the server uses, you have a good base to install the applications.
- Packages can now be built up in a non-persistent way. You have a machine that is self-contained with no persistence. On the outside, you have your persistence – your data stores, databases, file stores, the things that hold the configuration.
- Then, the idea is to have multiple containers of the same thing to give you the additional resources required for loads but using a single version of the truth in the persistence layer.
So that’s the way that Docker handled that sort of application deployment. It was a much smaller footprint. And it was quite simple.
Comparing Containers and Virtual Machines
In an IBM FileNet P8 world, rather than having a big, bare-metal server or virtual VMware type server, you could use Docker as the backend platform with separate containers to build up the functionality. You could add Content Navigator to be a GUI, the Content Platform Engine, Content Search, External share, GraphQL, all in containers that interact with each other. Whilst IBM indicated they were moving into a container world, initially, it was purely for the Content side. You didn’t have any of the Case Management or Process Management functionality which was a major drawback.
I was using a really nice demo environment from IBM called CPIT which worked well on a laptop. It didn’t work on Windows, it needed to be Linux or macOS, so I was using it on a Mac. I started working on a CPIT deployment for Windows but never got to the end of it. I was close! And then IBM was promoting the Kubernetes light versions, which supported any CNCF (Cloud Native Computing Foundation) compliant Kubernetes environments.
Kubernetes
Kubernetes, sometimes abbreviated to K8s, was released as an open-source project in 2014. Unlike Docker where the management of containers is all manual, Kubernetes starts to manage the ability to add (autoscale). Furthermore, you can have Pods that can contain multiple containers. For example, for a WordPress website you would have the user interface such as an Nginx web server, then the WordPress application. So for that, you might create a single pod with two containers. That would then talk to an external database. If you then start to get an increase in user traffic, you’d create a new pod to handle the additional workload, which contains the Nginx and the WordPress, talking to the database at the back end. Similarly, if the database starts to get utilised you would then create a new database pod.
Kubernetes manages scaling automatically. You can set it up to say, if this pod gets to 75% CPU utilisation, roll up a new pod. Across the top, it will even load balance the traffic across the two different pods to maintain performance and stability. So, it is seamless, in theory.
I started looking at a community version of Kubernetes that was really easy and lightweight, called Rancher, which has since been bought by SUSE. It has very limited disk space requirements. It runs in MBs of memory rather than GBs and a very limited CPU, but it does all of that orchestration for you. I started deploying the extended CPIT environment, which now started to include Case Management. I was doing well and getting really good traction with it, and then…
IBM acquired RedHat OpenShift for $34 billion (acquisition completed July 2019)
OpenShift is fully supported by Red Hat to run in any infrastructure environment, from mainframes to private cloud to any commercial public cloud. With the acquisition, it was time to change tack to OpenShift. That’s where things got interesting!
With OpenShift, IBM has done lots of clever things on top of Kubernetes. They’ve got a really nice GUI to manage, and you can look at what’s going on in the cluster. But, along with that comes the mega amounts of resources that OpenShift requires. With Rancher, I had been able to install Kubernetes on a single machine, which made it ideal for demo machines. With OpenShift, I could do a single node deployment called Code Ready Containers, which I started experimenting with. Then I tried to install IBM Cloud Pak for Automation v. 19.0.1, released in Nov 2019, and soon exhausted the amount of hardware I’d got in my home lab!
Installing IBM Cloud Pak for Automation
I had a few rack-mounted servers that I was running VMware on 96 GB of RAM, the other 128 GB of RAM, both had 24 Core processors. Code Ready Containers was installed as a virtual machine on one of those. I kept increasing the memory and the number of CPUs allocated to it. Frustratingly, it just failed to deploy in Code Ready Containers. It just kept pausing saying there’s not enough resource available.
I have discovered that for a full OpenShift environment, you need at least four machines: Three machines just managing the environment (Master nodes), and one doing the work (Worker nodes). Then, to extend you can add more worker nodes to cater for the workload. We went from using just one virtual machine to moving the worker nodes across both of my servers. So I’ve now got five worker nodes and three master nodes running. That just gave us the horsepower to install a very small demo environment of Cloud Pak V19.
Then came Cloud Pak V20, and then Cloud Pak V21 which as you’d expect, added more functions. The more functions that were installed, the more horsepower it needed. We’re currently running a demo environment on the Insight 2 Value office system, running across three master nodes and eight worker nodes. Each worker node has 32 GB of RAM and 8 cores of processor. Now, it doesn’t actually use that amount of resource, it just requests that you’ve got that available so that it can bring on more should it need to. It is currently using 6 cores, 6 CPUs and 5 GB of memory.
In the configuration, you can specify limits. For example, instead of requesting 4GB of memory per pod, you can limit it to 1GB. There are tables that say what you can do. We’re still on a journey of working out what the best values would be. For a demo environment, we can’t see that we’d need all of this capacity, but the way that the demo environment is laid out, you don’t have any control over it. On top of the standard IBM P8 environment, it puts something called IBM common services to monitor the licences that you’re using and lots of different things in the background that you have no control over. Again, that requests more CPU. So, on a technical level, it’s a beast!
Knowledge shift
Outside of the Cloud Pak knowledge building we’re doing at Insight 2 Value, we’ve had to build our knowledge around OpenShift. OpenShift seems to work really well in the cloud on Microsoft Azure, AWS or IBM Cloud. All of the networking, infrastructure and everything you need to wrap around the OpenShift environment is sorted for you.
When you deploy OpenShift locally, the first thing you have to do is learn a shedload about networking, DNS servers and load balancers. It was quite a steep learning curve but after the first few installs, it’s become much easier. The latest version has an assisted installer, which helps install on bare metal. For the first few, you had to create a Bastion server – a server that would host your load balancer and your DNS server for the cluster. So on top of the 11 nodes that we need, we also needed a further machine, which would be the Bastion so now we’re looking at 12 machines to run a demo environment.
So that’s the journey.
To recap
We’ve gone from a little virtual machine with the CPU environment that needed a couple of CPUs and 4GB memory, to 12 machines needing a direct link to the electric grid!
All these containers would run in a Docker environment. However, the pre-packaged intelligence IBM has built around the operators and the install and deployment requires an enterprise-grade Red Hat environment.
We’ve moved from putting in a cd and running setup.exe and needing the know-how to configure your environment plus proprietary knowledge around FileNet, to needing knowledge around Red Hat management and deployments into a Red Hat OpenShift environment, e.g. looking through logs to try and determine why a pod is not running, is it because it hasn’t got the access it needs?
OpenShift is definitely a knowledge shift, but the benefits for the longer term are plain to see. All systems will be created in the same way by the Operator who has the knowledge to deploy and make the components work with each other by just changing the configuration YAML (a human-friendly data serialisation standard for all programming languages) to make connections to outside services such as Databases and Directory Services.
Thank you Clive for the history lesson.
Our next session with Clive will go into more detail on the lessons learned from installing Cloud Pak for Business Automation version 21, released in March 2021.