Installation workaround when elastic search pods hang

Describes a workaround for an installation error


Author(s): László Czap | Created: 22 June 2020 | Last modified: 22 June 2020
Tested on: Cloud Pak for Security 1.3.0

Installation workaround when elastic search pods hang

When installing (IBM Entitled Registry version) CP4S v1.3.0 on OCP 4.3.24 we encountered the sympthoms as described here, i.e. the pod with name ibm-dba-ek-isc-cases-elastic-ibm-dba-ek-client-xxxxxxxxxx-xxxxx was stuck in state ContainersNotReady. Looking at the logs it is filled with entries like this:

[ibm-dba-ek-isc-cases-elastic-ibm-dba-ek-client-xxxxxxxxxx-xxxxx] Authentication finally failed for admin from...

The workaround in the troubleshooting guide did not help. It turned out that setting the admin password for the elastic search pod was not successful, still the configured clients tried to use it. The quick-and-dirty solution was to reconfigure the clients to use the default password (which is simply admin). Here is what was done.

There is a secret with name isc-cases-elastic-ibm-dba-ek-creds, where the username and password is stored. The password was replaced with admin.

Then, there is a job called ibm-dba-ek-isc-cases-elastic-ibm-dba-ek-security-config. We deleted and recreated this job with the same parameters. To do this, we simply copy-pasted the .yaml config. (Note, that the generated uids must be cleared from the file.)

There is yet another place to set this password: the deployment called ibm-dba-ek-isc-cases-elastic-ibm-dba-ek-client stores the password as a basic authentication header. This is used to issue the readiness probes which defines the container status. Here is the relevant part of the config:

            httpGet:
              path: /_cluster/health
              port: 9200
              scheme: HTTPS
              httpHeaders:
                - name: Authorization
                  value: Basic YWRtaW46YWRtaW4=

Here the value of the header must contain the base64 encoded string admin:admin. Simply replacing the original header with the above value and saving the config will do the job: all affected pods will be automatically recreated.

After these steps the installation went on and cases came alive.