!!! Rabbitmq reported unrecoverable state , recovery.dets corrupted !!!


!!! Rabbitmq reported unrecoverable state , recovery.dets corrupted !!!

Rated NaN out of 5 stars.

Unable to start rabbitmq after an outage?

Are you seeing a similar exception as below

2018-07-26T09:39:17.273888+00:00 <> [cluster-rabbitmq-monitor] - ERROR - Rabbitmq reported unrecoverable state: [Error]: {could_not_start,rabbit,

{{badmatch,

{error, {{{badmatch,

{error,

{not_a_dets_file,

“/var/lib/rabbitmq/mnesia/rabbit@<>/recovery.dets”}}},

[{rabbit_recovery_terms,open_table,0,

[{file,“src/rabbit_recovery_terms.erl”},{line,126}]},

{rabbit_recovery_terms,init,1,

[{file,“src/rabbit_recovery_terms.erl”},{line,107}]},

{gen_server,init_it,6,[{file,“gen_server.erl”},{line,328}]},

{proc_lib,init_p_do_apply,3,

[{file,“proc_lib.erl”},{line,247}]}]},

{child,undefined,rabbit_recovery_terms,

{rabbit_recovery_terms,start_link,[]},

transient,30000,worker,

[rabbit_recovery_terms]}}}},

[{rabbit_queue_index,start,1,

[{file,“src/rabbit_queue_index.erl”},{line,464}]},

{rabbit_variable_queue,start,1,

[{file,“src/rabbit_variable_queue.erl”},{line,455}]},

{rabbit_priority_queue,start,1,

[{file,“src/rabbit_priority_queue.erl”},{line,92}]},

{rabbit_amqqueue,recover,0,

[{file,“src/rabbit_amqqueue.erl”},{line,239}]},

{rabbit,recover,0,[{file,“src/rabbit.erl”},{line,756}]},

{rabbit_boot_steps,‘-run_step/2-lc$^1/1-1-’,1,

[{file,“src/rabbit_boot_steps.erl”},{line,49}]},

{rabbit_boot_steps,run_step,2,

[{file,“src/rabbit_boot_steps.erl”},{line,49}]},

{rabbit_boot_steps,‘-run_boot_steps/1-lc$^0/1-0-’,1,

[{file,“src/rabbit_boot_steps.erl”},{line,26}]}]}}

2018-07-26T09:39:17.898241+00:00 vasydp161 su: (to rabbitmq) root on /dev/pts/4

Above exception states that rabbitmq could not start as there was an exception reading recovery.dets file

If you browse to /var/lib/rabbitmq/mnesia and perform ls -ltrh

You would see that this file recovery.dets is corrupt or 0 bytes

recovery.dets file contains recovery metadata if the node was stopped gracefully. There exists a high change of it’s corruption if the node rabbitmq is stopped abruptly

To remediate , delete or move this 0 byte file to another location ( eg. /tmp/ ) and then reboot the node , in this case vRealize Automation appliance

Once done , during boot process we did see all services including rabbitmq started successfully.

#vRealizeAutomation