Discussion:
[389-users] 389ds can't start after "db error (no disk space)" ... space problem has been resolved
Zarko D
2018-12-10 20:59:10 UTC
Permalink
Hi there, we have four IPA servers 4.4.0 and 389-ds is 1.3.5.10-11, and there is multi master replication among some of them.

There is daily backup via ipa-backup, and on one server it failed because of disk space. The /var/log/dirsrv/slapd-EXAMPLE-COM/errors read:

- NSMMReplicationPlugin - changelog program - _cl5WriteEntryCount: failed to write count entry for file /var/lib/dirsrv/slapd-EXAMPLE-COM/cldb/2acb5f15-a8ef11e6-81cbc137-643887ad_57be0c5f000000040000.db; db error - 28 No space left on device
- NSMMReplicationPlugin - changelog program - _cl5WriteRUV: failed to write purge RUV for file /var/lib/dirsrv/slapd-EXAMPLE-COM/cldb/2acb5f15-a8ef11e6-81cbc137-643887ad_57be0c5f000000040000.db; db error - 28 (No space left on device)
- NSMMReplicationPlugin - changelog program - _cl5WriteRUV: failed to write upper bound RUV for file /var/lib/dirsrv/slapd-EXAMPLE-COM/cldb/2acb5f15-a8ef11e6-81cbc137-643887ad_57be0c5f000000040000.db; db error - 28 (No space left on device)
- NSMMReplicationPlugin - changelog program - _cl5WriteEntryCount: failed to write count entry for file /var/lib/dirsrv/slapd-EXAMPLE-COM/cldb/6070479f-a8ef11e6-81cbc137-643887ad_57be0cb8000000600000.db; db error - 28 No space left on device

Disk space is resolved by growing logical volume, but 389ds fails to start with messages:

- NSMMReplicationPlugin - changelog program - cl5Open: failed to open changelog
- NSMMReplicationPlugin - changelog program - changelog5_init: failed to start changelog at /var/lib/dirsrv/slapd-EXAMPLE-COM/cldb
- Failed to start object plugin Multimaster Replication Plugin
- Error: Failed to resolve plugin dependencies

Can you please advise about possible resolution. Thanks in advance, Zarko
_______________________________________________
389-users mailing list -- 389-***@lists.fedoraproject.org
To unsubscribe send an email to 389-users-***@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/
Marc Sauton
2018-12-10 21:21:57 UTC
Permalink
Restore is one way t proceed.
A quick way to recover the LDAP service in this situation, is to remove the
changelog files and let 389-ds create new ones at start up.
Then check for consistency as much as possible.
Eventually re-init from another replica.
The recovery process may take "some time", depending on disk I/O latency ,
cache tuning and db size.
Thanks,
M.
Post by Zarko D
Hi there, we have four IPA servers 4.4.0 and 389-ds is 1.3.5.10-11, and
there is multi master replication among some of them.
There is daily backup via ipa-backup, and on one server it failed because
- NSMMReplicationPlugin - changelog program - _cl5WriteEntryCount: failed
to write count entry for file
/var/lib/dirsrv/slapd-EXAMPLE-COM/cldb/2acb5f15-a8ef11e6-81cbc137-643887ad_57be0c5f000000040000.db;
db error - 28 No space left on device
- NSMMReplicationPlugin - changelog program - _cl5WriteRUV: failed to
write purge RUV for file
/var/lib/dirsrv/slapd-EXAMPLE-COM/cldb/2acb5f15-a8ef11e6-81cbc137-643887ad_57be0c5f000000040000.db;
db error - 28 (No space left on device)
- NSMMReplicationPlugin - changelog program - _cl5WriteRUV: failed to
write upper bound RUV for file
/var/lib/dirsrv/slapd-EXAMPLE-COM/cldb/2acb5f15-a8ef11e6-81cbc137-643887ad_57be0c5f000000040000.db;
db error - 28 (No space left on device)
- NSMMReplicationPlugin - changelog program - _cl5WriteEntryCount: failed
to write count entry for file
/var/lib/dirsrv/slapd-EXAMPLE-COM/cldb/6070479f-a8ef11e6-81cbc137-643887ad_57be0cb8000000600000.db;
db error - 28 No space left on device
- NSMMReplicationPlugin - changelog program - cl5Open: failed to open changelog
- NSMMReplicationPlugin - changelog program - changelog5_init: failed to
start changelog at /var/lib/dirsrv/slapd-EXAMPLE-COM/cldb
- Failed to start object plugin Multimaster Replication Plugin
- Error: Failed to resolve plugin dependencies
Can you please advise about possible resolution. Thanks in advance, Zarko
_______________________________________________
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
Zarko D
2018-12-10 22:08:21 UTC
Permalink
Thanks Marc, and what files / directories exactly to remove?

[1] I see many db file in /var/lib/dirsrv/slapd-US-COM/db/changelog/ directory
aci.db
ancestorid.db
changenumber.db
cn.db
DBVERSION
entryrdn.db
entryusn.db
id2entry.db
nsuniqueid.db
numsubordinates.db
objectclass.db
parentid.db
seeAlso.db
targetuniqueid.db

[2] What are files inside cldb ?

/var/lib/dirsrv/slapd-US-COM/cldb
2acb5f15-a8ef11e6-81cbc137-643887ad_57be0c5f000000040000.db
2acb5f15-a8ef11e6-81cbc137-643887ad.sema
6070479f-a8ef11e6-81cbc137-643887ad_57be0cb8000000600000.db
DBVERSION
_______________________________________________
389-users mailing list -- 389-***@lists.fedoraproject.org
To unsubscribe send an email to 389-users-***@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/38
Zarko D
2018-12-11 01:36:06 UTC
Permalink
I was able to restore 389ds data and start service after below exercises.

[ on failed-server] ipa-restore -d --data /var/lib/ipa/backup/ipa-data-yyyy-mm-dd-hh-mm-ss
[ on failed-server] systemctl start ***@US--COM.service
[ on failed-server] ipa-replica-manage -v re-initialize --from="good-server"
_______________________________________________
389-users mailing list -- 389-***@lists.fedoraproject.org
To unsubscribe send an email to 389-users-***@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/

Loading...