[389-users] Monthly internal scheduled task failure resulting in segfault

Discussion:

Nelson Bartley

2021-05-26 05:42:52 UTC

Good day,

I previously messaged about this issue, but didn't have a core dump to provide.

Almost exactly 1 month, to the minute, a scheduled task starts in our
389-ds which results in a seg-fault.

We are currently using Fedora 33, 389 packages 1.4.4.15-1.fc33. This
bug also occurred with an earlier package set as well (I do not
remember the version, same FC33). We have experienced this exact
segault now three times on schedule.

I have attached to the email the cockpit information from the crash.

You can get the coredump from this link: http://gofile.me/4kovq/KZB9Waixm

I was hoping it was possible to identify what scheduled service is
crashing, and if possible how to disable it temporarily until the
actual cause of the crash can be fixed in an updated binary?

Nelson

Mark Reynolds

2021-05-26 12:15:26 UTC

Permalink

HI Nelsen,

I'm working on a db compaction improvement.,Â Now DB compaction occurs
every 30 days, and I found a bug if you don't have replication set up
then the server crashes when trying to compact a changelog (that does
not exist).Â This only happens on 389-ds-base-1.4.3, or newer, and only
if you don't have replication set up.Â Can you confirm if you are using
replication on this server?

Mark

Post by Nelson Bartley
Good day,
I previously messaged about this issue, but didn't have a core dump to provide.
Almost exactly 1 month, to the minute, a scheduled task starts in our
389-ds which results in a seg-fault.
We are currently using Fedora 33, 389 packages 1.4.4.15-1.fc33. This
bug also occurred with an earlier package set as well (I do not
remember the version, same FC33). We have experienced this exact
segault now three times on schedule.
I have attached to the email the cockpit information from the crash.
You can get the coredump from this link: http://gofile.me/4kovq/KZB9Waixm
I was hoping it was possible to identify what scheduled service is
crashing, and if possible how to disable it temporarily until the
actual cause of the crash can be fixed in an updated binary?
Nelson
_______________________________________________
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure

--
389 Directory Server Development Team

Mark Reynolds

2021-05-26 13:12:00 UTC

Permalink

Post by Mark Reynolds
HI Nelsen,
I'm working on a db compaction improvement.,Â Now DB compaction occurs
every 30 days, and I found a bug if you don't have replication set up
then the server crashes when trying to compact a changelog (that does
not exist).Â This only happens on 389-ds-base-1.4.3, or newer, and
only if you don't have replication set up.Â Can you confirm if you are
using replication on this server?

If this is the situation you are running into, then can change the
compact interval to 0 for nsslapd-db-compactdb-interval under dn:
cn=bdb,cn=config,cn=ldbm database,cn=plugins,cn=config.Â I would suggest
stopping the server and manually editing the dse.ldif in this case
because changing the value while the server is running with ldapmodify
causes the compaction to start and it will crash your server again.

Post by Mark Reynolds
Mark

Post by Nelson Bartley
Good day,
I previously messaged about this issue, but didn't have a core dump to provide.
Almost exactly 1 month, to the minute, a scheduled task starts in our
389-ds which results in a seg-fault.
We are currently using Fedora 33, 389 packages 1.4.4.15-1.fc33. This
bug also occurred with an earlier package set as well (I do not
remember the version, same FC33). We have experienced this exact
segault now three times on schedule.
I have attached to the email the cockpit information from the crash.
You can get the coredump from this link:http://gofile.me/4kovq/KZB9Waixm
I was hoping it was possible to identify what scheduled service is
crashing, and if possible how to disable it temporarily until the
actual cause of the crash can be fixed in an updated binary?
Nelson
_______________________________________________
Fedora Code of Conduct:https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines:https://fedoraproject.org/wiki/Mailing_list_guidelines
Do not reply to spam on the list, report it:https://pagure.io/fedora-infrastructure

--
389 Directory Server Development Team
_______________________________________________
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure

--
389 Directory Server Development Team

Mark Reynolds

2021-05-26 16:21:33 UTC

Permalink

Thank you for your assistance, I will be trying it in dour test
environment. Is there anything that I can do to artificially do to
trigger this issue to ensure itâs no longer a problem?

So you are turning "off" compaction with my suggestion.Â So it should
never run.Â There is no way to verify it except for the server not crashing.

HTH,

Mark

Cheers
Nelson.

I can confirm we do not have replication on these servers.

Ok so you can use the workaround I mentioned about setting the
compact interval to 0 until we get the proper fix released.
Thanks,
Mark

HI Nelsen,
I'm working on a db compaction improvement., Now DB
compaction occurs every 30 days, and I found a bug if you
don't have replication set up then the server crashes when
trying to compact a changelog (that does not exist). This
only happens on 389-ds-base-1.4.3, or newer, and only if you
don't have replication set up.Â Can you confirm if you are
using replication on this server?
Mark

Post by Nelson Bartley
Good day,
I previously messaged about this issue, but didn't have a core dump to provide.
Almost exactly 1 month, to the minute, a scheduled task starts in our
389-ds which results in a seg-fault.
We are currently using Fedora 33, 389 packages 1.4.4.15-1.fc33. This
bug also occurred with an earlier package set as well (I do not
remember the version, same FC33). We have experienced this exact
segault now three times on schedule.
I have attached to the email the cockpit information from the crash.
You can get the coredump from this link:http://gofile.me/4kovq/KZB9Waixm <http://gofile.me/4kovq/KZB9Waixm>
I was hoping it was possible to identify what scheduled service is
crashing, and if possible how to disable it temporarily until the
actual cause of the crash can be fixed in an updated binary?
Nelson
_______________________________________________
Fedora Code of Conduct:https://docs.fedoraproject.org/en-US/project/code-of-conduct/ <https://docs.fedoraproject.org/en-US/project/code-of-conduct/>
List Guidelines:https://fedoraproject.org/wiki/Mailing_list_guidelines <https://fedoraproject.org/wiki/Mailing_list_guidelines>
Do not reply to spam on the list, report it:https://pagure.io/fedora-infrastructure <https://pagure.io/fedora-infrastructure>

--
389 Directory Server Development Team
--
Sent from Gmail Mobile

--
389 Directory Server Development Team

Nelson Bartley

2021-05-26 16:38:41 UTC

Permalink

Understood.

Thank you for your help, we will disable this setting until we get contact
that the issue has been fixed on this mailing list.

Cheers

Nelson.

Thank you for your assistance, I will be trying it in dour test
environment. Is there anything that I can do to artificially do to trigger
this issue to ensure itâs no longer a problem?
So you are turning "off" compaction with my suggestion. So it should
never run. There is no way to verify it except for the server not crashing.
HTH,
Mark
Cheers
Nelson.

I can confirm we do not have replication on these servers.
Ok so you can use the workaround I mentioned about setting the compact
interval to 0 until we get the proper fix released.
Thanks,
Mark

Post by Mark Reynolds
HI Nelsen,
I'm working on a db compaction improvement., Now DB compaction occurs
every 30 days, and I found a bug if you don't have replication set up then
the server crashes when trying to compact a changelog (that does not
exist). This only happens on 389-ds-base-1.4.3, or newer, and only if you
don't have replication set up. Can you confirm if you are using
replication on this server?
Mark
Good day,
I previously messaged about this issue, but didn't have a core dump to provide.
Almost exactly 1 month, to the minute, a scheduled task starts in our
389-ds which results in a seg-fault.
We are currently using Fedora 33, 389 packages 1.4.4.15-1.fc33. This
bug also occurred with an earlier package set as well (I do not
remember the version, same FC33). We have experienced this exact
segault now three times on schedule.
I have attached to the email the cockpit information from the crash.
You can get the coredump from this link: http://gofile.me/4kovq/KZB9Waixm
I was hoping it was possible to identify what scheduled service is
crashing, and if possible how to disable it temporarily until the
actual cause of the crash can be fixed in an updated binary?
Nelson
_______________________________________________
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
--
389 Directory Server Development Team
--

Sent from Gmail Mobile
--
389 Directory Server Development Team
--

Sent from Gmail Mobile