OpenLDAP sucks

Originally posted to the Scary Devil Monastery on 2004.02.09. To be fair, OpenLDAP has gotten much more stable since then.

Rune Kristian Viken  wrote:
> 
> I suddenly realize that the ldap database contains far too few users, 
> that is, no users since November seem to exist in the database.  Curses
> fly through the room.  

    OpenLDAP, at least, seriously sucks.  If it weren't better than
all the alternatives for our setup, I'd be tempted to find all of the
authors and go back in time to kill all four grandparents of each one,
just to be sure.

    I discovered the marvelous failure-to-sync behaviour in a couple
different ways as I underwent the joys of a NIS-to-LDAP migration.
The first time around, the database appeared to have taken some
corruption and was no longer writing anything to disk, although it was
handling new entries fine from the memory allocated by the processes.
I discovered this when investigating why simple operations that used
to be merely annoyingly slow[1] had become sanity-sucking slow, and
when I brought it down for a few seconds to reindex it as I had done a
few times before when I was still working out which attributes to
index slapindex hung.  So, puzzled, and not wanting users on my back
while I figured it out[2], I restarted slapd.

    Or tried to, anyway.  I knew I should never have let the stock of
black candles get so low.  There weren't enough left for a proper
ritual.

    Anyway, after much cursing, and the discovery that of course the
backups of the database directory directly are *also* corrupt, and
more cursing, and telling users who can no longer log in to go away
and leave me alone for more than a minute at a time so I CAN ACTUALLY
GET SOME WORK DONE, I manage to get a slapcat out of it, that hangs
*IN THE MIDDLE OF SPITTING OUT AN ATTRIBUTE IN THE MIDDLE OF AN ENTRY,
AND, IMPOSSIBLY, IN A DIFFERENT FSCKING PLACE DEPENDING ON HOW I PIPE
IT!*

    Ahem, excuse me.  I've been told that since the last budget cuts,
any more increases on the graduate student health insurance premiums
will come out of my paycheck, so I haven't been getting my necessary
stress relief in.  It's been so bad that I haven't even bothered to
replace the batteries in the cattle prod.

    So anyway, the last few entries that we added were unrecoverable,
including an account made for an attractive undergraduate that my PFY
has been getting friendly with.  PFY isn't entirely unhappy, because
she'll have to come in to see him to get her password reset.  Still,
I'm a little nervous, so I make a cronjob to start making straight
text LDIF dumps of the database so that in the case that something
like this happens again, I'll have an up-to-date backup from the
previous night that can't be corrupt.

   By now, all the monks here are probably laughing at me, but that's
okay.  I've laughed at a lot of you, too.  What would the monastery be
without the camaraderie of mutual schadenfreude?

    So of course, a month later, I come in and note that the system is
frightfully sluggish again.  I fight the adrenaline hit and calmly
clone the ldap data to a second server to check it out.  Corrupt.
Okay, no problem, I know exactly how to get around it now, so downtime
will be on the order of seconds.  No problem.  I'll just check on the
backup LDIF just to make sure that everything is ... WHAT DO YOU MEAN
IT TRUNCATED IN THE MIDDLE OF AN ENTRY *EVEN BEFORE* THE END OF MY
LAST MANUAL DUMP THAT WAS FINE?!  IT WAS RUNNING PERFECTLY WHEN I LEFT
LAST NIGHT!

    *cough* Anyway, no helping it now, so I recover as much as I can,
and reload the system.  The users don't even notice, with the
exception of the owners of a couple recently recreated accounts,
including the previously mentioned undergraduate, who is now starting
to get a little suspicious at how often she's ending up in the PFYs
office.  PFY takes one look at my expression and wisely keeps his
comments neutral.

    So, anyway, I have a much more solid workaround in place now, but
describing it would be UI, so all you buggers who were snickering
earlier can just hang if you need it.  It should have been
transparently obvious to me in the first place, so it should be
transparently obvious to you by now.  And I haven't had any problems
with it since then, probably because whatever mocking power it is that
keeps poking at my life has decided that there's easier targets.

    Like the versioning issues with perl and db, for instance, that
cause perl to spin its wheels endlessly on some indexing operations,
or the half-dozen lines of code I added to my automatic updates system
that failed last Saturday in a manner utterly inconsistent with the
perfectly clean results on I observed my independent test systems only
last Friday.  But that's another rant.


> I need some whisky.

    Annoyingly, I now find nauseating all the stuff I used to like
when I was a teenager, though at last year's Usenix I discovered a new
fondness for margaritas if sufficiently smooth (a concept which nobody
seems to understand out where I live, where margaritas seem to be made
almost entirely out of poor tequila, though I suppose I can start
making my own).  What are people recommending now?


[1] Lightweight my ass.  The fact that X.509 has the weight of an
    18-wheel rig doesn't make a minivan something you shove in your
    backpack.

[2] Yes, I know what a replica server is.  It wasn't working for
    reasons I'm not going to get into.  Go away.

Trackback URL for this post:

http://www.resonant.org/trackback/16

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

OpenLDAP Really Really SUCKS

I agree, and have experienced all the same issues like this ...
If OpenLDAP team blames BerkleyDB, why the hell they dont use some other backend databse as default.

My problem is simple but very critical; My ldap data got usually corrupt saying that it has some indexing issue, I tried to turn off all indexing it worked fine, but the LDAP_SEARCH was dam slow. In my case indexing is vital.

I dont know why OpenLDAP even exists, its good only as a toy for very small entries. (i am talking about less then 100).

If there exists a solution, I would more then glad to know. Else I will be shifting to the Great Microsoft World ! Where everything atleast is usable.

I am really really pissed off with OpenLDAP data corruption.
Posted on several mailing lists and http://www.ldapguru.com, but no one has ever replied ... !!!

If you know any solution, instead to turn indexes off that will be more then a bottel of wine for me.

It's a pity...

... that everything else is worse. Even with the problems, I'm very happy to be finally through with NIS, and I'm not ready to move to a complete Kerberos infrastructure. That didn't keep me from being royally pissed at the time, of course.

To be fair, with the current version (2.1.30) my only problem seems to be with very sporadic performance breakdowns (it grinds nearly to a halt every other month or so), and on SMP systems only (my uniprocessor servers are fine). Those can be fixed simply be restarting slapd periodically. Changing backends didn't help — I experienced exactly the same problem with LDBM as with BDB. I suspect a very obscure race condition somewhere.

I'm running about 1200 entries on the SMP system, and only 200 on the uniprocessor system, though, which may have something to do with it.

By the way, I have seen ActiveDirectory collapse as well, and it's far harder to clean up, so going to Microsoft won't help you there, plus it gets you all of the baggage of a Microsoft server. If you want commercial support, you might want to give Netscape Directory Server a try, or go with one of the places that provides support for OpenLDAP. (I'm not endorsing any of the places in those links, but just noting them as a possibility.)

It's scary how close our situations are...

I also have a script to backup my ldap daily.

I even modified the postfix config NOT to start up if LDAP is in it's successfully-started-but-hung state.

Here is my restore script:

/etc/init.d/slapd stop
killall -9 slapd
rm -rf /var/lib/openldap-data
mkdir /var/lib/openldap-data
chown ldap:ldap /var/lib/openldap-data
/etc/init.d/slapd start
ldapadd -x -D cn=admin,dc=blah,dc=co,dc=za -W < backup.ldif

I've done this many times, and all forums do is blame filesystem/os/db instability... but no clue on how to resolve it.

-cry-

I think LDAP is the only really good and neat solution, but if I cannot get some stabiulity (the ldap db coruupts practically after every system crash and some plain reboots)

Marius van Wyk

LDAP backups

Make sure your backup scripts generate staggered backups over the last week or two. It's quite possible to generate two days in a row worth of corrupt backups without realizing it.

Also, I've only ever had problems on SMP machines. You may want to make your main auth server single-processor and see if that helps.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.