Sending group messages

I am working on a modifcation to the messaging plugin to allow a PM to be sent to all the users (1000+) in my site. I have used the Messages improved - Groups & Collections of friends support as a basis and have it working.

The issue I have is that when I look at the db, I now have the same message saved in the objects_entity table for every user which seems a waste of space.

Is it possible to modify the way an Elgg object is created such that the message is saved only once and all the db entries for the users who have received that message refer that single row from objects_entity?

Thanks in advance!

 

  • First thing that comes to mind is using relationships to track recipients.  You'd likely have to do a lot more custom coding in order to get things like "delete" to work as expected.

  • If it works similarly to the stock message plugin, it actually creates two objects for each message: one for the sender and one for the recipient.  The reason for this is access permissions--the message entities' owners are set to the sender and the recipient, and then the access is set to private. This allows only those users to read the message.

    Because of how Elgg's access system works, you can't override access when getting entities--If an entity is saved with an access level that makes it not visible to a user, you can't (cleanly) make it be.

    I wouldn't worry too much about wasted space (storage is cheap) but there are major performance implications when creating message entities like this.

  • Thanks guys - that makes sense.

    Re performance, I went through a test run on my local site with about 800 messages and it took about 15 minutes... this is OK for now if I send them out at low usage times. 

    I think ultimately I need to create a different plugin for this as I don't need the option of reply or the privacy option for an all users PM. 

     

     

  • I think also the adminshout plugin is meant to provide similar functionality to what you're going for.  You might consider checking it out before building something yourself.

  • @jj01 - The larger problem is that while those messages are being sent, db tables will be locked, so the site will slow to a crawl for other users.

  • Hmm.. both the PlugIns mentioned above actually send out emails ;)
    I think JJ is looking at only sending internal messages.
    The performance issue is one we have already faced with fbfkids.com
    when we started considering mass-mailouts -
    our problems were compounded by the fact that we have 151K users
    and so sending out that many emails would have bogged our server down
    The solution we devised was to "batch" the mailouts to 200 per basket and
    wrote some basic CRON bash script to call PHP CLI every 5 or 10 minutes
    to send out the smaller basket of emails.
    It would work but did mean that the the CRON would run continually
    over about 24 - 36 hours before completeing all the emails ;-)
    JJ might want to look into creating some sort of "batched" PHP script
    to run off CRON -- which can pull in engine start.php
    and send the internal messages -
    this should help the database locking issue
    If you really don't like the same message on the db for each user..
    well as Brett said - space *is cheap.
    Only way to solve such a space situation would be
    to customize the messages PlugIn to fetch the generic message
    from some other store - but the code effort for this would be
    most likely not worth the time.

  • Dhrup, 

    At 1.5K users we already ran into issues with email. Switched to a very hackish message queue set up to send those mails. Has worked well so far. Will soon switch to a pcntl_fork based set up that I am using on another non-Elgg project that is doing a lot of crawling for me. MUCH better performance compared to the single CLI script. The messages are being pulled from another store, but it is not a hack on top of the Elgg messaging system.

  • Hmm. I had not looked into pcntl_fork before - will study it now;0 I'd be curious to see how you coded for this to send messages/emails. Are you on Dedicated/VPS/Shared ? The batched queue seems to ok for us - b/c we run only 200 emails every 5-10 mins (we're still in testing phase) - though I've not yet tested this on very high loads (24-36 hr runs;-)

  • The Elgg stuff is running on a 1GB VPS, load is minimal on it since it processes it sequentially. I'd assume that it is not very different from how you would have done it.

    Code for the existing mass mailer works like this:

    1. Grab all the users

    $users = get_data("SELECT * FROM  elggusers_entity where banned = 'no'"); 

    2. Process message/job for each user.

    foreach($users as $user){

    $query = insert_data("INSERT INTO elgg_job_queue (type, payload, bucket, done) VALUES ('email','$payload', '$bucket','no')"); 

    }

    3. Handle exceptions/errors.

    4. A PHP CLI runs with sleep(1), that pulls from the elgg_job_queue table and processes the data when it finds any and deletes processed jobs, marks failed jobs as not done. It never exits. I'm planning to use supervised later for a more elegant solution.

    Since I have a field mentioning 'type' in the table, I can do various things with the same queue. The major downside is that it is sequential and implementing locking on your own is reinventing the wheel. It is better to use a proper job queue for it.

    The code that is using pcntl_fork uses the library from Kore Nordmann: http://kore-nordmann.de/blog/0098_native_job_queue.html

    It does require 5.3 and pcntl_fork support compiled in.

    What I do here is to have the main daemon-like process spawn worker CLIs that have a strict time out.