Performance of elgg_get_entities_from_annotations (feedback required)

I've started a ticket in GitHub https://github.com/Elgg/Elgg/issues/6638 and I need your feedback.

The problem we have is a serious performance issue with elgg_get_entities_from_annotations(). It can shutdown the database.

The problem seems to be related to the custom select in this function ("max(n_table.time_created) as maxtime")

I use the following options for my problem query:

$options = array(
     "type" => "group",
     "limit" => 1,
     "site_guids" => false,
     "annotation_name_value_pairs" => array(
            array(
                 "name" => "email_invitation",
                 "value" => $invite_code
            ),
            array(
                "name" => "email_invitation",
                "value" => $invite_code . "|%",
                "operand" => "LIKE"
            )
         ),
         "annotation_name_value_pairs_operator" => "OR"
       );

This query takes around 10 sec on a large DB, if I remove the custom select (and the resulting group_by and order_by) the query takes 0.047 sec.

I asked Brett why the default behaviour of this function is different from the other elgg_get_entities* functions. He could only find the commits not the reason.

I would like your input if we should change the default behaviour to be inline with the other elgg_get_entities* functions.

  • The reason was here: https://github.com/Elgg/Elgg/issues/2022 BC with 1.6, but particularly to order by the annotations rather than the entities.

    Brett's suggestion SGTM: ("...order by e.last_action by default, but if someone passes maxtime, we fall back to the old behavior with a deprecation notice").

  • It's a function I haven't used extensively, I'm not sure what BC issues may arise from the change... that said such a change wouldn't affect me, and given the amount I use Elgg (ie. all day every day) I doubt such a change would affect many people.

  • I have a problem with the suggestion of sorting by e.last_action.

    For example I want to list all liked entities. The order of the result is affected by when someone last edited a blog or group.

    Why not just do the default sorting that is used in all elgg_get_entities* functions? I could still buildin the deprication for the old code, but let the new code be inline with the rest of elgg

  • I would try to keep it as close to the other egef* functions as possible - so default to e.time_created

  • None of my use cases require the current behaviour.

    Im definitely in favor of changing it and getting the performance gain.

  • +1 on using default behaviour. Not seen any sites which insist on this behaviour.

  • +1 for moving towards egef*. As for order_by, it seems elgg_get_metastrings_based_objects() uses time_created ASC, which last time I have used elgg_get_annotations() was producing results that I was not expecting. Had to debug until I realized it was non-default ordering.

  • Tidypics use case:

    elgg_list_entities_from_annotations() and elgg_list_entities_from_annotation_calculation() are used to be able to sort image lists on recent comments, recent views, recent votes. The sorting of images depends on the order of the annotations (their time_created) and not on the creation time of the corresponding images / entities.

    Changing the behaviour of elgg_get_entities_from_annotations would surely break the Tidypics plugin - both the Elgg 1.8 and 1.9 version.

  • @iionly thanks for digging this up

  • It seems that more people will benefit if we change the behavior and the plugins needing the old behavior are updated to pass in an extra parameter.

    @jeabakker Are you able to make this change for 1.9 so that it's trivial to update e.g. Tidypics to work with the modified egefa*()?

Feedback and Planning

Feedback and Planning

Discussions about the past, present, and future of Elgg and this community site.