data model limitations while moving a large community over to Elgg

Hi guys, we are a large hospitality network (HospitalityClub.org) and seriously considering a migration to Elgg. HC would from what we can tell be the biggest site using Elgg so far (we have 650.000 members, but this number would probably considerably increase once we have a new site).

We hope to get some feedback to our questions here :-)

The first big question we have concerns the data model. Since we have already a database structure serving our site for the above mentioned amount of users we would like to know how viable you Elgg experts see a migration from a standard relational database design to the abstract entity based Elgg data model.

Our concern is that it might be more difficult to adapt our database model to Elgg rather than building a new site on top of a framework, keeping our current database model.

We'd appreciate any feedback, ideas, links comments :-)

Thanks and greetings from Buenos aires,

Veit (HC Founder) and Ariel (Lead Developer)

  • have you tried installing elgg on a localhost before going through with it? when you install elgg, you can see elgg's database structure and what not. there are plugins here that you can use to import from a cvs or other formats. good luck~

  • Hi Cim,

    Yes in deed, we installed it in a development box. I have gone trough the DB schema, specially the entities, metadata and metastrings tables. The last one specially called my attention, after adding a user the metastrings table got populated with twenty something rows. Imagine this number times the amount of users we have, and adding around 20 custom profile fields.

    I know the datamodel / performance question has already been discussed in the past. The Elgg scalabilty wiki page states (regarding the datamodel):

    There is an efficiency cost to doing this, however from looking at the typical usage of an Elgg install, this appears to be very minor.

    We'd like to know if a 650.000 member network would be considered among Elgg typical deployments.

    More advice is welcome. Thanks!

    Ariel

     

     

  • Twenty custom Profile fields -> 20 metadata rows & 40 metastring rows being added is of no real concern.  This gives only 60 rows and considering that MySQL uses the B+Tree structure and that the Meta tables are indexed quite properly -->  even with 1 billion rows for metadata/tring rows - the total disk accesses would be only about 2 - 3.

    The real scalablity issue is the hardware/server structure being used. a $5/month hosting will scale up to just so few users, a VPS to maybe 20,000 users, a dedicated server - depending on the power level - perhaps 50,000 - 200,000 users. Of course - the actual concurrent users hitting the server will cost resources and one will most likely be looking at load-balancers, apache/ MySQL fine-tuning aspects as well by that time. You have 650,000 on a non-Elgg platform right now ? We are running almost 200,000 uses on Elgg + dedicated server + heavy apache/MySQL technical support.

    Elgg by itself does not have any substantive issues with scaling. Install Elgg on an IBM mainframe running Linux and you'll find *no scalability problems at all with the plain vanilla hardware & OS ;-) -- with no software tuning needed. Put Elgg on a Shared Hosting server and.. and... scalability goes to the loo ! The difference in initial and operational costs ? most likely a few $MM up from $5 ;-)

     

  • all in all, it comes down to hardware. elgg can handle it as long as you got a lot of juice.

  • Thanks for the input about scalability! Looks like that would be doable. However, do you have any input on the data model question? Basically, we're not starting a new site from scratch but have to move all our records to an (apparently) completely new data model, and the question seems to be if it's worth it :-)

  • 1       how viable ?
             can be done with the right brain power
    2       (a) adapt our database model to Elgg rather than (b) building a new site on top of a framework ?
             'ditto'''
    the process will need some serious database structures analysis and logical field matching
    2) parts (a) or (b) ? depends on purpose of the migration. either option - depends on the budget & & lots of "work" to be done" anyway...