QListBox holds all objects making it slow in certain situations - possible performance improvement?

thread: 15 messages  |  last: about 2 years ago  |  started: monday, december 28, 2009, 7:27 pm pst


#1  |  tronics (VIE, AUT) Austria
Monday, December 28, 2009, 7:27 PM PST

Hello!

I think one possible performance improvement would be to move away from holding the complete object in QListBox (and probably other).

Having a 1000 item only mysql table that consists of 142 columns and a lot of data in each row, I'm experiencing 4 seconds to generate the QListBox.
When making a QListBox just consisting of the ids behind this it is literarily not measurable.
There are indexes on the data and making a select * involving these fields in phpmyadmin takes only 0.0037 sec. I really tried to make this query slow by different experiments with the query and it was always lightening fast. Not so with the autogenerated QListBox.

I wonder if it would not be much better to do this in a different way.

UPDATE:
I found out that this has more likely to do with Qquery then with QListBox (where you only might see the results of this performance issue).
Please see below.

Cheers,
tronics

----------------

The following code is autogenerated in the same way that almost any application we build with the framework..

This takes 1,8 seconds alone:

$start = microtime(true);
$asdf=Business::LoadAll();
echo microtime(true)-$start.'<br />';

Stepping through to make the QListBox takes another 2,2 seconds.

public function lstRelatedBusinessObject_Create($strControlId = null) {
            $this->lstRelatedBusinessObject = new QListBox($this->objParentObject, $strControlId);
            $this->lstRelatedBusinessObject->Name = QApplication::Translate('Zugehoerigkeit Geschaeft Object');
            $this->lstRelatedBusinessObject->AddItem(QApplication::Translate('- Select One -'), null);
            $objRelatedBusinessObjectArray = Business::LoadAll();
            if ($objRelatedBusinessObjectArray) foreach ($objRelatedBusinessObjectArray as $objRelatedBusinessObject) {
                $objListItem = new QListItem($objRelatedBusinessObject->__toString(), $objRelatedBusinessObject->Id);
                if (($this->objGeschaefte->RelatedBusinessObject) && ($this->objGeschaefte->RelatedBusinessObject->Id == $objRelatedBusinessObject->Id))
                    $objListItem->Selected = true;
                $this->lstRelatedBusinessObject->AddItem($objListItem);
            }
            return $this->lstRelatedBusinessObject;
        }
.bc
#2  |  tronics (VIE, AUT) Austria
Tuesday, December 29, 2009, 7:41 PM PST

ADDITION:

I have now extracted the Query and applied it manually and measured it..

This is the complete code it took

*QueryArray TEST: 2.3054418563843
*LoadAll TEST: 2.2675719261169
*PLAIN PHP TEST: 0.036140918731689
* in phpmyadmin it took only 0.0050 but that might be automatically restricted to the first 30 rows.


protected function Form_Create() {


//-----QueryArray TEST-----------------------

QApplication::$Database[1]->EnableProfiling();
$start1 = microtime(true);
$test1= Geschaefte::QueryArray(
                                QQ::All(),QQ::OrderBy(QQN::Geschaefte()->Id,true),array(QQ::LimitInfo('10000')));    
        
print 'QueryArray TEST: ';
echo microtime(true)-$start1;
print '<br />';
            
            
            
//------LoadAll TEST------------------------------
            
$start2 = microtime(true);
$test2= Geschaefte::LoadAll();
print 'LoadAll TEST: ';
echo microtime(true)-$start2;
print '<br />';        

// output profiling for the 2 above..
                QApplication::$Database[1]->OutputProfiling();
                
//------Plain old PHP TEST---------------------------------
            
    
$link = mysql_connect("mysql.asdfasdf.info", "43_61", "1234");
mysql_select_db("34_61", $link);
$start3 = microtime(true);        
            
$query = '
    
SELECT
    `t0`.`id` AS `a0`,
... and so on ...
    `t0`.`modified` AS `a139`,
    `t0`.`created` AS `a140`
    
        FROM
            `Geschaefte` AS `t0`
            
        ORDER BY
            `t0`.`id` ASC
            
            LIMIT 0,100000000
            ';
            
            
$result = mysql_query($query);
$array = mysql_fetch_assoc($result);


print 'PLAIN PHP TEST: ';
echo microtime(true)-$start3;
print '<br />';            
            
            
}    // end form create

Cheers
tronics


#3  |  tronics (VIE, AUT) Austria
Tuesday, December 29, 2009, 7:49 PM PST

Now I also tried QueryArrayCached..

It is slower then this $result = mysql_query($query); $array = mysql_fetch_assoc($result);

1 time (not cached yet): slower then all of the above by at least .3 seconds
QueryArrayCached TEST: 2.5453979969025

2nd time and subsequent (cached): 16 times slower then Plain PHP
QueryArrayCached TEST: 0.56282591819763


//------QueryArrayCached TEST--------------

$start4 = microtime(true);
$test1= Geschaefte::QueryArrayCached(
QQ::All(),QQ::OrderBy(QQN::Geschaefte()->Id,true),array(QQ::LimitInfo('10000')));
print 'QueryArrayCached TEST: ';
echo microtime(true)-$start4;
print '
';

And last but not least stepping through all of the rows in plain PHP (I only stepped through one of them in the previous example..):

PLAIN PHP TEST: 0.066386938095093 very fast


//------Plain PHP TEST stepping trough all rows for comparison--------------

$start5 = microtime(true);
$result = mysql_query($query);
//$array = mysql_fetch_assoc($result);
while($ris=mysql_fetch_assoc($result)) {
   
    }
echo microtime(true)-$start5;

.bc
#4  |  tronics (VIE, AUT) Austria
Thursday, December 31, 2009, 12:48 AM PST

From collegue Akrohn:

Generated around 4000 rows for the projects table of the sample database and was curious what the profiler would say. (used QSqlServer2005Database)

For a Project::LoadAll() it got me this for a total time of 21.569 ms:

- first for “Avg. self” was ProjectGen::InstantiateDbResult (1206 ms for 1 call)
- first for “Total self” by a large margin was ProjectGen::InstantiateDbRow (11.115 ms for 4006 calls)
- first for most called function besides php internals is GetColumn() (around 56000 times, total self is 3.102 ms)

So I guess these 3 functions should be the first ones to look into for optimization.

#5  |  tronics (VIE, AUT) Austria
Friday, January 1, 2010, 1:49 AM PST


Further analysis from Akrohn..

The only difference between the database adapters using GetColumn() is the handling of boolean values. So maybe one could call GetColumn() only if it's a boolean in InstantiateDbRow().

Otherwise the value of the column could be determined in InstantiateDbRow() like that:


$strColumnArray = $objDbRow->GetColumnNameArray();
...
$strAliasName = array_key_exists($strAliasPrefix . 'id', $strColumnAliasArray) ? $strColumnAliasArray[$strAliasPrefix . 'id'] : $strAliasPrefix . 'id';
// for all other then boolean, directly access the value
// TODO: maybe a null check
$objToReturn->intId = $strColumnArray[$strAliasName];

.bc
#6  |  tronics (VIE, AUT) Austria
Saturday, January 2, 2010, 8:57 PM PST

It comes down to 2 problems:

1) Qcodo ORM is generating a lot of objects when doing eg. a LoadAll?()
This has also proven to hamper performance on other ORMs like Doctrine.

2) QListBox is hardly usable for large datasets - regarding usability & performance
What works really good on small datasets makes for a very lagging system as soon as you have a couple of hundred entries.

Performance is hampered in several ways:

  • Using the Qquery LoadAll?() method is really very, very slow
  • The more data there is in the set the larger the Formstate is on the clientside. I have one form where the HTML hits 2,7 MB because of this reason
  • The selection of this UI Element for a large dataset is poor. Looking at advanced listbox replacements from other frameworks it is obvious we have some room to catch up: http://www.grails.org/screencast/show/7 (I know it might look like autocomplete at first sight but this or something similar is also a high performance UI replacement for Listboxes)

I submitted those as 2 separate tickets:

ORM performance
http://www.qcodo.com/issues/view.php/43/lastpage

QListBox enhancements
http://www.qcodo.com/issues/view.php/44/lastpage


All the Best,
tronics

#7  |  Mike Ho (Sunnyvale, CA) United States of America Qcodo Administrator
Monday, January 4, 2010, 8:40 AM PST

Tronics... thanks for your posts and your analysis.

Yes, I agree that we should do a better job with the metacontrol codegen templates for QListBox in that it should try and only save the PK Column's data value as the listboxitem's value instead of the entire object.  Plus, it makes sense to use QueryArrayCached for the qlistbox creates in the metacontrols.

However, for the other performance issues that you mention, I would like to suggest an alternative approach.

Your Geschaefte table alone is 140 columns -- which is a very, VERY wide column.  That, combined with the likely case that you have many many rows, will undoubtedly cause slow downs with any ORM-type of framework (Qcodo, as you mention Doctrine, etc.)

The value of Qcodo and its approach to application development is that it has a primary concern/focus of helping to keep a clean, consistent architecture throughout your code and thus should minimize maintenance issues in the long term life of your application.  But what is implicit in that premise is that the data model stay agile, clean and consistent as well.

Having tables that are very large with millions or billions of rows is unavoidable, especially for very large applications.  Things like Qcodo's encouragement of proper use of indexes and codegenerating loadarray shortcuts on those indexes helps the developer to ensure that their database is as optimal as possible.

However, having tables that are very wide (e.g. 140 columns) does impede on the “agile data model” premise quite a bit.  But the good news is that this approach is avoidable... in fact, I will always manage to refactor a data model if I start to see tables getting beyond 15 or 20 columns.

I believe that with a properly factored, “agile” data model and the right use of indexes (which Qcodo definitely encourages), a lot of the performance issues you are seeing can be avoided and/or minimized.  Plus, you get the added benefit of continuing to allow for a clean, consistent, agile codebase that can be easily augmented and maintained.

On the flip side, when you get to tables that have millions of rows and 40+ columns, then you almost HAVE to start using custom PHP in order to have a system optimal-enough to be usable.  But, when that happens, you lose out on the consistency and the ease-of-maintenance aspects of the framework.

Hope that makes sense... please post with any additional thoughts or comments.

#8  |  tronics (VIE, AUT) Austria
Monday, January 4, 2010, 7:24 PM PST

Hello Mike,


Thank you.
Yes you are totally right that the datastructure used in that application should be improved.
I posted it because it was the first application where I found this issue and I was unsure if it was only having to do with my setup.

--

But then I found actually the performance issue has nothing to do with that structure, it also shows with your recommended sample database.

Akrohn tested with just 4000 rows (which is not very much) with the projects table of the sample database and that took 5 seconds.
Avoiding GetColumn() calls in InstantiateDbRow() did get him down to 3.2 seconds

And he had the same issues that make an application too slow for nowadays standards just because of one generated QListBox.
Or better to say, not because of the QListBox but because of one ->LoadAll() statement

Quote from Akrohn:


Generated around 4000 rows for the projects table 
of the sample database and was curious what the 
profiler would say. 

--

Hey, I've got to add, that it needed 21 seconds 
only because the xdebug-profiler was enabled, without 
it needs only around 5 seconds.

Avoiding GetColumn() calls in InstantiateDbRow() did 
get me to 3.2 seconds for a 6 column table. 
So this is a lot for such a little change.

Related to this is avoiding of "Select *" discussion, 
that should enhance performance too.

--

Apart from the problem that ->LoadAll() is slow, there should be a different UI element to address large datasets.

QListBox containing more then say 100 entries, cannot be selected or sorted by the user from the UI aspect plus without the performance issue above.

So giving QListBox as the default is a choice that will make the developer thiknk it is ok to start with that and to use ->SelectedValue and having an application rely on QListBox.

There is no indication on the code or website, that this is not a good choice if you are going to have more than 100 rows . But there is probably no application that is intended to have less then 100 rows.

I'm sure there are workarounds for example to use a Qdialog/Qdatagrid combination or pagination.
Qcodo does not give any indication how to use that.

Most useful would be to have optimal UI elements autogenerated in the drafts already.


Cheers,
tronics

#9  |  Mike Ho (Sunnyvale, CA) United States of America Qcodo Administrator
Tuesday, January 5, 2010, 7:51 AM PST

Yes -- definitely, the code generated templates for the metacontrol involving listbox creating should and will definitely be updated to be made more efficient.

I see that you've already opened up an issue ticket for this... thanks!

#10  |  Fernando Lordán (Barcelona, CAT, Spain) Spain
Tuesday, January 5, 2010, 8:25 AM PST

The standard QFormStateHandler is not usable in pages with big controls or lots of controls, for the reason you're exposing. Use the QFormSessionStateHandler or QFormFileStateHandler, so the entire state is not transfered to the client but cached ni the server.

The GRails control you reference is not a listbox, it's clearly an autocomplete box. Not the same functionality. And don't trust in the fantastic performance seen in a demo, as it's done with just a couple of records.

I faced the same problem you're facing, and I'll explain the solution I implemented:

For the QQuery part, optimize the database (indexes, number of records per table, number or fields per record...). Currently there's no other way.

For the HTML part, a listbox with tenths of values is simply not usable (I can't imagine how does it work with thousands of them). This is a bit complicated to solve in a generalized manner, but it's doable (in fact I did it for my own framework):

- Implement popup pages that serve only for record selection (using a data grid that allows a JavaScript action when you click on a row, for instance).

- Implement a listbox replacement control that simlpy opens that popup and receives the result.

The complex part is that the popup must return a value or some values, so the control must specify in the URL it opens where to put them, and the popup must put that values in the parent window when they're selected. It's a bit hard to do it right, cross browser and in a generalized manner, but we've faced more complicated JavaScript wars.

Maybe it's less complicated if you use a panel or a QDialogBox instead of a popup. I haven't tried those.



Copyright © 2005 - 2012, Quasidea Development, LLC
This open-source framework for PHP is released under the terms of The MIT License.