Skip to main content

Posts

Showing posts from March, 2015

Ignoring duplicate inserts with Postgres when processing a batch

I'm busy on a project which involves importing fairly large datasets of about ~3.3GB at a time.  I have to read a CSV file, process each line, and generate a number of database records from the results of that process. Users are expected to be able to rerun batches and there is overlap between different datasets.  For example: the dataset of "last year" overlaps with the dataset of "all time".  This means that we need an elegant way to handle duplicate updates. Searching if a record exists (by PK) is fine until the row count in the table gets significant.  At just over 2 million records it was taking my development machine 30 seconds to process 10,000 records.  This number steadily increased as the row count increased. I had to find a better way to do this and happened across the option of using a database rule to ignore duplicates.  While using the rule there is a marked improvement in the performance as I no longer need to search the database for a r

Adding info to Laravel logs

I am coding a queue worker that is handling some pretty large (2gig+) datasets and so wanted some details in my logs that Vanilla laravel didn't offer. Reading the documentation at  http://laravel.com/docs/4.2/errors wasn't much help until I twigged that I could manipulate the log object returned by  Log :: getMonolog ( ) ; . Here is an example of adding memory usage to Laravel logs. In app/start/global.php make the following changes Log::useFiles(storage_path().'/logs/laravel.log'); $log = Log::getMonolog(); $log->pushProcessor(new Monolog\Processor\MemoryUsageProcessor); You'll find the Monolog documentation on the repo

Support for Postgres broken in HHVM 3.6.0

On my desktop machine I run my package upgrades every day.  The other day my Hiphop version got updated to 3.6.0 and suddenly my Postgres support died. Running Hiphop gave a symbol not found error in the postgres.so file ( undefined symbol: _ZTIN4HPHP11PDOResourceE\n ) exactly like the issue reported on the driver repository ( here ). I tried to recompile the postgres driver against Hiphop 3.6.0 but hit a number of problems, mostly to do with  hhvm-pgsql-master/pdo_pgsql_statement.cpp it seems. The fix for the incompatibility was unfortunately rolling back to my previous version of Hiphop.  To do this on Mint/Ubuntu just do this: Run cat /etc/*-release to get your release information Download the appropriate package for your distro from http://dl.hhvm.com/ubuntu/pool/main/h/hhvm/ Remove your 3.6.0 installation of hhvm: sudo apt-get remove hhvm Install the package you downloaded : sudo dpkg -i <deb package> After that everything should be installed properly and y