ics-parser: (Overly?) Slow parsing

  • PHP Version: 7.1.16
  • PHP date.timezone: UTC
  • ICS Parser Version: 2.1.5
  • Windows/Mac/Linux

Description of the Issue:

Parsing the attached exchange.ics.txt takes some 3s. For my use case this is way too slow but I have no idea if this is considered “normal” with this library. Before I download & configure Xdebug (PHP is not my home turf) to see where those milliseconds are lost I would like to know how you assess this case.

File size: 160KB Number of events: 242 When expanded with span 1: 512

Steps to Reproduce:

<?php
date_default_timezone_set("UTC");
require_once "./vendor/autoload.php";

use ICal\ICal;

$calendarString = "";

try {
    $ical = new ICal(array(
        "defaultSpan"                 => 1,     // Default value: 2
        "defaultWeekStart"            => "MO",  // Default value
        "disableCharacterReplacement" => false, // Default value
        "skipRecurrence"              => false, // Default value
        "useTimeZoneWithRRules"       => false, // Default value
    ));
    $millis = getMillis();
    $ical->initString($calendarString);
    echo "ical->initString() took " . (getMillis() - $millis) . "ms\n";
    $events = $ical->events();
    echo "calendar contains " . sizeof($events) . " events (recurring span: 1y)\n";
} catch (\Exception $e) {
    die($e);
}
function getMillis() {
    return round(microtime(true) * 1000);
}

I pasted the content of the file exchange.ics.txt into $calendarString.

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Comments: 17 (17 by maintainers)

Most upvoted comments

Before I continue allow me to explain my use case.

I run a script that is fed with remote iCalendar URLs (Google, iCloud, Office360, etc.) through a query string parameter. It will load and parse that calendar and return a new calendar with only the events from yesterday, today, and tomorrow. It’s kind of a calendar filter or reducer.

I like your library a lot, it does so many things right. However, looking at a “clean” call graph it’s obvious that its design doesn’t go well with my requirements.

screen shot 2018-09-21 at 22 20 02

I can understand that for many cases it may make sense to first parse the entire calendar and build an internal representation. It is unfortunate that in eventsFromRange the library first instantiates a new Event for all events and only then reduces the collection to the desired range. For my case it would be best to reduce the list of events during parsing e.g. before processDateConversions as that’s also costly.

The challenge is that before you do all the tricky date, time, timezone computations you don’t really have “good” data to discard events based on a range predicate. If you allow the optimization to be fuzzy though you can achieve a significant performance boost. Idea:

  • while parsing keep events from -2 to +5 days ignoring time and timezone (still playing with the exact numbers), make the window large enough
  • use eventsFromRange to later reduce precisely to -1d to +1d

I have a working prototype that does

In the constructor

$now = time();
$this->min = $now - 172800; // -2d
$this->max = $now + 432000; // +5d

At the end of initLines

...
        $this->trim();
        $this->processDateConversions();
    }
}

private function trim(){
    $events = (isset($this->cal['VEVENT'])) ? $this->cal['VEVENT'] : array();
    if (empty($events)) {
        return false;
    }
    foreach ($events as $key => $anEvent) {
        if (!$this->isValidDate($anEvent['DTSTART']) || $this->outOfRange($anEvent['DTSTART'])) {
            unset($events[$key]);
            $this->eventCount--;
            continue;
        }
    }
    $this->cal['VEVENT'] = $events;
}

private function outOfRange($dtstart)
{
    // TODO handle date with time zone id e.g. DTSTART;TZID=US-Eastern:19980119T020000
    $start = strtotime(explode("T", $dtstart)[0]);
    return $start < $this->min || $start > $this->max;
}

WDYT? How would such a trimming-parser mode have to be implemented to stand a chance at being added to the library?