Despite the rapid increase in mobile web traffic, page loads still fall short of user performance expectations. Numerous solutions have attempted to optimize the web performance, however, these state-of-the-art techniques are either ineffective or impractical in real-world settings due to complexity or deployment challenges. Inspired by mobile apps which provide faster user-perceived performance, we target two significant bottlenecks in the page load process; First, clients have to suffer multiple round trips and server processing delays to fetch the page's main HTML; during this time, a browser cannot display any visual content which frustrates users. Next, these pages include large amounts of JavaScript code in order to offer users a dynamic experience. These scripts often make pages slow to load, partly due to a fundamental inefficiency in how browsers process JavaScript content which fails to leverage the multiple CPU cores that are readily available even on low-end phones. We pursue a programmatic approach that works on legacy web pages and unmodified browsers. We built two fully-automatic systems that are complementary to each other and each optimizes one of the above contributors to slow page loads: Fawkes and Horcrux. Fawkes leverages our measurement study finding that 75% of HTML content remains unchanged across page loads spread 1 week apart. With Fawkes, web servers extract static, cacheable HTML templates (e.g., layout templates) for their pages offline. Upon client request, the static template is sent back to the client immediately and is rendered while dynamic content (e.g., news headlines) is generated which expresses the updates required to transform those rendered templates into the latest page versions. Fawkes reduces the startup delays incurred during the fetch of page's HTML and improves interactivity metrics such as Speed Index and Time-to-first-paint by 46% and 64% at the median in warm cache settings; results are 24% and 62% in cold cache settings.
Our second system, Horcrux addresses the client-side computation overheads through offline analysis of all the JavaScript code on the server-side to conservatively identify the page state across all loads of the page. Horcrux's JavaScript scheduler then uses this information to judiciously parallelize JavaScript execution on the client-side while ensuring correctness, accounting for the non-determinism intrinsic to web page loads, and the constraints placed by the browser's API for parallelism. Horcrux reduces median browser computation delays by 31-44% and page load times by 18-37%.