You are on page 1of 7

EACH IMAGE MATTERS, EVEN AMONG MILLIONS:

SCALING UP QOE-DRIVEN DELIVERY OF IMAGE-RICH


WEB APPLICATIONS
BY PARVEZ AHAMMAD

It comes as no surprise that when it comes to


image-rich web applications, every single image
matters in defining the quality of experience
(QoE) for the end user. So how does one offer
individually-tuned settings for optimal image
delivery while being able to scale up to millions
of images across the entire web delivery
pipeline? Theres some fun math that goes into
answering this question; at Instart Logic, we call
it SmartVision technology. Today, aligned with
the public release of our
first formal academic publication describing
SmartVision technology, let me use this blog
post to give you the basic ideas behind the
technical core of this technology and how it
enables optimized delivery of image-rich web
applications as a whole, while selecting
individually-tuned settings for each image within
a given web application.

Intuitively speaking, the key to optimal delivery of an image is to have a content-dependent signature (or hash code) for
computing the impact of web delivery on the given image, and using said signature to prioritize various constituent parts of
the image file. In our work, we developed a simple computational signature that captures the impact of web delivery
pipeline on image quality; we call it VoQS (variation of quality signature). In our experiments, we also discovered that large
corpuses of images can be effectively split into coherent clusters based on the VoQS similarity. Taken together, these two
simple insights combine to result in an efficient algorithmic approach SmartVision for finding adaptive settings for each
individual image, delivered via a web delivery service.
For technical details on the algorithm and experimental results on empirical datasets, please see the academic publication
that we are presenting today at the ACM (Association for Computing Machinery) Multimedia Conference. While there is a
large body of research out there on the topics of image categorization and computer vision-based image content analysis,
our paper is one of the first publications (to our knowledge) where quality-dependent image categorization in the context of
web delivery is directly addressed.

The following flowchart shows how the


SmartVision algorithm works:

As you can see in the flowchart, the categorization part can be done offline (with intermittent updates) to adapt to a changing
image corpus pooled across the web delivery service. The real-time aspect simply depends on efficient computation of VoQS
and a nearest-neighbor lookup against the pre-stored exemplars, that were estimated during the offline categorization step.
While message-passing algorithms such as Affinity Propagation [Frey & Dueck, 2007] offer the advantage that one doesnt
need to pre-specify the number of expected clusters as well as get the cluster-specific exemplars as a side product, the
algorithmic complexity of Affinity Propagation makes it impractical for really large image datasets (such as the ones we
encounter with our Software-Defined Application Delivery service). In scenarios where the image corpus is very large, one
can use faster algorithms such as K-means (with appropriate care and safety checks) for clustering, and choose the image
exemplars by minimizing aggregate distance in the VoQS metric space. It is worth noting that the entire algorithmic flow (and
the categorization aspect) happens in an unsupervised fashion so it is highly amenable to automation in the context of an
always-on web delivery service. In our experiments, we found that we could find optimal delivery thresholds for a large corpus
of images quickly, while minimizing the loss of visual quality (see Figure-3 in our ACM-Multimedia paper). In addition, our
approach is not really dependent on any particular image format; thus, we can apply it for most of the popular image formats
used by the web community.

At Instart Logic, we use the SmartVision algorithmic pipeline in two related but different contexts. One application scenario
(termed True Fidelity Image Streaming) is to divide the image into parts such that most relevant bits of the image file useful
for optimizing users quality of experience (QoE) are delivered up-front in a first-pass. This quick first-pass allows an imagerich web application to load quickly and delivers fast user interaction. Meanwhile, Instart Logics client-cloud architecture
continually works in the background to enable a seamless backfill so that the remaining details are incorporated into the
image quickly, without impacting the interaction time, while ensuring that the full-quality of the original image is delivered.
(Note though, that such a streaming approach requires the user to have our thin JavaScript-based client Nanovisor.js running
in their web browser.)
So what can you do when the client isnt installed on the target device, such as is the case with a native mobile application?
For users who do not have an environment that can run our JavaScript client, we can use the SmartVision technology to
automatically determine the optimal threshold on the server-side, and just send the part of the image file that delivers a good
QoE compared to the original. In congested mobile networks, or for users with low-complexity user-devices, or other
scenarios where network footprint comes at a premium, such a server-side approach can deliver dramatic improvement in
web application interactivity without significantly sacrificing the visual quality-of-experience (QoE). We term this application
scenario Image Transcoding with SmartVision. This approach allows us to improve application delivery performance through
a server-side transformation.
For further technical details and empirical experimental results, click on this link to access our ACM Multimedia publication.

To learn more, visit our Blog

You might also like