Improving Applications with Browser Extensions

Improving Applications with Browser Extensions

Nowadays many applications and services are accessed with a browser. This is good for users because the interface is familiar. It is good for developers too, because their applications are accessible from any device or operating system where a browser is available.

A less well-known advantage of browsers is that they make it possible to improve an application, without modifying the software of the application itself in any way. This customization is done using browser extensions, the subject of this post. Extensions are also known as add-ons and are often confused with plugins. Before going into any technicalities, let me tell you two stories.

Story #1 – Improving existing functions

This is a hypothetical story, with you as the hero. Often you complain about a well-known enterprise application used in your company. Let’s call this application TLA (the famous Three Letter Acronym). Every few weeks you must pick facts and figures from TLA to write a report. The info is scattered around multiple pages, requiring much clicking and copy-pasting to get the work done. The process is repetitive, time-consuming, boring, and error prone. What you want from TLA is easier access, with all the information consolidated into one page.

Let’s look at two different solutions. First, there is the bureaucratic way: you make a request for improvement to the vendor of the software, then you pray. Second, there is the do-it-yourself solution. Luckily, the user interface of TLA is not fully proprietary, but is built on standard browser technology. All the data for your reporting is already in the browser. The real problem is getting out the relevant information. Because a browser is programmable, it can be instructed to perform all the necessary steps automatically, quickly, and without errors. If you are really lucky, somebody has already solved a similar problem, and has made the solution available as a browser extension.

Many browser extensions are available to read tabular data from websites into a spreadsheet. Most are designed for specific sites, but some are more general. When the data is on multiple pages, some extensions are smart enough to navigate automatically until all pages have been loaded. Maybe the TLA interface is too clumsy for a standard extension to find the data. In this case, it is necessary to create a custom extension. It is usually not a very difficult task. If you cannot do it yourself, you can pay someone to do it for you.

So, our two solutions for improving the TLA user experience can be summarized as “pray or pay”. In the happy case where an existing extension solves the problem, the price is usually small, sometimes even zero. More important, “pay” certainly beats “pray” on delivery, since vendors, especially big ones, are unlikely to modify their product for your particular use-case.

Story #2 – Providing new functions

Now a real story. Once upon a time, a company had people spending many hours every single day to review hundreds and hundreds of web pages, hunting for verify specific information. The really interesting information could be found only on a small subset of websites, and had then to be handled by following complicated guidelines. A large part of the job was mechanical and repetitive but identifying the relevant information required specialized knowledge. The whole process was time-consuming and days frequently ended with a backlog of unfinished work. As part of a larger solution for the company, PENTAG created a custom browser extension to partially automate the task. Thanks to that simple solution, the time required by the task was massively reduced and the experts could now apply their talent to other important activities. For the company it was a significant increase in productivity at a modest price.

How does it work?

A browser is a complex and sophisticated product, which provides numerous services to applications, like message passing between unrelated websites, local database storage, and much more. Applications use these services through well-defined Application Programming Interfaces (APIs).

Let’s try a simplified and non-technical description of the internals of a browser. Each domain runs in a private execution environment in the browser. (A domain corresponds roughly to a website.) Usually, multiple domains are present in the browser at the same time. Each one has its own environment, strongly separated from others. The browser itself runs in its own protected environment, from which it can supervise and provide services to all others. Browser extensions can be installed in the browser by the user and each one runs in yet another separate environment. Extensions request services from the browser through various APIs and can modify the behavior of websites loaded in the browser, but in a very controlled manner.

Concretely: how can an extension save a web page as a spreadsheet? The extension must explicitly declare its intentions to the browser in a file called a manifest. In our case, it tells the browser to perform a specific action whenever the user, in any window, presses the shift and f7 keys together (this is just an example). The action is a piece of JavaScript code provided by the extension, which is executed in the private environment of a window on each occurrence of the shift+f7 event. The code uses something called the DOM API to inspect the web page. When it finds a table, it extracts all rows into lines of text, separating cell content with commas (and properly dealing with any comma or quote present in the data). The extension then uses the Download API and prompts the user to save the captured text into a CSV file or to cancel the action. Job done.

Security and installation

Extensions are not the same as browser plugins. Plugins are not used any more. They are even not supported any more by some modern browsers, because they are a constant source of security bugs. Remember Flash? Plugins are binary programs which cannot be properly supervised by the browser. In contrast, an extension comes in source form, can be verified and reviewed, and can only perform actions through well-defined APIs. Most importantly, an extension must request explicit permissions in its manifest, and these permissions are clearly described to the user during installation. Like for smartphone apps, security-conscious users do not install extensions which require incomprehensible permissions. Which bring us to installation: where do you find a browser extension?

Every browser vendor has a web store where you can get publicly available extensions, free or paying. The extensions you find in a store have undergone a strict review procedure and should be fine from a security point of view, at least in theory. This is the general model. Depending on the vendor, it is possible to set up an enterprise web store to distribute privately developed extensions inside a company. Alternatively, with some vendors, extensions can be published “discreetly” in their public store, without being listed in the catalog. Such extensions are not searchable, and can only be found using their exact URL. Yet another variant is the possibility to define a closed circle of users.

Finally there are completely private extensions which can be installed using the development tools available in all browsers. Such extensions are the only ones not submitted to any review procedure.