I feel like Page Visibility API should be locked behind a permission toggle, which would ask whenever the website tries to listen to any of those events, and will not trigger the events until the user agrees to give the permission (whether or not the user grants the permission should not be revealed to the website). The ability for a website to see when you click on things outside of the page itself (either by changing active window or tab) is quite unexpected from an end-user perspective, and its use in various "ed tech" websites to detect alleged cheating[1] is frightening, to say the least. Other than that, it's also used in many other anti-user implementations of ads or other content in order to force you to watch them in their entirety.
Edit: If you want to protect yourself from such abuse, try an extension that spoofs the API.
You can try: "Always active window" on Firefox or "Don't make me watch" on Chrome.
There are other ways to detect if a tab/window is in foreground. One hack is abusing timer slowdown (setInterval() resolution is moved to 1000ms for background tabs, you could compare against system time to detect this).
I've thought a lot about this api because I've recently both used it as a developer to reduce the resource intensity of my web app and as a user taking an online exam.
The way I see it, all the uses where this is "tracking the user" would simply require the permission and all the ones where it's just an optimization would never be used because the permission adds friction.
Like, if I'm setting up an online exam and my requirement is that I need to know if the window is in focus, I won't let students who don't give that permission participate. And if the browsers don't inform me of the permission state and simply don't send the events, I'd scrap the project and force students to use locally installed proctoring software.
On the other hand, if I'm trying to save resources by suspending live updates and heavy rendering when a user tabs away, asking for a "track your activity" permission would cause a lot of friction and user loss (because even just asking for it sounds creepy). Very few users will notice the resource usage, or if they do, will think it's unavoidable, so I simply won't implement those optimisations.
You can't require permission if the default action is spoofing. This is how I think all privacy-related features should be implemented. If I don't want a website to use cookies, the browser shouldn't let the website know it can't. It should just pretend to let it use it, while forgetting the cookie when the browser is closed.
As a side note I think prompts like 'letting website track your identity' is misleading and infantilizing. Just let user know what is being accessed, like
- Do you want this website to know when you left the page? We won't let them know if you don't.
- Do you want to allow this website to save cookies? If you don't, all cookies from this site will be cleared when you close the browser.
At this point, we need to acknowledge that browsing the web safely is essentially an adversarial game against adtech and spyware companies. If we don't fight with all our might, we'll just lose. This isn't 2005 anymore. I too wish we could go back, but we can't.
Hard agree. The proper place for cookie permission management is in the browser. The web as a whole needs need to move on from these obnoxious pop-up cookie warnings ASAP.
> The web as a whole needs need to move on from these obnoxious pop-up cookie warnings ASAP.
Those obnoxious cookie pop-ups can go away today. No browser intervention necessary.
You know why? They only exist because the greedy industry really wants to collect and sell your private information at scale. No other reason. So the companies could remove those popups today if they cared.
If you move the dialog to the browser, you will have both the browser dialogs and the non-cookie dialogs (because they will still want to fingerprint you, and collect your data, and sell it)
And you could optimize the use of resources further if you knew the user wasn’t sitting at their desk but that doesn’t mean we should open up webcam APIs as permissionless.
It’s good that you’re using the API in a noble way, but many will not, so it should be up to the user and the settings of the browser they choose to optimize the use of their computer’s resources.
You optimize your system as best you can without spying on the user, and the rest is up to them.
The reason we don't do that is because of user privacy. Knowing whether a tab is shown or not has no impact on privacy. Should we add permissions for click events and mouse movements too?
>its use in various "ed tech" websites to detect alleged cheating[1] is frightening, to say the least
why is this frightening? proctoring software has long been used and is much more invasive than this, and the alternative is to simply not allow students to use their own devices, or even to take tests remotely - when i was in school the alternative to this sort of software was an exam proctor sitting behind you and watching your every move to make sure you didn't cheat. i get that outside of this sort of context it's bad, but that doesn't make it bad in the contexts where it actually is appropriate.
(In the case of Canvas) This is done surreptitiously without the student's knowledge. As far as I know, there is a separate lockdown proctoring browser which is for actual exams, but then there are some other quizzes which do not require using the locked down browser, but they nevertheless log a bunch of stuff that get sent to the instructor, who often may not know what they actually indicate and proceed to accuse students of cheating.
On your second point about the comparison of human proctor and this:
Lots of things are fine to do on a small scale, but becomes terribly invasive when you do it on a massive scale with the aid of computers. You may accept that the person driving behind you may take a note of your license plate, but would you say the same if someone installed a high resolution webcam on the pedestrian footbridge to see every single car that went through forever? The general issue with computer-assisted surveillance is that it is cheaply scalable and nearly impossible to erase.
Lastly, is there really a point in preventing the use of external information aids (textbooks, notes, Google) in an age where information is so widely available? Rote memorization has never been a good way to truly learn things, and maybe we should do away with this tradition of tests based on information reproduction.
I don't follow why you are so opposed to the visibility api. I agree that it's annoying if used to ensure that I watch a certain video, but besides from that it doesn't seem malicious to me.
>they nevertheless log a bunch of stuff that get sent to the instructor
..what sort of stuff? And how is it related to the visibility api?
> I don't follow why you are so opposed to the visibility api.
It's in the second sentence.
The ability for a website to see when you click on things outside of the page itself (either by changing active window or tab) is quite unexpected from an end-user perspective
> what sort of stuff? And how is it related to the visibility api?
It's in the linked article. At least a few of those events are implemented with the visibility api.
Good question. I've always assumed that when I take an exam online that I'm granting access to BOAT LOADS of PII (mouse movements, IP address, cookies, my video camera, my video camera as I point it around my room)
I've always chalked this up to "of course you need to be vigilant that I'm not cheating", but with the twist that I hold you to a high ethical bar with the data you collect about me.
Mind you, it's not just about the user explicitly inactivating the tab/window; the page visibility API also considers the page to be invisible if you completely cover the viewport by moving another window in front of said page's viewport. Or, for that matter, if the viewport is owned by an iframe, and that iframe is scrolled out of the bounds of its parent viewport.
The intention behind the Page Visibility API is supposed to be to allow the page to stop bothering to run JS intended solely to calculate DOM updates, if said DOM isn't currently being rendered where the user can see it. Like how "damage region"-based redraw event-pumping worked in native apps before window managers started doing compositing; or how culling works in 3D rendering. The browser asks the OS to notify it whenever the tab's viewport is entirely occluded; and when the browser receives said notification, it translates that to a Page Visibility API event for the tab.
And the annoying thing about that, is that an opt-in in this case would do away with all the subtle improvements to battery life that are the "correct" use-case of this API. It'd be like having to explicitly prompt the user to opt-in to allowing the use of WEBMs for displaying animations on pages, while allowing actual .GIF-file animations all the time. Websites that are just websites, not long-use apps, would know that users don't care enough to take the time to understand what's being asked and opt in to using extra APIs with them, so 80% of the time they'd get refused; so, for a "clean experience", they just wouldn't bother to ask, and would instead just consider the API verboten, and stick with doing the worse, more-CPU-intensive thing.
Meanwhile, the malicious use-cases would just ask; because for the malicious use-case, even just those 20% of users who naively default to "yes" to prompts instead of "no" would be enough to ensure they siphon off enough data to sell for sweet, sweet ad-tech dollars. (At least, that's how things seem to be for the similar Background Notification API.)
Effectively, you'd punish honest use-cases of the API (because the devs were only doing it out of the goodness of their hearts, and so a little thing like "80% of users won't benefit" is enough to make those devs not bother), while not really thwarting malicious use-cases (because they're highly incentivized.)
That would be a loss for sure, but I think the dangers of surveillance far outweighs the reduced CPU load from not rendering content sometimes. Wouldn't it be better to have the browser do this instead, by somehow ignoring the rendering when the tab is not visible?
As a side note, can you give a website where this is done? I feel like most modern websites are so bloated that no amount of these subtle optimizations would be better than just having a more minimalistic design, such as not fetching things continuously by default and doing away with unneeded animations.
> Wouldn't it be better to have the browser do this instead, by somehow ignoring the rendering when the tab is not visible?
The browser does already do this. Browsers are already smart enough to not "reflect" DOM updates to currently-hidden parts of the DOM into re-rendering passes. (Which is one reason a lot of people think "virtual DOM" frameworks are silly.)
What the Page Visibility API allows you to save the expense of, is running the JavaScript logic that translate into those DOM update calls. (Think: the logic that decides what horrible banner ad to fetch next, actually pre-fetches it, and then updates the DOM to tell the browser it's the ad that should be being displayed. Disabling that logic when the banner ad isn't visible => no more background network fetches for new ads.)
This requires an API, because the browser can't just be reaching into a Turing machine (i.e. some arbitrary Javascript program that could really be trying to do anything, given that webapps exist) and randomly shutting parts of it off. The browser needs a clear signal to hand that page-embedded Turing machine to say "if you have a part that's just for the sake of figuring out what to show next on the screen, then please stop running that part for now." That signal is a Page Visibility event.
(Though, this isn't the only way to accomplish that. If said Javascript program has provided a clear rendering API to the browser, that works too! I.e., if the page is pumping its logic using Window.requestAnimationFrame, then IIRC the browser can take as long as it likes to call the requestAnimationFrame callback back — and if the window is occluded, that would be a perfect time for the browser to wait on that. But if you think about it, "requestAnimationFrame taking longer to get called back" exposes basically the same data-leak that the Page Visibility API does, so...)
> As a side note, can you give a website where this is done?
YouTube, I believe, has independent video and audio streams, precisely so that whenever you leave the viewport occluded for a few seconds, it can stop bothering to fetch the video stream (or can drop the video stream down to the lowest resolution+bitrate version), while continuing to play the previous highest-negotiated-bitrate audio stream.
I agree. I've used Firefox extensions to override these APIs and stop sites like YouTube from pausing playback. The cheating detection thing in proctoring software is pretty disgusting as well.
These web sites are getting way too abusive with the powerful APIs we've given them. It's time to reassert control.
It is important to note that, reasserting control has to be done at a browser vendor level -- trying to spoof APIs with extensions or whatnot as I mentioned may be a good mitigation for a specific behavior, but it is often detectable in a way that makes it an excellent fingerprint.
Unfortunately, a browser cannot realistically be controlled by a non-corporate-funded entity (if you don't count a government). Modern web browsers are incredibly complex and the web standards are made in a way that largely prevents the creation of new browsers. Firefox is mostly funded by Google, and Mozilla being the ineffective organization it is, this is unlikely to ever change.
Wow thanks. I'll try that when I can to avoid having random extensions inject scripts. How would you exclude a website for this if it doesn't work properly, especially with blur and focus?
Let's say you want to allow duckduckgo.com to listen to the visiblitychange event but all other sites should be blocked. Then you could use the following:
There's also StopTheMadness which can prevent pages from using the visibility API and also removes a whole host of annoying behaviors from “modern” websites.
Another concerning web API is the Battery Status API. See demo[1]. Firefox and WebKit have not implemented it thankfully but Google refuses to remove it for some odd reason. How it ever came to be boggles me. A low-power API would make much more sense. The Battery Status API does not have a user facing toggle in Chrome, unlike the sensor APIs, let alone actual permissions. Speaking of which, I can't believe the Chrome team still hasn't done anything about the sensor APIs. I've turned it off and see so many analytics tools in websites try to use it. Another surprising browser API that's only in Chrome is the Network Information API - again, permissionless, and cannot be opted-out of.
A lot of the specs say that user agents must provide users with a way to disable support for certain API. CSP reporting is one that many uBO users will recognise. But of course, Google doesn't follow that. And many of the aforementioned APIs are not W3C standards and are merely working drafts.
I work in Ed tech and we have gotten requests from schools to implement anti-cheat methods along these lines. We’ve thus far declined.
At least in our case you’re largely only cheating yourself. Our whole goal is to meet you where you are so you can succeed. We level your options to your ability. By cheating, you’re just going to get presented with harder materials and make it harder for yourself than it already was.
It's almost trivial to do the same without this API: focus/blur events, user input from mouse/keyboard. A permission toggle will add nothing.
> its use in various "ed tech" websites to detect alleged cheating[1] is frightening, to say the least
Because? Monitoring what happens during an exam seems normal and reasonable. There is a limit to what is "reasonable" – monitoring webcam and microphone is not IMHO, but seeing if another tab was opened is fairly reasonable.
It's not during an exam. If you read the link referenced, it notes that Canvas can log activity even when in non-proctored tests, for which there is no indication or expectation of this happening. For proctored tests they just record with your webcam and microphone anyways.
It's still during an exam. The expectation of complete privacy of what happens on an exam webpage while you're taking the exam is an unreasonable expectation.
> The ability for a website to see when you click on things outside of the page itself
It doesn't let pages see what you're clicking on outside of the page. That would indeed be a serious privacy violation. It merely tells the page that it's not on-screen any more. Personally I don't see any problem with that.
The person you replied to literally posted an example of it being misused:
> The ability for a website to see when you click on things outside of the page itself (either by changing active window or tab) is quite unexpected from an end-user perspective, and its use in various "ed tech" websites to detect alleged cheating
I once used it (on a personal project) to add a sad-face emoji to the page’s <title> if the page wasn’t visible. It would change to a heart when they returned.
(Edit: since the emoji was prefixed, you could see it on the background tab)
Asking for permission increases user experience friction by a ton. Especially since Web permissions aren't all at once but page dependent.
Even something like asking for camera permission isn't obvious all the time. Tried to build a PWA with camera as the focus, but users weren't always aware of the cam permission. Sometimes it slightly bugged and the cam took a few seconds to load or didn't at all.
Page Visibility API is used in tools like react query, where it will show the page with the existing cached data, but refetch in the background upon a page visibility event. It feels natural for some apps to have this update-on-focus behavior, and it's nicer than polling with set interval.
Intl API is great, but I feel it's somewhat hamstrung because Node doesn't have matching APIs.
Page Visibility is also often abused to force you to watch videos/ads in their entirely without switching to another page. If any one here suffers from something like this, be sure to check out extensions like "Always active window" on Firefox, which would spoof it to make it look like you are on the page the entire time.
Years ago I used this on my web application to work around a bug/leak. There was an auto update feature that ran on a regular basis. I made a mistake somewhere when replacing page elements that slowly increased memory usage. While developing I would close everything at the end of the day but my users would leave the page up for days. Eventually the small leak would crash the tab.
I "fixed" it by shutting off auto update on page blur then immediately doing an update on focus.
I'm obsessed with stripping out dependencies so when I'm working on a new project I try to figure out which APIs are commonly supported and use them when I can. What amazes me is when I run into something that seems new to me, only to discover it has been around since Chrome 1.0 or earlier. The JavaScript and DOM APIs are decades old and obscenely huge at this point and you never use all of them on every project, and it's too easy to just go back to what you already familiar with.
Case in point: Element.insertAdjacentHTML(). Not sure how I missed this pretty basic method, but there it is, helping turn a mess of code I had into a one liner. Itemscope and itemprop attributes as well. Wow, where did those come from? Oh, like 1999? Did I just forget they were there? Maybe. Or maybe I just never noticed.
That doesn't even count the continuous new APIs being added during each Chromium release, which are now pretty much everywhere. Mind boggling.
if applications are using it for non-nefarious reasons, the fallback behaviour for people like you who block it is to use more of your bandwidth or battery power to render non-visible elements. it's an optimization for most people, so it's still worth implementing even if some clients block the API.
I just tend to wrap my timer callbacks in a requestAnimationFrame, which should delay processing when the page isn’t visible. I guess this is a hack, but it’s been with me for so long now.
Intl is the ECMAScript Internationalization API (ECMA-402). Browsers are not the only JS implementations that include it. Node.js does, Deno does, probably others do too.
Looking a bit deeper there is the reverse option if you have you code registered as an installed PWA, allowing you app to be a share target without being a native app. Could be useful for a couple if things I might tinker with at some point.
I made one of these web-based Android share target things. I found the JS API a little tough but now all of my on-phone apps can share into my custom thing (a personal KB)
Semi-funny story: I was still using Win7 at the time and the share button from a browser extension crashes itself due to not handling non-support system properly.
This is how I learnt that even desktop operating system (Win10 in this case) nowadays has share feature.
- Page Visiblility is fully supported by all browsers except Opera Mini, and has been for a while
- Same with Intl (aditionally except KaiOS)
- Broadcast is supported by everyone but Opera Mini and Internet Explorer
- Web Share is only fully supported by Safari, Edge, and some Android browsers (Chrome, Samsung, Firefox, Opera Mobile), Chrome only has partial support because only Windows and ChromeOS
That means you can use the first 3 rather reliably, while Web Share is a lot more of a crapshoot.
Maybe not an “API” and maybe not lesser known but every developer should be aware of Proxy. Don’t abuse it. But be aware it exists. It’s a beautiful escape hatch for things when you need advanced capabilities but still need to fit an existing object shape.
Don't abuse it? That's half the fun! I made a toy library a while back that allows you to call "any" function you want, and it tries to implement and execute it runtime based on the name! You can do stuff like this, where it will map and filter a list of nested objects.
WeakMap also works really well for that, because it means you don't add any extra properties or symbols to the object itself. Instead you store it in the WeakMap with the object as a key, and it'll still get garbage collected with the object as if you'd added it to the object. I've found it's occasionally useful when you're trying to get things from different frameworks to interact with each other, both of those frameworks want to do things like assigning extra fields or setting up proxies, and then they start conflicting.
Good point. I haven't seen an analysis of the tradeoffs of using WeakMap vs Symbol properties. It would be cool if someone knowledgeable did a write-up on that. I imagine that performance could swing either way depending on lots of subtle factors.
Proxy has been abused a lot but always in very interesting ways. ObjectBuffer[0] is a great example, which I also tried to (poorly) reimplement myself.
It creates a new data structure that behaves as an object but is backed by SharedArrayBuffer, in order to support parallel computation over it.
Basically youtube sends ads data in a window.ytInitialPlayerResponse variable. My extension loads before and intercepts this variable with proxy to prevent setting of window.ytInitialPlayerResponse.adPlacements field.
When I was writing this code, I looked into ublock origin filters, so this idea or may be even code could have been borrowed there, I don't really remember.
I also override JSON.parse and Response.json functions, because youtube can also send ad data with AJAX. One of those might be unnecessary, I'm not really sure.
Verified...this is very nifty indeed! Thank you for sharing, now I can finally remove my AdBlock extension as it kept polluting my dev console with browser-polyfill errors ;)
It was very slow before, but browsers improved performance a lot, multiple frameworks use proxies for their reactive primitives under the hood, Vue, Mobx and so on. SolidJS is one of the fastests frameworks out there and is built mainly on proxies, so proxies themselves can't be that slow.
Granted in very very tight loop regular object get/set will be faster, but performance is more than good enough today. If you have an use case for it, use it.
It lets you intercept behaviors as they are requested of your object and do whatever you want to respond to them. Think of them as extensions of descriptors and you won't be far off (except they're designed to be a wrapper instead of methods on an existing object so that they can also be used for security, IIRC)
> In general, we want to use the Page Visibility API to stop unnecessary processes when the user doesn’t see the page or, on the other hand, to perform background actions.
Hahahaha no. That's going to be put to use for surveillance and watching when people switch back and forth to the page to maximize ad revenue. I can imagine Twitch using it to determine if the viewer is really looking at the page of if they have gone off to another window, and using that to gauge "engagement" or some other "monetization" metric.
I don't know about twitch but I work for a streaming company and we're using the page visibility API mainly to lower video quality when the user is not actively watching the corresponding tab for more than a minute.
This allows to keep audio while lowering bandwidth, CPU and memory usage.
So yes it may be used for nefarious purposes but it also provides very nice features in a media streaming case.
I have no doubt there are good uses. Of course, that kind of screws people who have say, two monitors, with the video open in one while they are in another window. You could certainly argue they are not actively watching the tab, but they could be watching a show while they work or chat with family.
Also, how long before the video streaming services use the api to adjust streamer compensation rates based on active/not active watchers?
As far as I know, it shouldn't be triggered when you lose focus due to not being the active monitor, though I cannot really test it right now as I've only one monitor at home.
> Also, how long before the video streaming services use the api to adjust streamer compensation rates based on active/not active watchers?
Good remark, that wouldn't surprise me that they don't already do this on such services by the way, though I'm not at all familiar with stream compensation rules.
You made me want to check if they collected it server-side on twitch by curiosity and they do seem to send regularly some engagement-related metrics (others than what I would assume would be useful to monitor if the player is doing a good job, such as bitrate and frame-drop-related matters). For example a base64-encoded JSON with properties such as "minutes_logged" (which I guess are the minutes since I logged in), a "chat_visible" boolean and more interesting here "time_spent_hidden" seems to be POSTed at intervals. That whole object is also conveniently associated to an "event" name called "minute-watched".
What's strange though is that the URL makes it look like they're requesting an usual ".ts" media segment though it is a POST request, it returns an HTTP 204 No Content and more importantly, the request is not performed by their usual media player script but by another mysterious p.js script, which makes it seem that this is not at all actually loading a media segment.
Maybe this camouflage is here to prevent people from messing with it, as wrong data would probably mess with their internal logic, but I guess that it indicates that they do collect that data on their servers, though I don't know what they do with it.
Moreover, they already have features to influence users so they stay active on the tab (for example the bonus points you win by clicking regularly on the treasure chest on the bottom of the chat), so that's not that surprising that they monitor this.
Some protips on Intl APIs (to simplify the examples in the article):
1) They default to the user's locale as set in their Browser settings and you don't ever need to look up navigator.language yourself.
The trick to using an options object to something like Intl.DateTimeFormat where the options object is after the locale name is to pass `undefined` as the locale to opt in to the default behavior. (I wish they had swapped the order in constructors to make this easier/more obvious.)
In the article's case can save a line and:
const dateFormatter = new Intl.DateTimeFormat(undefined, { timeZone: 'UTC' })
Also, constructing a formatter like this is for better reuse of the formatter with saved options (ie, moving this up and out of the function defined in the articles instead of building a new one every function call).
2) There's one more important way to shrink this article's example down, many built-in JS types including `Date` as all these examples show now include a function directly to format using the appropriate Intl formatter called `toLocaleString`. In this case: Date.prototype.toLocaleString(locale, options) takes all the same options. You can replace the article's formatDate function entirely with toLocaleString:
For what it's worth, Page Visibility doesn't detect switching windows under the Sway Wayland compositor. It does detect switching tabs within the same window.
I wonder how many terrabytes of storage would be saved by people realising these exist and not relying on thousands of tiny npm packages to do their own implementation of what some of these API's do.
Node itself needs to better enable some of these. Off hand, I am aware that much of Intl was still flaky in Node into much too recently despite being rock solid in browsers for a lot longer. This is an area where Deno competition seems to be helping.
Also, related to Intl replacing tons of code, Moment.js has this disclaimer up top of its README and npmjs.org listing today:
> Moment.js is a legacy project, now in maintenance mode.
But this is not big or bold enough and it is amazing how many projects still rely on a legacy package like this, which is huge because it bundles its own Intl replacement because it predates Intl. (Please use Luxon or date-fns or Intl.DateTimeFormat directly in 2022 rather Moment.js.)
There could be some NPM packages where the readme is “Don’t use this. This is an empty lib that only outputs this sentence. Use the XYZ api to do this instead, with docs found at: https:// …”
I cleared 4gb out of my npm package cache the other day. Never in any version of the history of software was I expecting to end up with gigabytes of javascript on my machine, but here we are.
Intl.Collator is a web UI superpower but apparently a well-kept secret: it gives you string sorting that takes into account local conventions for things like accented characters and even sorts strings containing numerical parts into natural rather than lexicographical order (that is, it correctly sorts “Item 1” before “Item 2” before “Item 10”).
Intl.DateTimeFormat is also really nicely done, because it limits the developer to "skeletons": you say what fields you want / need, and the browser will render what it has with at least those. Meaning the OS / user can override your whims if that's better or more readable. There's also formatRange which can provide localised range notations.
Although later on formatToParts was added which allows the developer to ignore the settings and really only use the bits they wanted anyway.
BroadcastChannel was the only thing new to me. I only knew the LocalStorage based hack (listening to the storage event and exchanging messages via the LocalStorage).
Edit: If you want to protect yourself from such abuse, try an extension that spoofs the API.
You can try: "Always active window" on Firefox or "Don't make me watch" on Chrome.
Test it with this website (not mine): https://testdrive-archive.azurewebsites.net/Performance/Page...
1. https://prioritylearn.com/teachers-canvas-tabs/