-
Notifications
You must be signed in to change notification settings - Fork 2.9k
A new API for work during unload #10997
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
It is a bit unclear why a worklet would be used and not a (temporary) dedicated worker. Do we need SharedWorker? |
A dedicated worker is pretty tightly tied to its owner document's lifetime. Are you thinking we could somehow loosen that, instead of using shared workers? I thought using shared workers would be easier, both from a spec and implementation point of view, since they are already separate from any single document's lifetime. But maybe that's not necessarily the case? |
Colleagues and I are rather wary of the longish timeout service workers have today and as I understand a big reason service workers are even kept alive is to reduce the cost of the ongoing navigation. This was discussed quite a bit as part of the |
SharedWorker seems least bad for this use-case[1], especially since MessagePorts can now generate close events which provides symmetry to the SharedWorkerGlobalScope "connect" event and this is conceivably something sites could already be doing. 1: In particular, I agree that a pagehide event with a waitUntil for a document would have terrifying lifetime implications. And a PageHideWorklet would be at odds with worklets currently not using tasks. |
Hmm, that |
Understood. I tried to address this in the OP by suggesting that this new mode for shared workers would be subject to whatever implementation-defined limits a browser places on service workers today. Basically, we should not make this new mode any worse than service workers, as otherwise web sites will need to continue to use service workers for this use case.
Correct. They apply to related but separate use cases. But, there are cases where neither
|
Let's tentatively call this API What do we do if multiple clients have mismatched values of
For our use case, any of these will work. We expect people to be using a specific shared worker for unloading purposes, and always calling with (1), (3), and (4) are pretty easy to implement. (2) adds a bit more complexity, but is kind of theoretically nice in some ways. (4) might be simplest to start with since it can evolve into any of the others later. Edit: @yoshisatoyanagisawa reminded me that we error the worker if |
Thanks @domenic for listing the possible behavior on the option mismatches. I also come up with the other corner case while I am checking the Chromium code.
Focusing on the use case mentioned in #10997 (comment), Option 2 might be enough. However, considering the case using the |
Good catch. I agree (1) seems nicer and less wasteful. If there are implementation/architecture reasons why it is especially hard, then we could consider (2), since we don't have strong use cases for (1) behavior. But I would default to (1) if possible. |
@domenic 4 above (exceptions if the lifetimes don't match) means that changing from non-extended to extended or vice versa is tricky and maybe impossible for sites that people keep open constantly (e.g. gmail, calendar, facebook). |
@fergald Can I ask you to elaborate more on the situation? I think the keys to look up SharedWorker are:
Then, I believe a new SharedWorker is unlikely to match with the existing SharedWorkers because it might have a different script URL. |
@yoshisatoyanagisawa Changing the URL would mean that there can be a period where 2 different SharedWorkers exist are active. This could be a problem if the SharedWorker manages some global state and this state is also involved in the extended lifetime task. I don't know what people do with SharedWorkers in reality so maybe this is not a real concern. |
@fergald I feel that it can be a general issue on updating a SharedWorker script even without the extendedLifetime support. However, since matching does not do a script byte-to-byte comparison like ServiceWorkers, there can be an update that only happens after all clients have gone. The proposal breaks that. Let me go with this limitation (i.e. prevent mixing extendedLifetime status), and revisit upon the real world feedback on OT. |
What problem are you trying to solve?
It's a known issue that many sites want to perform some work during document unloading. This usually includes writing to storage, or sending information to servers. (Previous discussion: #963.)
Some of the simpler use cases are solved by APIs like
fetch(..., { keepalive: true })
, or by using synchronous storage APIs likelocalStorage
.But the more complex cases, such as writing to async storage APIs like IndexedDB, or performing some sort of async operation before the HTTP request, require more work. (Examples of such pre-request async operations include using WebCrypto to hash/encrypt data, or using
CompressionStream
to compress it.)What solutions exist today?
The current best solution the platform offers for this is service workers. By sending messages to the service worker,, the service worker can then use its ability to run in the background to perform the appropriate actions. The message can carry along any data necessary to perform those actions, e.g., the not-yet-encrypted-or-compressed payload, or the user's unsaved changes that need to be written to IndexedDB.
However, requiring service workers is pretty heavyweight for this use case. Even if there are no fetch handlers, the disk space consumption of the service worker registration database means this is hard to deploy at scale. And, it uses up a process's worth of memory---not only during unload time, when the work is being performed, but the entire time any document from that origin is open.
Additionally, the developer experience of service worker registration, installation, upgrading, client claiming, etc. is a lot to manage, when the goal is to just run some background code after unload. For example, they require a separate same-origin service worker file being hosted, instead of allowing creation from
blob:
URLs, which means that libraries for this functionality need to consist of two files, not just one. (And it gets worse if the library needs to be able to integrate with a site's existing service worker script!)How would you solve it?
We (myself, @fergald, @pmeenan) think there are possible new APIs which could allow sites to perform this kind of processing, but at a lower cost. Here are the ideas we've come up with:
A
PageHideWorklet
. This would be a special type of worklet which a document would register early in its lifetime. Then, during unload, right after firing thepagehide
event, it gets spun up and runs its code. The document would need to synchronously set the data that the worklet plans to consume, either continuously throughout the document's lifetime, or in thepagehide
handler (or both). But the worklet could run asynchronously for some amount of time (see below).On the implementation level, this could be implemented either with a separate process for the worklet, which gets spun up at unload time, or it could be implemented with an in-process worklet plus some code that keeps the unloading document's process alive, even while stopping its event loop and freeing up most of its resources.
A minor extension to
SharedWorker
.SharedWorker
s are already reasonable tools for this: they don't have heavyweight registrations and persistent state like service workers, and they are allowed per spec to stay alive after document unloading for some time.In theory, this could involve no spec changes, just implementation changes to allow shared workers to stay alive for longer. In practice, it would probably be better to include a hint at construction time that this
SharedWorker
is intended to perform post-unload processing, and so the browser should keep it alive for a longer time. Something likenew SharedWorker(url, { usageHint: "after-pagehide-processing" })
.We're currently leaning toward (2), as it seems like a simple extension of what exists today.
With regards to privacy/security, the intent here would be to be exactly as powerful as service workers are today. Today there are implementation-defined limits on how long service workers stay alive after all of the documents from their origin are closed, and different browsers have made different choices for them. (And I believe some have discussed changing these limits over time, or in reaction to other signals.) We would allow these mechanisms to operate for those same implementation-defined amounts of time.
Anything else?
We also considered extendable `pagehide` event handlers, but we don't like them very much. Expand if you want to read more
The idea: inside a
pagehide
event,event.waitUntil(promise)
would allow you to extend the document's lifetime and continue running JavaScript, while the document unloads. This would be up to some implementation-defined limit (per above).In parallel, the user agent would be loading the new document. This would not block loading the new document in any way: even if the new document completely finishes before the promise from the old document settles, we could visibly swap in the new document, while the old document continues running. It is sort of like keeping the old document in bfcache, except JavaScript continues to run. Chrome already has this sort of document state for ~3 seconds for some cross-process navigations, and I believe other implementations might as well.
This is probably the most convenient option for web developers, as they can colocate all their code into the
pagehide
handler. But keeping unloaded documents alive in that way, even with opt in, is scary. And probably the memory consumed by the document, with all the DOM and JS objects its built up throughout its lifetime, is quite high.The text was updated successfully, but these errors were encountered: