Work needed on the "Start session algorithm"

https://webaudio.github.io/web-speech-api/#start-session-algorithm, there are multiple issues in this section, making it hard to implement.

Quoting my colleagues @pehrsons and @jan-ivar (copying / paraphrasing their comments):

@pehrsons says:

> "not-allowed" is the error fired if permission code says "denied". gUM has several paths leading to `NotAllowedError` while the one explicitly covered by webspeech is... just one. E.g., see https://w3c.github.io/mediacapture-main/ and search for Permission Failure.
> Given the [start session algorithm](https://webaudio.github.io/web-speech-api/#start-session-algorithm) it seems to me that if the microphone request fails with anything other than an explicity user rejection, the app is left hanging, as there's no "start" event (Once the system is successfully listening to the recognition, nor an "error" event.

@jan-ivar answers:

> I agree. The SpeechRecognition spec is confusing the permission subsystem, which just returns a state without obtaining any microphone resources at all, for the media subsystem which it never calls, leaving questions unanswered.
> Questions like which microphone to open (the default)? What if the user user has no microphone (NotFoundError)? What to do in case of hardware error (NotReadableError)? Are microphone [privacy indicators](https://w3c.github.io/mediacapture-main/#privacy-indicator-requirements) shown? For how long? What determines the lifetime of capture? No algorithm in the spec appears to fire the [audiostart](https://webaudio.github.io/web-speech-api/#eventdef-speechrecognition-audiostart) event.

First thing we need to do: we need to handle all failure case when calling into `getUserMedia`.

> Also I wonder whether it's ok in the spec to write the start session algorithm steps as if they're sync ("MUST run the following steps") but in reality it's async (the request permission to use algorithm is inherently async since it involves prompting). All other specs I looked at (mediacapture, geolocation, notifications) do it in parallel.

Second thing we need to do: we need to call into `getUserMedia` `"in parallel"`

Finally, @jan-ivar says:

> Also questions like: should it expose devices in enumerateDevices()? The spec says nothing of this.

It seems to grant access to devices in Chrome, I just tested, but I don't see why this is the case.

Third thing we need to do, either Chrome changes this behaviour (that is detrimental to user privacy, because it severely increases fingerprinting surface), or we spec it. I would say the former.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Work needed on the "Start session algorithm" #175

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Work needed on the "Start session algorithm" #175

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions